- Up - | Next >> |
MakeScript
MakeScript
takes as argument a list of atoms representing the words of a sentence to be analyzed and returns a predicate appropriate for encapsulated search. This predicate characterizes a solution: a list Nodes
of lexical nodes.
fun {MakeScript Words}
proc {$ Nodes}
N = {Length Words}
<DG MakeScript: Defs>
in
<DG MakeScript: create lexical nodes>
<DG MakeScript: sentence partitioning rule>
<DG MakeScript: install strict yields>
<DG MakeScript: word order constraints>
<DG MakeScript: distribution strategy>
end
end
First we create the lexical nodes and then we install some binary constraints between every pair of them. Note
Nodes = {List.mapInd Words fun {$ I W} {Word2Node I W N Topo} end}
{ForAll Nodes
proc {$ W1}
{ForAll Nodes
proc {$ W2}
<W2 is daughter of W1 iff W1 is mother of W2>
<Try all possible edges from W1 to W2>
end}
end}
A german sentence is partitioned into 4 fields: (1) the Vorfeld VF
, (2) the root field RF
which is a singleton containing just the finite Root
of the sentence, (3) the Mittelfeld MF
, and (4) the Nachfeld NF
for extrapositions. There is an additional restriction that the Vorfeld must contain just one constituent: thus we denote by VFR
the root of the Vorfeld. The fields are sequential and form a partition of the Sentence
. Variable Topo
denotes a record that contains all useful information pertaining to fields.
VF = {FS.var.upperBound 1#N}
Root = {FD.int 1#N}
RF = {FS.var.upperBound 1#N}
{FS.cardRange 1 1 RF}
{FS.include Root RF}
MF = {FS.var.upperBound 1#N}
NF = {FS.var.upperBound 1#N}
VFR= {FD.int 1#N}
{FS.include VFR VF}
Topo = o(vf:VF rf:RF mf:MF nf:NF list:[VF RF MF NF] vfr:VFR root:Root)
Sentence = {FS.value.make 1#N}
{FS.int.seq Topo.list}
{FS.partition Topo.list Sentence}
{FS.reified.include W2.index W1.daughters}=
{FS.reified.include W1.index W2.mother}
{ForAll Roles
proc {$ R}
thread
or {FS.include W2.index W1.role.R}
{Gamma R W1 W2}
[] {FS.exclude W2.index W1.role.R}
end
end
end}
The sentence partitioning rule states that each word of the input sentence fills precisely one role: either it is root or it fills a role on some other node. Thus the root field RF
together with the daughter sets of all nodes must form a partition of the input Sentence
.
{FS.partition
{FoldL Nodes
fun {$ L N} {Append {Record.toList N.role} L} end [RF]}
Sentence}
The strict yield of W1
is the union of contributions from all nodes. W2
contributes nothing if it is not a daughter of W1
, otherwise it contributes its full yield. This condition can be expressed using the weighted set constraint WeightC
.
{ForAll Nodes
proc {$ W1}
W1.yieldS =
{FS.unionN
{Map Nodes
fun {$ W2}
{WeightC
{FS.reified.include W2.index W1.daughters}
W2.yield}
end}}
end}
Local word order constraints are primarily enforced by procedure WordOrderLocal
.
{ForAll Nodes WordOrderLocal}
So far, we just enforce the condition that the prenominal domain of an NP is convex (i. e. forms a contiguous interval with no holes, no insertions). Here, we take advantage of the fact that roles det
and adj
can only occur on NPs; thus, it is harmless to enforce the constraint also when the Node
is not an NP.
proc {WordOrderLocal Node}
SELF = {FS.value.singl Node.index}
L1 = [Node.role.det Node.role.adj SELF]
in
{FS.int.seq L1}
{FS.int.convex {FS.unionN L1}}
end
In order to solve the CSP, we need a labeling strategy. Here is an obvious one that doesn't attempt to be particularly clever:
apply the default naive labeling strategy on the collection of mother sets:
apply first-fail on the collection of lexicon entry selectors:
finally use again the default naive labeling strategy on the collection of daughter sets:
{FS.distribute naive {Map Nodes fun {$ N} N.mother end}}
{FD.distribute ff {Map Nodes fun {$ N} N.entryIndex end}}
{FS.distribute naive
{FoldL Nodes
fun {$ L N} {Append {Record.toList N.role} L} end nil}}
- Up - | Next >> |