<< Prev | - Up - | Next >> |
In chapter Chapter 5, we described bottom-up parsing in terms of an inference process. Here, we are going to improve on this description to arrive at chart parsing. One notion that we introduce is that of span: we write SPAN C[i,j]
to state that we have recognized category C
and that it corresponds to the subsequence of the input starting at position i
(included) and ending at position j
(excluded).
The second important notion is that of a partially recognized category (or equivalently a partially applied rule). We write ITEM C [i,j] C1 ... Cn
to state that we have partially recognized C
: the portion that has been recognized spans [i,j]
and what remains to be recognized is the sequence C1 ... Cn
.
Finally, as before, we write RULE C -> C1 ... Cn
to state that the grammar contains a rule C -> C1 ... Cn
. We also write INPUT C1 ... Cn
to state that the input consists of the sequence C1 ... Cn
.
Let's consider first the rules that initialize the inference process. If Ci
is in the input then it spans itself:
INPUT C1 ... Cn
---------------
SPAN Ci[i,i+1]
In the following, we will write N
for the size of the input. At every position i
, we have partially recognized every rule with an empty span [i,i]
:
RULE C -> C1 ... Ck
----------------------
ITEM C [i,i] C1 ... Ck
Now the rule which extends a partial recognition:
ITEM C [i,j] C1 C2 ... Cp
SPAN C1 [j,k]
--------------------------
ITEM C [i,k] C2 ... Cp
and the rule which notices a fully recognized category:
ITEM C [i,j]
------------
SPAN C [i,j]
and finally a rule which notices that a category covers the entire input. We write ACCEPT C
to state that the entire input has been recognized as category C
.
SPAN C [1,N]
------------
ACCEPT C
Thus the question of whether C1 ... Cn
is in the language generated by the grammar can be replaced by the question of whether, for some C
, ACCEPT C
is in the inferential closure of INPUT C1 ... Cn
.
<< Prev | - Up - | Next >> |