9.1.2 Chart Parsing as Inferential Closure

In chapter Chapter 5, we described bottom-up parsing in terms of an inference process. Here, we are going to improve on this description to arrive at chart parsing. One notion that we introduce is that of span: we write SPAN C[i,j] to state that we have recognized category C and that it corresponds to the subsequence of the input starting at position i (included) and ending at position j (excluded).

The second important notion is that of a partially recognized category (or equivalently a partially applied rule). We write ITEM C [i,j] C1 ... Cn to state that we have partially recognized C: the portion that has been recognized spans [i,j] and what remains to be recognized is the sequence C1 ... Cn.

Finally, as before, we write RULE C -> C1 ... Cn to state that the grammar contains a rule C -> C1 ... Cn. We also write INPUT C1 ... Cn to state that the input consists of the sequence C1 ... Cn.

Let's consider first the rules that initialize the inference process. If Ci is in the input then it spans itself:

INPUT C1 ... Cn --------------- SPAN Ci[i,i+1]

In the following, we will write N for the size of the input. At every position i, we have partially recognized every rule with an empty span [i,i]:

RULE C -> C1 ... Ck ---------------------- ITEM C [i,i] C1 ... Ck

Now the rule which extends a partial recognition:

ITEM C [i,j] C1 C2 ... Cp SPAN C1 [j,k] -------------------------- ITEM C [i,k] C2 ... Cp

and the rule which notices a fully recognized category:

ITEM C [i,j] ------------ SPAN C [i,j]

and finally a rule which notices that a category covers the entire input. We write ACCEPT C to state that the entire input has been recognized as category C.

SPAN C [1,N] ------------ ACCEPT C

Thus the question of whether C1 ... Cn is in the language generated by the grammar can be replaced by the question of whether, for some C, ACCEPT C is in the inferential closure of INPUT C1 ... Cn.