9.1.1 What's Wrong with the Naive Parser?

The parser we developed in Chapter 5 was dumb because it had no memory: it did not recall which constituents it had already built and often did reparse a constituent it had parsed at some previous processing step. When a big grammar is used, this re-parsing of constituents can become very costly.

This problem is neither specific to our recogniser, nor to the specific (bottom-up) parsing strategy it uses. It also affects top-down parsers. As a simple example suppose the string to be analysed is the verb phrase "saw the man talk with the actress" and the grammar contains the following rules:

VP --> V NP VP --> V NP PP VP --> V NP VP

And suppose that we use a top-down parser. Then assuming the rules are treated sequentially and in the order given above, a chartless parser applies the first rule thereby parsing the verb "saw" and the NP "the man"; as the first rule fails, the second rule is tried out: again the verb "saw" and the NP "the man" are parsed. Finally the third rule is tried out and the parser succeeds. In total, both the verb "saw" and the NP "the man" are parsed three times which obviously is a waste of precious time.

To remedy this shortcoming, chart parsers use a table or "chart" in which parsed constituents are stored. The parsing strategy then ensures that no constituent is added to the chart which is already in it.