13.1.1 Feature Terms

Suppose we want to associate with the word "she" the information that it is a noun phrase in the third person whose number is singular and whose case is nominative. Using a ground term, we can represent this information as follows:

  
  feat(np, third, sg, nom)

The tuple has exactly four arguments whose positions indicate which information is filled in (i.e. category, person, number and case). It can only unify with a term of identical arity and compatible information. For instance, it unifies with the following term where _ represents an undefined value:

  
  feat(np, _ , _ , nom)

but not with:

  
  feat_struct(np, sg, third, nom)
  feat_struct(np)
  feat_struct(np, third, pl, nom) 

A more explicit way to represent the same informtion would be to use a record (feature tree). In that way, the various features are explicitely given and the order of feature-value pairs is irrelevant. For our example, the record can be completely described by the following ground feature term:

  
  feat(cat:np pers:third nb:sg 'case':nom)

which can unify with the following feature terms:

  
  feat(cat:np pers:_ nb:sg 'case':nom)
  feat(pers:third cat:np nb:sg 'case':nom)

but not with:

  
  feat(pers:third cat:np nb:pl 'case':nom)

In short, feature terms support a more explicit and thereby easier access to features for the grammar writer: s/he does not have to know which feature is in which position of the record nor in which order they come from, but can simply use their name to specify them in which order s/he pleases.

However note that in both cases, that a feature term fixes the arity of the feature tree it describes. This can make grammar writing very cumbersome as even if a feature value is undefined, it must be represent. For instance suppose that the feature tree for a verb has four features say Category, Form, Number and Pers, then a feature term representing this feature tree must always have four arguments even though perhaps some of the features are unspecified. For instance, given the verb "saw" the feature Number is undefined and feature tree can be described by the following feature term: feat(cat:verb form:fin nb:_ pers:_) where pers:_ and nb:_ are in some sense superfluous.

When writing big grammars, such issues become crucial. There are basically two solutions to the problem: use open feature terms or let the grammar writer define the grammar as is intuitive (i.e. ommitting features with undefined values) and write a compiler which translates the grammar so written into a record-based grammar.

The first solution involves working with open feature terms i.e. feature term which do not restrict the arity of the described tree. It yields just what's needed for the grammar writer but at the cost of efficiency: the grammar thus written will process much more slowly than one written with closed structures (structures with fixed arity). In Oz such open feature terms can be used though again much less efficiently than closed ones. Thus instead of using a closed terms as indicated above, we could use the open terms (the three dots indicate openness):

  
  feat(cat:np pers:third nb:pl 'case':nom ...)

which will unify with any other term with identical label and compatible information e.g.

  
  feat(cat:np)
  feat(pers:third cat:np)

Neither the respective order of feature value pairs nor their number plays a role. In practice however, efficiency considerations strongly suggests the use of a grammar compiler.


Denys Duchier, Claire Gardent and Joachim Niehren
Version 1.3.99 (20050412)