7.1.4.3 Ending a Terminal Production

Only the adaptive bottom-up parser abu-parser supports a specifier for marking the end of a production with terminal symbols at its right-hand side. The specifier has the format

#pt-NonterminalSymbol

where NonterminalSymbol is a nonterminal symbol at the left-hand side of a production from a source PCFG. NonterminalSymbol is a string that can consist of digits, English letters, and ‘_’.

At the beginning of processing a regular expression of a nonterminal symbol in a top-down template grammar, the parser clears a list of terminal symbols at the right-hand side of a production from a source PCFG. While processing the regular expression, the parser appends consumed source terminal symbols to the list. On encountering a #pt-NonterminalSymbol specifier, the parser registers a production from a source PCFG and clears the list. The left-hand side of the production is a NonterminalSymbol, and the right-hand side of the production is source terminal symbols from the list.

Source terminal symbols are virtual terminal symbols with a stripped prefix STR specified by the option --term-prefix=STR. The prefix distinguishes virtual terminal symbols from virtual nonterminal symbols. The default prefix is ‘%’.

Example

A grammar contains the production:

C0: "%a" "%b" ( #pt-t1a "%c" "%d" #pt-t1b ...
              | #pt-t2a "%e" "%f" #pt-t2b ...
              )
;

After processing the sequence of virtual terminal symbols ‘%a %b %c %d’, the parser registers the source productions:

t1a: "a" "b" ;
t1b: "c" "d" ;

After processing the sequence of virtual terminal symbols ‘%a %b %e %f’, the parser registers the source productions:

t2a: "a" "b" ;
t2b: "e" "f" ;