Next: , Previous: , Up: Output Information   [Contents][Index]


Terminal Symbol Expansions

Terminal symbol expansions are sequences of terminal symbols consumed while parsing nonterminal symbols of a template regular expression grammar. Use the following command-line format to dump terminal symbol expansions collected while processing a training terminal symbol sequence:

$ topdown --qe[=NONT1] ... --qe[=NONTn] [--oe=FILE] [--fe=fq_min]  \
          [--nle=num_lower] [--nue=num_upper] [--pe=prob_min]      \
          REGEX_GRAM_FILE SYM_SEQ_FILE

The following command-line options are applicable to dumping terminal symbol expansions:

--fe=fq_min

The minimum number of occurrences (frequency) a terminal symbol expansion must have for including it in the output. The default value is 0.

--nle=num_lower

If possible, dump at least a specified number of terminal symbol expansions for every nonterminal symbol. The default value is 0.

--nue=num_upper

Dump at most a specified number of terminal symbol expansions for every nonterminal symbol. The parser dumps the most probable expansions. No limit by default.

--oe=FILE

Write terminal symbol expansions to a FILE. If FILE is ‘-’, write the expansions to stdout. By default, write the expansions to stdout.

--pe=prob_min

A minimum probability a terminal symbol expansion must have for including it in the output. The probability of a terminal symbol expansion is the number of occurrences (frequency) of this terminal symbol expansion divided by the total number of occurrences of all terminal symbol expansions for a nonterminal symbol. The default value is 0.

--qe[=NONT]

Dump terminal symbol expansions for a nonterminal symbol NONT of the template regular expression grammar to a file specified by the option --oe=FILE. You can pass multiple options --qe=NONT to dump terminal symbol expansions for multiple nonterminal symbols. If the option --oe=FILE not supplied, dump queried terminal symbol expansions to stdout. If NONT not supplied, dump terminal symbol expansions for all nonterminal symbols. This option queries terminal symbol expansions.

On passing any of these options (except for the last one), if the option --qe[=NONT] is absent on the command line, the parser implicitly turns on the option --qe.

Example:

$ cat >expan1.pcfg <<EOF
S: "a" "b" "c"
;
EOF
$ pcfg-generate-seq -n100 -o expan1.seq expan1.pcfg
$ cat >expan.rg <<EOF
S: . . . . .
 | A A
;

A: . . . ;
EOF
$ topdown -N10 --qe=S expan.rg expan1.seq
0.50000000         85  S: "a" "b" "c" "a" "b" "c"
0.18235294         31  S: "c" "a" "b" "c" "a" "b"
0.10588235         18  S: "a" "b" "c" "a" "b"
0.08235294         14  S: "b" "c" "a" "b" "c"
0.08235294         14  S: "c" "a" "b" "c" "a"
0.04705882          8  S: "b" "c" "a" "b" "c" "a"

Every line with a terminal symbol expansion consists of:

  1. The probability of the expansion.
  2. The number of occurrences (frequency) of the expansion.
  3. A parsed nonterminal symbol followed by ‘:’.
  4. A sequence of terminal symbols.

Next: , Previous: , Up: Output Information   [Contents][Index]