Next: , Up: rege-asm   [Contents][Index]


8.5.1 Command-Line Format

Invoke the program using one of the following command-line formats:

$ rege-asm [--nterm-min=INT] [--dump-asm=extended] REGEX
$ rege-asm --dump-gram[=specific|replace] [--recurs=right] REGEX
$ rege-asm --dump-stats REGEX

The argument REGEX is a regular expression. The first format is for dumping an assembler program for the regular expression. The second format is for dumping a context-free grammar for the regular expression. The third format is for dumping statistics on the regular expression. Refer to General Production Format for the syntax of regular expressions.

The program rege-asm supports the following command-line options:

--dump-asm[=simple|extended]

Dump an assembler program for probabilistic parsing a terminal symbol sequence according to the regular expression:

simple

Dump a simple assembler program. The regular expression may not contain terminal symbol classes and specific terminal symbols, but it may contain ‘.’.

extended

Dump an assembler program containing instructions for setting up a correspondence between parts of this assembler program and the productions of a context-free grammar for the regular expression. The expression may contain terminal symbol classes and specific terminal symbols.

If the option argument not specified, the program rege-asm uses --dump-asm=simple. If the option not specified, the program rege-asm does not dump the assembler program.

--dump-gram[=specific|dot|replace]

Dump a context-free grammar for the regular expression:

specific

Dump the grammar with the following subexpressions replaced with auxiliary nonterminal symbols _E_iT and _E_iTj: individual terminal symbol classes, individual ‘.’, and the sequences (groups) of specific terminal symbols, terminal symbol classes, and ‘.’ with lengths greater than 1.

dot

Dump the grammar with the following subexpressions replaced with auxiliary nonterminal symbols _E_iT and _E_iTj: individual terminal symbol classes and the sequences (groups) of specific terminal symbols, terminal symbol classes, and ‘.’ with lengths greater than 1.

replace

Dump the grammar with any sequences (groups) of specific terminal symbols, terminal symbol classes, and ‘.’ replaced with auxiliary nonterminal symbols _E_iT and _E_iTj.

In _E_iT and _E_iTj, i is the ordinal number of an auxiliary nonterminal symbol, and j is sequence length if it is greater than 1.

If the option argument not specified, the program uses --dump-gram=dot. If the option not specified, the program does not dump the context-free grammar.

--dump-stats

Dump statistics on the regular expression.

--nterm-min=INT

The minimum number of terminal symbols. On passing the option --dump-asm=extended, the program rege-asm generates an assembler program referencing terminal symbols contained in the regular expression. Pass the option --nterm-min=INT to generate an assembler program for a larger set of terminal symbols or generate an assembler program for a specified number of terminal symbols on passing the option --dump-asm=simple. The default minimum number of terminal symbols for generated assembler programs is 2.

--recurs=left|right

Recursion type for the productions of a context-free grammar dumped on passing the option --dump-gram[=specific|dot|replace]: left or right. By default, generate left-recursive productions.


Next: , Up: rege-asm   [Contents][Index]