Tcl Library Source Code

Documentation
Login


[ Main Table Of Contents | Table Of Contents | Keyword Index | Categories | Modules | Applications ]

NAME

pt::peg::to::param - PEG Conversion. Write PARAM format

Table Of Contents

SYNOPSIS

package require Tcl 8.5
package require pt::peg::to::param ?1?
package require pt::peg
package require pt::pe

pt::peg::to::param reset
pt::peg::to::param configure
pt::peg::to::param configure option
pt::peg::to::param configure option value...
pt::peg::to::param convert serial

DESCRIPTION

Are you lost ? Do you have trouble understanding this document ? In that case please read the overview provided by the Introduction to Parser Tools. This document is the entrypoint to the whole system the current package is a part of.

This package implements the converter from parsing expression grammars to PARAM markup.

It resides in the Export section of the Core Layer of Parser Tools, and can be used either directly with the other packages of this layer, or indirectly through the export manager provided by pt::peg::export. The latter is intented for use in untrusted environments and done through the corresponding export plugin pt::peg::export::param sitting between converter and export manager.

API

The API provided by this package satisfies the specification of the Converter API found in the Parser Tools Export API specification.

Options

The converter to PARAM markup recognizes the following configuration variables and changes its behaviour as they specify.

PARAM code representation of parsing expression grammars

The PARAM code representation of parsing expression grammars is assembler-like text using the instructions of the virtual machine documented in the PackRat Machine Specification, plus a few more for control flow (jump ok, jump fail, call symbol, return).

It is not really useful, except possibly as a tool demonstrating how a grammar is compiled in general, without getting distracted by the incidentials of a framework, i.e. like the supporting C and Tcl code generated by the other PARAM-derived formats.

It has no direct formal specification beyond what was said above.

Example

Assuming the following PEG for simple mathematical expressions

PEG calculator \(Expression\)
    Digit      <\- '0'/'1'/'2'/'3'/'4'/'5'/'6'/'7'/'8'/'9'       ;
    Sign       <\- '\-' / '\+'                                     ;
    Number     <\- Sign? Digit\+                                  ;
    Expression <\- Term \(AddOp Term\)\*                            ;
    MulOp      <\- '\*' / '/'                                     ;
    Term       <\- Factor \(MulOp Factor\)\*                        ;
    AddOp      <\- '\+'/'\-'                                       ;
    Factor     <\- '\(' Expression '\)' / Number                   ;
END;

one possible PARAM serialization for it is

\# \-\*\- text \-\*\-
\# Parsing Expression Grammar 'TEMPLATE'\.
\# Generated for unknown, from file 'TEST'

\#
\# Grammar Start Expression
\#

<<MAIN>>:
         call              sym\_Expression
         halt

\#
\# value Symbol 'AddOp'
\#

sym\_AddOp:
\# /
\#     '\-'
\#     '\+'

         symbol\_restore    AddOp
  found\! jump              found\_7
         loc\_push

         call              choice\_5

   fail\! value\_clear
     ok\! value\_leaf        AddOp
         symbol\_save       AddOp
         error\_nonterminal AddOp
         loc\_pop\_discard

found\_7:
     ok\! ast\_value\_push
         return

choice\_5:
\# /
\#     '\-'
\#     '\+'

         error\_clear

         loc\_push
         error\_push

         input\_next        "t \-"
     ok\! test\_char         "\-"

         error\_pop\_merge
     ok\! jump              oknoast\_4

         loc\_pop\_rewind
         loc\_push
         error\_push

         input\_next        "t \+"
     ok\! test\_char         "\+"

         error\_pop\_merge
     ok\! jump              oknoast\_4

         loc\_pop\_rewind
         status\_fail
         return

oknoast\_4:
         loc\_pop\_discard
         return
\#
\# value Symbol 'Digit'
\#

sym\_Digit:
\# /
\#     '0'
\#     '1'
\#     '2'
\#     '3'
\#     '4'
\#     '5'
\#     '6'
\#     '7'
\#     '8'
\#     '9'

         symbol\_restore    Digit
  found\! jump              found\_22
         loc\_push

         call              choice\_20

   fail\! value\_clear
     ok\! value\_leaf        Digit
         symbol\_save       Digit
         error\_nonterminal Digit
         loc\_pop\_discard

found\_22:
     ok\! ast\_value\_push
         return

choice\_20:
\# /
\#     '0'
\#     '1'
\#     '2'
\#     '3'
\#     '4'
\#     '5'
\#     '6'
\#     '7'
\#     '8'
\#     '9'

         error\_clear

         loc\_push
         error\_push

         input\_next        "t 0"
     ok\! test\_char         "0"

         error\_pop\_merge
     ok\! jump              oknoast\_19

         loc\_pop\_rewind
         loc\_push
         error\_push

         input\_next        "t 1"
     ok\! test\_char         "1"

         error\_pop\_merge
     ok\! jump              oknoast\_19

         loc\_pop\_rewind
         loc\_push
         error\_push

         input\_next        "t 2"
     ok\! test\_char         "2"

         error\_pop\_merge
     ok\! jump              oknoast\_19

         loc\_pop\_rewind
         loc\_push
         error\_push

         input\_next        "t 3"
     ok\! test\_char         "3"

         error\_pop\_merge
     ok\! jump              oknoast\_19

         loc\_pop\_rewind
         loc\_push
         error\_push

         input\_next        "t 4"
     ok\! test\_char         "4"

         error\_pop\_merge
     ok\! jump              oknoast\_19

         loc\_pop\_rewind
         loc\_push
         error\_push

         input\_next        "t 5"
     ok\! test\_char         "5"

         error\_pop\_merge
     ok\! jump              oknoast\_19

         loc\_pop\_rewind
         loc\_push
         error\_push

         input\_next        "t 6"
     ok\! test\_char         "6"

         error\_pop\_merge
     ok\! jump              oknoast\_19

         loc\_pop\_rewind
         loc\_push
         error\_push

         input\_next        "t 7"
     ok\! test\_char         "7"

         error\_pop\_merge
     ok\! jump              oknoast\_19

         loc\_pop\_rewind
         loc\_push
         error\_push

         input\_next        "t 8"
     ok\! test\_char         "8"

         error\_pop\_merge
     ok\! jump              oknoast\_19

         loc\_pop\_rewind
         loc\_push
         error\_push

         input\_next        "t 9"
     ok\! test\_char         "9"

         error\_pop\_merge
     ok\! jump              oknoast\_19

         loc\_pop\_rewind
         status\_fail
         return

oknoast\_19:
         loc\_pop\_discard
         return
\#
\# value Symbol 'Expression'
\#

sym\_Expression:
\# /
\#     x
\#         '\\\('
\#         \(Expression\)
\#         '\\\)'
\#     x
\#         \(Factor\)
\#         \*
\#             x
\#                 \(MulOp\)
\#                 \(Factor\)

         symbol\_restore    Expression
  found\! jump              found\_46
         loc\_push
         ast\_push

         call              choice\_44

   fail\! value\_clear
     ok\! value\_reduce      Expression
         symbol\_save       Expression
         error\_nonterminal Expression
         ast\_pop\_rewind
         loc\_pop\_discard

found\_46:
     ok\! ast\_value\_push
         return

choice\_44:
\# /
\#     x
\#         '\\\('
\#         \(Expression\)
\#         '\\\)'
\#     x
\#         \(Factor\)
\#         \*
\#             x
\#                 \(MulOp\)
\#                 \(Factor\)

         error\_clear

         ast\_push
         loc\_push
         error\_push

         call              sequence\_27

         error\_pop\_merge
     ok\! jump              ok\_43

         ast\_pop\_rewind
         loc\_pop\_rewind
         ast\_push
         loc\_push
         error\_push

         call              sequence\_40

         error\_pop\_merge
     ok\! jump              ok\_43

         ast\_pop\_rewind
         loc\_pop\_rewind
         status\_fail
         return

ok\_43:
         ast\_pop\_discard
         loc\_pop\_discard
         return

sequence\_27:
\# x
\#     '\\\('
\#     \(Expression\)
\#     '\\\)'

         loc\_push
         error\_clear

         error\_push

         input\_next        "t \("
     ok\! test\_char         "\("

         error\_pop\_merge
   fail\! jump              failednoast\_29
         ast\_push
         error\_push

         call              sym\_Expression

         error\_pop\_merge
   fail\! jump              failed\_28
         error\_push

         input\_next        "t \)"
     ok\! test\_char         "\)"

         error\_pop\_merge
   fail\! jump              failed\_28

         ast\_pop\_discard
         loc\_pop\_discard
         return

failed\_28:
         ast\_pop\_rewind

failednoast\_29:
         loc\_pop\_rewind
         return

sequence\_40:
\# x
\#     \(Factor\)
\#     \*
\#         x
\#             \(MulOp\)
\#             \(Factor\)

         ast\_push
         loc\_push
         error\_clear

         error\_push

         call              sym\_Factor

         error\_pop\_merge
   fail\! jump              failed\_41
         error\_push

         call              kleene\_37

         error\_pop\_merge
   fail\! jump              failed\_41

         ast\_pop\_discard
         loc\_pop\_discard
         return

failed\_41:
         ast\_pop\_rewind
         loc\_pop\_rewind
         return

kleene\_37:
\# \*
\#     x
\#         \(MulOp\)
\#         \(Factor\)

         loc\_push
         error\_push

         call              sequence\_34

         error\_pop\_merge
   fail\! jump              failed\_38
         loc\_pop\_discard
         jump              kleene\_37

failed\_38:
         loc\_pop\_rewind
         status\_ok
         return

sequence\_34:
\# x
\#     \(MulOp\)
\#     \(Factor\)

         ast\_push
         loc\_push
         error\_clear

         error\_push

         call              sym\_MulOp

         error\_pop\_merge
   fail\! jump              failed\_35
         error\_push

         call              sym\_Factor

         error\_pop\_merge
   fail\! jump              failed\_35

         ast\_pop\_discard
         loc\_pop\_discard
         return

failed\_35:
         ast\_pop\_rewind
         loc\_pop\_rewind
         return
\#
\# value Symbol 'Factor'
\#

sym\_Factor:
\# x
\#     \(Term\)
\#     \*
\#         x
\#             \(AddOp\)
\#             \(Term\)

         symbol\_restore    Factor
  found\! jump              found\_60
         loc\_push
         ast\_push

         call              sequence\_57

   fail\! value\_clear
     ok\! value\_reduce      Factor
         symbol\_save       Factor
         error\_nonterminal Factor
         ast\_pop\_rewind
         loc\_pop\_discard

found\_60:
     ok\! ast\_value\_push
         return

sequence\_57:
\# x
\#     \(Term\)
\#     \*
\#         x
\#             \(AddOp\)
\#             \(Term\)

         ast\_push
         loc\_push
         error\_clear

         error\_push

         call              sym\_Term

         error\_pop\_merge
   fail\! jump              failed\_58
         error\_push

         call              kleene\_54

         error\_pop\_merge
   fail\! jump              failed\_58

         ast\_pop\_discard
         loc\_pop\_discard
         return

failed\_58:
         ast\_pop\_rewind
         loc\_pop\_rewind
         return

kleene\_54:
\# \*
\#     x
\#         \(AddOp\)
\#         \(Term\)

         loc\_push
         error\_push

         call              sequence\_51

         error\_pop\_merge
   fail\! jump              failed\_55
         loc\_pop\_discard
         jump              kleene\_54

failed\_55:
         loc\_pop\_rewind
         status\_ok
         return

sequence\_51:
\# x
\#     \(AddOp\)
\#     \(Term\)

         ast\_push
         loc\_push
         error\_clear

         error\_push

         call              sym\_AddOp

         error\_pop\_merge
   fail\! jump              failed\_52
         error\_push

         call              sym\_Term

         error\_pop\_merge
   fail\! jump              failed\_52

         ast\_pop\_discard
         loc\_pop\_discard
         return

failed\_52:
         ast\_pop\_rewind
         loc\_pop\_rewind
         return
\#
\# value Symbol 'MulOp'
\#

sym\_MulOp:
\# /
\#     '\*'
\#     '/'

         symbol\_restore    MulOp
  found\! jump              found\_67
         loc\_push

         call              choice\_65

   fail\! value\_clear
     ok\! value\_leaf        MulOp
         symbol\_save       MulOp
         error\_nonterminal MulOp
         loc\_pop\_discard

found\_67:
     ok\! ast\_value\_push
         return

choice\_65:
\# /
\#     '\*'
\#     '/'

         error\_clear

         loc\_push
         error\_push

         input\_next        "t \*"
     ok\! test\_char         "\*"

         error\_pop\_merge
     ok\! jump              oknoast\_64

         loc\_pop\_rewind
         loc\_push
         error\_push

         input\_next        "t /"
     ok\! test\_char         "/"

         error\_pop\_merge
     ok\! jump              oknoast\_64

         loc\_pop\_rewind
         status\_fail
         return

oknoast\_64:
         loc\_pop\_discard
         return
\#
\# value Symbol 'Number'
\#

sym\_Number:
\# x
\#     ?
\#         \(Sign\)
\#     \+
\#         \(Digit\)

         symbol\_restore    Number
  found\! jump              found\_80
         loc\_push
         ast\_push

         call              sequence\_77

   fail\! value\_clear
     ok\! value\_reduce      Number
         symbol\_save       Number
         error\_nonterminal Number
         ast\_pop\_rewind
         loc\_pop\_discard

found\_80:
     ok\! ast\_value\_push
         return

sequence\_77:
\# x
\#     ?
\#         \(Sign\)
\#     \+
\#         \(Digit\)

         ast\_push
         loc\_push
         error\_clear

         error\_push

         call              optional\_70

         error\_pop\_merge
   fail\! jump              failed\_78
         error\_push

         call              poskleene\_73

         error\_pop\_merge
   fail\! jump              failed\_78

         ast\_pop\_discard
         loc\_pop\_discard
         return

failed\_78:
         ast\_pop\_rewind
         loc\_pop\_rewind
         return

optional\_70:
\# ?
\#     \(Sign\)

         loc\_push
         error\_push

         call              sym\_Sign

         error\_pop\_merge
   fail\! loc\_pop\_rewind
     ok\! loc\_pop\_discard
         status\_ok
         return

poskleene\_73:
\# \+
\#     \(Digit\)

         loc\_push

         call              sym\_Digit

   fail\! jump              failed\_74

loop\_75:
         loc\_pop\_discard
         loc\_push
         error\_push

         call              sym\_Digit

         error\_pop\_merge
     ok\! jump              loop\_75
         status\_ok

failed\_74:
         loc\_pop\_rewind
         return
\#
\# value Symbol 'Sign'
\#

sym\_Sign:
\# /
\#     '\-'
\#     '\+'

         symbol\_restore    Sign
  found\! jump              found\_86
         loc\_push

         call              choice\_5

   fail\! value\_clear
     ok\! value\_leaf        Sign
         symbol\_save       Sign
         error\_nonterminal Sign
         loc\_pop\_discard

found\_86:
     ok\! ast\_value\_push
         return
\#
\# value Symbol 'Term'
\#

sym\_Term:
\# \(Number\)

         symbol\_restore    Term
  found\! jump              found\_89
         loc\_push
         ast\_push

         call              sym\_Number

   fail\! value\_clear
     ok\! value\_reduce      Term
         symbol\_save       Term
         error\_nonterminal Term
         ast\_pop\_rewind
         loc\_pop\_discard

found\_89:
     ok\! ast\_value\_push
         return

\#
\#

PEG serialization format

Here we specify the format used by the Parser Tools to serialize Parsing Expression Grammars as immutable values for transport, comparison, etc.

We distinguish between regular and canonical serializations. While a PEG may have more than one regular serialization only exactly one of them will be canonical.

Example

Assuming the following PEG for simple mathematical expressions

PEG calculator \(Expression\)
    Digit      <\- '0'/'1'/'2'/'3'/'4'/'5'/'6'/'7'/'8'/'9'       ;
    Sign       <\- '\-' / '\+'                                     ;
    Number     <\- Sign? Digit\+                                  ;
    Expression <\- Term \(AddOp Term\)\*                            ;
    MulOp      <\- '\*' / '/'                                     ;
    Term       <\- Factor \(MulOp Factor\)\*                        ;
    AddOp      <\- '\+'/'\-'                                       ;
    Factor     <\- '\(' Expression '\)' / Number                   ;
END;

then its canonical serialization (except for whitespace) is

pt::grammar::peg \{
    rules \{
        AddOp      \{is \{/ \{t \-\} \{t \+\}\}                                                                mode value\}
        Digit      \{is \{/ \{t 0\} \{t 1\} \{t 2\} \{t 3\} \{t 4\} \{t 5\} \{t 6\} \{t 7\} \{t 8\} \{t 9\}\}                mode value\}
        Expression \{is \{x \{n Term\} \{\* \{x \{n AddOp\} \{n Term\}\}\}\}                                        mode value\}
        Factor     \{is \{/ \{x \{t \(\} \{n Expression\} \{t \)\}\} \{n Number\}\}                                  mode value\}
        MulOp      \{is \{/ \{t \*\} \{t /\}\}                                                                mode value\}
        Number     \{is \{x \{? \{n Sign\}\} \{\+ \{n Digit\}\}\}                                                 mode value\}
        Sign       \{is \{/ \{t \-\} \{t \+\}\}                                                                mode value\}
        Term       \{is \{x \{n Factor\} \{\* \{x \{n MulOp\} \{n Factor\}\}\}\}                                    mode value\}
    \}
    start \{n Expression\}
\}

PE serialization format

Here we specify the format used by the Parser Tools to serialize Parsing Expressions as immutable values for transport, comparison, etc.

We distinguish between regular and canonical serializations. While a parsing expression may have more than one regular serialization only exactly one of them will be canonical.

Example

Assuming the parsing expression shown on the right-hand side of the rule

Expression <\- Term \(AddOp Term\)\*

then its canonical serialization (except for whitespace) is

\{x \{n Term\} \{\* \{x \{n AddOp\} \{n Term\}\}\}\}

Bugs, Ideas, Feedback

This document, and the package it describes, will undoubtedly contain bugs and other problems. Please report such in the category pt of the Tcllib Trackers. Please also report any ideas for enhancements you may have for either package and/or documentation.

When proposing code changes, please provide unified diffs, i.e the output of diff -u.

Note further that attachments are strongly preferred over inlined patches. Attachments can be made by going to the Edit form of the ticket immediately after its creation, and then using the left-most button in the secondary navigation bar.

KEYWORDS

EBNF, LL(k), PARAM, PEG, TDPL, context-free languages, conversion, expression, format conversion, grammar, matching, parser, parsing expression, parsing expression grammar, push down automaton, recursive descent, serialization, state, top-down parsing languages, transducer

CATEGORY

Parsing and Grammars

COPYRIGHT

Copyright © 2009 Andreas Kupries