Tcl Library Source Code


[ Main Table Of Contents | Table Of Contents | Keyword Index | Categories | Modules | Applications ]


pt::ast - Abstract Syntax Tree Serialization

Table Of Contents


package require Tcl 8.5 9
package require pt::ast ?1.2?

::pt::ast verify serial ?canonvar?
::pt::ast verify-as-canonical serial
::pt::ast canonicalize serial
::pt::ast print serial
::pt::ast bottomup cmdprefix ast
cmdprefix ast
::pt::ast topdown cmdprefix pe
::pt::ast equal seriala serialb
::pt::ast new0 s loc ?child...?
::pt::ast new s start end ?child...?


Are you lost ? Do you have trouble understanding this document ? In that case please read the overview provided by the Introduction to Parser Tools. This document is the entrypoint to the whole system the current package is a part of.

This package provides commands to work with the serializations of abstract syntax trees as managed by the Parser Tools, and specified in section AST serialization format.

This is a supporting package in the Core Layer of Parser Tools.


AST serialization format

Here we specify the format used by the Parser Tools to serialize Abstract Syntax Trees (ASTs) as immutable values for transport, comparison, etc.

Each node in an AST represents a nonterminal symbol of a grammar, and the range of tokens/characters in the input covered by it. ASTs do not contain terminal symbols, i.e. tokens/characters. These can be recovered from the input given a symbol's location.

We distinguish between regular and canonical serializations. While a tree may have more than one regular serialization only exactly one of them will be canonical.


Assuming the parsing expression grammar below

PEG calculator (Expression)
    Digit      <- '0'/'1'/'2'/'3'/'4'/'5'/'6'/'7'/'8'/'9'       ;
    Sign       <- '-' / '+'                                     ;
    Number     <- Sign? Digit+                                  ;
    Expression <- Term (AddOp Term)*                            ;
    MulOp      <- '*' / '/'                                     ;
    Term       <- Factor (MulOp Factor)*                        ;
    AddOp      <- '+'/'-'                                       ;
    Factor     <- '(' Expression ')' / Number                   ;

and the input string


then a parser should deliver the abstract syntax tree below (except for whitespace)

set ast {Expression 0 4
    {Factor 0 4
        {Term 0 2
            {Number 0 2
                {Digit 0 0}
                {Digit 1 1}
                {Digit 2 2}
        {AddOp 3 3}
        {Term 4 4
            {Number 4 4
                {Digit 4 4}

Or, more graphical

Bugs, Ideas, Feedback

This document, and the package it describes, will undoubtedly contain bugs and other problems. Please report such in the category pt of the Tcllib Trackers. Please also report any ideas for enhancements you may have for either package and/or documentation.

When proposing code changes, please provide unified diffs, i.e the output of diff -u.

Note further that attachments are strongly preferred over inlined patches. Attachments can be made by going to the Edit form of the ticket immediately after its creation, and then using the left-most button in the secondary navigation bar.


EBNF, LL(k), PEG, TDPL, context-free languages, expression, grammar, matching, parser, parsing expression, parsing expression grammar, push down automaton, recursive descent, state, top-down parsing languages, transducer


Parsing and Grammars


Copyright © 2009 Andreas Kupries