Tcl Library Source Code

Bounty program for improvements to Tcl and certain Tcl packages.

[ Main Table Of Contents | Table Of Contents | Keyword Index | Categories | Modules | Applications ]


pt::param - PackRat Machine Specification

Table Of Contents


package require Tcl 8.5


Are you lost ? Do you have trouble understanding this document ? In that case please read the overview provided by the Introduction to Parser Tools. This document is the entrypoint to the whole system the current package is a part of.

Welcome to the PackRat Machine (short: PARAM), a virtual machine geared towards the support of recursive descent parsers, especially packrat parsers. Towards this end it has features like the caching and reuse of partial results, the caching of the encountered input, and the ability to backtrack in both input and AST creation.

This document specifies the machine in terms of its architectural state and instruction set.

Architectural State

Any PARAM implementation has to manage at least the following state:

Instruction Set

With the machine's architectural state specified it is now possible to specify the instruction set operating on that state and to be implemented by any realization of the PARAM. The 37 instructions are grouped roughly by the state they influence and/or query during their execution.

Input Handling

The instructions in this section mainly access IN, pulling the characters to process into the machine.

Character Processing

The instructions in this section mainly access CC, testing it against character classes, ranges, and individual characters.

Error Handling

The instructions in this section mainly access ER and ES.

Status Control

The instructions in this section directly manipulate ST.

Location Handling

The instructions in this section access CL and LS.

Nonterminal Execution

The instructions in this section access and manipulate NC.

Value Construction

The instructions in this section manipulate SV.

AST Construction

The instructions in this section manipulate ARS and AS.

Control Flow

Normally this section would contain the specifications of the control flow instructions of the PARAM, i.e. (un)conditional jumps and the like. However, this part of the PARAM is intentionally left unspecified. This allows the implementations to freely choose how to implement control flow.

The implementation of this machine in Parser Tools, i.e the package pt::rde, is not only coded in Tcl, but also relies on Tcl commands to provide it with control flow (instructions).

Interaction of the Instructions with the Architectural State

Instruction		Inputs				Outputs
======================= =======================		====================
ast_pop_discard		AS			->	AS
ast_pop_rewind		AS			->	AS, ARS
ast_push		ARS, AS			->	AS
ast_value_push		SV, ARS			->	ARS
======================= =======================		====================
error_clear		-			->	ER
error_nonterminal sym	ER, LS			->	ER
error_pop_merge   	ES, ER			->	ER
error_push		ES, ER			->	ES
======================= =======================		====================
input_next msg		IN			->	TC, CL, CC, ST, ER
======================= =======================		====================
loc_pop_discard		LS			->	LS
loc_pop_rewind		LS			->	LS, CL
loc_push		CL, LS			->	LS
======================= =======================		====================
status_fail		-			->	ST
status_negate		ST			->	ST
status_ok		-			->	ST
======================= =======================		====================
symbol_restore sym	NC			->	CL, ST, ER, SV
symbol_save    sym	CL, ST, ER, SV LS	->	NC
======================= =======================		====================
test_alnum  		CC			->	ST, ER
test_alpha		CC			->	ST, ER
test_ascii		CC			->	ST, ER
test_char char		CC			->	ST, ER
test_ddigit		CC			->	ST, ER
test_digit		CC			->	ST, ER
test_graph		CC			->	ST, ER
test_lower		CC			->	ST, ER
test_print		CC			->	ST, ER
test_punct		CC			->	ST, ER
test_range chars chare	CC			->	ST, ER
test_space		CC			->	ST, ER
test_upper		CC			->	ST, ER
test_wordchar		CC			->	ST, ER
test_xdigit		CC			->	ST, ER
======================= =======================		====================
value_clear		-			->	SV
value_leaf symbol	LS, CL			->	SV
value_reduce symbol	ARS, LS, CL		->	SV
======================= =======================		====================

Bugs, Ideas, Feedback

This document, and the package it describes, will undoubtedly contain bugs and other problems. Please report such in the category pt of the Tcllib Trackers. Please also report any ideas for enhancements you may have for either package and/or documentation.

When proposing code changes, please provide unified diffs, i.e the output of diff -u.

Note further that attachments are strongly preferred over inlined patches. Attachments can be made by going to the Edit form of the ticket immediately after its creation, and then using the left-most button in the secondary navigation bar.


EBNF, LL(k), PEG, TDPL, context-free languages, expression, grammar, matching, parser, parsing expression, parsing expression grammar, push down automaton, recursive descent, state, top-down parsing languages, transducer, virtual machine


Parsing and Grammars


Copyright © 2009 Andreas Kupries