Tcl Library Source Code

Bounty program for improvements to Tcl and certain Tcl packages.

[ Main Table Of Contents | Table Of Contents | Keyword Index | Categories | Modules | Applications ]


grammar::fa - Create and manipulate finite automatons

Table Of Contents


package require Tcl 8.4
package require snit 1.3
package require struct::list
package require struct::set
package require grammar::fa::op ?0.2?
package require grammar::fa ?0.4?

::grammar::fa faName ?=|:=|<--|as|deserialize src|fromRegex re ?over??
faName option ?arg arg ...?
faName destroy
faName clear
faName = srcFA
faName --> dstFA
faName serialize
faName deserialize serialization
faName states
faName state add s1 ?s2 ...?
faName state delete s1 ?s2 ...?
faName state exists s
faName state rename s snew
faName startstates
faName start add s1 ?s2 ...?
faName start remove s1 ?s2 ...?
faName start? s
faName start?set stateset
faName finalstates
faName final add s1 ?s2 ...?
faName final remove s1 ?s2 ...?
faName final? s
faName final?set stateset
faName symbols
faName [email protected] s ?d?
faName [email protected] stateset
faName symbol add sym1 ?sym2 ...?
faName symbol delete sym1 ?sym2 ...?
faName symbol rename sym newsym
faName symbol exists sym
faName next s sym ?--> next?
faName !next s sym ?--> next?
faName nextset stateset sym
faName is deterministic
faName is complete
faName is useful
faName is epsilon-free
faName reachable_states
faName unreachable_states
faName reachable s
faName useful_states
faName unuseful_states
faName useful s
faName epsilon_closure s
faName reverse
faName complete
faName remove_eps
faName trim ?what?
faName determinize ?mapvar?
faName minimize ?mapvar?
faName complement
faName kleene
faName optional
faName union fa ?mapvar?
faName intersect fa ?mapvar?
faName difference fa ?mapvar?
faName concatenate fa ?mapvar?
faName fromRegex regex ?over?


This package provides a container class for finite automatons (Short: FA). It allows the incremental definition of the automaton, its manipulation and querying of the definition. While the package provides complex operations on the automaton (via package grammar::fa::op), it does not have the ability to execute a definition for a stream of symbols. Use the packages grammar::fa::dacceptor and grammar::fa::dexec for that. Another package related to this is grammar::fa::compiler. It turns a FA into an executor class which has the definition of the FA hardwired into it. The output of this package is configurable to suit a large number of different implementation languages and paradigms.

For more information about what a finite automaton is see section FINITE AUTOMATONS.


The package exports the API described here.


All automatons provide the following methods for their manipulation:



For the mathematically inclined, a FA is a 5-tuple (S,Sy,St,Fi,T) where

In computer theory a FA is most often shown as a graph where the nodes represent the states, and the edges between the nodes encode the transition function: For all n in S' = T (s, sy) we have one edge between the nodes representing s and n resp., labeled with sy. The start and accepting states are encoded through distinct visual markers, i.e. they are attributes of the nodes.

FA's are used to process streams of symbols over Sy.

A specific FA is said to accept a finite stream sy_1 sy_2 ... sy_n if there is a path in the graph of the FA beginning at a state in St and ending at a state in Fi whose edges have the labels sy_1, sy_2, etc. to sy_n. The set of all strings accepted by the FA is the language of the FA. One important equivalence is that the set of languages which can be accepted by an FA is the set of regular languages.

Another important concept is that of deterministic FAs. A FA is said to be deterministic if for each string of input symbols there is exactly one path in the graph of the FA beginning at the start state and whose edges are labeled with the symbols in the string. While it might seem that non-deterministic FAs to have more power of recognition, this is not so. For each non-deterministic FA we can construct a deterministic FA which accepts the same language (--> Thompson's subset construction).

While one of the premier applications of FAs is in parsing, especially in the lexer stage (where symbols == characters), this is not the only possibility by far.

Quite a lot of processes can be modeled as a FA, albeit with a possibly large set of states. For these the notion of accepting states is often less or not relevant at all. What is needed instead is the ability to act to state changes in the FA, i.e. to generate some output in response to the input. This transforms a FA into a finite transducer, which has an additional set OSy of output symbols and also an additional output function O which maps from "S x (Sy + epsilon)" to "(Osy + epsilon)", i.e a combination of state and input, possibly empty to an output symbol, or nothing.

For the graph representation this means that edges are additional labeled with the output symbol to write when this edge is traversed while matching input. Note that for an application "writing an output symbol" can also be "executing some code".

Transducers are not handled by this package. They will get their own package in the future.

Bugs, Ideas, Feedback

This document, and the package it describes, will undoubtedly contain bugs and other problems. Please report such in the category grammar_fa of the Tcllib Trackers. Please also report any ideas for enhancements you may have for either package and/or documentation.

When proposing code changes, please provide unified diffs, i.e the output of diff -u.

Note further that attachments are strongly preferred over inlined patches. Attachments can be made by going to the Edit form of the ticket immediately after its creation, and then using the left-most button in the secondary navigation bar.


automaton, finite automaton, grammar, parsing, regular expression, regular grammar, regular languages, state, transducer


Grammars and finite automata


Copyright © 2004-2009 Andreas Kupries