Tcl Library Source Code

Documentation
Login


[ Main Table Of Contents | Table Of Contents | Keyword Index | Categories | Modules | Applications ]

NAME

struct::list - Procedures for manipulating lists

Table Of Contents

SYNOPSIS

package require Tcl 8.4
package require struct::list ?1.8.5?

::struct::list longestCommonSubsequence sequence1 sequence2 ?maxOccurs?
::struct::list longestCommonSubsequence2 sequence1 sequence2 ?maxOccurs?
::struct::list lcsInvert lcsData len1 len2
::struct::list lcsInvert2 lcs1 lcs2 len1 len2
::struct::list lcsInvertMerge lcsData len1 len2
::struct::list lcsInvertMerge2 lcs1 lcs2 len1 len2
::struct::list reverse sequence
::struct::list shuffle list
::struct::list assign sequence varname ?varname?...
::struct::list flatten ?-full? ?--? sequence
::struct::list map sequence cmdprefix
::struct::list mapfor var sequence script
::struct::list filter sequence cmdprefix
::struct::list filterfor var sequence expr
::struct::list split sequence cmdprefix ?passVar failVar?
::struct::list fold sequence initialvalue cmdprefix
::struct::list shift listvar
::struct::list iota n
::struct::list equal a b
::struct::list repeat size element1 ?element2 element3...?
::struct::list repeatn value size...
::struct::list dbJoin ?-inner|-left|-right|-full? ?-keys varname? {keycol table}...
::struct::list dbJoinKeyed ?-inner|-left|-right|-full? ?-keys varname? table...
::struct::list swap listvar i j
::struct::list firstperm list
::struct::list nextperm perm
::struct::list permutations list
::struct::list foreachperm var list body

DESCRIPTION

The ::struct::list namespace contains several useful commands for processing Tcl lists. Generally speaking, they implement algorithms more complex or specialized than the ones provided by Tcl itself.

It exports only a single command, struct::list. All functionality provided here can be reached through a subcommand of this command.

COMMANDS

LONGEST COMMON SUBSEQUENCE AND FILE COMPARISON

The longestCommonSubsequence subcommand forms the core of a flexible system for doing differential comparisons of files, similar to the capability offered by the Unix command diff. While this procedure is quite rapid for many tasks of file comparison, its performance degrades severely if sequence2 contains many equal elements (as, for instance, when using this procedure to compare two files, a quarter of whose lines are blank. This drawback is intrinsic to the algorithm used (see the Reference for details).

One approach to dealing with the performance problem that is sometimes effective in practice is arbitrarily to exclude elements that appear more than a certain number of times. This number is provided as the maxOccurs parameter. If frequent lines are excluded in this manner, they will not appear in the common subsequence that is computed; the result will be the longest common subsequence of infrequent elements. The procedure longestCommonSubsequence2 implements this heuristic. It functions as a wrapper around longestCommonSubsequence; it computes the longest common subsequence of infrequent elements, and then subdivides the subsequences that lie between the matches to approximate the true longest common subsequence.

TABLE JOIN

This is an operation from relational algebra for relational databases.

The easiest way to understand the regular inner join is that it creates the cartesian product of all the tables involved first and then keeps only all those rows in the resulting table for which the values in the specified key columns are equal to each other.

Implementing this description naively, i.e. as described above will generate a huge intermediate result. To avoid this the cartesian product and the filtering of row are done at the same time. What is required is a fast way to determine if a key is present in a table. In a true database this is done through indices. Here we use arrays internally.

An outer join is an extension of the inner join for two tables. There are three variants of outerjoins, called left, right, and full outer joins. Their result always contains all rows from an inner join and then some additional rows.

  1. For the left outer join the additional rows are all rows from the left table for which there is no key in the right table. They are joined to an empty row of the right table to fit them into the result.

  2. For the right outer join the additional rows are all rows from the right table for which there is no key in the left table. They are joined to an empty row of the left table to fit them into the result.

  3. The full outer join combines both left and right outer join. In other words, the additional rows are as defined for left outer join, and right outer join, combined.

We extend all the joins from two to n tables (n > 2) by executing

(...((table1 join table2) join table3) ...) join tableN

Examples for all the joins:

Inner Join

{0 foo}              {0 bagel}    {0 foo   0 bagel}
{1 snarf} inner join {1 snatz}  = {1 snarf 1 snatz}
{2 blue}             {3 driver}

Left Outer Join

{0 foo}                   {0 bagel}    {0 foo   0 bagel}
{1 snarf} left outer join {1 snatz}  = {1 snarf 1 snatz}
{2 blue}                  {3 driver}   {2 blue  {} {}}

Right Outer Join

{0 foo}                    {0 bagel}    {0 foo   0 bagel}
{1 snarf} right outer join {1 snatz}  = {1 snarf 1 snatz}
{2 blue}                   {3 driver}   {{} {}   3 driver}

Full Outer Join

{0 foo}                   {0 bagel}    {0 foo   0 bagel}
{1 snarf} full outer join {1 snatz}  = {1 snarf 1 snatz}
{2 blue}                  {3 driver}   {2 blue  {} {}}
                                       {{} {}   3 driver}

REFERENCES

  1. J. W. Hunt and M. D. McIlroy, "An algorithm for differential file comparison," Comp. Sci. Tech. Rep. #41, Bell Telephone Laboratories (1976). Available on the Web at the second author's personal site: http://www.cs.dartmouth.edu/~doug/

  2. Donald E. Knuth, "Fascicle 2b of 'The Art of Computer Programming' volume 4". Available on the Web at the author's personal site: http://www-cs-faculty.stanford.edu/~knuth/fasc2b.ps.gz.

Bugs, Ideas, Feedback

This document, and the package it describes, will undoubtedly contain bugs and other problems. Please report such in the category struct :: list of the Tcllib Trackers. Please also report any ideas for enhancements you may have for either package and/or documentation.

When proposing code changes, please provide unified diffs, i.e the output of diff -u.

Note further that attachments are strongly preferred over inlined patches. Attachments can be made by going to the Edit form of the ticket immediately after its creation, and then using the left-most button in the secondary navigation bar.

KEYWORDS

Fisher-Yates, assign, common, comparison, diff, differential, equal, equality, filter, first permutation, flatten, folding, full outer join, generate permutations, inner join, join, left outer join, list, longest common subsequence, map, next permutation, outer join, permutation, reduce, repeating, repetition, reshuffle, reverse, right outer join, shuffle, subsequence, swapping

CATEGORY

Data structures

COPYRIGHT

Copyright © 2003-2005 by Kevin B. Kenny. All rights reserved
Copyright © 2003-2012 Andreas Kupries