Tcl Improvement Proposals: 170.tip at [f9cf529b8c]

File tip/170.tip artifact 9f3237bb50 part of check-in f9cf529b8c

TIP:		170
Title:		Better Support for Nested Lists
Version:	$Revision: 1.2 $
Author:		Sergey Babkin <[email protected]>
State:		Draft
Type:		Project
Tcl-Version:	8.5
Vote:		Pending
Created:	30-Jan-2004
Post-History:	

~Abstract

Nested lists are easy to create with Tcl but then manipulating them is
not easy. For example, think about how to change a value nested in a
list 2 levels deep?  How about 4 levels deep?  The proposed new
commands make such manipulation easy, and make the nested lists a
great replacement for arrays and structures in C-like languages.

~Rationale and Specification

The new proposed commands start with the prefix "ldeep". They are
desgined to resemble the classic list commands with "l" names. In the
cases when the meaning of "ldeep" command differs substantially from
the "l" command, the name has been selected to be different (such as
"ldeeprep", not "ldeepreplace").

The commands have been extensively used in the Not A Commander project
[http://nac.sf.net/] and have been adjusted to the observed needs of
practical use.

All these command use the concept of "path". The idea is that you can
identify a location in a nested list by a sequence of indexes. For
example, the path "1 2 3" identifies the element that can be accessed
as

|  [lindex [lindex  [lindex $list 1] 2] 3]

A path can be naturally represented as a list itself. In practice it
is also convenient to specify paths as two or more lists that get
logically concatenated within a command, to represent a "base
location" (the first list) and "location relative to the base" (the
second list). An empty path list is also valid, it means "the whole
original list". The current implementation does not support "end" as a
valid element of the path since it does not seem to make much sense:
the location identified by such a path would float as the length of
the intermediate lists changes. But it can be added if desired.

The new proposed commands are:

|ldeepindex list path ?path?...

 > Extract an element from a nested list. The element is identified by
   the logical concatenation of the paths. It returns the extracted
   element.

|ldeeplen list path ?path?...

 > Find the length of a nested sublist. The sublist is identified by
   the logical concatenation of the paths.  It returns the found
   length.  A non-existing sublist is assumed to have the length of 0.

|ldeeprange list path ?path?... first last

 > Extract a sublist from a nested list. This command is a convenient
   equivalent of

|   lrange [ldeepindex list path ?path?] first last]

 > "end" is supported as the first or last index.  It returns the
   extracted sublist.

|ldeepset listvar path ?path?... value

 > Set a value in a nested list variable. If the variable did not
   exist previously, it is created. If the intermediate lists
   specified in the path did not exist, they are created as empty
   lists. This includes the "filler" elements: for example, if listvar
   contained a list of one element and the path starts with "5 ...",
   the elements with indexes 1, 2, 3, 4 and 5 will all be created and
   the the further creation within the element at index 5 will
   proceed.

 > A special meaning is assigned by this command to the negative
   indexes in the path: they mean "add a new element at the end of the
   list". So this command also doubles as a nested version of
   lappend. For example,

|   ldeepset listvar -1 value

 > means the same thing as

|   lappend listvar value

 > This merging has happened because it's often neccessary to add
   elements to the lists in the middle of the path. The particular
   value used to indicate the addition of an element can be changed to
   something more symbolic, for example to "append" instead of a
   negative value.

 > The ''ldeepset'' command returns nothing. Since the version without
   value as in the common set can not be used, returning the value did
   not seem to make sense.  Also when experimenting with large lists
   from the command line, returning the value that is a large list
   itself would cause a long and unpleasant printout of it.

|ldeepincr lstvar path ?path?... int-value

 > Increase a value within a nested list by int-value. Note that since
   the amount of increase has to be differentiated from the path, it's
   mandatory even for the value of 1. This is a convenient and often
   used shortcut for the ldeepindex-expr-ldeepset sequence.  It
   returns the value of the element after increase.

|ldeeprep lstvar path ?path?... first last element-list

 > Replace a range of elements in a sublist identified by the path
   with the elements from the element-list.  It returns nothing.

 > This command is different from lreplace in two ways, hence the name
   change.  First, it acts on data in a variable, not on a list as an
   argument. Second, the elements for replacement are contained in a
   list, not as separate elements on command line. Both differences
   were created for convenience of practical use, plus to allow the
   path to pick up the variable number of arguments. I have found that
   I always need to replace elements in a variable, not as a
   pass-through operator, and that I almost always need to insert
   elements from another list, not just some fixed set of elements.

|ldeeppop lstvar path ?path?... count

 > Remove count elements from the end of the sublist identified by
   path in the variable and return them in a list.

 > This command was inspired by the pop operator in Perl. Somehow I've
   never has a very string need for the other similar commands (which
   would be ''ldeeppush'' to add elements, and ''ldeepshift'' and
   ''ldeepunshift'' for operations on the start of the list) but they
   can be easily added as well for completeness.

 > The command returns the list of the popped elements in the original
   order. For example, if lstvar contained {0 1 2 3 4 5}, "ldeeppop
   lstvar {} 2" would return {4 5}, NOT {5 4}.

~Other Extensions for List Support

In my practice I have found that a few other commands make working
with lists much more convenient. They are not directly related to the
nested lists but to the lists in general.

|lconcat sublist ?sublist?...

 > Concatenate the argument lists and return the resulting list.  This
   command is similar to concat but avoids converting the values to
   strings, concatenating the strings and then re-parsing the
   strings. When the lists involved grow to a few megabytes, concat
   becomes very inefficient both in the sense of time and memory
   usage. ''lconcat'' resolved this inefficiency. Note that it does
   ''not'' replace ''concat'', which can be used to assemble lists
   from pieces in different argument strings.  The command returns the
   concatenated list.
  
|mset list-of-variable-names list-of-values

 > Set the values from the value list to the variable in the variable
   list at the same index. The command name stands for "multiple
   set". This command is inspired by assignments in Perl.

 > If there were more variables than the values, the rest of variables
   are set to an empty value. If there were more values than
   variables, then the last variable is assigned the whole end of a
   list (as a list).

 > A special variable name "-" can be used to throw away a value, or
   the whole end of the value list if it's specified as the last one
   in the variable list.

 > The command returns nothing. It can be argued that it would make
   sense to return the length of the original list, or the difference
   between the length of the values list and the length of the
   variables list. I don't know which one is better.

 > This command is particularly convenient for returning multiple
   values from a procedure.  For example:

|   proc xxx {a b} {
|      return [list [expr $b+$a] [expr $b*$a]]
|   }
|   mset {sum product} [xxx 1 2]

|ldup count element ?element?...

 > This returns a list produced by duplicating the sequence of
   elements ''count'' times.  This command is inspired by the operator
   "x" in Perl, and it is a logical equivalent of:

|   proc ldup {count args} {
|      set res {}
|      for {set i 0} {$i < $count} {incr i} {
|         set res [lconcat $res $args]
|      }
|      return $res
|   }


|lvdup count element-list

 > This returns a list produced by duplicating the element-list count
   times.  This command is inspired by the operator "x" in Perl, and
   it is a logical equivalent of

|   proc lvdup {count list} {
|      set res {}
|      for {set i 0} {$i < $count} {incr i} {
|         set res [lconcat $res $list]
|      }
|      return $res
|   }

~Reference Implementation

The reference implementation is available as part of the Not A
Commander project [http://nac.sf.net/], the source file ''cutil.c''.
To include the new commands into Tcl, the error messages will have to
be adjusted to match the style used in Tcl, and the man pages will
have to be written.  The current implementation has been tested in
fair amounts for both correctness and efficiency by usage in the Not A
Commander project, the formal test suite would have to be written.
Further progress in this direction depends on acceptance of this
proposal.

~Copyright

This document has been placed in the public domain.