# TIP 22: Multiple Index Arguments to lindex
Author: David Cuthbert <[email protected]>
Author: Kevin Kenny <[email protected]>
Author: Don Porter <[email protected]>
Author: Donal K. Fellows <[email protected]>
State: Final
Type: Project
Vote: Done
Created: 19-Jan-2001
Post-History:
Discussions-To: news:comp.lang.tcl,mailto:[email protected]
Keywords: lindex,multiple arguments,sublists
Tcl-Version: 8.4a2
-----
# Abstract
Obtaining access to elements of sublists in Tcl often requires nested
calls to the _lindex_ command. The indices are syntactically listed
in most-nested to least-nested order, which is the reverse from other
notations. In addition, the nesting of command substitution brackets
further decreases readability. This proposal describes an extension
to the _lindex_ command that allows it to accept multiple index
arguments, in least-nested to most-nested order, to automatically
extract elements of sublists.
# Rationale
The heterogeneous nature of Tcl lists allows them to be applied to a
number of useful data structures. In particular, lists can contain
elements that are, themselves, valid lists. In this document, these
elements are referred to as _sublists._
Extracting elements from sublists often requires nested calls to
_lindex._ Consider, for example, the following Tcl script that
prints the center element of a 3-by-3 matrix:
set A {{1 2 3} {4 5 6} {7 8 9}}
puts [lindex [lindex $A 2] 2]
When these calls are deeply nested - e.g., embedded in an _expr_
arithmetic expression, having results extracted through _lrange,_
etc. - the results are difficult to read:
# Print the sum of the center indices of two 3x3 matrices
set p [expr {[lindex [lindex $A 2] 2] + [lindex [lindex $A 2] 2]}]
# Get all but the last font in the following parsed structure:
set pstruct {text {ignored-data
{ ... }
}
{valid-styles
{justifiction {left centered right full}}
{font {courier helvetica times}}
}
}
return [lrange [lindex [lindex [lindex $pstruct 1] 2] 2] 0 end-1]
Note that the list of indices in the latter example is listed in the
reverse order of vector indices. In most other languages/domains, the
last line might take on one of the following forms:
return list_range(pstruct[2][2][1], 0, end-1);
return pstruct[[2, 2, 1]][[0:-1]]
temp = pstruct(2, 2, 1);
result = range(temp, 0, length(temp) - 1);
Allowing the _lindex_ command to accept multiple arguments would
allow this more-natural style of coding to be written in Tcl.
# Specification
1. Under this proposal, the syntax for the _lindex_ command is to be
modified to accept one of two forms:
lindex list indexList
or
lindex list index1 index2...
> In either form of the command, the _list_ parameter is
expected to be a well-formed Tcl list.
> In the first form of the command, the _indexList_ argument is
expected to be a Tcl list comprising one or more list indices,
each of which must be an integer, the literal string _end_, or
the literal string _end-_ followed by an integer with no
intervening whitespace. The existing _lindex_ command is a
degenerate form of this first form, where the list comprises a
single index.
> In the second form of the command, each of the _index_
arguments is expected to be a list index, which again may be an
integer, the literal string _end_, or the literal string
_end-_ followed by an integer with no intervening whitespace.
2. In either form of the command, once the _indexList_ parameter
is expanded, there is a single _list_ parameter and one or
more _index_ parameters. If there is a single _index_
parameter _N_, the behavior is identical to today's _lindex_
command, and the command returns the _N_th element of
_list_; if _N_ is less than zero or at least the length of
the list, a null string is returned.
3. If more than one _index_ parameter is given, then the behavior is
defined recursively; the result of
lindex $list $index0 $index1 ...
> or
lindex $list [list $index0 $index1 ...]
> is indentical to that of
lindex [lindex $list $index0] $index1...
> or, equivalently,
lindex [lindex $list $index0] [list $index1...]
> \(This specification does not constrain the implementation, which
may be iterative, recursive, or even expanded inline.\)
4. When an invalid index is given, an error of the form, _bad index
"invalid\_index": must be integer or end?-integer?_, where
_invalid\_index_ is the first invalid index encountered, must be
returned.
5. If the list argument is malformed, the error resulting from an
attempt to convert the list argument to a list must be returned.
This behaviour is unchanged from the current implementation.
# Side Effects
1. Whether the result of the _lindex_ operation is successful, the
underlying Tcl\_Obj that represents the list argument may have its
internal representation invalidated or changed to that of a list.
# Discussion
Some attention must be paid to giving the _lindex_ command adequate
performance. In particular, the implementation should address the
common case of
lindex $list $i
where _$i_ is a "pure" integer \(that is, one whose string
representation has not yet been formed\). If the above specification
is followed naively, the flow will be as follows.
Since _objc_ is three, _objv[[2]](2.md)_ is expected to be a list.
Since it is not, it must be converted from its string representation.
It does not have one yet, so the string representation must be formed.
Now the string representation is parsed as a list. An array \(of
length one\) of Tcl\_Obj pointers is allocated to hold the list in
question, and a Tcl\_Obj is allocated to hold the single element.
Memory is allocated to hold the element's string representation.
Now the _lindex_ command converts the first element of the list to
an index \(in this case an integer\).
This elaborate ballet of type shimmering requires converting the
integer to a string and back again. It also requires four calls to
_ckalloc:_
1. Allocate a buffer for the string representation.
2. Allocate the array of Tcl\_Obj pointers for the list
representation.
3. Allocate the Tcl\_Obj that represents the first \(and only\) element
of the list.
4. Allocate a buffer for the string representation of that element.
And at the end, the result is the same integer that was passed as a
parameter originally.
To avoid all this overhead in the common case, the proposed
implementation shall \(in the case where _objc==3_\)
1. Test whether _objv[[2]](2.md)_ designates an object whose internal
representation holds an integer. If so, simply use it as an
index.
2. Test whether _objv[[2]](2.md)_ designates an object whose internal
representation holds a list. If so, perform the recursive
extraction of indexed elements from sublists described above.
3. Form the string representation of _objv[[2]](2.md)_ and test whether
it is _end_ or _end-_ followed by an integer. If so, use it
as an index.
4. Attempt to coerce _objv[[2]](2.md)_ to an integer; if successful, use
the result as an integer.
5. Attempt to coerce _objv[[2]](2.md)_ to a list; if successful, use the
result as an index list.
6. Report a malformed _index_ argument; the _indexList_ parameter
is not a well-formed list.
This logic handles all the cases of singleton lists transparently; it
is effectively a simple-minded type inference that optimizes away
needless conversions.
Assuming that the related [[33]](33.md) is approved, this logic will most
likely be combined with the identical logic required in that proposal
for parsing _index_ arguments to the _lset_ command.
----
# Comments
_Don Porter <[email protected]>_
> I agree that it would be helpful to many programmers to
provide a multi-dimensional array data structure that can
be accessed in the manner described in this TIP. In the
_struct_ module of _tcllib_, several other data structures
are being developed: graph, tree, queue, stack. I would support
adding another data structure to that module that provides an
interface like the one described in this TIP, with the intent that
all of these helpful data structures find their way into the
BI distribution.
> I don't see any advantage to adding complexity to [lindex]
as an alternative to development of a multi-dimensional array
structure. Without a compelling advantage, I'm inclined against
making [lindex] more complex. I like having Tcl's built-in
commands provide primitive operations, and leave it to
packages to combine the primitives into more useful, more
complex resources.
> This TIP should also consider how any changes to [lindex] mesh
with the whole [listx] overhaul of Tcl's [list] command that
has been discussed.
_Dave Cuthbert <[email protected]> responds_
> Don makes a good point -- with a good set of data structures in
tcllib, the need for this TIP is lessened or even eliminated.
Nonetheless, I see this as a way of implementing the structures he
describes. In other words, a more powerful primitive \(which, in
reality, adds fairly little complexity when measured in number of
lines of code changed\) would benefit these structures.
> As for the [listx] overhaul, there are many competing proposals
for the specification it is difficult to come up with a metric. In
writing this TIP, I assumed a vacuum -- that is, a listx command
would not be added to the core in the near future.
_Donal K. Fellows <[email protected]> points out_
> Although there is tcllib and [listx] to think about, they are
certainly not reasons for rejecting this TIP out of hand. The
availability of tcllib is not currently anything like universal
\(not stating whether this is a good, bad or ugly thing\) and all the
[listx] work will need its own TIP to make it into the core \(you
tend to have availability problems if it is an extension.\) It is
not as if the core is short of syntactic sugar right now \(the
[foreach] command is ample demonstration of this.\)
_Don Porter <[email protected]> follows up_
> I'll leave the discussion above in place so the history
of this TIP is preserved, but I have withdrawn my
objection.
----
There was quite a discussion on news:comp.lang.tcl about using lindex
to return multiple arguments. For example:
% set list {a {b1 b2} c d e}
% lindex $list 1
b1 b2
% lindex $list 1 0
{b1 b2} a
% lindex $list {1 0}
b1
In other words, the list index arguments can, themselves, be lists.
Only when the argument is a list would the "recursive selection"
procedure of the TIP be used. For multiple arguments, the behaviour
is akin to
lindex $list a b c ->
list [lindex $list a] [lindex $list b] [lindex $list c]
_Summarised by Dave Cuthbert <[email protected]>_
_Donal K. Fellows <[email protected]> points out_
> The problems with the above version of multiple indexing are that
it loses the property that [lindex] always returns a single
element \(making writing robust code harder\) and that it forces use
of the [list] constructor a lot or inefficient type handling when
some of the indices must be computed. Then there is the whole
question of what happens when you have indexes that are lists of
lists, which is a major can of worms.
> Luckily, we could always put this sort of behaviour into a separate
command \(e.g. called [lselect]\) which addresses at least the
majority of my concerns, and which \(in my opinion\) need not even
form part of this TIP.
_Dave Cuthbert <[email protected]> adds_
> I intentionally left [lselect] out of the original TIP \(and it
is still not present in the 02-April-2001 version\). As Donal points
out, it is a major can of worms and, though there was general
agreement on c.l.t that such a command would be useful, people had
differing opinions on what form it should take.
> Perhaps my view on TIPs is incorrect, but I try to include only
sure-fire "yeah, we ought to have done that a few versions ago"
items.
----
# Notes on History of this TIP
_This TIP was originally written by Dave Cuthbert <[email protected]>,
but ownership has passed \(beginning of April\) to Kevin Kenny
<[email protected]>._
This TIP underwent substantial revision in May of 2001, to add the
syntax where all the _index_ parameters could be grouped as a list
rather than placed inline on the command line.
# See Also
[[33]](33.md).
# Copyright
This document has been placed in the public domain.