225.tip at [71abb0b4df]

Login

File tip/225.tip artifact 866298d78a part of check-in 71abb0b4df


TIP:            225
Title:          Arithmetic Series with Optimized Space Complexity
Version:        $Revision: 1.5 $
Author:         Salvatore Sanfilippo <[email protected]>
Author:         Miguel Sofer <[email protected]>
State:          Draft
Type:           Project
Vote:           Pending
Created:        25-Oct-2004
Post-History:   
Tcl-Version:    8.5

~ Abstract

This TIP proposes to add a new command to generate arithmetic
sequences as Tcl lists that may be stored in constant space in many
practical situations.  The only change from the point of view of the
Tcl programmer is the addition of a new command named '''range'''.

~ Rationale

An idiomatic way to assign successive elements of an arithmetic series
to a variable is to use the '''for''' command. Usually the loop
variable is initialized to the first element of the sequence, and
incremented at every iteration of a given step using '''incr'''. The
'''for''' test condition is used to limit the sequence generation to a
given element, like in the following example:

|for {set i 0} {$i < 10} {incr i} {
|    puts $i
|}

The Tcl programming language is at higher level than the C language,
where this idiom firstly appeared, so it may be desiderable to be able
to generate arithmetic sequences of integer numbers in a more
comfortable way. Being the Tcl list a central data structure of the
Tcl language, it apperas natural to generate a Tcl list of integers,
and possibly use the '''foreach''' command to loop over every element,
so that the above '''for''' loop can be translated into the following
fragment of code:

|foreach i [range 0 10] {
|    puts $i
|}

The '''range''' command can be also conveniently used in different
contexts. The following code generates a list of squares of 0, 1, 2,
3, ... 9.

|proc map {varname script mylist} {
|    upvar $varname var
|    set res {}
|    foreach var $mylist {
|        lappend res [uplevel 1 $script]
|    }
|    return $res
|}
|
|puts [map x {expr {$x*$x}} [range 0 10]]
|
|# Will output "0 1 4 9 16 25 36 49 64 81"

The '''range''' command can be implemented in a way that makes it
possible to internally store the arithmetic sequences genereated in
constant space if they are only accessed using '''foreach''',
'''llength''' and '''lindex''' commands ('''lrange''' may also be
handled in a special way).  When needed, the object will be converted
into a List object automatically.  From the Tcl programmer point of
view this optimization is transparent.

~ Specification of the Behaviour

The '''range''' command takes three arguments in the complete format,
named ''start'', ''end'' and ''step'', and generates a sequence of
integers accordingly to the following algorithm in pseudo code:

|RangeLen(start, end, step)
|1. if step = 0
|2.     then ERROR
|3. if start = end
|4.     then return 0
|5. if step > 0 AND start > end
|6.     then ERROR
|7. if setp < 0 AND end > start
|8.     then ERROR
|9. return 1+((ABS(end-start)-1)/ABS(step))
|
|Range(start, end, step)
|1. result <- EMPTY LIST
|2. len <- RangeLen(start, end, step)
|3. for i <- 0 to len - 1
|4.     result.append(start+(i*step))
|6. return result

The ''step'' argument can be omitted, and default to the value of 1,
so [['''range''' 0 10 1]] is the same as [['''range''' 0 10]]. It's
also possible to call the '''range''' command with a single argument,
omitting both the ''start'' and ''step'' argument that will default
respectively to 0 and 1, so that the following three commands will
generate the same sequence of integers:

|range 0 10 1
|range 0 10
|range 10

The following are examples of outputs.

|[range 0 10 1] => 0 1 2 3 4 5 6 7 8 9
|[range 10 0 -2] => 10 8 6 4 2
|[range 10 10] => empty list
|[range 10 20 -3] => ERROR
|[range 5] -> 0 1 2 3 4

Infinite series and series resulting in lists bigger than the maximum
list length that the Tcl code can handle are detected and reported as
an error. ''start'', ''end'', and ''step'' can be anything can fit
into a Tcl wide integer.

Note that there is a practical justification for the fact that the
elements generated never reach the value of the End argument, with the
effect of [['''range''' 0 10 1]] generating the sequence 0, 1, 2, ...,
9 and a range with the same value of ''start'' and ''end'' always
generating an emtpy list. This is needed in order to make it
comfortable to use '''range''' and '''foreach''' instead of '''for'''
loops like in the following example:

|foreach i [range [llength $mylist]] {
|    foobar [lindex $mylist $i]
|}

Because Tcl indexes are mostly zero-based, and it is often useful to
access every element of a sequence given it's length, this appears to
be the more sensible behaviour (this semantic is very similar to the
range() function of the Python programming language, where range() is
fully used to replace C-like ''for'' loops.)

Unfortunately this behaviour is not as comfortable to run the indexes
in reverse order:

|foreach i [range [expr {[llength $mylist]-1}] -1 -1] {
|    foobar [lindex $mylist $i]
|}

But the access from the first to the last element is far more common
in programs, and the range command needs to be consistent when the
step is negative.

An alternative syntax for reverse-indexing is:

|foreach i [range [llength $mylist]] {
|   foobar [lindex $mylist end-$i]
|}

~ Proposed Change

The change proposed is to modify the Tcl core in order to handle a new
object type called ArithSeries, that is recongnized and handled as a
special case by at least the '''llength''', '''lindex''' and
'''foreach''' commands. Syntactically, the ArithSeries object will
have the string representation that is exactly that that would be
produced by creating a list with the elements that would be iterated
over by '''foreach''' as previously described. This TIP also proposes
to add logic into ''SetListFromAny'' method of the List type in order
to convert an Arithmetic Series object into a List directly without to
pass from the string representation.

This TIP proposes to add a '''range''' command to the Tcl core having
the semantics specified above, and returning an Arithmetic Series
object.  Formally, the syntax is:

 > '''range''' ?''start''? ''end'' ?''step''?

The proposed changes are available as a Patch against HEAD that can be
found in the SourceForge Tcl patch 1052584
[http://sf.net/tracker/?func=detail&aid=1052584&group_id=10894&atid=310894]

~ Copyright

This document has been placed in the public domain.

----

~ Appendix: Reference Pure-Tcl Implementation

It may be useful to test the behaviour of the '''range''' command
without having to apply the Patch, so the following is a pure Tcl
implementation that should be exactly equivalent in the semantic to
the specification in this TIP, but of course not able to store ranges
in O(1) space.

|# RangeLen(start, end, step)
|# 1. if step = 0
|# 2.     then ERROR
|# 3. if start = end
|# 4.     then return 0
|# 5. if step > 0 AND start > end
|# 6.     then ERROR
|# 7. if setp < 0 AND end > start
|# 8.     then ERROR
|# 9. return 1+((ABS(end-start)-1)/ABS(step))
|proc rangeLen {start end step} {
|    if {$step == 0} {return -1}
|    if {$start == $end} {return 0}
|    if {$step > 0 && $start > $end} {return -1}
|    if {$step < 0 && $end > $start} {return -1}
|    expr {1+((abs($end-$start)-1)/abs($step))}
|}
|
|# Range(start, end, step)
|# 1. result <- EMPTY LIST
|# 2. len <- RangeLen(start, end, step)
|# 3. for i <- 0 to len - 1
|# 4.     result.append(start+(i*step))
|# 6. return result
|proc range args {
|    # Check arity
|    set l [llength $args]
|    if {$l == 1} {
|        set start 0
|        set step 1
|        set end [lindex $args 0]
|    } elseif {$l == 2} {
|        set step 1
|        foreach {start end} $args break
|    } elseif {$l == 3} {
|        foreach {start end step} $args break
|    } else {
|        error {wrong # of args: should be "range ?start? end ?step?"}
|    }
|
|    # Generate the range
|    set rlen [rangeLen $start $end $step]
|    if {$rlen == -1} {
|        error {invalid (infinite?) range specified}
|    }
|    set result {}
|    for {set i 0} {$i < $rlen} {incr i} {
|        lappend result [expr {$start+($i*$step)}]
|    }
|    return $result
|}

----

~ Discussion

Does the TIP include a C-level api to ranges, or are they transparent also in C - in the sense that they are addressable with any of the list-oriented functions of the Tcl api? What if any changes and caveats are necessary in the documentation of Tcl's C api? ''Miguel''

Ranges are transparent to C level too, in the proposed patch,
because the logic is put inside the commands,
so directly in the code implementing lindex, foreach, ...
In all the other cases, when a SetListFromAny() call occurs the
range is converted into a normal Tcl list object. ''Salvatore''