Tcl Source Code

New abstract list representations
Login

The proposal below introduces new Tcl_Obj internal representations based on the TIP 636 abstract list API. Because there are no changes to public Tcl commands or C interfaces, the changes will not go through a formal TIP process.

Motivation

The primary purpose of the changes is memory efficiency for large lists. As a side benefit, affected list operations are now constant time though at a minor cost in indexed access. This document quantifies the tradeoffs in order to justify the proposal.

The following new abstract list Tcl_Obj types are added.

  • repeatedList - holds repeated elements (lrepeat)

  • reversedList - holds elements of an existing list in reverse order (lreverse)

  • rangeList - holds elements of a range within an existing list (lrange)

All these share the following characteristics.

Benefits:

  • Most important, the resulting list is memory efficient, either through sharing memory with the source list or a more efficient internal representation. This makes certain list operations on large lists feasible that were previously impractical.

  • The associated operations (lrepeat for repeatedList, lreverse for reversedList etc.) are very fast (orders of magnitude for even moderately long lists).

  • There are likely other benefits for large lists that are difficult to measure like minimizing cache use, swapping and the cost of freeing the additional memory.

Drawbacks:

Indexing and iterating an abstract list is less efficient, roughly by 4-10%. Unlike the existing arithseries, which has a higher overhead due to additional memory allocation of the indexed element, here the efficiency loss is due to not having direct support in the byte code execution engine. In all likelihood, this can be eliminated by modifying the BCE if so deemed necessary for acceptance. For simplicity, the current implementation does not modify the BCE but instead only uses the abstract list representations for lengths greater than a threshold (100 at the time of writing) where the memory savings is seen as substantial.

Caveats:

All abstract lists, including the existing arithseries (lseq) are only memory efficient as long they are not modified. Any write operation shimmers them into a non-abstract list so for such cases abstract lists essentially do "lazy" shimmering, postponing the time and memory cost of modification to the actual modification. Read operations, which include iteration, are however frequent enough to be worth optimizing.

Metrics

All timings in the tables below are in microseconds as measured by timerate. Memory growth is the change in memory after a test run. Thus for both, smaller values are better.

The repeatedList abstract list

Generated by the lrepeat command. Comparison with 9.0.0 in memory use and performance in shown in the table below.

  • Memory growth number is of course driven by the largest repeat count, 1000000 in the table below, and assumes a single repeated element. However, corresponding benefits are seen (not in table below) for fewer counts and multiple repeated elements.

  • Unlike 9.0.0 which is linear with the repeat count, the time cost of the lrepeat operation is a small constant and is significantly faster even for a few hundred elements.

  • The time cost of indexing and iteration is higher by about 7-8% and 3-4% respectively.

                                                         9.0.0 |apn-tip636-appl
---------------------------------------------------------------|---------------
Memory growth                                     :   16612 KB |     144 KB
--
(len=10)        {lrepeat $len x}                  :      0.131 |      0.135
(len=99)        {lrepeat $len x}                  :      0.323 |      0.314
(len=101)       {lrepeat $len x}                  :      0.322 |      0.113
(len=1000)      {lrepeat $len x}                  :      1.894 |      0.113
(len=10000)     {lrepeat $len x}                  :     18.327 |      0.113
(len=100000)    {lrepeat $len x}                  :    183.535 |      0.113
(len=1000000)   {lrepeat $len x}                  :   2107.770 |      0.113
(len=10)        {lindex $l $ix}                   :      0.100 |      0.099
(len=99)        {lindex $l $ix}                   :      0.100 |      0.100
(len=101)       {lindex $l $ix}                   :      0.100 |      0.109
(len=1000)      {lindex $l $ix}                   :      0.099 |      0.108
(len=10000)     {lindex $l $ix}                   :      0.101 |      0.108
(len=100000)    {lindex $l $ix}                   :      0.099 |      0.109
(len=1000000)   {lindex $l $ix}                   :      0.099 |      0.107
(len=10)        {foreach v $l {}}                 :      1.974 |      1.935
(len=99)        {foreach v $l {}}                 :     16.650 |     16.401
(len=101)       {foreach v $l {}}                 :     17.000 |     17.705
(len=1000)      {foreach v $l {}}                 :    165.182 |    170.577
(len=10000)     {foreach v $l {}}                 :   1645.150 |   1698.710
(len=100000)    {foreach v $l {}}                 :  16556.300 |  17017.000
(len=1000000)   {foreach v $l {}}                 : 164621.300 | 169603.200

The reversedList abstract list:

Generated by the lreverse command by sharing the memory in the source list. Comparison with 9.0.0 in memory use and performance in shown in the table below.

  • Memory growth is independent of list size since the original source memory is shared.

  • Time of the lreverse operations is again a small constant and not linear in list size as in 9.0.0.

  • Indexing and iteration incurs a penalty of about 6-7% and 3-4% respectively.

                                                         9.0.0 |apn-tip636-appl
---------------------------------------------------------------|---------------
Memory growth                                     :    8800 KB |     144 KB
--
(len=10)        {lreverse $l}                     :      0.177 |      0.178
(len=99)        {lreverse $l}                     :      0.444 |      0.469
(len=101)       {lreverse $l}                     :      0.458 |      0.133
(len=1000)      {lreverse $l}                     :      3.086 |      0.132
(len=10000)     {lreverse $l}                     :     29.436 |      0.132
(len=100000)    {lreverse $l}                     :    291.879 |      0.132
(len=1000000)   {lreverse $l}                     :   3263.890 |      0.133
(len=10)        {lindex $l $ix}                   :      0.100 |      0.100
(len=99)        {lindex $l $ix}                   :      0.101 |      0.100
(len=101)       {lindex $l $ix}                   :      0.101 |      0.107
(len=1000)      {lindex $l $ix}                   :      0.100 |      0.106
(len=10000)     {lindex $l $ix}                   :      0.100 |      0.107
(len=100000)    {lindex $l $ix}                   :      0.100 |      0.110
(len=1000000)   {lindex $l $ix}                   :      0.100 |      0.107
(len=10)        {foreach v $l {}}                 :      1.988 |      1.949
(len=99)        {foreach v $l {}}                 :     17.060 |     16.409
(len=101)       {foreach v $l {}}                 :     17.007 |     17.363
(len=1000)      {foreach v $l {}}                 :    166.382 |    168.623
(len=10000)     {foreach v $l {}}                 :   1651.030 |   1687.490
(len=100000)    {foreach v $l {}}                 :  16529.900 |  16880.700
(len=1000000)   {foreach v $l {}}                 : 165598.600 | 169084.200

The rangeList abstract list:

The purpose of rangeList differs from the other abstract list types above. 9.0.0 already optimizes the range operation for both existing list types, list and arithseries so as seen in the table below, comparisons fall into statistical noise.

                                                         9.0.0 |apn-tip636-appl
---------------------------------------------------------------|---------------
Memory growth                                     :     144 KB |     144 KB
--
(len=10)        {lrange $l 1 end-1}               :      0.132 |      0.142
(len=99)        {lrange $l 1 end-1}               :      0.410 |      0.419
(len=101)       {lrange $l 1 end-1}               :      0.417 |      0.428
(len=1000)      {lrange $l 1 end-1}               :      0.120 |      0.124
(len=10000)     {lrange $l 1 end-1}               :      0.118 |      0.124
(len=100000)    {lrange $l 1 end-1}               :      0.118 |      0.124
(len=1000000)   {lrange $l 1 end-1}               :      0.117 |      0.123
(len=10)        {lindex $l $ix}                   :      0.100 |      0.101
(len=99)        {lindex $l $ix}                   :      0.100 |      0.101
(len=101)       {lindex $l $ix}                   :      0.101 |      0.101
(len=1000)      {lindex $l $ix}                   :      0.101 |      0.102
(len=10000)     {lindex $l $ix}                   :      0.101 |      0.102
(len=100000)    {lindex $l $ix}                   :      0.101 |      0.102
(len=1000000)   {lindex $l $ix}                   :      0.100 |      0.102
(len=10)        {foreach v $l {}}                 :      1.618 |      1.633
(len=99)        {foreach v $l {}}                 :     16.439 |     16.110
(len=101)       {foreach v $l {}}                 :     16.724 |     16.544
(len=1000)      {foreach v $l {}}                 :    164.658 |    161.976
(len=10000)     {foreach v $l {}}                 :   1656.320 |   1617.200
(len=100000)    {foreach v $l {}}                 :  16446.800 |  16141.000
(len=1000000)   {foreach v $l {}}                 : 164901.700 | 161559.800

The motivation for rangeList is to prevent unnecessary shimmering of the newly added repeatedList and reversedList types as well as abstract lists implemented in extensions. Since these do not exist in 9.0.0, the question of comparison does not arise. However, the performance of range operations on the new abstract list types is also a consideration. The corresponding table is shown below.

As seen, range operations on abstract lists are memory efficient but incur about a 15% cost on indexing and 7-8% cost on iteration compared to range operations on non-abstract lists.

                                                   apn-tip636-appl
------------------------------------------------------------------
Memory growth                                     :     144 KB

--
(len=10)        {lrange $l 1 end-1}               :      0.141
(len=99)        {lrange $l 1 end-1}               :      0.425
(len=101)       {lrange $l 1 end-1}               :      0.431
(len=1000)      {lrange $l 1 end-1}               :      0.118
(len=10000)     {lrange $l 1 end-1}               :      0.125
(len=100000)    {lrange $l 1 end-1}               :      0.117
(len=1000000)   {lrange $l 1 end-1}               :      0.117
(len=10)        {lindex $l $ix}                   :      0.101
(len=99)        {lindex $l $ix}                   :      0.100
(len=101)       {lindex $l $ix}                   :      0.101
(len=1000)      {lindex $l $ix}                   :      0.115
(len=10000)     {lindex $l $ix}                   :      0.116
(len=100000)    {lindex $l $ix}                   :      0.118
(len=1000000)   {lindex $l $ix}                   :      0.125
(len=10)        {foreach v $l {}}                 :      1.611
(len=99)        {foreach v $l {}}                 :     16.084
(len=101)       {foreach v $l {}}                 :     16.487
(len=1000)      {foreach v $l {}}                 :    178.548
(len=10000)     {foreach v $l {}}                 :   1770.350
(len=100000)    {foreach v $l {}}                 :  19182.900
(len=1000000)   {foreach v $l {}}                 : 177549.800

Status

The apn-tip636-appl branch contains the implementation of the proposal. The listTypes.test file contains the test cases to exercise the three new internal list types as well as the existing ones. gcov shows 91% code and 100% function coverage of the new implementation. Missing coverage is for "panic blocks" and validation which is not exercised as it is already done by upper callers.