Author: Ashok P. Nadkarni <[email protected]>
State: Draft
Type: Project
Vote: Pending
Created: 14-Nov-2022
Tcl-Version: 9.1
Tcl-Branch: tip-649
Keywords: list
Abstract
Although not directly dependent on it, this TIP is not being implemented unless TIP 636 is passed.
The list implementation has functionality that is available at the script level but not in the C API. This TIP proposes to expose the same as a C API as well. The benefits include
- convenience for extension and application writers that need to manipulate lists at the C level by voiding the need to implement the same functionality themselves
- significantly better performance as the currently defined list related C API is implicitly built around the 8.6 list internal representation and does not allow callers to benefit from implementation of more efficient higher level operations.
- prevention of shimmering from the more efficient list representations made possible by TIP 636 (for example, arithmetic series) to the memory intensive 8.6 implementation.
Rationale
Efficiency of list manipulation
Many applications and extensions for Tcl as well as Tk use Tcl's C API to
manipulate Tcl list values. For example, reversing or extracting a range of
elements is accomplished through a combination of
Tcl_ListObjGetElements
/Tcl_NewListObj
or a loop involving
Tcl_ListObjIndex
/Tcl_ListObjAppendElement
. Given the internal
representation of lists in Tcl 8.6, this was also how the Tcl core would
implement these operations and thus there were no significant efficiency
gains that would result from exposing the higher level operations as a C API.
This has changed with Tcl 9.0 where list internal representation may take one of several forms (see TIP's 625, 629, 636). For many of these, some list operations may be done more efficiently than allowed by the restrictive Tcl 8.6 list C API. For example, range and list reversal can now be faster and more memory efficient by orders of magnitude for large lists.
The script level list commands benefit from these and it makes sense to extend the same to C applications and extensions as well.
Efficiency of list access
While the above pertain to list manipulation, the list accessor functions
Tcl_ListObjGetElements
and Tcl_ListObjIndex
also have their own issues
arising from an implicit assumption about the list internals.
Tcl_ListObjGetElements
works well when the list internal representation
stores elements as an array of Tcl_Obj *
pointer values.
Tcl_ListObjIndex
assumes that the list implementation holds a reference to
the returned Tcl_Obj *
element so the caller does not need to do any
explicit reference management with the returned pointer. Neither of these
assumptions necessarily hold true with newer list representations. For
example, the arithmetic series stores the list in compact (start, end, step)
form and generates a Tcl_Obj
only when required. However, because of
Tcl_ListObjIndex
semantics, the list implementation has to hold on to a
reference to the returned element Tcl_Obj
essentially shimmering to the
equivalent Tcl_Obj *
array at a cost in memory and efficiency.
For this reason, the TIP proposes an accessor function that is almost
identical to Tcl_ListObjIndex
but requires the caller to release
the returned Tcl_Obj
by calling Tcl_DecrRefCount
. The list
implementation does not then need to artificially hold on to a reference
to the returned `Tcl_Obj** just in order to free it later.
As an aside, note that the above drawbacks do not apply to the Tcl core implementation itself, including implementation of script commands, as Brian has already taken care to compensate appropriately. This is not possible for extension writers as they do not have access to internal Tcl list type information.
Specification
The following functions will be added along with corresponding entries in the stubs table. (Names all subject to change)
int Tcl_ListObjRange(
Tcl_Interp *interp, /* Interpreter for error messages */
Tcl_Obj *listPtr, /* Unshared list object */
Tcl_Size first, /* Index of first element of range */
Tcl_Size last); /* Index of last element of range */
int Tcl_ListObjReverse(
Tcl_Interp *interp, /* Interpreter for error messages */
Tcl_Obj *listPtr); /* Unshared list object */
int Tcl_ListObjRepeat(
Tcl_Interp *interp, /* Interpreter for error messages */
Tcl_Obj *srcListPtr, /* List object containing elements to repeat */
Tcl_Size repeatCount, /* Number of times to repeat */
Tcl_Obj **newListPtr) /* Where to store the new list */
int Tcl_ListObjSort(
Tcl_Interp *interp, /* Interpreter for error messages */
Tcl_Obj *listPtr, /* Unshared list object */
TBD);
int Tcl_ListObjIndexWithRef(
Tcl_Interp *interp, /* Interpreter for error messages */
Tcl_Obj *listPtr, /* Shared or unshared list object */
Tcl_Size index, /* Index of element to retrieve */
Tcl_Obj **elementObjPtr) /* Where to store the element. Caller
needs to free it */
All functions return TCL_OK on success and TCL_ERROR on failure. Functionality should be obvious.
Discussion
The functions could be slightly altered to return a new Tcl_Obj
and
allow listPtr
to be a shared object. However, that would be inconsistent
with the current Tcl_ListObjReplace
definition.
Applications should be encouraged to use Tcl_ListObjIndexWithRef
to
avoid shimmering.
Implementation
Pending until TIP 636 passes.
Copyright
This document has been placed in the public domain.