Author: Brian Griffin <[email protected]>
Author: Ashok P. Nadkarni <[email protected]>
State: Draft
Type: Project
Vote: Pending
Created: 1-Sep-2022
Tcl-Version: 9.1
Tcl-Branch: tip-636-tcl9
- Abstract
- Specification
- Alterative - Extending Tcl_ObjType
- Requirements and Caveats
- Examples
- Implementation
Abstract
The most frustrating part of implementing custom Tcl_Obj types is that they often shimmer to a built-in Tcl type when performing typical Tcl operations on the value. This removes the advantages of having a custom type. The reason for having a custom type is usually due to either computational advantage, space advantage, or both. Maintaining this advantage is a worthy goal. Having access to built-in Tcl list operations makes the custom type even more worthwhile.
This TIP proposes to add a Tcl_Obj type, or possibly extend the
Tcl_Obj type (see TIP-644), to provide a means to define new concrete
value types that behave like a List without specifically having an
internal List storage implementation. Some or all of the script level
List operations, such as lindex
, llength
, and foreach
, would be
supported for the new type without causing the obj to shimmer to the
List type.
Specification
In order to prevent a values' type from being converted to a List type, custom functions are need to perform the List operations using the existing internal representation of the value. There are 2 possible ways to support this customized behavior for an arbitrary type. The custom Tcl_AbstractList type provides a structure with a function tray that can be used to define a new concrete type. This is implemented as a wrapper around the Tcl_ObjTyp that provides additional capability. The implementor of a new Abstract List type will provide its own internal representation along with a set of functions necessary to perform the particular actions needed for the corresponding List operation. The proposed functions are:
Tcl_Obj* DupRepProc()
Tcl_Wideint LengthProc(obj)
int IndexProc(interp,obj,index,elemObjPtr)
int SliceProc(interp,obj,from,to,newObjPtr)
int ReverseProc(interp,obj,newObjPtr)
void ToStringProc(obj)
void freeRepProc()
int GetElementsProc(interp,obj,start,count,objcPtr,objvPtr)
int ReplaceProc(interp,obj,first,numToDelete,numToInsert,insertObjs)
Tcl_Obj* SetElementProc (interp, listPtr, indicies, valueObj)
The Length and Index functions are required for basic support. The rest are optional, however, most of these related List operations would result in the value shimmering to the List type if the function is not provided. Some functions may have a fallback mechanism that rely on the Length and Index functions.
Abstract List C API
The AbstractList C API calls are used internally by the List functions when the given listObj is an AbstractList type.
Tcl_Obj * Tcl_AbstractListObjNew(interp, abstractListType)
Tcl_AbstractListType *
Tcl_AbstractListGetType(objPtr)
void Tcl_AbstractListSetConcreteRep(objPtr, repPtr)
void * Tcl_AbstractListGetConcreteRep(objPtr)
Tcl_WideInt Tcl_AbstractListObjLength(objPtr)
Tcl_Obj * Tcl_AbstractListObjIndex(objPtr, index)
Tcl_Obj * Tcl_AbstractListObjRange(objPtr,fromIdx, toIdx)
int Tcl_AbstractListObjReplace(interp,obj,first,
numToDelete,numToInsert,insertObjs)
Tcl_Obj * Tcl_AbstractListObjReverse(objPtr)
int Tcl_AbstractListObjGetElements(interp,objPtr,
start,count,objcPtr,objvPtr)
int Tcl_AbstractListObjSortElements(interp,TBD)
Tcl_Obj * Tcl_AbstractListObjCopy(interp, listPtr)
Tcl_Obj * Tcl_AbstractListSetElement(interp, listPtr,
indicies, valueObj)
The Tcl_AbstractListType struct used by implementors to provide the necessary List capatability functions:
typedef struct Tcl_AbstractListType {
Tcl_AbstractListVersion version;/* Structure version */
const char *typeName; /* Custom value reference */
/* List emulation functions */
Tcl_ALNewObjProc *newObjProc; /* How to create a new Tcl_Obj of this
** custom type */
Tcl_ALDupRepProc *dupRepProc; /* How to duplicate a internal rep of this
** custom type */
Tcl_ALLengthProc *lengthProc; /* Return the [llength] of the
** AbstractList */
Tcl_ALIndexProc *indexProc; /* Return a value (Tcl_Obj) for
** [lindex $al $index] */
Tcl_ALSliceProc *sliceProc; /* Return an AbstractList for
** [lrange $al $start $end] */
Tcl_ALReverseProc *reverseProc; /* Return an AbstractList for
** [lreverse $al] */
Tcl_ALGetElements *getElementsProc; /* Return an objv[] of all elements in
** the list */
Tcl_ALReplaceProc *replaceProc; /* Replace or remove subset of
** elements. */
Tcl_ALFreeConcreteRep *freeRepProc; /* Free ConcreteRep internals if
** necessary */
Tcl_ALToStringRep *toStringProc; /* Optimized "to-string" conversion
** for updating the string rep */
Tcl_ALToListProc *toListProc; /* Convert Abstract List
** to a List, presumably more
** efficiently than creating an
** intermediate string rep */
Tcl_ALSetElement *setElementProc; /* Replace the element at the
** index with the given valueObj */
Tcl_ALSortProc *sortProc; /* Sort list in a specified ordering. */
} Tcl_AbstractListType;
Alterative - Extending Tcl_ObjType
An alternative approach is to extend the exiting Tcl_ObjType with the additional List oriented function tray. This approach can be facilitated via TIP-644. The set of functions are essentially the same for either approach.
With this approach, the Tcl_AbstractList C API is not necessary. The "work" done by this API would be handled directly within the Core. Custom types would only need to supply the necessary list functions as part of the Tcl_ObjType definition.
Requirements and Caveats
In all cases, the string representation for an Abstract List value
must always be a valid List as defined by the Tcl language syntax and
the list command. An Abstract List will always successfully
"SetListFromAny" to a List type, memory limits aside. It is not a
requirement that a string representation of a list be recognized by a
particular Abstract List Type; there are some cases where this is not
reasonably possible. For example, the ArithSeries, used in the lseq
command, has only start
, step
, and count
stored in the internal
representation and uses math to compute indexed values. Taking a list
of number and computing the start, step, and count and then validate,
although possible, may not be worth the effort if vary large lists are
involved.
Examples
A simple example of this concept is the new lseq command (TIP-629), which implements a custom ArithSeries type that contains 4 values that represent a list. The length and index values are computed on demand based on 3 of the 4 numeric values by simple arithmetic. It does not compute and fill a complete list up front. Shimmering is thus prevented in many cases. See branch tip-636-tcl9 which (re)implements the ArithSeries using AbstractLists. Branch tip-636-tcl9-644 implements the in-core approach.
Other examples of possible Abstract Lists
It turns out that [dict] already implements special treatment within the foreach command in order to avoid shimmering of a dict to a list. It would be possible to avoid many, if not all, dict to list shimmering by using the Abstract List approach.
Other suggested examples:
- matrix values (there are several existing implementations)
- bit-field arrays
- ByteArray
- strings
- polynomial equations:
[lpoly 5 2 -3] -> y=c0*x^0+c1*x^1+c2*x^2
..., wherec0=5
,c1=2
,c3=-3
, ... - database:
foreach row [lselect $db {FROM * WHERE name LIKE "Griffin"}]
Implementation
See branch tip-636-tcl9, and tip-636-tcl9-644
Note: this is work that germinated from TIP-225. Salvatore Sanfilippo and Miguel Sofer should be recognized for their contribution.
Impact
On entry to the various list operations, both in the command implementation and in the bytecode engine, an initial test is made to see if the Obj is an AbstractList. A test is already made to see if the Obj is a List before calling SetListFromAny. The impact should be minimal. Note also that a similar test already exists in some limited cases for the Dict type.
It is conceivable that the traditional List implementation be (re)implemented as an AbstractList as well, then the additional overhead is simplified and minimized.