TIP 616: Tcl lists > 2^31 elements

Login
Author:         Jan Nijtmans <[email protected]>
State:          Final
Type:           Project
Vote:           Done
Tcl-Version:    9.0
Tcl-Branch:     tip-616
Vote-Summary:   Accepted 3/0/2
Votes-For:      JN, KBK, KW
Votes-Against:  none
Votes-Present:  FV, SL

Abstract

This TIP proposes to extend the Tcl API for lists and dicts such that they can handle more than 2^31 elements. Also many functions involved with parsing are extended the same way.

Specification

The stub table is extended with 7 new functions:

int Tcl_ListObjGetElements(Tcl_Interp *interp, Tcl_Obj *listPtr,
	Tcl_Size *objcPtr, Tcl_Obj ***objvPtr)
int Tcl_ListObjLength(Tcl_Interp *interp, Tcl_Obj *listPtr,
	Tcl_Size *lengthPtr)
int Tcl_DictObjSize(Tcl_Interp *interp, Tcl_Obj *dictPtr, Tcl_Size *sizePtr)
int Tcl_SplitList(Tcl_Interp *interp, const char *listStr, Tcl_Size *argcPtr,
	const char ***argvPtr)
void Tcl_SplitPath(const char *path, Tcl_Size *argcPtr, const char ***argvPtr)
Tcl_Obj *Tcl_FSSplitPath(Tcl_Obj *pathPtr, Tcl_Size *lenPtr)
int Tcl_ParseArgsObjv(Tcl_Interp *interp, const Tcl_ArgvInfo *argTable,
	Tcl_Size *objcPtr, Tcl_Obj *const *objv, Tcl_Obj ***remObjv)

Also, wrapper macro's are put around those 7 functions, such that - depending in the size of the "Tcl_Size *" argument (either sizeof(int) or sizeof(ptrdiff_t)) the correct stub entry is called, either the original one or the new one. This way, extensions can decide to continue to use an "int" or change the code to use "Tcl_Size" as variable type, supporting an enhanced range for list/dict's. This is how source compatibility is kept.

Currently, the functions Tcl_ListObjGetElements, Tcl_ListObjLength, Tcl_DictObjSize and Tcl_SplitList return TCL_ERROR when the the list/dict has invalid syntax. Starting Tcl 9.0, those function can also return TCL_ERROR when objcPtr/lengthPtr/sizePtr points to a variable of type int and the list/dict has more than 2^31 elements.

In Tcl 8.7, the same functions are added to the stub table, but there the functions are just wrappers calling the original functions. The goal is to allow the new API to be used in 8.7 too, but without actually being able to use more than 2^31 elements in lists/dicts.

The full list of API changes is here

Implementation

Implementation is in Tcl branch "tip-616".

There is also a minimal Tcl 8.7 implementation in Tcl branch "tip-616-for-8.7"

Caveat

There is one change in Tcl_Token that could cause compiler warnings: the extension of the commentSize and numTokens fields, which are now unsigned since they can never be negative. Comparing those values with signed numbers will result in a compiler warning, even though there is no problem at all: the code keeps functioning fine. Solution: always compare those fields with unsigned numbers. A type-cast could be also used to silence this warning.

Even though the API changed to allow larger values, internally not everywhere this additional range can be used. For example, commands still don't accept more than 2^31 elements, so lists larger than that cannot be used as commands. Work will continue to allow that in the future, but it's not done in this TIP yet.

Addendum

After TIP #660 was accepted, a lot of functions changed from using size_t to ptrdiff_t parameters. In order to prevent confusion, this change has been adapted in the TIP text above as well.

Compatibility

The proposed changes for Tcl 9 are source compatible, but NOT binary compatible: All extensions compiled with Tcl 9 headers will have to be recompiled, but no source-code changes are necessary. The additions for Tcl 8.7 are 100% upwards compatible with Tcl 8.6.

Copyright

This document has been placed in the public domain.