TIP 616: Tcl lists > 2^31 elements

Login
Bounty program for improvements to Tcl and certain Tcl packages.
Author:         Jan Nijtmans <[email protected]>
State:          Draft
Type:           Project
Vote:           Pending
Tcl-Version:    9.0
Tcl-Branch:     tip-616

Abstract

This TIP proposes to extend the Tcl API for lists and dicts such that they can handle more than 2^31 elements. Also many functions involved with parsing are extended the same way.

Specification

The stub table is extended with 7 new functions:

int Tcl_ListObjGetElements(Tcl_Interp *interp, Tcl_Obj *listPtr,
	size_t *objcPtr, Tcl_Obj ***objvPtr)
int Tcl_ListObjLength(Tcl_Interp *interp, Tcl_Obj *listPtr,
	size_t *lengthPtr)
int Tcl_DictObjSize(Tcl_Interp *interp, Tcl_Obj *dictPtr, size_t *sizePtr)
int Tcl_SplitList(Tcl_Interp *interp, const char *listStr, size_t *argcPtr,
	const char ***argvPtr)
void Tcl_SplitPath(const char *path, size_t *argcPtr, const char ***argvPtr)
Tcl_Obj *Tcl_FSSplitPath(Tcl_Obj *pathPtr, size_t *lenPtr)
int Tcl_ParseArgsObjv(Tcl_Interp *interp, const Tcl_ArgvInfo *argTable,
	size_t *objcPtr, Tcl_Obj *const *objv, Tcl_Obj ***remObjv)

Also, wrapper macro's are put around those 7 functions, such that - depending in the size of the "size_t *" argument (either sizeof(int) or sizeof(size_t)) the correct stub entry is called, either the original one or the new one. This way, extensions can decide to continue to use an "int" or change the code to use "size_t" as variable type, supporting an enhanced range for list/dict's. This is how source compatibility is kept.

In Tcl 8.7, the same functions are added to the stub table, but there the functions are just wrappers calling the original functions. The goal is to allow the new API to be used in 8.7 too, but without actually being able to use more than 2^31 elements in lists/dicts.

The full list of API changes is here

Implementation

Implementation is in Tcl branch "tip-616".

There is also a minimal Tcl 8.7 implementation in Tcl branch "tip-616-for-8.7"

Caveat

There is one change in Tcl_Token that could cause compiler warnings: the extension of the commentSize and numTokens fields, which are now unsigned since they can never be negative. Comparing those values with signed numbers will result in a compiler warning, even though there is no problem at all: the code keeps functioning fine. Solution: always compare those fields with unsigned numbers. A type-cast could be also used to silence this warning.

Even though the API changed to allow larger values, internally not everywhere this additional range can be used. For example, commands still don't accept more than 2^31 elements, so lists larger than that cannot be used as commands. Work will continue to allow that in the future, but it's not done in this TIP yet.

Compatibility

The additions for Tcl 8.7 are 100% upwards compatible with Tcl 8.6. The proposed changes for Tcl 9 are source compatible, but NOT binary compatible: All extensions compiled with Tcl 9 headers will have to be recompiled, but no source-code changes are necessary.

Copyright

This document has been placed in the public domain.