TIP 660: Use signed types for lengths and indices

Login
Author:		Ashok P. Nadkarni <[email protected]>
State:		Final
Type:		Project
Vote:		Done
Created:	30-03-2023
Tcl-Version:	8.7
Tcl-Branch:	tip-660
Vote-Summary:	Accepted 9/0/0
Votes-For:	SL, BG, DF, KK, MC, KW, JD, FV, AK
Votes-Against:	none
Votes-Present:	none

Abstract

Tcl 8.x used the signed int type for indexing and lengths both internally as well as in the public API. This was changed for 9.0, primarily via TIP 494, to the unsigned integer type size_t, typedef'ed as Tcl_Size.

This TIP proposes changing the Tcl_Size type to ptrdiff_t for Tcl 9. For practical purposes, this provides the same width but as a signed integer type. For Tcl 8.7, where Tcl_Size is already typedef'ed as int, there is no change. It is targeted in this TIP only because of the addition of a new exported stubs function Tcl_GetSizeIntFromObj and a #define TCL_SIZE_MAX.

Rationale

TL;DR Changing signed types to unsigned will impose a significant burden on extension writers porting extensions to 9.0 for no tangible benefit. This additional work is independent of the changes required for 32->64 bit migration. Furthermore, adopting unsigned integers for indices in the Tcl core necessitates some "unusual" coding patterns that are susceptible to errors and will be a continuous source of bugs even within the Tcl core.

Given that TIP 494 has already been passed, there need to be significant reasons for reverting to the use of a signed type. These are summarized here with detailed examples given in the Discussions section.

First, from a C language perspective, (paraphrasing Nathan from the chat) indexing into arrays is tightly coupled to pointer arithmetic which in turn requires the integer type representing pointer differences to follow the semantics of the ptrdiff_t type. The size_t type, being unsigned, does not meet this criteria. Further, compilers do not guarantee correct operation of pointer arithmetic on allocations greater than PTRDIFF_MAX bytes. See C++ standard, C standard, this blog, gcc ticket for more. So in a nutshell, proper handling of allocations that do not fit in ptrdiff_t is problematic.

Second, even if the above were not the case, changing a variable's type from signed to unsigned in any code base requires careful inspection of not just the variable's use in the Tcl API but practically every single location where the variable is referenced. This includes arithmetic operations, comparisons, iterations, and even I/O. Concrete examples are given in a later section. Even worse, the compiler will not warn about most of these. The Tcl core itself is an example of the extent of changes required and manifested bugs point to the ease with which these can be overlooked despite the utmost discipline and care that has been taken. As an extension, the changes made to Tk are further evidence of the porting effort entailed by the use of unsigned indices in Tcl 9.

The question also has to be asked as to what one might lose by reverting from size_t back to a signed type like ptrdiff_t. The change to the use of size_t in lieu of int was made to permit indices and lengths beyond the int range, particularly on 64-bit platforms. However, TIP 494 does not state the motivation for change from a signed type to an unsigned type. Based on discussion, it appears that it was at least partly motivated that an unsigned type expands the possible range from 2**31 to 2**32 or 2**63 to 2**64.

This is a false benefit for multiple reasons:

Any single one of the above reasons precludes any potential benefit of the expanded range of size_t compared to ptrdiff_t. We thus have a situation where significant effort has to be expended, now and in the future, dealing with unsigned index values for no concrete benefit whatsoever.

This TIP is intended to remedy this situation.

Specification

The Tcl_Size typedef will be changed from size_t to ptrdiff_t.

The TCL_SIZE_MAX preprocessor constant is defined to hold the maximum value for a Tcl_Size.

The TCL_SIZE_MODIFIER preprocessor constant is defined to hold the printf family width specifier appropriate for values of type Tcl_Size. (This is similar to existing TCL_LL_MODIFIER, TCL_Z_MODIFIER etc.)

All parameters to public API's that pertain to indices and lengths will be changed to Tcl_Size if they are not so already (most are). Internal API's are not generally specified in TIP's but as a point of information, the same applies there.

As an exception to the above, any parameters that were specified as size_t in Tcl 8.x API's will remain so.

Functions that accepted negative lengths or indices (for example to indicate nul termination) will revert to their 8.x compatible behavior instead of only accepting -1 as a special value.

A new function is exported with the following signature

int Tcl_GetSizeIntFromObj(Tcl_Interp *interp, Tcl_Obj *objPtr, Tcl_Size *sizePtr);

to extract a Tcl_Size value from a Tc_Obj. This is analogous to the Tcl_GetIntFromObj and Tcl_GetWideIntFromObj.

Implementation

Implementation for Tcl 9 is in branch tip-660 and builds with no errors or test failures. Implementation for Tcl 8.7 is branch tip-660-tcl8 which simply exports Tcl_GetSizeIntFromObj.

Discussion

ChatGPT says

Overall, changing a signed type to unsigned in C can have significant ramifications for your program, and it's important to carefully consider the implications before making the change.

Surely no one needs further convincing having heard from ChatGPT :-) but nevertheless below are specific examples of the kind of additional burden on developers precipitated by the change to unsigned types.

Note with respect to the examples that the issue is not that individual changes are major. Rather, it is (a) the number of such occurences, (b) the fact that the every use of indices and lengths has to be examined with no compiler diagnostics to help and (c) the atypical usage patterns required are not natural in C programming leading to subtle bugs in further development.

Looping

Consider the following simplistic but plausible 8.6 extension to reverse a list.

static int
Sandbox_Cmd(
    ClientData dummy,	/* Not used. */
    Tcl_Interp *interp,		/* Current interpreter */
    int objc,			/* Number of arguments */
    Tcl_Obj *const objv[]	/* Argument strings */
    )
{
    int i, len;
    Tcl_Obj *listObj, *objPtr;

    Tcl_ListObjLength(interp, objv[1], &len);
    listObj = Tcl_NewListObj(len, NULL);
    for (i = len-1; i >= 0; --i) {
        Tcl_ListObjIndex(interp, objv[1], i, &objPtr);
        Tcl_ListObjAppendElement(interp, listObj, objPtr);
    }
    Tcl_SetObjResult(interp, listObj);
    return TCL_OK;
}

Porting this extension to TIP 660 with Tcl_Size defined as ptrdiff_t would simply require changing the int declaration to Tcl_Size. No other changes are needed and the code would run correctly as before. With TIP 494 (current 9.0 implementation) on the other hand, changing the variable type to Tcl_Size defined as size_t results in a crash (loop never terminates) requiring the loop condition to be rewritten.

This required condition rewrite is not even the real problem. The real problem is that every line of code using variables i or len has to be manually examined to determine if it need to be adapted to the change to unsigned types. Compilers do not always warn and unlike the above example, which results in a crash, other cases may silently corrupt data or produce invalid results.

These issues arise from the differences between signed and unsigned types in comparisons, arithmetic and type promotion. Here is another innocuous command implementation just to illustrate the problem.

static int
Sandbox_Cmd(
    ClientData dummy,	/* Not used. */
    Tcl_Interp *interp,		/* Current interpreter */
    int objc,			/* Number of arguments */
    Tcl_Obj *const objv[]	/* Argument strings */
    )
{
    int i, len;
    const char *s = Tcl_GetStringFromObj(objv[1], &len);
    for (i = 0; i < len-1; ++i) {
        printf("%c", s[i]);
    }
    return TCL_OK;
}

This prints all but the last character of a string in Tcl 8.6. Changing the int declaration to size_t for Tcl9, results in uninitialized memory being accessed and junk printed but only when an empty string is passed making the problem easy to miss. A ptrdiff_t type on the other hand continues to work correctly in all cases. No source code change is required other than the type declaration.

Expressions

The problems caused by the switch to an unsigned type are further exacerbated by the fact that the value -1 is used as an index in several contexts in the Tcl core such as an index indicating "before the first element", length indicator for nul terminated strings, etc. To deal with this, the macro TCL_INDEX_NONE is #defined as (effectively) (size_t) -1.

The following pattern that is pervasive in the Tcl core

    if (len < 0) {
        len = {strlen, Tcl_GetCharLen etc.}(...)
    }

has then to be replaced by the pattern

    if (len == TCL_INDEX_NONE) {
        len = ....
    }

So an extension writer has to look for these cases and fix them. For an idea of the pervasiveness of this idiom and number of changes required, between Tcl and Tk, there are close to a couple of hundred such instances.

There is a bigger issue though which is that the semantics are now changed. While previously an extension could call Tcl_NewStringObj (for example) with any negative value (which can happen for computed values) for the length parameter to indicate nul terminated arguments, this is no longer the case. An incompatibility between 8.x and 9.0 is acceptable but not one that offers no benefit in return.

Along similar lines, simple comparisons of the type

if (idx < 0)...
if (last < first)...
if (last + 1 < stringPtr->numChars)

where a and b are of type size_t have to be rewritten as

if (idx + 1 < 1)
if (last + 1 < first + 1)
if (last + 2 < stringPtr->numChars + 1)

and so on. The reason is left as an exercise for the reader. If not obvious, that should itself indicate the "unnaturalness" of this idiom. And again, as an indication of the amount of work required on the part of developers porting extensions, Tk has around a hundred occurences that needed such modification.

Additional casts

Some cases are even more subtle in that they need an additional cast to work correctly when the type widths differ. For instance, when objc is an int as is common for function implementing Tcl commands, a cast is first necessary where it was not previously required.

toIdx + 1 >= (size_t)objc + 1

or this line

Tcl_SetObjResult(interp, Tcl_NewWideIntObj((Tcl_WideInt)((Tcl_WideUInt)(lineLen + 1U)) - 1));

where with a signed type, a simple Tcl_NewWideIntObj(lineLen) would suffice.

Generating Tcl_Objs containing indices

For extensions that implement collections (VecTcl, tarray, BLT, RBC etc.) returning a Tcl_Obj * containing a index resulting from a search can use a simple Tcl_NewWideIntObj call when indices are signed. With unsigned size_t indices, this no longer suffices. The extension has to deal with mp_int internal representation as seen in the this internal use macro.

Although an extension may copy and adapt this internal macro or perhaps it could be exported as a stubs function, it illustrates the additional complexities wrought by unsigned indices.

I/O and formatting

Even I/O or string formatting statements (as in error messages) are not exempt from needing examination. Specifiers like "%d" now have to be changed to "%u".

All the issues above disappear if the Tcl_Size type is reverted back to a signed integer type of the appropriate width such as ptrdiff_t.

Copyright

This document has been placed in the public domain.