Tcl Source Code

Artifact [d6ff4a080d]
Login

Artifact d6ff4a080d8f9e876f1e5fcb7149d459221ca8b7ee8e6a2903e0abb6e8385714:

Wiki page [Migrating C extensions to Tcl 9] by oehhar 2025-04-25 17:23:39.
D 2025-04-25T17:23:39.444
L Migrating\sC\sextensions\sto\sTcl\s9
N text/x-markdown
P e16612341c1416c5f45431641c105af5b0769864f52c7e1ab4db1202f654738e
U oehhar
W 17608
Also see [Porting extensions to Tcl 9](https://wiki.tcl-lang.org/page/Porting+extensions+to+Tcl+9)
for another post on the topic.

## Init stubs and package require with open version

The list of versions should include TCL 9. Here is an example for TCL 8.5 up to any TCL 9 version.
Note that the upper bound "10" is exclusive.

~~~
    if (Tcl_InitStubs(interp, "8.5-10", 0) == NULL) {
        return TCL_ERROR;
    }
~~~

The Tcl_InitStubs command changed in the following aspects:

   *   The strict parameter changed in type from boolean to a bit field. Only values of 0 and 1 are supported.
   *   If the preprocessor variable "USE_TCL_STUBS" is not defined, a direct link to the TCL library is assumed. In this case, no stubs table is initialized. The given version specification is verified against the calling interpreter.

## 64-bit data support

On 64-bit platforms, Tcl 9 no longer limits data size (lengths of strings, lists
etc.) and indices to 32-bits. For portability of extensions between 32- and
64-bit platforms, a new typedef `Tcl_Size` has been defined. This maps to a
64-bit or 32-bit signed integer type on the respective platforms. This affects
all public structures and parameter types as appropriate. Variables that hold
lengths or indices that were typed as `int` for Tcl 8 should be changed to be of
type `Tcl_Size`. Care must be taken that related variables, for example loop
iterators, are also modified accordingly.

For example, in the following fragment

```
int len;
int i;
Tcl_ListObjLength(interp, objPtr, &len);
for (i = 0; i < len; ++i) {
   ....
}
```

the declaration of both `i` and `len` should be changed

```
Tcl_Size len;
Tcl_Size i;
```

When converting sizes and indices to and from string representation,
replace the use of `Tcl_GetIntFromObj` and `Tcl_NewIntObj` with the
`Tcl_Size` analogues `Tcl_GetSizeIntFromObj` and `Tcl_NewSizeIntObj`.

When checking string limits etc. appropriately for both platform widths,
the constant `TCL_SIZE_MAX` should be used in lieu of `INT_MAX`.

When formatting as a string through a `printf` style varargs function, a size
modifier appropriate for the platform width needs to be used. This is defined
as `TCL_SIZE_MODIFIER`. For example,

```
Tcl_Size len;
Tcl_Obj *objP = Tcl_ObjPrintf("The length is %" TCL_SIZE_MODIFIER "d.", len);
```

If the ability to build against Tcl 8 is desired after making the above
changes require for Tcl 9, add the following fragment to the appropriate
header for the extension.

```
#ifndef TCL_SIZE_MAX
# define Tcl_GetSizeIntFromObj Tcl_GetIntFromObj
# define TCL_SIZE_MAX      INT_MAX
# define TCL_SIZE_MODIFIER ""
#endif

```

"Tcl_Size" itself doesn't need to be provided as long as you update the extension to the latest tclconfig release. TEA (latest tcl.m4/rules.vc files) takes care of providing an appropriate preprocessor #define for `Tcl_Size`.

Note there is no complementary `Tcl_NewSizeIntFromObj`. Use `Tcl_NewWideIntObj` in its stead to wrap `Tcl_Size` values. Do **not** use `Tcl_NewIntObj` as it will truncate values above 4Gb.

Paul Obermeier's very useful post [Tcl 9 functions using Tcl_Size](https://wiki.tcl-lang.org/page/Tcl+9+functions+using+Tcl%5FSize)
enumerates the functions in the Tcl API affected by this change
and provides a script to locate the use of these functions within
extensions and applications.

See [TIP 660](https://core.tcl-lang.org/tips/doc/trunk/tip/660.md)

## Unicode support

### `Tcl_UniChar` and `TCL_UTF_MAX`

As part of the changes to support the full range of Unicode code points, the
`Tcl_UniChar` type has changed to a 32-bit integer type. Correspondingly, the
default value of `TCL_UTF_MAX` is now 4.

This means that encoding API's such as `Tcl_GetUnicodeFromObj` that work with
`Tcl_UniChar` strings can no longer be used to pass values to external API's
that expect UTF-16 encoded strings. This is particularly relevant on Windows
platforms where it was common in Tcl 8 to use these functions to transform
strings to the UTF-16 / UCS-2 form expected by the Win32 API. Instead, use either

* the UTF-16 specific `Tcl_Char16ToUtfDString`, `Tcl_UtfToChar16DString`,
`Tcl_UtfToChar16` functions, or

* the generic functions `Tcl_UtfToExternalDStringEx`, `Tcl_ExternaltoUtfDStringEx`,
`Tcl_UtfToExternal`, `Tcl_ExternalToUtf` with the `utf-16` encoding.

### Encoding transforms

Tcl 9 supports *encoding profiles* that allow control of how errors are handled
in encoding transforms. At the C API level, profiles are passed by or-ing one of
the `TCL_PROFILE_*` values as part of the `flags` parameter to encoding
transform functions like `Tcl_ExternalToUtf` and `Tcl_UtfToExternal`.

Extensions that or-ed `TCL_ENCODING_STOPONERROR` in the `flags` parameter
to these functions in Tcl 8 should now use `TCL_ENCODING_PROFILE_STRICT`
instead or not specify a profile at all as `strict` is the default in any case.

Extensions that did not make use of `TCL_ENCODING_STOPONERROR` will see
a change in that the functions default to the strict profile and will return
an error on invalid encoding bytes. To revert to Tcl 8 behavior, the
`TCL_ENCODING_PROFILE_TCL8` must be specified. Alternatively,
`TCL_ENCODING_PROFILE_REPLACE` may be used to which is standard-compliant
but does not raise an error on invalid encodings.

See [TIP 656](https://core.tcl-lang.org/tips/doc/trunk/tip/656.md)

## Encoding errors on I/O

In Tcl 8, any invalid byte sequence encountered in channel input were silently
handled by treating the invalid bytes as iso8859-1 encoded bytes. Similarly, on
output Tcl replaced characters that could not be represented in the channel
encoding with an encoding-specific replacement character, usually `?`. In
contrast, channel I/O in Tcl 9 is by default subject to the `strict` encoding
profile and the above are treated as error conditions. Functions like
`Tcl_Read`, `Tcl_Gets` etc. on input and `Tcl_Write` etc. on output will raise
an error for these cases.

Applications and extensions that need to match Tcl 8 behavior for compatibility
reasons can set the `-profile` configuration option for the channel to `tcl8` via
the `Tcl_SetChannelOption` function. This is however non-conformant with the
Unicode standard and is not recommended. A standard-conformant alternative is
to set the profile to `replace` instead.

See [TIP 657](https://core.tcl-lang.org/tips/doc/trunk/tip/657.md)

## System encoding changed on MS-Windows

On MS-Windows, the system encoding is now fixed to `utf-8`.

Most encoding related functions use the parameter `NULL` for the system encoding.
For example, the following code now uses `utf-8` encoding with TCL 9 and `cp1252` (in Western Europe) with TCL 8.6:

```
Tcl_UtfToExternalDString( NULL, pStr, lStr, &sPar1); 
```

To use `cp1252`, the encoding must be explicitly loaded.
In TCL 9.0.1, there is no possibility to get the system encoding of TCL8.6.
The TCL 9.1 TIP 716 proposes the command `Tcl_GetEncodingNameForUser` to get the name of the system encoding of TCL8.6.

Also remark, that encoding conversions now may return errors. The upper call may be replaced by its extended version `Tcl_ExternalToUtfDStringEx` which is able to handle encoding errors.

## `Tcl_ObjType` versioning

The Tcl value system has been extended. The `Tcl_ObjType` has additional (optional) fields.  Any Tcl extensions, or embedded integration with Tcl, that has a custom `Tcl_ObjType` value defined, should review these changes.  Existing code should compile and execute just fine, unless the particular implementation is using any "creative" programming. In simpler terms, the `Tcl_ObjType` structure is larger. It should be initialized properly without any changes needed, but YMMV.

Please see the `Tcl_RegisterObjType` manual page for more details, as well as TIP's [636](https://core.tcl-lang.org/tips/doc/trunk/tip/636.md) and [644](https://core.tcl-lang.org/tips/doc/trunk/tip/644.md).

## Abstract lists

An abstract list is a custom Tcl value (custom `Tcl_ObjType`) that can be treated as a Tcl list at the script level without causing the value to be converted to the internal List type. To make this possible, the custom type must implement a set of accessor functions corresponding to list operations. 

See the `Tcl_RegisterObjType` manual page for more details, as well as TIP's [636](https://core.tcl-lang.org/tips/doc/trunk/tip/636.md) and [644](https://core.tcl-lang.org/tips/doc/trunk/tip/644.md).

### Memory management of list elements returned by Tcl_ListObjIndex

When dealing with list elements, a common micro-optimization practice involves
skipping `Tcl_IncrRefCount` and `Tcl_DecrRefCount` sequences for a `Tcl_Obj`
element returned by `Tcl_ListObjIndex` when the use is local and temporary. In
Tcl 8, this posed no issues because the list implementation internally held a
reference to each element, ensuring that the element would be freed (assuming no
other references) along with the parent list.

However, with the introduction of Abstract Lists in Tcl 9, this guarantee no
longer holds. Abstract lists may or may not retain a reference to `Tcl_Obj`
elements returned from `Tcl_ListObjIndex`. Specifically, it is possible for
elements to be created on demand and returned by the index operation without an
internal reference being held. In the absence of a `Tcl_DecrRefCount` call
on the element, the `Tcl_Obj` would never be freed resulting in a memory leak.
The caller needs to explicitly call `Tcl_DecrRefCount` on the element
`Tcl_Obj` to ensure it is freed.

On the flip side, an abstract list implementation may mimic Tcl 8 behavior by
returning an existing internal reference to a `Tcl_Obj` without incrementing its
reference count. In such instances, calling `Tcl_DecrRefCount` on the element
without a corresponding `Tcl_IncrRefCount` would release an internal reference
held by the abstract list implementation and prematurely free the element resulting
in use-after-free bugs.

To handle both types of implementations correctly, the caller should increment
the reference count via `Tcl_IncrRefCount` on the `Tcl_Obj` returned by
`Tcl_ListObjIndex` and decrement it using `Tcl_DecrRefCount` when it will
no longer access it.

Alternatively, if compatibility with Tcl 8.6 is not required, the same effect can
be achieved slightly more efficiently by passing the returned `Tcl_Obj` to
`Tcl_BounceRefCount` when finished with it. This function will free the passed
 `Tcl_Obj` if, and only if, there are no existing references to it. **However, care
must be taken in this case that the `Tcl_Obj` is not passed to functions that
should not be passed `Tcl_Obj` values with a zero reference count
(e.g. `Tcl_ObjSetVar2`).** It is therefore recommended to use the `Tcl_IncrRefCount`/`Tcl_DecrRefCount`
idiom instead.

### Prefer Tcl_ListObjIndex to Tcl_ListObjGetElements for large lists

The introduction of abstract list implementations has an additional consequence
related to iteration over lists. In Tcl 8, it was common practice to iterate
over list elements by calling `Tcl_ListObjGetElements` which returned a pointer
to the array of `Tcl_Obj *` elements stored in the internal representation of a
list. In Tcl 9, the internal representation may take a different form and use of
this function forces generation of an array with potentially significantly
higher memory usage. For example, the `lseq` command which is used to create a
list of integers has an extremely compact internal representation. A list of a
million consecutive integers takes up under a hundred bytes of memory. However,
if `Tcl_ListObjGetElements` is called on this list, it is forced to allocate an
array of a million pointers each pointing to a `Tcl_Obj` holding an integer --
more than 50MB of memory.

It is therefore recommended that in cases where an extension may be dealing with
large lists, iteration be done using the `Tcl_ListObjIndex` function instead
of `Tcl_ListObjGetElements`. Note however that there is a performance tradeoff here
as iterating using `Tcl_ListObjGetElements` is faster as the returned array can
be directly accessed without a function call for each element.

## Tommath

Tcl 9 optionally allows the linking of an external `libtommath` library for multiple-precision
integer arithmetic in lieu of Tcl's internal version. This has some consequences for extensions:

* (Method 1) If including `tclTomMath.h`, Tcl's stubs interface to `libtommath` is used and only a subset of
`libtommath` functions are available. (Method 2) If including `tommath.h` (external library), all `libtommath`
functions are available.

* Extensions using the definitions of `mp_digit`, `mp_word` or `mp_int` need to include either
 `tclTomMath.h` or `tommath.h` in addition to `tcl.h`.

The extension needs to call `Tcl_TomMath_InitStubs` if it included `<tclTomMath.h>` (Method 1).
It doesn't need to call `Tcl_TomMath_InitStubs` if it includes `<tommath.h>`
(Method 2, but then it has to link with the libtommath library itself, `-ltommath`).
The extension can freely choose what it wants. However, the disadvantage of Method 2 
is that the extension loses 8.6-compatibility.

See [TIP 538](https://core.tcl-lang.org/tips/doc/trunk/tip/538.md).

## Building

### Building with autotools

* Update TEA to the latest [tclconfig](https://core.tcl-lang.org/tclconfig).

* Within your `configure.ac` or `configure.in` file, change the `TEA_INIT` call to require **3.13** or whatever the latest version happens to be.

* Run `autoconf` to regenerate configure.

* When running `configure`, in addition to the `--with-tcl` option also specify the `--with-tclinclude`. It's possible this is only required on MingW and seems like a [bug](https://core.tcl-lang.org/tcl/tktview/6f22c7a1fc) in TEA but I am not sufficiently versed in TEA to tell for sure.

Without those options, the load command (directly or via `package require`) may raise the error 

*interpreter uses an incompatible stubs mechanism*.

This *may* happen when building against a private Tcl 9 installation with a system installed Tcl 8 version. For example, if you do

```
% path/to/extension/configure --with-tcl=/path/to/tcl9/lib
% make
```

To resolve, also specify the `--with-tclinclude` option to `configure`.

```
% path/to/extension/configure --with-tcl=/path/to/tcl9/lib --with-tclinclude=/path/to/tcl9/include
% make
```

See the [ticket](https://core.tcl-lang.org/tcl/tktview/6f22c7a1fc) for the specific circumstances when this may happen.

### Threads

Tcl 9.0 is always build with threads enabled.
The macro *TCL_THREADS* is not defined.
A check for threads build for TCL 8.x and 9 may now look like this:

```
#if (TCL_MAJOR_VERSION < 9) && ! defined(TCL_THREADS)
#error "build requires TCL_THREADS"
#endif
```

See [TIP491](https://core.tcl-lang.org/tips/doc/trunk/tip/491.md).

### CONST macro

The macro *CONST* is not defined any more by the *tcl.h*.
This was a helper for non ANSI-C compilers.

This was observed on the Windows platform using the MS-VC 2015 compiler and the nmake build system.
This may be different on other platforms or build systems.

Thus, the following snipped must be modified (which defines a function passed to *Tcl_CreateObjCommand*):

```
ZbarAsyncDecodeObjCmd(ClientData clientData, Tcl_Interp *interp, int objc,  Tcl_Obj *CONST objv[])
```

to

```
ZbarAsyncDecodeObjCmd(ClientData clientData, Tcl_Interp *interp, int objc,  Tcl_Obj *const objv[])
```

### Deprecated of MS-Windows TCHAR utility functions 

The MS-Windows string character type `TCHAR` is 8 bit or 16 bit, depending on preprocessor define `_UNICODE`.
This functionality is outdated today and only 16 bit string types should be used to call the MS-Windows API.

TCL 8.6 had two functions to support this, which are still present but deprecated and not documented any more.
In consequence, the following code should be replaced (with `_UNICODE` defined):

```
Tcl_Size lStr;
char *pStr;
Tcl_DString *dStr;
pStr = Tcl_GetStringFromObj(objv[PositionSPar],&lStr);
// Old
Tcl_WinUtfToTChar(pStr,lStr,dStr);
// New
Tcl_DStringInit( &sPar1 );
Tcl_UtfToWCharDString( pStr, lStr, &sPar1);
// both solutions
TextOutW(pdlg.hDC, X0, Y0, Tcl_DStringValue( &dPar1 ),
        Tcl_DStringLength(&dPar1 )/sizeof(WCHAR));
```

Remark that the new function does not initialize the Dstring, but appends to it.

And the other way around (get the default printer name):

```
DWORD BufferSize;
TCHAR *pDefaultPrinter;
Tcl_DString Printer;
GetDefaultPrinter(NULL, &BufferSize);
if (BufferSize == 0) {return NULL;}
pDefaultPrinter = (TCHAR *) Tcl_Alloc(BufferSize * sizeof(TCHAR));
GetDefaultPrinter(pDefaultPrinter, &BufferSize);
// Old
Tcl_WinTCharToUtf(pDefaultPrinter, -1, &Printer);
// New
Tcl_DStringInit(&Printer);
Tcl_WCharToUtfDString(pDefaultPrinter, -1, &Printer);
// both
Tcl_Free((char *)pDefaultPrinter);
Tcl_DStringResult(interp, &Printer);
return RET_OK;
```

Remark the following difference of the old and the new function:

   *   The DString is not initialized allowing to append
   *   The size argument is in wide characters, not in bytes (e.g. `/ sizeof(WCHAR)`)
   *   Following the documentation, the argument `-1` was formerly not allowed for the size to read until the 0 word. This is not my observation (as seen in the upper code).
Z 5ded68111c9190b331419a42f510f4ce