TIP 643: Provide a public API to retrieve nul terminator length for an encoding

Login
Author:		Ashok P. Nadkarni <[email protected]>
State:		Final
Type:		Project
Vote:		Done
Created:	09-Oct-2022
Tcl-Version:	8.7
Tcl-Branch:	tip-643
Keywords:	encoding
Vote-Summary:  Accepted 3/0/0
Votes-For:     JN, KBK SL
Votes-Against: none
Votes-Present: none

Abstract

Add a C API to allow extensions to retrieve the length of the nul terminator for a specific encoding.

Rationale

The Tcl_UtfToExternal, Tcl_UtfToExternalDString and related API's store string in destination buffers in a specified encoding. While they append an appropriate nul terminator to the destination, the returned length information does not include the terminator. This is a problem when further copying the encoded data to another buffer as might be required when interfacing to third party API's.

This TIP proposes adding a Tcl_GetEncodingNulLength C API that will return the number of nul bytes required for a specific encoding.

Specification

The following function will be added along with an entry in the stubs table.

int Tcl_GetEncodingNulLength(Tcl_Encoding encoding);

The encoding parameter specifies the encoding for which the nul terminator length is to be retrieved. If encoding is NULL, the current system encoding is assumed.

The function returns the length of the nul terminator for the encoding.

Discussion

One could also think of providing a encoding nullength command at the script level as FFI extensions like CFFI and FFIDL would find it useful to terminate binary strings produced by encoding convertto before passing them to the FFI. However, this can also be achieved by converting "\0" so such a command is not strictly necessary though it would be clearer and more efficient. At the moment, this is not planned. Opinions invited.

Copyright

This document has been placed in the public domain.