Author: Jan Nijtmans <[email protected]>
Author: Jan Nijtmans <[email protected]>
State: Final
Type: Project
Vote: Done
Created: 3-June-2019
Post-History:
Discussions-To: Tcl Core list
Keywords: Tcl
Tcl-Version: 8.7
Tcl-Branch: tip-548
Vote-Results: 4/2/0 accepted
Votes-For: JN, DKF, KW, KBK
Votes-Against: none
Votes-Present: DGP, SL
Abstract
This TIP proposes to add wchar_t
conversion functions and deprecate Tcl_WinUtfToTChar()
and Tcl_WinTCharToUtf()
in favour of those new functions.
Rationale
The functions Tcl_WinUtfToTChar()
and Tcl_WinTCharToUtf()
originally were functions able to do two different
conversions, depending on the runtime platform: On Windows 95/98/ME they performed conversions between UTF-8 and
the Windows default encoding (usually CP1252), on later Windows versions they convert between UTF-8 and UTF-16.
The length parameter of Tcl_WinTCharToUtf()
always was in bytes, but most other Unicode-related Tcl functions
expect their length in Unicode characters.
Since Windows 95/98/ME are not supported any more, it's time to fix this inconsistency.
Modern systems have a wchar_t
type which represents a Unicode-like type, which can either be 16 bits
(unsigned short) or 32 bits (int). This TIP proposes 3 additional functions which convert between
wchar_t
-related types and UTF-8, and 3 more which convert between unsigned short
-related types (UTF-16)
and UTF-8. The new functions work identically on all platforms, not only Windows.
Specification
This document proposes:
Enhance the
Tcl_UniCharToUtfDString()
function such that the uniLength parameter is allowed to have the value -1. In that case, the UniChar string will be read up to the closing /u0000 character.Enhance the
Tcl_UniCharToUtfDString()
andTcl_UtfToUniCharDString()
functions such that the src/uniStr parameters are allowed to have the value NULL. In that case, the functions return NULL, without doing anything.New functions:
Tcl_UtfToWCharDString()
, similar toTcl_UtfToUniCharDString()
, but has awchar_t
pointer type in its signature.Tcl_WCharToUtfDString()
, similar toTcl_UniCharToUtfDString()
, but has awchar_t
pointer type in its signatureTcl_UtfToWChar()
, similar toTcl_UtfToUniChar()
, but has awchar_t
pointer in its signature.On Windows,
wchar_t
is the same type asunsigned short
, but on other platformswchar_t
might be a 32-bit type ('int' usually). These functions map to either the 32-bit or the 16-bit versions, depending on the size ofwchar_t
, automatically.Tcl_UtfToChar16DString()
, similar toTcl_UtfToUniCharDString()
, but has anunsigned short
pointer type in its signature.Tcl_Char16ToUtfDString()
, similar toTcl_UniCharToUtfDString()
, but has anunsigned short
pointer type in its signatureTcl_UtfToChar16()
, similar toTcl_UtfToUniChar()
, but has anunsigned short
pointer in its signature.Deprecate the following functions:
Tcl_WinUtfToTChar()
, in favour ofTcl_UtfToWCharDString()
Tcl_WinTCharToUtf()
, in favour ofTcl_WCharToUtfDString()
If Tcl is compiled with either -DTCL_UTF_MAX=6 (which is not officially supported) or -DTCL_NO_DEPRECATED, those functions will become macro's, which do exactly the same thing. In Tcl 9.0, the 2 deprecated functions will be removed from the stub tables, but the replacement macro's will still be there. So, the functions can still be used in extensions, they will be replaced with the new functions automatically.
- The windows dde and registry extensions (tclWinDde.c and tclWinReg.c) are updated to use the new functions
Tcl_UtfToWCharDString()
andTcl_WCharToUtfDString()
, serving as proof/demonstration that the functions in this TIP actually work.
How to upgrade.
No need to do anything. In Tcl 9.0, the two deprecated functions are replaced by macro's which do the same thing. But if you want to prevent a (future) deprecation warning, you can do the following:
In your extension, replace the call:
`Tcl_WinUtfToTChar(....., bufPtr)`;
with the following two lines:
`Tcl_DStringInit(bufPtr);`
`Tcl_UtfToWCharDString(....., bufPtr);`
And also, replace:
`Tcl_WinTCharToUtf(....., bufPtr);`
with the following two lines:
`Tcl_DStringInit(bufPtr);`
`Tcl_WCharToUtfDString(....., bufPtr);`
If the Tcl_WinTCharToUtf()
call originally had a "length" parameter not equal to -1, divide it by 2 (or ... don't multiply it by 2 any more).
Compatibility
This is fully upwards compatible.
Reference Implementation
A reference implementation is available in the tip-548 branch. https://core.tcl-lang.org/tcl/timeline?r=tip-548
Copyright
This document has been placed in the public domain.