Check-in [a1fb82d376]

Login
Bounty program for improvements to Tcl and certain Tcl packages.
Tcl 2019 Conference, Houston/TX, US, Nov 4-8
Send your abstracts to [email protected]
or submit via the online form by Sep 9.

Many hyperlinks are disabled.
Use anonymous login to enable hyperlinks.

Overview
Comment:Make TIP align with actual current implementation
Downloads: Tarball | ZIP archive | SQL archive
Timelines: family | ancestors | descendants | both | trunk
Files: files | file ages | folders
SHA3-256: a1fb82d376eb7c0e17e446286f09fc346843cc8a8c34982d72605dcfb683e792
User & Date: jan.nijtmans 2019-07-11 07:14:06
Context
2019-08-03
21:19
Textual improvements for TIP #548 check-in: 691293448c user: jan.nijtmans tags: trunk
2019-07-11
07:14
Make TIP align with actual current implementation check-in: a1fb82d376 user: jan.nijtmans tags: trunk
2019-07-02
07:06
TIP 545 svg options: some refinements and EuroTCL discussion report check-in: 7c6da1bc85 user: oehhar tags: trunk
Changes
Hide Diffs Unified Diffs Ignore Whitespace Patch

Changes to tip/548.md.

15
16
17
18
19
20
21
22
23
24
25
26



27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58











59



60
61
62
63
64
65
66
..
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
# Abstract

This TIP proposes to deprecate `Tcl_WinUtfToTChar()` and `Tcl_WinTCharToUtf()` and provide more flexible replacement functions.

# Rationale

The functions `Tcl_WinUtfToTChar()` and `Tcl_WinTCharToUtf()` originally were functions able to do two different conversions,
depending on the runtime platform: On Windows 95/98/ME they performed conversions between Utf-8 and the Windows default encoding
(usually cp1252), on later Windows versions they convert between Utf-8 and Utf-16. The length parameter of `Tcl_WinTCharToUtf()`
always was in bytes, but most other Unicode-related Tcl functions expect their length in Unicode characters.

Since Windows 95/98/ME are not supported any more, it's time to fix this inconsistency.




# Specification

This document proposes:

 * Deprecate the following functions:

     `Tcl_WinUtfToTChar()`

     `Tcl_WinTCharToUtf()`

   If Tcl is compiled with either -DTCL\_UTF\_MAX=6 (which is not officially supported) or -DTCL\_NO\_DEPRECATED, those functions will
   no longer be available. In Tcl 9.0, those functions will be completely removed.

 * Enhance the Tcl\_UniCharToUtfDString() function such that the uniLength parameter is allowed to
   have the value -1. In that case, the UniChar string will be read up to the closing /u0000 character.

 * New replacement functions:

     `Tcl_UtfToUtf16DString()`, replaces `Tcl_WinUtfToTChar()`

     `Tcl_Utf16ToUtfDString()`, replaces `Tcl_WinTCharToUtf()`

     Those are the same as the already existing _UniChar_ variants (`Tcl_UniCharToUtfDString`/`Tcl_UtfToUniCharDString`), but they use an "unsigned short"
     pointer type in their signature stead of a "Tcl\_UniChar" pointer type, which is always 16-bits.
     `Tcl_Utf16ToUtfDString()` accepts - just as `Tcl_UniCharToUtfDString()` - the value -1 as length parameter.

     Those functions can be used if you want your extension to compile with `-DTCL_UTF_MAX=3`, `-DTCL_UTF_MAX=4` or `-DTCL_UTF_MAX=6`,
     but still want to use the 16-bit conversions independent on the `TCL_UTF_MAX` setting or Tcl\_UniChar type.

     Those functions are available on all platforms, not only Windows.












# How to upgrade.




In your extension, replace the call:

     `Tcl_WinUtfToTChar(....., bufPtr)`;
     
with the following two lines:

................................................................................
     `Tcl_Utf16ToUtfDString(....., bufPtr);`

If the `Tcl_WinTCharToUtf()` call originally had a "length" parameter not equal to -1, divide it by 2 (or ... don't multiply it by 2 any more).


# Compatibility

This is fully upwards compatible in Tcl 8.x, except if Tcl is compiled with `-DTCL_UTF_MAX=6` (not officially supported) or
`-DTCL_NO_DEPRECATED`. Starting with Tcl 9.0, the replacement functions should be used in stead.

# Reference Implementation

A reference implementation is available in  the **tip-548** branch.
<https://core.tcl-lang.org/tcl/timeline?r=tip-548>

# Copyright

This document has been placed in the public domain.






|
|



>
>
>












|






|

|

|
|







>
>
>
>
>
>
>
>
>
>
>

>
>
>







 







|
<









15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
..
94
95
96
97
98
99
100
101

102
103
104
105
106
107
108
109
110
# Abstract

This TIP proposes to deprecate `Tcl_WinUtfToTChar()` and `Tcl_WinTCharToUtf()` and provide more flexible replacement functions.

# Rationale

The functions `Tcl_WinUtfToTChar()` and `Tcl_WinTCharToUtf()` originally were functions able to do two different conversions,
depending on the runtime platform: On Windows 95/98/ME they performed conversions between UTF-8 and the Windows default encoding
(usually CP1252), on later Windows versions they convert between UTF-8 and UTF-16. The length parameter of `Tcl_WinTCharToUtf()`
always was in bytes, but most other Unicode-related Tcl functions expect their length in Unicode characters.

Since Windows 95/98/ME are not supported any more, it's time to fix this inconsistency.

Modern systems have a `wchar_t` type which represents a Unicode-like type, which can either be 16 bits (unsigned short) or
32 bits (int). This TIP proposes 3 additional functions to convert between `wchar_t`-related types and UTF-8. 

# Specification

This document proposes:

 * Deprecate the following functions:

     `Tcl_WinUtfToTChar()`

     `Tcl_WinTCharToUtf()`

   If Tcl is compiled with either -DTCL\_UTF\_MAX=6 (which is not officially supported) or -DTCL\_NO\_DEPRECATED, those functions will
   become macro's, which do exactly the same thing.

 * Enhance the Tcl\_UniCharToUtfDString() function such that the uniLength parameter is allowed to
   have the value -1. In that case, the UniChar string will be read up to the closing /u0000 character.

 * New replacement functions:

     `Tcl_UtfToChar16DString()`, replaces `Tcl_WinUtfToTChar()`

     `Tcl_Char16ToUtfDString()`, replaces `Tcl_WinTCharToUtf()`

     Those are the same as the already existing _UniChar_ variants (`Tcl_UniCharToUtfDString`/`Tcl_UtfToUniCharDString`), but they use an `unsigned short`
     pointer type in their signature stead of a `Tcl_UniChar` pointer type, which is always 16-bits.
     `Tcl_Utf16ToUtfDString()` accepts - just as `Tcl_UniCharToUtfDString()` - the value -1 as length parameter.

     Those functions can be used if you want your extension to compile with `-DTCL_UTF_MAX=3`, `-DTCL_UTF_MAX=4` or `-DTCL_UTF_MAX=6`,
     but still want to use the 16-bit conversions independent on the `TCL_UTF_MAX` setting or Tcl\_UniChar type.

     Those functions are available on all platforms, not only Windows.

 * New functions:

     `Tcl_UtfToWCharDString()`, similar to `Tcl_UtfToUniCharDString()`, but has a `wchar_t` pointer type in their signature.

     `Tcl_WCharToUtfDString()`, similar to `Tcl_UniCharToUtfDString()`, but has a `wchar_t` pointer type in their signature

     `Tcl_UtfToChar16()` and `Tcl_UtfToWChar()`, similar to `Tcl_UtfToUniChar()`, but has a `unsigned short` resp `wchar_t` in their signature.
     
     On Windows, `wchar_t` is the same type as `unsigned short`, but on other platforms `wchar_t` might be a 32-bit type ('int' normally).
     These functions map to either the 32-bit or the 16-bit versions, depending on the size of `wchar_t`, automatically.

# How to upgrade.

No need to do anything. In Tcl 9.0, those two deprecated functions are replaced by macro's which do the same thing.
But if you want to prevent a (future) deprecation warning, you can do the following:

In your extension, replace the call:

     `Tcl_WinUtfToTChar(....., bufPtr)`;
     
with the following two lines:

................................................................................
     `Tcl_Utf16ToUtfDString(....., bufPtr);`

If the `Tcl_WinTCharToUtf()` call originally had a "length" parameter not equal to -1, divide it by 2 (or ... don't multiply it by 2 any more).


# Compatibility

This is fully upwards compatible.


# Reference Implementation

A reference implementation is available in  the **tip-548** branch.
<https://core.tcl-lang.org/tcl/timeline?r=tip-548>

# Copyright

This document has been placed in the public domain.