Ticket UUID: | 5fca83d78cf37b70d3e958897b542f985f2ce56a | |||
Title: | [encoding system] is wrong in an ISO-8859-1 locale | |||
Type: | Patch | Version: | 9.0b2 | |
Submitter: | bhaible | Created on: | 2024-07-01 18:34:20 | |
Subsystem: | 38. Init - Library - Autoload | Assigned To: | jan.nijtmans | |
Priority: | 5 Medium | Severity: | Important | |
Status: | Closed | Last Modified: | 2024-07-01 19:56:44 | |
Resolution: | Fixed | Closed By: | jan.nijtmans | |
Closed on: | 2024-07-01 19:56:44 | |||
Description: |
On a Linux/glibc system, I have two French locales: $ LC_ALL=fr_FR.UTF-8 locale charmap UTF-8 $ LC_ALL=fr_FR.ISO-8859-1 locale charmap ISO-8859-1 ('locale charmap' is the command-line equivalent of nl_langinfo(CODESET).) In Tcl, [encoding system] should be set to an equivalent value, otherwise strings with non-ASCII characters are printed incorrectly to standard output. This works in Tcl 8.6: $ LC_ALL=fr_FR.UTF-8 tclsh8.6 % encoding system utf-8 $ LC_ALL=fr_FR.ISO-8859-1 tclsh8.6 % encoding system iso8859-1 But it does not work for the ISO-8859-1 locale any more in Tcl 9.0 beta2: $ LC_ALL=fr_FR.UTF-8 tclsh9.0 % encoding system utf-8 $ LC_ALL=fr_FR.ISO-8859-1 tclsh9.0 % encoding system utf-8 I debugged it. nl_langinfo(CODESET) comes out as "ISO-8859-1". This string is lowercased, producing "iso-8859-1", and then a lookup in LocaleTable is done, in function SearchKnownEncodings. The table happens to have 174 elements, and the "iso-8859-1" is at index 80. Due to a logic bug in function SearchKnownEncodings, this index is not found. SearchKnownEncodings then returns NULL. The attached patch fixes the bug. The bug is present since 2005, but triggered on different encodings, because of the influence of the table length and the searched entry's index. | |||
User Comments: |
jan.nijtmans added on 2024-07-01 19:56:44:
Thanks for this report and the patch! Your explanation makes 100% sense! Following the code, I see what's happening, and why this wasn't noticed in Tcl 8.x. Many thanks! Fixed in all branches now. Will be part of Tcl 9.0b3 |
Attachments:
- 0001-unix-Fix-encoding-system-in-ISO-8859-1-locales.patch [download] added by bhaible on 2024-07-01 18:35:40. [details]