Tcl Source Code

View Ticket
Login
Ticket UUID: ecafd8611dd7665a6376c9c35b092d7d1a8e61c8
Title: Euro/Tail-sign missing from cp864 encoding
Type: Bug Version: 8.6
Submitter: jan.nijtmans Created on: 2025-06-24 14:51:15
Subsystem: 16. Commands A-H Assigned To: jan.nijtmans
Priority: 5 Medium Severity: Minor
Status: Closed Last Modified: 2025-06-25 11:58:06
Resolution: Fixed Closed By: jan.nijtmans
    Closed on: 2025-06-25 11:58:06
Description:

Example:

$ tclsh9.0
% scan [encoding convertfrom cp864 \xA7] %c
unexpected byte sequence starting at index 0: '\xA7'

According to Wikipedia, the EURO-sign was assigned to this code point in 1999

User Comments: jan.nijtmans added on 2025-06-25 11:58:06:

Fixed [089e9b68cdb23540|here]


jan.nijtmans added on 2025-06-25 11:11:55:

Just discovered that the arabic tail sign (ﹳ) was missing as well. The reason is that this code entry was introduced in Unicode 3.2, which didn't exist yet at the time. Fixed that too.


jan.nijtmans added on 2025-06-24 16:14:37:

The latest encoding source file for cp864 is here, it isn't updated for the EURO character.

Therefore, it's just as good to hand-edit the .enc-file: It never will be re-generated. And if someone does, the failing testcase will detect the problem.

There are some more .enc-file which were hand-edited after generation.


apnadkarni added on 2025-06-24 16:03:38:
Are the .enc files supposed to be generated using the txt2enc.c tool or manually edited? I was under the assumption the former.

jan.nijtmans added on 2025-06-24 15:01:18:

Proposed fix [7dd57ba178b39a9d|here]

While on it, add cp165 which is very similar to cp864