Tcl Source Code

View Ticket
Login
Ticket UUID: 8ffd8cabd10e936228f73e082d8d72eb874a2fbe
Title: "encoding system": wrong result without manifest
Type: Bug Version: 9.0
Submitter: jan.nijtmans Created on: 2025-05-05 08:37:16
Subsystem: 52. Portability Support Assigned To: jan.nijtmans
Priority: 5 Medium Severity: Severe
Status: Closed Last Modified: 2025-05-06 08:06:13
Resolution: Fixed Closed By: jan.nijtmans
    Closed on: 2025-05-06 08:06:13
Description:

The windows "tcl90.dll" can be linked into other applications than only tclsh and wish. Doing that, if the other application doesn't use a manifest, "encoding sytem" gives the wrong answer.

Example: people using the DB2 database use that, because the DB2 dll cannot handle the "utf-8" encoding. Removing the "UTF-8" setting in the manifest is a workaround, possibly resulting in encoding problems all around Tcl. The wrong answer for "system encoding" is the worst of those problem, which can easily be solved. That's the purpose of this ticket.

This can be demonstrated by building Tk, but removing the UTF-8 setting from the manifest. Demo:

    >.\tclsh90.exe
    % encoding system
    utf-8
    % exit
    >.\wish90.exe
    % encoding system
    cp1252
    % exit

Note that the "tcl90.dll" file used by tclsh90 and wish90 is identical in this example.

Since the default encoding on Windows starting with Tcl 9.0 is supposed to be UTF-8, this can result in encoding problems, when Tcl and Tk (in this example) are exchanging files.

User Comments: jan.nijtmans added on 2025-05-06 08:06:13:

Fixed [cd024d90d96deefe|here].

Thanks, Ashok, for the code!


oehhar added on 2025-05-05 16:50:22:

Yes, please merge. And thanks to Ashok for the code, I suppose ;-)

Harald


jan.nijtmans added on 2025-05-05 16:37:34:

@Harald, thanks for your support!

Any objection, merging this to core-9-0-branch and trunk?


oehhar added on 2025-05-05 10:09:12:

Ok, thanks for the Info. I always tested Unicode support by Tcl 8.6 on Windows in a limitted way, e.g. file names with Chinese characters and Tk entry and label widgets. I don't remeber when there were issues. Anybody understands, that non BMP are an issue. I avoid to use that anyway. This all worked on Windows 8 now for a long time. I don't have a system each day, but in the office, there is one.

Thanks for all, Harald


jan.nijtmans added on 2025-05-05 10:01:20:
> So, on Windows 8, we still have "encoding system=cp1252"?

Yes, that's true. I wouldn't be surprised if Windows 8 has incomplete support for utf-8, so I'm afraid that changing the "encoding system" on machines < Windows10 2019H1 to "UTF-8" would be problematic anyway.

It's the same on UNIX: Running Tcl 9.0 on a machine which doesn't have the UTF-8 encoding as default, Tcl 9.0 will still use the ISO-8859-1 encoding.

I hope that for Tcl 9.1, Windows10 2019H1 will be set as the minimum supported version. Then we don't have to worry about this any more.

oehhar added on 2025-05-05 09:49:54:

Hi Jan, thanks, great. May I ask, why utf-8 is only reported as system encoding for Windows Version after Windows10 2019H1?

Was this the first version, where the manifest magic got active and thus, the "encoding system" changed to utf-8?

So, on Windows 8, we still have "encoding system=cp1252"?

Harald


jan.nijtmans added on 2025-05-05 09:26:13:

Proposed fix [4f085c414109aaf0|here]. It's extracted from the TIP #617 implementation, a part which I agree with that should be done.

I have some problems with some parts of TIP #716. They can be addressed, but the current TIP's still contains too meany questions. So it will still take time.

This ticket is meant to - at least - accept a small part from TIP #716 which is not controversial. It already helps the DB2 people: If they build their own tclsh, it will at least use the utf-8 encoding as default, as do Tcl 9.0.0/9.0.1


oehhar added on 2025-05-05 09:06:58:

Jan, thanks for looking into this so deeply.

I second, that the TCL library should be agnostic on the manifest. "system encoding = utf8" and "encoding user = cp1252" should not depend on the manifest.

Even for tcl or wish, we may have two alternate builds, one with the UTF-8 manifest, one without it.

The utf-8 in the manifest:

  • is good to make some 3rd party software magically utf-8 aware
  • is bad as it magically breaks some 3rd party software

Tcl and Tk should not at all depend on it.

Harald