TIP 714: encoding compatibility for GDI/HammerDB/TPC/DB2

Login
Author:         Jan Nijtmans <[email protected]>
State:          Draft
Type:           Project
Tcl-Version:    9.0.2
Tcl-Branch:     encoding-user

Abstract

Not all applications on Windows are able to make the move to utf-8. For example, using a GDI function in Tcl 9.0 doesn't work as expected:

    pStr = Tcl_GetStringFromObj(objv[PositionSPar],&lStr);
    Tcl_UtfToExternalDString(NULL, pStr, lStr, &sPar1);
    TextOut(pdlg.hDC, X0, Y0, Tcl_DStringValue(&sPar1), Tcl_DStringLength(&sPar1))

The TclWinGetUserEncoding() function is proposed to bridge that gap.

Also a Tcl counterpart "encoding user" is proposed.

Rationale

There are 3 ways application can translate UTF-8 to an external encoding, interfacing with Windows (or other) API.

  1. Use the Unicode API. Example:

    pStr = Tcl_GetStringFromObj(objv[PositionSPar], &lStr);
    Tcl_WinUtfToTChar(pStr, lStr, &sPar1);
    FILE *_wfopen((WCHAR *)Tcl_DStringValue(&sPar1), L"r");
    

  2. Use the ANSI API. Example (Tcl 8.6):

    pStr = Tcl_GetStringFromObj(objv[PositionSPar], &lStr);
    Tcl_UtfToExternalDString(NULL, pStr, lStr, &sPar1);
    FILE *fopen(Tcl_DStringValue(&sPar1), "r");
    

  3. Don't care about characters outside the ASCII set. Example:

    pStr = Tcl_GetStringFromObj(objv[PositionSPar], &lStr);
    FILE *f = fopen(pStr, "r");
    

Those 3 approaches all work fine in both Tcl 8.6 and 9.0. In Tcl 9.0, approach 3) will even do the right thing for all characters, not only ASCII (but don't use that if your code is still expected to run in Tcl 8.6).

However, in Tcl 9, method 2) doesn't work as expected any more for the GDI API, since GDI doesn't use the utf-8 encoding in Tcl 9.0, contrary to other API's.

The TclWinGetUserEncoding() function can help here:

    pStr = Tcl_GetStringFromObj(objv[PositionSPar],&lStr);
    Tcl_UtfToExternalDString(TclWinGetUserEncoding(interp), pStr, lStr, &sPar1);
    TextOut(pdlg.hDC, X0, Y0, Tcl_DStringValue(&sPar1), Tcl_DStringLength(&sPar1))

Specification

Implement a new command

This command returns the same value as "encoding system" would return in Tcl 8.6. It can be use for compatibility, if - for some reason - you cannot convert the source-code of some Tcl 8.6 script to UTF-8, but still want to run it in Tcl 9.

Example (derived from Harald's example):

    source -encoding [encoding user] oldTclScriptInCp1252.tcl
(of course, it would be better to specify cp1252 in stead of [encoding user])

Declare and implement a new function:

Example (derived from Harald's example):

    pStr = Tcl_GetStringFromObj(objv[PositionSPar],&lStr);
    Tcl_UtfToExternalDString(TclWinGetUserEncoding(interp), pStr, lStr, &sPar1);
    TextOut(pdlg.hDC, X0, Y0, Tcl_DStringValue(&sPar1), Tcl_DStringLength(&sPar1))

Implementation

See the encoding-user branch.

Copyright

This document has been placed in the public domain.