TIP 691: Setting -profile for tclsh/wish and the "source"/"open" commands.

Login
Author:		Jan Nijtmans <[email protected]>
State:		Draft
Type:		Project
Vote:		Pending
Created:	21-03-2024
Tcl-Version:	8.7 and 9.0
Tcl-Branch:	tip-691
Vote-Summary:	
Votes-For:	
Votes-Against:	
Votes-Present:	

Abstract

Tclsh and Wish, and the "source" and "open" commands have no possibility to set the profile of the channel used to read the file.

Older UNIX platforms used the ISO8859-1 encoding, while many older Windows systems use the CP1252 encoding. Starting with Tcl 9.0 (TIP #587), the default encoding for the source command is UTF-8. That creates problems for old Tcl scripts, which were originally written in ISO8859-1 or CP1252: due to the strict profile such scripts will start throwing an exception in Tcl 9. As a quick workaround, one possibility would be to change back the profile to "tcl8", as it was in Tcl 8.x.

This TIP is meant to provide a syntax for that.

On systems where the system-encoding was big5 or shiftjis, it's best to explicitly provide the known encoding. E.g.

$ tclsh -encoding shiftjis 
Automatic detection of the encoding is out-of-scope for this TIP.

Rationale

Example:

$ tclsh8.6 my_script.tcl

Assume this script is written in ISO8859-1 encoding, but the system encoding is UTF-8. Then, in Tcl 8.6 this will run fine. In Tcl 9.0 it will not run any more if there are bytes > 0x7F (unless they form valid UTF-8 pairs, which is not very likely).

Best would be to convert "my_scipt.tcl" to UTF-8, then it will run fine in both Tcl 8.6 and 9.0. But may-be that's not possible (because the script is on a CDROM, for example)

Another example:

$ tclsh8.6
source my_scipt.tcl
Same problem. The only way to remedy this is:
$ tclsh9.0
set f [open my_script.tcl]
fconfigure $f -profile tcl8
eval [read $f]
This TIP proposes a new syntax:
$ tclsh9.0
source -profile tcl8 my_scipt.tcl

Finally

$ tclsh9.0
set f [open my_script.tcl]
fconfigure $f -profile tcl8
eval [read $f]
In the new syntax this could be shortened to:
$ tclsh8.6
set f [open my_script.tcl TCL8]
eval [read $f]

Specification

The Tcl_FSEvalFileEx, Tcl_GetStartupScript, Tcl_SetStartupScript functions are modified such that special values TCL_ENCODING_UTF8_STRICT, TCL_ENCODING_UTF8_REPLACE or TCL_ENCODING_UTF8_TCL8 are accepted as encoding names as well.

tclsh/wish will get a new -profile option, setting the profile to one of the available profiles. This option cannot be combined with the -encoding option; it only can be used in combination with the UTF-8 encoding (which is implicit and the default).

In addition, the source command also gets a new -profile which works exactly the same as the tclsh/wish command-line option.

Finally the open command gets 3 new access options (second form only), STRICT, TCL8 and REPLACE.

Currently one of RDWR, RDONLY or WRONLY is mandatory for the second form of the access options. We don't want to make it mandatory to specify {RDONLY TCL8}, that's why another change is done making RDONLY the default.

Implementation

Implementation is in TCL branch "tip-691".

There is also a simplified implementation for 8.6 in TCL branch "tip-691-for-8.6" where -profile anyvalue is just a dummy option, doing nothing. It won't be documented in 8.6.

Copyright

This document has been placed in the public domain.