Author: Jan Nijtmans <[email protected]> Author: Nathan Coulter <[email protected]> State: Final Type: Project Vote: Done Tcl-Version: 9.0 Tcl-Branch: tip-657 Vote-Summary: Accepted 6/0/1 Votes-For: AF, AK, JN, KW, MC, SL Votes-Against: none Votes-Present: DKF
This TIP proposes to make "-profile strict" the default. This TIP is intended as replacement for TIP #601, but builds on top of TIP #656 ("A revised proposal for encodings")
tcl8 profile is a legacy profile, which doesn't conform
to any recommended behavior, the two other profiles
strict is the recommended profile in most situations, it becomes
the default in Tcl 9.0, with a few exceptions. That has some implications
at the script level.
Many scripts will have to be adapted, either expecting exceptions for encoding errors or setting the channel profile to "tcl8" or "replace". And functions like "fcopy", "read" and "gets" now will throw an exception when encountering encoding-errors, which might not be expected by external applications/extensions.
New channels are by default assigned the
strict profile, and both
encoding convertfrom and
encoding convertto use the
by default. The exception for this is the
stderr channel, which
will default to the
Tcl_FSEvalFileEx() uses the
strict profile, and therefore
the strict profile. All commands except
glob use the
Tcl_UtfToExternal(), support operation in a mode
where any encoding error that occurs results in an
EILSEQ POSIX error. That
mode is now the default. Other modes can be explicitly configured by the
caller (TIP #656) to specify how these functions behave when invalid data are encountered.
Handling of environment variables (syncing between the ::env array and the
native environment) is still using the
tcl8 profile, as well as the
glob command. The reason for this is that in those situations many
applications won't expect exceptions when illegal byte-sequences
happen in (disk-)filenames or in environment variables. That's why
it's out-of-scope for this TIP. TIP #671 is an attempt
to solve this problem with environment variables and the
Since this is an incompatible change whenever channels/files/sockets are used, it has a potential big effect on extensions. All extensions which could be confronted with encoding errors now have to handle the possibility of exceptions to be thrown in the case of encoding errors.
Also, when trying to open a file, when the filename has surrogate characters in it (or .. any code-point missing from the system encoding), opening such file will fail in Tcl 9.0, while it might have succeeded in Tcl 8.x. e.g.:
set f [open \U1F91D w] close $f set f [open \uD83E\uDD1D r]This will succeed in Tcl 8.7, but fail in Tcl 9.0, because surrogate pairs are not equal to the combined character any more.
The 'http' package is modified because of this change: Since the 'http' package is not prepared to handle exceptions, it can easily be left in an inconsistent state, as shown by test-case errors when the default profile was changed to 'strict'. Therefore, the 'http' package, when run in Tcl 9.0, will use the 'replace' profile. This makes the package conformant to the W3C recommendations.
The 'tcltest' package is modified to use the 'tcl8' profile for its internal channels. For this package, we don't want exceptions to disturb test-outputs. If a test-case wants to handle a surrogate, so be it, this should not disturb the testcase.
Implementation is available in the tip-657 branch of the Tcl repository.
This document has been placed in the public domain.