Author: Nathan Coulter <[email protected]>
State: Draft
Type: Project
Vote: Pending
Tcl-Version: 9.0
Tcl-Branch: tip-667
Obsoleted-By: 657
Abstract
Although TIP #657 purported to be about making the strict profile the default, it also specified other things that were out of scope, specified unnecessary implementation details, and included a partial alternative to TIP #653 in its Compatibility section (those changes have since been incorporated into TIP #653. This TIP proposes that "strict" become the default encoding profile for all operations.
Rationale
The tcl8
profile was until recently the only option for handling
encoding errors in channel content. Now there are two additional profiles
available, strict
and replace
.
The most common use case for encoded data is to expect that if the operation
completed without error, the data were correctly encoded and that no data were
lost in the result. This corresponds to the strict
encoding profile, so it
makes sense to make this profile the default. Where it is not the default,
data may be silently corrupted, with the corruption being discovered only at
some later date after collateral damage, possibly including exploitation by bad
actors, has been discovered.
It is expected that scripts that must be adapted due to this change in default
behaviour will fail early and before real damage is done, making it easy to
detect where change is necessary and leading to a more secure and correct
scripting environment overall. Functions like fcopy
, read
and gets
throw
exceptions as soon as bad data is detected. Where this is not desired it is
easy to remedy through trivial mechanical changes to existing scripts.
Specification
New channels are by default assigned the strict
profile, and both
encoding convertfrom
and encoding convertto
use the strict
profile
by default.
Tcl_FSEvalFileEx()
uses the strict
profile, and therefore source
uses
the strict profile. The http package leaves any channels it opens in their
default strict configuration, so it too uses the strict
profile.
Tcl_ExternalToUtfDStringEx()
, Tcl_UtfToExternalDStringEx()
,
Tcl_ExternalToUtf()
and Tcl_UtfToExternal()
, support operation in a mode
where any encoding error that occurs results in an EILSEQ
POSIX error. That
mode is now the default. Other modes can be explicitly configured by the
caller to specify how these functions behave when invalid data are encountered.
Any test that in the Tcl test suite that requires a channel that is not configured for strict encoding explicitly configures the channel according to its needs.
Further explanation
Compatibility
This is an incompatible change for Tcl_ExternalToUtf()
/Tcl_UtfToExternal()
,
but since those functions are often called to operate in strict mode, it will
have little effect.
This is an incompatible change for Tcl_Read()
, Tcl_Write()
, Tcl_Gets()
.
See TIP 653 for details.
Implementation
The branch trunk-encodingdefaultstrict implements this TIP.
Copyright
This document has been placed in the public domain.