Author: Nathan Coulter <[email protected]> State: Draft Type: Project Vote: Pending Created: 08-Jan-2023 Tcl-Version: 8.7 Tcl-branch: tip-653 Vote-Summary: Votes-For: Votes-Against: Votes-Present:
In recent versions of Tcl it is possible to configure a channel to treat data
that does not conform to the encoding specification as an error.
gets must pass along such an error. In order to maintain expected
semantics for a blocking channel, and to maximize utility,
gets can use the return options dictionary to communicate additional
information when an error occurs.
gets must return an encoding error on a blocking channel, the
error is returned when it is encountered, not on some subsequent call, and the
data successfully decoded up to the point of the error is available in the
return options dictionary either under the path
-result read, or under the
-data, (to be determined). The advantage of
-result read is that
additional information could be added under the same key. For example,
-result bytes might contain the original bytes prior to decoding.
After such an error, the current access position in the channel is the position
of the first byte of the data that caused the error.
[tell] provides that
Channels which use the (default)
strict profile, now return the POSIX error
EILSEQ when an encoding error occurs. For maximum compatibility with current
behavior, a distinction is made for 'blocking' resp. 'non-blocking' mode.
In 'blocking' mode, the functions
Tcl_GetsObj() set the POSIX error
EILSEQ whenever an encoding
error occurs. If
Tcl_GetsObj() encounter an encoding error, the
file-pointer is left at the original position, and the functions return -1.
Tcl_ReadObj() store the data as received so far in the return
options dictionary, and the file pointer is left where the encoding error
In 'non-blocking' mode, all data prior to the first byte that resulted in an
encoding error is returned, and the POSIX error is not yet set. On the the next
Tcl_ReadObj(), which normally happens in a loop or as a
readable event, no data is returned and the POSIX error
EILSEQ is set.
This makes it possible to handle all data up to the point of the error
Tcl_Eof() don't depend on
Tcl_Write() always writes out as many characters it can, and
always sets POSIX error
EILSEQ when it cannot write more due to an encoding
Tcl_Eof() only returns true when the channel is at an EOF condition,
not when the channel is at an encoding error position.
The primary intent is to preserve current semantics of
for a blocking channel: An error occurs immediately when non-conforming data
is encountered, not on the next call to
gets, as was proposed
in some other approaches. The second goal is to make the position of the
non-conforming data available to the caller. One natural way to do this is to
make it the current position so that
[tell] can provide it. The question
then arises: What to do with the data that has been successfully decoded so
far? The most simple and probably best answer is to make it available to the
caller in case something useful can be done with it.
In Tcl the return value in case of an error is normally an error message, so
the return value is not available for passing to the caller other information
related the error.
-errorcode could be used, but it is typically used for
classification of the error, and mixing in other types of additional
information does not seem like a particularly good idea.
The data successfully decoded so far is stored under the path
rather than just
-result so that if there later arises a need to return other
information, it can be assigned to another key under
-result. For example,
one idea is that the original undecoded bytes should also be returned.
-result could become a common pattern for returning rich data in exceptional
Under this proposal the caller of
gets can handle each
occurence of non-conforming data and then continue to read data from the
The py-b8f575aa23 branch contains a complete implementation under which the entire test suite passes.
Copyright © 2023, Nathan Coulter. All rights reserved.
The author of this TIP requests financial support for this and other free software work. Contact and payment information available at: