TIP 646: Change -eofchar handling

Login
Bounty program for improvements to Tcl and certain Tcl packages.
Author:		Jan Nijtmans <[email protected]>
State:		Final
Type:		Project
Vote:		Done
Created:	17-Oct-2022
Tcl-Version:	8.7
Tcl-branch:	tip-646
    Vote-Summary:	Accepted 3/0/0
    Votes-For:	JN, KBK, KW, SL
    Votes-Against:	none
    Votes-Present:	none

Abstract

This TIP is inspired by Bug #5bfe3de008, suggesting to remove the -eofchar handling for writable channels. So, in stead of (assuming $chan is a writable channel):

    fconfigure $chan -eofchar \x1A
    close $chan
Now, this should be done:
    puts -nonewline $chan \x1A
    close $chan
i.e.: The close command never writes any byte to the channel any more as part of the close process. If you want any byte to be written, it should be done explicitly just before closing the channel.

Also, this TIP suggests to change the default -eofchar for channels on Windows in Tcl 9.0 to be the same as on UNIX: -eofchar {}.

The only feature -eofchar "\\x1A {}" is currently used for is for starpacks, where a script file and some archive is glue'd together in a single file, using \\x1A as separation. That functionality will be kept as-is in Tcl 9.0.

Background & Rationale

Tcl channels have the configuration option -eofchar {$inChar $outChar}. The current functionality of inChar (for channels which are readable) is that - as soon as this character is encountered in the input stream - it behaves as EOF. The current functionality of outChar is that - as soon as the channel is closed - the character is written to the output stream. This last behavior is problematic: We don't know where the cursor is when the file is closed, no check is done and no seek to the end of the channel is done. For this reason, no extension or application is known using this feature. Whenever -eofchar is set, it is usually either set to {} or to "\\x1A {}".

The default for channels on Windows is -eofchar "\\x1A {}", which is different from the default on UNIX. This is causing problem with a lot of extensions, which need to set -eofchar {} explicitly for channels, preventing that the character \\x1A somewhere in the input stream causes the channel to be closed. The motivation for this was that old DOS files sometimes have the \x1A character at the end, but this is highly unlikely now: Current editors on Windows don't do that any more. Therefore, the default for channels on Windows will be changed to -eofchar {} in Tcl 9.0.

Specification

For Tcl 9.0, remove the outChar behavior of -eofchar completely. Any attempt to set the outChar to something else than the empty string will result in an error-message.

For Tcl 9.0, change the default -eofchar value on Windows to be the same as it is on UNIX.

For Tcl 8.7, we can do a little bit to reduce the chance of errors. Currently, setting a single character:

    fconfigure $chan -eofchar \x1A
is equivalent to
    fconfigure $chan -eofchar "\x1A \x1A"
For Tcl 8.7, this will be changed to be equivalent to:
    fconfigure $chan -eofchar "\x1A {}"
This prevents the - most likely unintended - behavior that a \\x1A character is written to the channel (if it is writable) whenever it is closed.

Implementation

Implementation is available in the tip-646 branch of the Tcl repository.

Copyright

This document has been placed in the public domain.