Tcl Source Code

View Ticket
Login
Ticket UUID: 17a1cb8d6e2a51bde1d5d4a6e55f7d42e408cbbc
Title: Tcl 9: "illegal byte sequence" ?!
Type: Bug Version: 9.0a4, trunk
Submitter: pointsman Created on: 2022-12-12 23:44:57
Subsystem: 25. Channel System Assigned To: jan.nijtmans
Priority: 5 Medium Severity: Minor
Status: Closed Last Modified: 2022-12-20 09:46:27
Resolution: Fixed Closed By: jan.nijtmans
    Closed on: 2022-12-20 09:46:27
Description: (text/x-fossil-plain)
With 252e282cd21b (2022-12-12 06:11:22 UTC)

# Create the test data
package require Tcl 9
set fd [open data.txt w+]
fconfigure $fd -encoding utf-8
puts $fd "A\U10FFFFBCDE"
close $fd


# The data is a valid UTF-8 encoded Unicode string
# in the understanding of Tcl.

# Try to read it
package require Tcl 9
set fd [open "data.txt"]
fconfigure $fd -encoding utf-8 -strictencoding 1 -nocomplainencoding 0
set data ""
while {![eof $fd]} {
    append data [read $fd]
}
# Not reached with this test data
puts [string length $data]
close $fd

I get:

error reading "file6": illegal byte sequence
    while executing
"read $fd"
    ("while" body line 2)
etc.


Expected behaviour:

Read the data without raising error.
User Comments: jan.nijtmans added on 2022-12-20 09:46:27: (text/x-fossil-wiki)
Branch now merged to core-8-branch and trunk. Closing

jan.nijtmans added on 2022-12-19 22:21:28: (text/x-fossil-wiki)
As there seems no interest to check for noncharacters, remove those checks. See [cbaa5e70167db75b]

jan.nijtmans added on 2022-12-13 11:47:09: (text/x-fossil-wiki)
B.T.W. I know that "illegal byte sequence" is not the best error-message we can think of. But that's a limination of the channel code, that we are restricted to POSIX errors. This one is the closest to the truth.

jan.nijtmans added on 2022-12-13 07:59:45: (text/x-fossil-wiki)
Does [05414ca78462df68|this] clarify the situation?

Referring to ticket [084ab982fe], you asked there:

> Why do we have the fconfigure -strict mode at all? 

Apparently, -strict is not the right option for you.