Tcl Source Code

View Ticket
Login
Ticket UUID: a7a89d422a4f5dd372e61a75acf5381142dde80
Title: Under strict encoding, [gets] returns an error instead of returning the second line
Type: Bug Version:
Submitter: pooryorick Created on: 2023-04-14 18:25:42
Subsystem: - New Builtin Commands Assigned To: pooryorick
Priority: 5 Medium Severity: Important
Status: Closed Last Modified: 2023-04-21 23:04:00
Resolution: Fixed Closed By: pooryorick
    Closed on: 2023-04-21 23:04:00
Description:

In the following script, the third line of data in the channel contains a byte sequence that is not valid utf-8. [gets] should successfully return two lines before returning an error, but it returns an error instead of returning the second line:

set chan [file tempfile]
chan configure $chan -translation binary
# This is not valid UTF-8
puts $chan hello\nAB\nCD\xc0\x40EF\nGHI
flush $chan
seek $chan 0

chan configure $chan -encoding utf-8 -profile strict
gets $chan
gets $chan

User Comments: jan.nijtmans added on 2023-04-18 06:14:27:

> @pouryorick wrote:

I am opposed to the release of Tcl 8.7 ...

Well, I'm opposed to the release of Tcl 9.0 while it still contains testcase failures .... ;-) (So, many thanks, @ashok!)


apnadkarni added on 2023-04-18 05:20:19:
Fixed, I think, by [079d2d5162]. Please review.

apnadkarni added on 2023-04-18 02:53:42:
Oops, last comment commit should link should have been [5995ca9b6c]

apnadkarni added on 2023-04-18 02:52:46:

As of commit [] on trunk, io-75.{14,15} are still failing on Windows.

==== io-75.14 [gets] succesfully returns lines prior to error

        invalid utf-8 encoding [gets] continues in non-strict mode after error FAILED
==== Contents of test case:

    lappend res [gets $chan]
    lappend res [gets $chan]
    set status [catch {gets $chan} cres copts]
    lappend res $status $cres
    chan configure $chan -profile tcl8
    lappend res [gets $chan]
    lappend res [gets $chan]
    close $chan
    return $res

---- Result was:
{a\x0D} {b\x0D} 1 {error reading "file1a8aec7b500": invalid or incomplete multibyte or wide character} {c\xC0\x0D} {d\x0D}
---- Result should have been (glob matching):
a b 1 {error reading "*": invalid or incomplete multibyte or wide character} c\xC0 d
==== io-75.14 FAILED



==== io-75.15 invalid utf-8 encoding strict
    gets does not hang
    gets succeeds for the first two lines FAILED
==== Contents of test case:

    #Now try to read it with [gets]
    fconfigure $chan -encoding utf-8 -profile strict
    lappend res [gets $chan]
    lappend res [gets $chan]
    set status [catch {gets $chan} cres copts]
    lappend res $status $cres
    set status [catch {gets $chan} cres copts]
    lappend res $status $cres
        chan configure $chan -translation binary
        set data [read $chan 4]
        foreach char [split $data {}] {
                scan $char %c ord
                lappend res [format %x $ord]
        }
    fconfigure $chan -encoding utf-8 -profile strict
        lappend res [gets $chan]
        lappend res [gets $chan]
    return $res

---- Result was:
hello AB 1 {error reading "file1a8aecf4570": invalid or incomplete multibyte or wide character} 1 {error reading "file1a8aecf4570": invalid or incomplete multibyte or wide character} 43 44 c0 40 {EF\x0D} {GHI\x0D}
---- Result should have been (glob matching):
hello AB 1 {error reading "*": invalid or incomplete multibyte or wide character} 1 {error reading "*": invalid or incomplete multibyte or wide character} 43 44 c0 40 EF GHI
==== io-75.15 FAILED


pooryorick added on 2023-04-16 23:24:54:

I am opposed to the release of Tcl 8.7 because it has a different alphabet for string values than Tcl 9 does, and releasing it will do more harm than good. Therefore I'm unlikely to spend much time maintaining it. I'd rather see it die an immediate death.


pooryorick added on 2023-04-16 13:14:12:

I believe commit [7015a9d04911e483] fixes the issue with 75.14 on Windows.


jan.nijtmans added on 2023-04-16 09:31:31:

This change broke the build on Windows, see: https://github.com/tcltk/tcl/actions/runs/4706656812/jobs/8348035149

2nd remark: Tcl 8.7 has "-profile strict" too, so this should be backported to 8.7 too. Even better: you should develop the fix on 8.7 and forward-merge to 9.0 then.


pooryorick added on 2023-04-15 11:57:23:

Fixed in [d481d08ed9]. This commit also ensures that position in file does not change when gets returns an error.