Tcl Source Code

View Ticket
Login
Ticket UUID: a7a89d422a4f5dd372e61a75acf5381142dde80
Title: Under strict encoding, [gets] returns an error instead of returning the second line
Type: Bug Version:
Submitter: pooryorick Created on: 2023-04-14 18:25:42
Subsystem: - New Builtin Commands Assigned To: pooryorick
Priority: 5 Medium Severity: Important
Status: Closed Last Modified: 2023-04-21 23:04:00
Resolution: Fixed Closed By: pooryorick
    Closed on: 2023-04-21 23:04:00
Description: (text/x-fossil-wiki)
In the following script, the third line of data in the channel contains a byte
sequence that is not valid utf-8.  [gets] should successfully return two lines
before returning an error, but it returns an error instead of returning the
second line:

<blockquote><code><verbatim>
set chan [file tempfile]
chan configure $chan -translation binary
# This is not valid UTF-8
puts $chan hello\nAB\nCD\xc0\x40EF\nGHI
flush $chan
seek $chan 0

chan configure $chan -encoding utf-8 -profile strict
gets $chan
gets $chan
</verbatim></code></blockquote>
User Comments: jan.nijtmans added on 2023-04-18 06:14:27: (text/x-fossil-wiki)
> @pouryorick wrote:
<pre>
I am opposed to the release of Tcl 8.7 ...
</pre>

Well, I'm opposed to the release of Tcl 9.0 while it still contains testcase failures .... ;-)      (So, many thanks, @ashok!)

apnadkarni added on 2023-04-18 05:20:19: (text/x-fossil-plain)
Fixed, I think, by [079d2d5162]. Please review.

apnadkarni added on 2023-04-18 02:53:42: (text/x-fossil-plain)
Oops, last comment commit should link should have been [5995ca9b6c]

apnadkarni added on 2023-04-18 02:52:46: (text/x-markdown)
As of commit [] on trunk, io-75.{14,15} are still failing on Windows.


```
==== io-75.14 [gets] succesfully returns lines prior to error

        invalid utf-8 encoding [gets] continues in non-strict mode after error FAILED
==== Contents of test case:

    lappend res [gets $chan]
    lappend res [gets $chan]
    set status [catch {gets $chan} cres copts]
    lappend res $status $cres
    chan configure $chan -profile tcl8
    lappend res [gets $chan]
    lappend res [gets $chan]
    close $chan
    return $res

---- Result was:
{a\x0D} {b\x0D} 1 {error reading "file1a8aec7b500": invalid or incomplete multibyte or wide character} {c\xC0\x0D} {d\x0D}
---- Result should have been (glob matching):
a b 1 {error reading "*": invalid or incomplete multibyte or wide character} c\xC0 d
==== io-75.14 FAILED



==== io-75.15 invalid utf-8 encoding strict
    gets does not hang
    gets succeeds for the first two lines FAILED
==== Contents of test case:

    #Now try to read it with [gets]
    fconfigure $chan -encoding utf-8 -profile strict
    lappend res [gets $chan]
    lappend res [gets $chan]
    set status [catch {gets $chan} cres copts]
    lappend res $status $cres
    set status [catch {gets $chan} cres copts]
    lappend res $status $cres
        chan configure $chan -translation binary
        set data [read $chan 4]
        foreach char [split $data {}] {
                scan $char %c ord
                lappend res [format %x $ord]
        }
    fconfigure $chan -encoding utf-8 -profile strict
        lappend res [gets $chan]
        lappend res [gets $chan]
    return $res

---- Result was:
hello AB 1 {error reading "file1a8aecf4570": invalid or incomplete multibyte or wide character} 1 {error reading "file1a8aecf4570": invalid or incomplete multibyte or wide character} 43 44 c0 40 {EF\x0D} {GHI\x0D}
---- Result should have been (glob matching):
hello AB 1 {error reading "*": invalid or incomplete multibyte or wide character} 1 {error reading "*": invalid or incomplete multibyte or wide character} 43 44 c0 40 EF GHI
==== io-75.15 FAILED

```

pooryorick added on 2023-04-16 23:24:54: (text/x-fossil-wiki)
I am opposed to the release of Tcl 8.7 because it has a different alphabet for
string values than Tcl 9 does, and releasing it will do more harm than good.
Therefore I'm unlikely to spend much time maintaining it.  I'd rather see it
die an immediate death.

pooryorick added on 2023-04-16 13:14:12: (text/x-fossil-wiki)
I believe commit [7015a9d04911e483] fixes the issue with 75.14 on Windows.

jan.nijtmans added on 2023-04-16 09:31:31: (text/x-fossil-wiki)
This change broke the build on Windows, see: [https://github.com/tcltk/tcl/actions/runs/4706656812/jobs/8348035149]

2nd remark: Tcl 8.7 has "-profile strict" too, so this should be backported to 8.7 too. Even better: you should develop the fix on 8.7 and forward-merge to 9.0 then.

pooryorick added on 2023-04-15 11:57:23: (text/x-fossil-wiki)
Fixed in [d481d08ed9].  This commit also ensures that position in file does not change when <code>gets</code> returns an error.