Tcl Source Code

View Ticket
Login
2023-04-21
23:04 Closed ticket [a7a89d422a]: Under strict encoding, gets returns an error instead of returning the second line plus 4 other changes artifact: a629dee7d2 user: pooryorick
2023-04-19
07:15 Ticket [a7a89d422a]: 2 changes artifact: 89ad33fccb user: pooryorick
07:14 Ticket [a7a89d422a]: 3 changes artifact: a5634f9359 user: pooryorick
2023-04-18
06:14 Ticket [a7a89d422a]: 3 changes artifact: ef144bfe75 user: jan.nijtmans
05:20 Ticket [a7a89d422a]: 3 changes artifact: 8803f06fe5 user: apnadkarni
05:18
Fix io-75.1{4,5} on Windows. Ticket [a7a89d422a] check-in: 079d2d5162 user: apnadkarni tags: trunk, main
02:53 Ticket [a7a89d422a] Under strict encoding, gets returns an error instead of returning the second line status still Open with 3 other changes artifact: dbf90cb72e user: apnadkarni
02:52 Ticket [a7a89d422a]: 3 changes artifact: da30571786 user: apnadkarni
2023-04-16
23:24 Ticket [a7a89d422a]: 3 changes artifact: bb0c9f2ff9 user: pooryorick
13:14 Ticket [a7a89d422a]: 3 changes artifact: 0704959a8c user: pooryorick
09:31 Open ticket [a7a89d422a]. artifact: 1a85560c66 user: jan.nijtmans
2023-04-15
11:57 Pending ticket [a7a89d422a]. artifact: 5b53efe494 user: pooryorick
11:54
Fix for [a7a89d422a4f5dd3], Under strict encoding, [gets] returns an error instead of returning the ... check-in: d481d08ed9 user: pooryorick tags: trunk, main
2023-04-14
18:25 New ticket [a7a89d422a] Under strict encoding, gets returns an error instead of returning the second line. artifact: 95a0384550 user: pooryorick

Ticket UUID: a7a89d422a4f5dd372e61a75acf5381142dde80
Title: Under strict encoding, [gets] returns an error instead of returning the second line
Type: Bug Version:
Submitter: pooryorick Created on: 2023-04-14 18:25:42
Subsystem: - New Builtin Commands Assigned To: pooryorick
Priority: 5 Medium Severity: Important
Status: Closed Last Modified: 2023-04-21 23:04:00
Resolution: Fixed Closed By: pooryorick
    Closed on: 2023-04-21 23:04:00
Description:

In the following script, the third line of data in the channel contains a byte sequence that is not valid utf-8. [gets] should successfully return two lines before returning an error, but it returns an error instead of returning the second line:

set chan [file tempfile]
chan configure $chan -translation binary
# This is not valid UTF-8
puts $chan hello\nAB\nCD\xc0\x40EF\nGHI
flush $chan
seek $chan 0

chan configure $chan -encoding utf-8 -profile strict
gets $chan
gets $chan

User Comments: jan.nijtmans added on 2023-04-18 06:14:27:

> @pouryorick wrote:

I am opposed to the release of Tcl 8.7 ...

Well, I'm opposed to the release of Tcl 9.0 while it still contains testcase failures .... ;-) (So, many thanks, @ashok!)


apnadkarni added on 2023-04-18 05:20:19:
Fixed, I think, by [079d2d5162]. Please review.

apnadkarni added on 2023-04-18 02:53:42:
Oops, last comment commit should link should have been [5995ca9b6c]

apnadkarni added on 2023-04-18 02:52:46:

As of commit [] on trunk, io-75.{14,15} are still failing on Windows.

==== io-75.14 [gets] succesfully returns lines prior to error

        invalid utf-8 encoding [gets] continues in non-strict mode after error FAILED
==== Contents of test case:

    lappend res [gets $chan]
    lappend res [gets $chan]
    set status [catch {gets $chan} cres copts]
    lappend res $status $cres
    chan configure $chan -profile tcl8
    lappend res [gets $chan]
    lappend res [gets $chan]
    close $chan
    return $res

---- Result was:
{a\x0D} {b\x0D} 1 {error reading "file1a8aec7b500": invalid or incomplete multibyte or wide character} {c\xC0\x0D} {d\x0D}
---- Result should have been (glob matching):
a b 1 {error reading "*": invalid or incomplete multibyte or wide character} c\xC0 d
==== io-75.14 FAILED



==== io-75.15 invalid utf-8 encoding strict
    gets does not hang
    gets succeeds for the first two lines FAILED
==== Contents of test case:

    #Now try to read it with [gets]
    fconfigure $chan -encoding utf-8 -profile strict
    lappend res [gets $chan]
    lappend res [gets $chan]
    set status [catch {gets $chan} cres copts]
    lappend res $status $cres
    set status [catch {gets $chan} cres copts]
    lappend res $status $cres
        chan configure $chan -translation binary
        set data [read $chan 4]
        foreach char [split $data {}] {
                scan $char %c ord
                lappend res [format %x $ord]
        }
    fconfigure $chan -encoding utf-8 -profile strict
        lappend res [gets $chan]
        lappend res [gets $chan]
    return $res

---- Result was:
hello AB 1 {error reading "file1a8aecf4570": invalid or incomplete multibyte or wide character} 1 {error reading "file1a8aecf4570": invalid or incomplete multibyte or wide character} 43 44 c0 40 {EF\x0D} {GHI\x0D}
---- Result should have been (glob matching):
hello AB 1 {error reading "*": invalid or incomplete multibyte or wide character} 1 {error reading "*": invalid or incomplete multibyte or wide character} 43 44 c0 40 EF GHI
==== io-75.15 FAILED


pooryorick added on 2023-04-16 23:24:54:

I am opposed to the release of Tcl 8.7 because it has a different alphabet for string values than Tcl 9 does, and releasing it will do more harm than good. Therefore I'm unlikely to spend much time maintaining it. I'd rather see it die an immediate death.


pooryorick added on 2023-04-16 13:14:12:

I believe commit [7015a9d04911e483] fixes the issue with 75.14 on Windows.


jan.nijtmans added on 2023-04-16 09:31:31:

This change broke the build on Windows, see: https://github.com/tcltk/tcl/actions/runs/4706656812/jobs/8348035149

2nd remark: Tcl 8.7 has "-profile strict" too, so this should be backported to 8.7 too. Even better: you should develop the fix on 8.7 and forward-merge to 9.0 then.


pooryorick added on 2023-04-15 11:57:23:

Fixed in [d481d08ed9]. This commit also ensures that position in file does not change when gets returns an error.