Ticket Change Details
Overview

Artifact ID: 9218f6577647a42cae4abb3c99055397d713a120619f4c5e3d5fce09cd54a66c
Ticket: 94c6a431fee47acdb590ee3963704ef1d756a5cf
Buffering until timeout
User & Date: anonymous on 2018-04-01 14:06:53
Changes

  1. foundin changed to: "1.7.16"
  2. icomment:
    This bug might not be in Tcltls: it might be in Tcl stacked channels, the Tcl
    [read] command, fileevent generation or buffering.
    
    The bug occurs when using Tcltls 1.7.16 with the stock Tcl 8.6.8 and
    LibreSSL 2.6.4, but has existed for a long time: it also occurs in older
    versions at least as far back as Tcltls 1.6.7, Tcl 8.5.13 and OpenSSL 1.02d.
    
    When a channel is opened with tls::socket, a non-blocking [read] does not
    always return the available data, sometimes (if -blocksize is 4096) even if it
    exceeds the size requested; instead it waits until the remote server closes
    the connection.
    
    When using http::geturl with tls::socket, a HTTPS request for an entity that
    is larger than the -blocksize hangs if the server response uses a persistent
    connection and does not use chunked encoding (but does supply an accurate
    Content-Length header).  The client has some successful [reads], but then
    does nothing until the server times out, although the full response has been
    received and is buffered somewhere.
    
    The relevant part of http is in the command http::Event, for unchunked data
    with known size $state(totalsize) obtained from a Content-Length header.  The
    relevant part of the code can be abstracted as:
    
    fconfigure $sock -blocking off -translation binary -buffersize $state(-blocksize)
    fileevent $sock readable [list http::Event $sock $token]
    
    proc http::Event {sock token} {
        variable $token
        upvar 0 $token state
    
        # 1. How many bytes to ask for.
        set reqSize $state(-blocksize)
    
        # 2. Read.
        set block [read $sock $reqSize]
        set n [string length $block]
        if {$n >= 0} {
            append state(body) $block
            incr state(currentsize) $n
        }
    
        # 3. Check for end of data.
        if {$state(currentsize) >= $state(totalsize)} {
            Eof $token
        }
        return
    }
    
    The bug can be worked around by precisely specifying the size of the
    untransferred response body.
    
        # 1. How many bytes to ask for.
        set reqSize [expr {$state(totalsize) - $state(currentsize)}]
    
    This workaround fails if reqSize is capped at $state(-blocksize):
    
        # 1. How many bytes to ask for.
        set reqSize [expr {min($reqSize, $state(-blocksize))}]
    
    When the hanging occurs, more than $state(-blocksize) bytes can be available,
    (if -blocksize is 4096) but they are not returned by [read] until timeout.
    
    The properties read by [fconfigure $sock] (when added to http::Event) are:
      -blocking 0
      -buffering full
      -buffersize 8192
      -encoding binary
      -eofchar {{} {}}
      -translation {lf crlf}
    The last three properties correspond (for reading) to the results of the command
      fconfigure $sock -translation binary
    
    read(n) describes non-blocking mode with two arguments:
    
          read channelId numChars
    
      If  channelId is in nonblocking mode, the command may not read as many
      characters as requested: once all available input has been read, the command
      will return the data that is available rather than blocking for more input.
      If the channel is configured to use a multi-byte encoding, then there may
      actually be some bytes remaining in the internal buffers  that do not form
      a  complete character.  These bytes will not be returned until a complete
      character is available or end-of-file is reached.  The -nonewline switch is
      ignored if the command returns before reaching the end of the file.
    
    The hanging occurs even with a binary resource, fetched with -encoding binary,
    so the multi-byte issue does not arise.  There is no problem with http, only
    with https where [tls::socket] is used in place of [socket]. read(n) makes
    clear that in nonblocking mode, the command will not wait to receive the number
    of characters requested.
    
    Example code that demonstrates the delay:
    
    # The http package sets both the -buffersize for the socket, and the number of
    # bytes requested with [read], to the http::geturl option -blocksize, which has
    # default value 8192.
    
    package require http
    
    set start [clock milliseconds]
    
    proc http::Log {args} {
        set time [expr {[clock milliseconds] - $::start}]
        puts stderr [list $time {*}$args]
        flush stderr
        return
    }
    
    package require tls
    http::register https 443 [list ::tls::socket -ssl2 0 -ssl3 0 -tls1 1 -tls1.1 1 -tls1.2 1]
    
    proc GetFromWeb {blocksize url} {
        set ::prevSize 0
        set token    [::http::geturl $url     \
                       -blocksize $blocksize  \
                       -keepalive 1           \
                       -progress httpProgress \
                       -headers {accept-encoding unsupported} \
        ]
        array get $token
        # N.B. Implicit Return
    }
    
    proc httpProgress {token total current} {
        upvar 0 $token state
        set msg {, and therefore will not demonstrate the bug.}
    
        if {    [info exists state(transfer)]
             && ($state(transfer) eq "chunked")
        } {
    	puts "The response uses chunked transfer encoding, $msg"
        } elseif {$total == 0} {
    	puts "The server has not reported the size of the entity, $msg"
        }
        if {$state(connection) eq {close}} {
    	puts "The server will close the connection, $msg"
        }
        set time [expr {[clock milliseconds] - $::start}]
        set extra [expr {$current - $::prevSize}]
        set ::prevSize $current
        puts "$time Fetched $extra bytes - accumulated $current bytes of total $total"
        flush stdout
    }
    
    
    set blocksize 8192
    
    # This URL serves (at least in the UK) a resource of about 300kB, unchunked, and
    # times out after approx 13s.
    
    set url https://www.bbc.co.uk/
    
    # These URLs demonstrate that http sites (using [socket]) do not
    # trigger the bug.
    
    # set url http://news.mit.edu/sites/mit.edu.newsoffice/files/styles/news_article_image_top_slideshow/public/images/2018/MIT-Debris-Disks_0.jpg?itok=kGim01Xz
    # set url http://static01.nyt.com/images/2018/03/30/theater/30threetallwomen2/merlin_135911256_e2911f33-8b35-445d-90ee-d15457c19fae-superJumbo.jpg
    
    # The corresponding https URLs do trigger the bug, but are inconvenient to use
    # because the server does not time out for about 10 minutes.
    
    # set url https://news.mit.edu/sites/mit.edu.newsoffice/files/styles/news_article_image_top_slideshow/public/images/2018/MIT-Debris-Disks_0.jpg?itok=kGim01Xz
    # set url https://static01.nyt.com/images/2018/03/30/theater/30threetallwomen2/merlin_135911256_e2911f33-8b35-445d-90ee-d15457c19fae-superJumbo.jpg
    
    set start [clock milliseconds]
    set img [GetFromWeb $blocksize $url]
    set NUL {}
    
  3. login: "anonymous"
  4. mimetype: "text/plain"
  5. private_contact changed to: "a774b1f5653dc3360e29ea7d635a64e94c9f1dab"
  6. severity changed to: "Important"
  7. status changed to: "Open"
  8. title changed to: "Buffering until timeout"
  9. type changed to: "Incident"