History Of Ticket 88c0c8496999c48f513eb4f97aaa0ac9829b35d3

Artifacts Associated With Ticket 88c0c8496999c48f513eb4f97aaa0ac9829b35d3

  1. Ticket change [639511f397] (rid 1593) by gustafn3 on 2023-10-22 11:55:59:

    1. foundin initialized to: "tcltls-1.7.22"
    2. icomment:
      The EOF handling has changed in OpenSSL 1.1.1e, where it changed from SSL_ERROR_SYSCALL with errno 0 to SSL_ERROR_SSL with reason code SSL_R_UNEXPECTED_EOF_WHILE_READING [1]. This change in OpenSSL requires also adjustments in applications using OpenSSL (see, e.g., [2]), including tcltls.
      
      We noticed the problem when upgrading a machine from CentOS 7 to Rocky
      Linux 9, where after the upgrade a script like the following stopped
      working:
      
      ````
      $ /usr/local/ns/bin/tclsh8.6
      % package require tls                 
      % set f [tls::socket localhost 8443]
      % puts $f "GET / HTTP/1.0\n"
      % flush $f
      % set content [read $f]
      % close $f
      ````
      
      The problem manifests itself in the "read" operation, where first, it
      transfers the full content, and then it reports "software caused
      connection abort". See below the output from the debug macros of
      tcltls.
      
      ````
      ./tlsIO.c:385:TlsInputProc():BIO_read(4096)
      ...
      ./tlsIO.c:422:TlsInputProc():BIO_read -> 465
      ./tlsIO.c:425:TlsInputProc():BIO_read returned err 0
      ...
      ./tlsIO.c:422:TlsInputProc():BIO_read -> 207
      ./tlsIO.c:425:TlsInputProc():BIO_read returned err 0
      ...
      ./tlsBIO.c:262:BioCtrl():Got BIO_CTRL_EOF
      ./tlsBIO.c:127:BioWrite():[chan=0x1438a7990] BioWrite(24) -> 24 [tclEof=1; tclErrno=0]
      ./tlsBIO.c:148:BioWrite():Successfully wrote some data
      ...
      ./tls.c:180:InfoCallback():Called
      ./tlsIO.c:422:TlsInputProc():BIO_read -> 0
      ./tlsIO.c:425:TlsInputProc():BIO_read returned err 1
      ./tlsIO.c:460:TlsInputProc():SSL negotiation error, indicating that the connection has been aborted
      ./tls.c:367:Tls_Error():Called
      ./tlsIO.c:502:TlsInputProc():Input(4096) -> -1 [53]
      ./tlsIO.c:719:TlsWatchProc():TlsWatchProc(0x0)
      ./tlsIO.c:728:TlsWatchProc():statePtr->flags=0
      ./tlsIO.c:992:Tls_GetParent():Requested to get parent of channel 0x1438a0790
      ./tlsIO.c:754:TlsWatchProc():Registering our interest in the lower channel (chan=0x1438a7990)
      error reading "sock144076990": software caused connection abort
      ````
      
      The problem exists not only on Linux, but as well on macOS (13.6)
      Below is a patch that fixes the problem without going into the (version
      dependent) error code / error reason handling of OpenSSL, since this approach makes the issue more transparent. This patch below was tested with Tcl 8.6.13, tcltls-1.7.22 and OpenSSL 3.1.3 (19 Sep 2023).
      
      ````
      $ diff -wu tlsIO.c-orig tlsIO.c
      --- tlsIO.c-orig	2020-10-12 22:39:22
      +++ tlsIO.c	2023-10-22 12:33:11
      @@ -420,6 +420,18 @@
       	ERR_clear_error();
       	bytesRead = BIO_read(statePtr->bio, buf, bufSize);
       	dprintf("BIO_read -> %d", bytesRead);
      +
      +	if (bytesRead == 0 && Tcl_Eof(statePtr->self)) {
      +            /* 
      +             * We know through BIO_CTRL_EOF that we are already at
      +             * EOF (determined during BIO_read()). There is no need to
      +             * try to handle this situation via error and reason codes
      +             * from OpenSSL.
      +             */
      +             dprintf("tried to read while channel is already at EOF");
      +             *errorCodePtr = 0;
      +             return(bytesRead);
      ````
      
      
      [1] https://mta.openssl.org/pipermail/openssl-project/2020-May/001975.html   
      [2] https://groups.google.com/g/mailing.openssl.users/c/9C2rT9WVqW8/m/1F-8JWnzAQAJ
      
    3. login: "gustafn3"
    4. mimetype: "text/x-markdown"
    5. private_contact initialized to: "ef1993a98c1daa778fe0b246a7af12b3076f2240"
    6. severity initialized to: "Critical"
    7. status initialized to: "Open"
    8. title initialized to:
      EOF handling potentially broken with OpenSSL 1.1.1e or newer
      
    9. type initialized to: "Code Defect"
  2. Ticket change [515aa59d81] (rid 1596) by azazel on 2023-11-10 21:15:10:

    1. icomment:
      There is an SSL option governing this behaviour: `SSL_OP_IGNORE_UNEXPECTED_EOF`.  There are reasons why OpenSSL made this change.  See:
      
      * https://github.com/openssl/openssl/issues/10880
      * https://github.com/openssl/openssl/pull/10882
      
      My initial thought was unconditionally to set the option:
      
      	--- tls.c
      	+++ tls.c
      	@@ -1212,10 +1212,13 @@
      	 #endif
      	     
      	     SSL_CTX_set_app_data( ctx, (VOID*)interp); /* remember the interpreter */
      	     SSL_CTX_set_options( ctx, SSL_OP_ALL);     /* all SSL bug workarounds */
      	     SSL_CTX_set_options( ctx, off);    /* all SSL bug workarounds */
      	+#ifdef SSL_OP_IGNORE_UNEXPECTED_EOF
      	+    SSL_CTX_set_options( ctx, SSL_OP_IGNORE_UNEXPECTED_EOF);
      	+#endif
      	     SSL_CTX_sess_set_cache_size( ctx, 128);
      	 
      	     if (ciphers != NULL)
      	        SSL_CTX_set_cipher_list(ctx, ciphers);
      	 
      
      However, given that this change in behaviour was intended to reveal errors that were previously unreported, I think it may make sense to add an option (this is what openssl did for s_client): _e.g._, `-ignoreunexpectedeof bool`.  The next question is whether to default it to true or false.  To preserve backwards compatibility and be liberal in what we accept, my inclination would be to default to true.
      
    2. login: "azazel"
    3. mimetype: "text/x-markdown"
    4. priority changed to: "Immediate"
    5. resolution changed to: "Open"
  3. Ticket change [2f9a729234] (rid 1597) by gustafn3 on 2023-11-14 09:58:13:

    1. icomment:
      I am not arguing about the changed behavior in OpenSSL about of the unexpected EOF handling (which hit many projects). No matter how the default should be, the old EOF handling of tcltls (relying on error states) is not ok and lead to a breaking behavior for us when upgrading OS versions. My suggested fix makes EOF handling clear, more robust, and avoids the error situation. 
      
      The decision to set SSL_OP_IGNORE_UNEXPECTED_EOF or not (and maybe making it configurable) is independent of this.
      
    2. login: "gustafn3"
    3. mimetype: "text/x-markdown"
  4. Ticket change [9bf50203b7] (rid 1603) by Avalanchee on 2024-01-01 12:32:52:

    1. icomment:
      Is this project dead?
      I mean I'm very thankful for the active participants here, so I was able to compile the patch into my deployment.
      However this is a breaking bug that should be handled with utmost urgency.
      
      From looking at the timeline it doesn't seem like anyone is maintaining this project, am I wrong?
      
      I would like to kindly ask to promote this fix into release, and also the "feature-dump-keys" branch if possible.
      Thanks.
      
    2. login: "Avalanchee"
    3. mimetype: "text/x-markdown"
  5. Ticket change [28d32ca3e2] (rid 3555) by bohagan on 2024-05-28 00:30:50:

    1. foundin changed to: "1.8"
    2. icomment:
      This should be fixed by the changes in [5a33efb87bd3750f]. I tested with a few
      files that were reported to have the issue, and they worked. Please test with
      the trunk version and let me know if you you have any URLs that still fail due
      to the unexpected EOF. Thanks.
      
    3. login: "bohagan"
    4. mimetype: "text/x-fossil-plain"