Tcl Library Source Code

View Ticket
Login
Bounty program for improvements to Tcl and certain Tcl packages.
Tcl 2019 Conference, Houston/TX, US, Nov 4-8
Send your abstracts to [email protected]
or submit via the online form by Sep 9.
Ticket UUID: ced089d5fec86a1b4722ffbd93810820ccc06845
Title: Multiplexer test continues to fail on FreeBSD
Type: Bug Version: 1.17
Submitter: mi Created on: 2015-05-27 20:58:46
Subsystem: multiplexer Assigned To: aku
Priority: 5 Medium Severity: Important
Status: Closed Last Modified: 2015-06-08 20:31:27
Resolution: Fixed Closed By: aku
    Closed on: 2015-06-08 20:31:27
Description:

The problem was originally reported as a SourceForge Bug #1212, and closed as unreproducible on Ubuntu. Well, it remains a problem on FreeBSD today, five years after the original report.

Please, do the needful.

User Comments: aku added on 2015-06-08 20:31:27:
Merged to trunk [debee3c876].
Pushed.

aku added on 2015-06-04 06:42:07:
Patch applied, into branch "tkt-ced089d5fe-multiplexer".
Commit [9bfb503d18].
Pushed.

Mikhail, can you confirm that this is a fix for the testsuite in your FreeBSD environs ?

anonymous (claiming to be aspect) added on 2015-06-04 00:54:38:

multiplexer-5.2 ensures that an access filter can deny (immediately close) inbound connections correctly, by checking that the first write from the client fails. Adding a second write after 200ms seems to do the right thing:

$ uname -r
10.1-RELEASE-p6
$ ~/bin/tclkit multiplexer.test
- tcllib::testutils 1.2
* logger 0.9.4
* multiplexer 0.2
multiplexer.test:       Total   9       Passed  9       Skipped 0       Failed  0
Thanks for the suggestion - I wasn't thinking clearly about Nagle and thought a delay before the first write should be sufficient. Patch inline below:
Index: modules/multiplexer/multiplexer.test
==================================================================
--- modules/multiplexer/multiplexer.test
+++ modules/multiplexer/multiplexer.test
@@ -193,22 +193,26 @@
     set ::forever {}
     set mp [multiplexer::create]
     ${mp}::Init 37465
     ${mp}::AddAccessFilter DenyAccessFilter
     set sk1 [socket localhost 37465]
-    set sk2 [socket localhost 37465]
-    update
-    fconfigure $sk1 -buffering none
-    if { [catch {
-	puts $sk1 "boom"
-    } err] } {
-	set result "socket blocked"
-    } else {
-	set result "socket not blocked"
+    after idle {
+	update
+	fconfigure $sk1 -buffering none
+	if { [catch {
+	    puts $sk1 "boom"
+	    after 200	;# delay to overcome nagle - see ticket [ced089d5fe]
+	    puts $sk1 "tish"
+	} err] } {
+	    set ::forever "socket blocked"
+	} else {
+	    set ::forever "socket not blocked"
+	}
     }
+    vwait ::forever
     ${mp}::destroy
-    set result
+    set forever
 } {socket blocked}
 
 
 testsuiteCleanup
 return


aku added on 2015-06-03 18:01:19:
  > multiplexer-5.2 such that it accurately tests what it claims to.

As a non-maintainer/non-author, what does multiplexer-5.2 claim to test ?

For the record, having read both example and reference now I agree with aspect that the test is apparently sensitive to OS differences in the TCP stack. I further agree with mi that using two puts more than 200 millis apart might be enough to overcome Nagle. aspect, could you test this for us ?

mi added on 2015-06-03 13:38:49:

I don't know how best to alter multiplexer-5.2 such that it accurately tests what it claims to.

How about writing a longer text and/or making two writes with an interval between them, that's longer than 0.2 second? Even if the first write succeeds because of Nagle's algorithm, the second one ought to fail...

Of course, it would've been best, if Tcl allowed manipulating the socket's parameters (such as setting the TCP_NDELAY).


anonymous (claiming to be aspect) added on 2015-06-03 08:53:29:
I've investigated this a little, and come to the conclusion that the test failure is benign:  as [http://paste.tclers.tk/3523] illustrates and [http://www.unixguide.net/network/socketfaq/2.11.shtml] explains, the assumption that puts will fail on a blocking, unbuffered socket whose remote has closed is not valid.  It seems to be mostly true on Linux, and often false on FreeBSD.

I don't know how best to alter multiplexer-5.2 such that it accurately tests what it claims to.

mi added on 2015-05-27 21:22:08:
"Needful" is whatever is needed to fix the problem. The current stance, that "it is not a problem because it works on Ubuntu" seems unsustainable.

aku added on 2015-05-27 21:18:04:
Forgot to ask, what do you consider to be "the needful".
This is a term made to mean 3 different things to any 2 people.

aku added on 2015-05-27 21:16:27:
We do not seem to have a patch for this.
Could this be a core issue with (intra-process) sockets on FreeBSD ?
Or the Tcl core eventloop.
Looks to require more instrumentation in the testsuite to see what is going on on Linux, and then compare with FreeBSD.

aku added on 2015-05-27 21:10:11:

The local ticket in question is [3053446fffffffffffff].