Tcl package Thread source code

View Ticket
Login
Ticket UUID: b5709ea9060d17f58ba48110351c964b3408e362
Title: [::thread::send -async] posting order not respected when sending to current thread
Type: Bug Version: 2.8.0
Submitter: adrianmedranocalvo Created on: 2018-07-31 11:27:59
Subsystem: 80. Thread Package Assigned To: nobody
Priority: 5 Medium Severity: Important
Status: Closed Last Modified: 2022-01-14 15:16:10
Resolution: Fixed Closed By: jan.nijtmans
    Closed on: 2022-01-14 15:16:10
Description:
if {0} {
When the target thread of [::thread::send -async] is the current thread, the
async script is not evaluated in order with respect to other [::thread::send
-async] scripts.

Let's call a [::thread::send -async] invoked from the current thread and
directed to the current thread a local [::thread::send -async], and a
[::thread::send -async] invoked from any other thread and directed to the former
thread a remote [::thread::send -async].

Current behaviour is that all remote [::thread::send -async] scripts will be
evaluated before any local [::thread::send -asinc] scripts are, even in the case
where the local scripts have been posted before the remote ones.  See the script
below for a concise example.

Intuitively, [::thread::send -async] scripts should be queued and evaluated in
the order they were added to the queue (that is, except if the -head argument
was given).

The script below demostrates this issue.  We append to a global variable in
the main thread in four different ways, in order:
- An after 0 script
- An after idle script
- A local [thread::send -async]
- A remote [thread::send -async]
- Another after 0 script
- Another after idle script

The printed trace of the script is reproduced below.  Contrary to the posting
order, the remote [thread::send -async] is evaluated before the local
[thread::send -async].

~~~
REMOTE-THREAD-SEND
AFTER0-BEG
AFTER0-END
IDLE-BEG
LOCAL-THREAD-SEND
IDLE-END
~~~

Without knowing the details of the Thread package, one would expect the local
and remote [thread::send -async] to be evaluated in the order they were
posted.  Instead, the remote [thread::send -async] are invariably evaluated
before the local ones; and the local ones are evaluated as [after idle] commands.

The cause for this behaviour is an optimization in threadCmd.c, commented as
"Short circuit sends to ourself", where local [thread::send -async] are
converted to [after idle], implying that a different queue is used for local
[thread::send -async] events as oposed to remote [thread::send -async] events.

How important is this optimization?  I can't see much performance being lost if
we would disable the optimization and would let local [::thread::send -async] be
added to the queue along with all others.  The benefit from this change would be
a more consistent behaviour.
    
The underlying use-case is a server with connection-specific threads.  The
connection-specific thread uses local [thread::send -async] in order to delay
the flushing of responses, so that the application's locks are grabbed for a
minimum amount of time.  Other threads use remote [thread::send -async] to the
connection thread in order to send messages through the connection.  The
optimization above difficults ensuring the order the responses are flushed.

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
}

package require Thread;

set t(main) [::thread::id]
set t(other) [::thread::create]

# Let t(other) know the main thread's id, t(main).
::thread::send $t(other) [list array set t [array get t]]

after 0 {lappend ::trace AFTER0-BEG;}
after idle {lappend ::trace IDLE-BEG;}

::thread::send -async $t(main) {
    lappend ::trace LOCAL-THREAD-SEND;
}
::thread::send $t(other) {
    ::thread::send -async $t(main) {
	lappend ::trace REMOTE-THREAD-SEND;
    }
}

after 0 {lappend ::trace AFTER0-END;}
after idle {lappend ::trace IDLE-END;}

after 100 {set forever now};
vwait forever;
puts [join $::trace "\n"];

# ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
User Comments: jan.nijtmans added on 2022-01-14 15:16:10:

Backported to thread-2-8-branch now. Will be in Thread 2.8.8


adrianmedranocalvo added on 2019-12-19 11:04:04:
Hello Sebres,

first of all, thank you for the well thought change you implemented.  I have tested it and it works perfectly.

I'd like to request two additional, related changes:

1. In order to satisfy older compilers, variables must be declared at the beginning of a block.  Could you change the code as follows?

<code>
    if (thrId == Tcl_GetCurrentThread() && (flags & THREAD_SEND_WAIT)) {
        int code = TCL_OK;
        Tcl_MutexUnlock(&threadMutex);

        if (!(flags & THREAD_SEND_HEAD)) {
            /*
             * Be sure all already queued events are processed before this event
             */
            while ( Tcl_DoOneEvent((TCL_ALL_EVENTS & ~TCL_IDLE_EVENTS)|TCL_DONT_WAIT) ) {};
        }
        /* call it synchronously right now */
        code = (*send->execProc)(interp, (ClientData)send);
        ThreadFreeProc((ClientData)send);
        return code;
    }
</code>

2. The patch is installed in trunk, but not in the thread-8.2-branch, so it was not included in the recently released Tcl 8.6.10.  Would it be possible to back-port it to the thread-8.2-branch?

Thank you, and please let me know if I can help in any way.

Best regards,
Adrián.

sebres added on 2019-07-26 15:42:24:

Also note that basically thread::send -head [thread::id] ... is nothing else as namespace inscope :: ... and co (note the global scope), so is more for the consistency (e. g. compatibility) reasons there...

Just even slower a bit. PoC:

% set ev [list set v 1]; 
- timerate {thread::send -head [thread::id] $ev}
+ timerate {namespace inscope :: $ev}
- 1.627392 µs/# 599904 # 614480 #/sec 976.279 net-ms
+ 0.477992 µs/# 1932230 # 2092084 #/sec 923.591 net-ms

But opposed to that, the thread::send [thread::id] ... makes real sense, so to be interpreted as meaning that: add an event to tail of the event queue and wait for completion (also process all other events up to this one).
The fact, that the event has not been really added to the queue, changes nothing on the whole handling. This way the new behavior (circuit implementation) is more similar to synchronous send to any other threads (no matter the head or tail of queue it'd target).


sebres added on 2019-07-26 14:38:16:

    after 0                                {lappend res TMR-0}
    after idle                             {lappend res IDLE}
    thread::send -async       [thread::id] {lappend res ASYNC}
    thread::send -async -head [thread::id] {lappend res AHEAD}
    thread::send        -head [thread::id] {lappend res HEAD}
    thread::send              [thread::id] {lappend res SYNC}
generates either:
    HEAD TMR-0 AHEAD ASYNC SYNC IDLE
    # ... or (timer event received later up-to following) ...
    HEAD AHEAD ASYNC SYNC TMR-0 IDLE
Note that due to internal handling the time point of TMR-0 event is rather undefined and may deviate on several tcl-versions or forks (e. g. some NRT- resp. RTS-capable), so it can move between HEAD and IDLE.


adrianmedranocalvo added on 2019-07-26 14:27:44:
Dear sebres,

thank you for your response.

If I understand correctly, you are saying that synchronous sends should behave just like asynchronous, only causing the caller to wait for the result.  That is, they should not be evaluated at the earliest opportunity, instead being evaluated after all other ::thread::sends to the target thread.

In my opinion that is a desirable feature, specially given that it can be "disabled" by using -head.  On the other hand, I think it would be understandable that -async sends and synchronous ones are not ordered.

Please post a comment once you push your changes; I'll have a look and test them.

Thank you,
Adrián.

sebres added on 2019-07-26 14:27:08:

Fixed in [f0bd14f53f], this close.
Thank you, Adrian.


sebres added on 2019-07-26 14:15:59:

The subject is rather complicated...

I found and fixed this already in my fork (which is unfortunately incompatible and it'd be complex to back-port directly), but the test-case covering that is there, so taking a closer look I noticed the fix is not so simple as assumed, because of:

  • the option -head which can be also supplied to the
  • synchronous send (if not -head) should consider all previously dispatched events, so they should be processed earlier.
  • (test-only) there are tcl-forks, which timer-generation may be retarded (or time point where setup would generate it, is raher undefined), so timer event can be delayed.
I fixed it now and will check-in (and rebase the test) in trunk soon.


adrianmedranocalvo (claiming to be [email protected]) added on 2019-07-24 17:22:11:
Hello everyone,

could you help me get this merged?  I have a local branch implementing it, but am not allowed to push it to the repository.

adrianmedranocalvo (claiming to be [email protected]) added on 2018-08-10 10:37:55:
Hello,

Implemented the suggested change, please see patch below.  Tests pass:

~~~
Only running tests that match:  *
Skipping test files that match:  l.*.test
Only sourcing test files that match:  *.test
Tests began at Fri Aug 10 11:57:28 CEST 2018
Thread 2.8.0
Mainthread id is tid0x7fff8e159380
thread.test
tkt-84be1b5a73.test

tpool.test
tsv.test
ttrace.test

Tests ended at Fri Aug 10 11:57:43 CEST 2018
all.tcl:        Total   136     Passed  120     Skipped 16      Failed  0
Sourced 0 Test Files.
Number of tests skipped for each constraint:
        8       have_gdbm
        8       have_lmdb
~~~

~~~
diff --git c/src/tcl8.6.6/pkgs/thread2.8.0/generic/threadCmd.c i/src/tcl8.6.6/pkgs/thread2.8.0/generic/threadCmd.c
index 9744230c3..b9af96475 100644
--- c/src/tcl8.6.6/pkgs/thread2.8.0/generic/threadCmd.c
+++ i/src/tcl8.6.6/pkgs/thread2.8.0/generic/threadCmd.c
@@ -343,9 +343,6 @@ ThreadErrorProc(Tcl_Interp *interp);
 static void
 ThreadFreeProc(ClientData clientData);
 
-static void
-ThreadIdleProc(ClientData clientData);
-
 static void
 ThreadExitProc(ClientData clientData);
 
@@ -2697,21 +2694,13 @@ ThreadSend(interp, thrId, send, clbk, flags)
     }
 
     /*
-     * Short circuit sends to ourself.
+     * Short circuit synchronous sends to ourself.
      */
-
-    if (thrId == Tcl_GetCurrentThread()) {
+    if (thrId == Tcl_GetCurrentThread() && (flags & THREAD_SEND_WAIT)) {
         Tcl_MutexUnlock(&threadMutex);
-        if ((flags & THREAD_SEND_WAIT)) {
-            int code = (*send->execProc)(interp, (ClientData)send);
-            ThreadFreeProc((ClientData)send);
-            return code;
-        } else {
-            send->interp = interp;
-            Tcl_Preserve((ClientData)send->interp);
-            Tcl_DoWhenIdle((Tcl_IdleProc*)ThreadIdleProc, (ClientData)send);
-            return TCL_OK;
-        }
+        int code = (*send->execProc)(interp, (ClientData)send);
+        ThreadFreeProc((ClientData)send);
+        return code;
     }
 
     /*
@@ -3462,34 +3451,6 @@ ThreadSetOption(interp, thrId, option, value)
     return TCL_OK;
 }
 
-/*
- *----------------------------------------------------------------------
- *
- * ThreadIdleProc --
- *
- * Results:
- *
- * Side effects.
- *
- *----------------------------------------------------------------------
- */
-
-static void
-ThreadIdleProc(clientData)
-    ClientData clientData;
-{
-    int ret;
-    ThreadSendData *sendPtr = (ThreadSendData*)clientData;
-
-    ret = (*sendPtr->execProc)(sendPtr->interp, (ClientData)sendPtr);
-    if (ret != TCL_OK) {
-        ThreadErrorProc(sendPtr->interp);
-    }
-
-    Tcl_Release((ClientData)sendPtr->interp);
-    ThreadFreeProc(clientData);
-}
-
 /*
  *----------------------------------------------------------------------
  *
~~~

adrianmedranocalvo (claiming to be [email protected]) added on 2018-08-01 07:53:26:
Hello sebres,

great!

Indeed, the code you selected is correct, and mine was wrong.  Sorry for the confusion.

Let me know if you need any help.

Best regards,
Adrián.

sebres added on 2018-07-31 16:55:21:

Now I understand your objection (I was just confused with both after(timer) events from your example and your description).

Indeed it is possible..., but not what you provided as block "to remove", this one would be enough to remove:

61003585d6458178(lines 2752-2756)

Of course with move of `Tcl_MutexUnlock(&threadMutex);` (line 2747) 2 lines down, into block `if ((flags & THREAD_SEND_WAIT))`.

I'll take a look later as regards the test-cases and some other dependencies and behavior etc.


adrianmedranocalvo (claiming to be [email protected]) added on 2018-07-31 16:22:35:
Hello sebres,

there's definitely some misunderstanding.  English is not my mother language, it's not the first time I cause a mess...

I'll try to explain with code.

One important point is: I don't care about the order of [after idle] events nor [after 0] events with respect to [thread::send -async] events.  I only care about the order of [thread::send -async] events.

Please, try the following script:

~~~
package require Thread;

set t(main) [::thread::id]
set t(other) [::thread::create]

# Let t(other) know the main thread's id, t(main).
::thread::send $t(other) [list array set t [array get t]]

# Async thread::send to current thread (the main thread).
# I expect this script to be inserted at the end of the thread event queue.
::thread::send -async $t(main) {
    lappend ::trace LOCAL-THREAD-SEND;
}

::thread::send $t(other) {
    # Async thread::send to other thread (the main thread).
    # I expect this script to be inserted at the end of the
    # thread event queue, after the one above.
    ::thread::send -async $t(main) {
	  lappend ::trace REMOTE-THREAD-SEND;
    }
}

after 100 {set forever 1};
vwait forever;
puts [join $::trace "\n"];
~~~

This is the output:

~~~
REMOTE-THREAD-SEND
LOCAL-THREAD-SEND
~~~

This is not expected, because the [thread::send -async] lappending LOCAL-THREAD-SEND ran before the one lappending REMOTE-THREAD-SEND.

When I remove the following code:
  https://core.tcl-lang.org/thread/artifact?udc=1&ln=2742-2759&name=61003585d6458178&udc=1

I obtain the expected output:

~~~
LOCAL-THREAD-SEND
REMOTE-THREAD-SEND
~~~

Are there any downsides in removing the following code?
  https://core.tcl-lang.org/thread/artifact?udc=1&ln=2742-2759&name=61003585d6458178&udc=1

----------------------------------------

I'll now try to respond to your points.

> if you'll do that, the event will be appended directly into the queue of the thread, so in this case it will be not LAST (as by idle) but FIRST event (because timer events coming later via processing of event-sources).
>
> So again, it cannot be "on the order they were posted", because they are another types of events. It will be not between your timer-events, but either hereafter (like now) or before (if direct appended to event-queue). But definitely not between.

I think that's what I want and expect, that the [thread::send -async] is appended to the queue of the thread.  I don't expect them to be handled between other types of events.

> For this case "thread::send -async [thread::id]" should be quasi rewritten as "after 0", and I'm definitely against this, because it produces unexpected overhead.

I'm also against this.  In fact that's the precise problem: "thread::send -async [thread::id]" is being rewritten into "after idle".  See the part of the code where that happens (https://core.tcl-lang.org/thread/artifact?udc=1&ln=2742-2759&name=61003585d6458178&udc=1):

  2742      /*
  2743       * Short circuit sends to ourself.
  2744       */
  2745  
  2746      if (thrId == Tcl_GetCurrentThread()) {
  2747          Tcl_MutexUnlock(&threadMutex);
  2748          if ((flags & THREAD_SEND_WAIT)) {
  2749              int code = (*send->execProc)(interp, (ClientData)send);
  2750              ThreadFreeProc((ClientData)send);
  2751              return code;
  2752          } else {
  2753              send->interp = interp;
  2754              Tcl_Preserve((ClientData)send->interp);
  2755              Tcl_DoWhenIdle((Tcl_IdleProc*)ThreadIdleProc, (ClientData)send);
  2756              return TCL_OK;
  2757          }
  2758      }

> Possibly for my another event-branch (that I wrote in tcl-core-repo), where "after 0" have O(1) by changing of timer-queue, does not generate extra overhead as regards time-compare at all (because creates an immediate timer-event), etc.
If this will be accepted sometimes... 

More performant event handling is always welcome... thank you for your efforts, I hope it gets merged soon.

sebres added on 2018-07-31 15:44:54:

But you did still not understand...

if you'll do that, the event will be appended directly into the queue of the thread, so in this case it will be not LAST (as by idle) but FIRST event (because timer events coming later via processing of event-sources).

So again, it cannot be "on the order they were posted", because they are another types of events. It will be not between your timer-events, but either hereafter (like now) or before (if direct appended to event-queue). But definitely not between.

For this case "thread::send -async [thread::id]" should be quasi rewritten as "after 0", and I'm definitely against this, because it produces unexpected overhead.

Possibly for my another event-branch (that I wrote in tcl-core-repo), where "after 0" have O(1) by changing of timer-queue, does not generate extra overhead as regards time-compare at all (because creates an immediate timer-event), etc.
If this will be accepted sometimes...


adrianmedranocalvo (claiming to be [email protected]) added on 2018-07-31 14:35:32:
Hello sebres,

thank you for your comments.

Indeed that's the issue: some [thread::send -async] are being transformed into "after" events, which implies all points you detailed, while other [thread::send -async] do not.

Mine was a lengthy text, I'm not sure my proposal it's clear.  Let me summarize it:

- I think that each "thread::send -async $tid $script" should be appended to the same queue, so that events for thread $tid are evaluated on the order they were posted.

- Other events can be handled in whatever order.

- I think that transforming some "thread::send -async $tid $script" into "after idle" (as it's currently done in Thread 2.8.0) is a misfeature, because it makes "thread::send -async" no longer work as a useful thread-safe queue.

- I propose removing the code selected in the following link:
  https://core.tcl-lang.org/thread/artifact?udc=1&ln=2742-2759&name=61003585d6458178&udc=1

What's your opinion on doing that?

sebres added on 2018-07-31 12:22:41:

Although idle-event is indeed not really good strategy for async send to self.

But I don't see this as issue.

Please note:

  • "after" events are timer-events and will be executed as one chunk (if the timer was reached, and in case of after 0 it is always the case);
  • this means - you can not insert in-between 0-timer events, unless you don't generate a timer-event also (I don't really see any advantages by generation of 0-timer event in case of async send).
  • events are cumulated with event source handlers (setupProc/checkProc), so the time where events are reached the evaluation phase is undefined;
  • if the event will be appended directly to thread-queue, then all your timer-events (after 0) will be executed hereafter (because those will reach the event-queue first after sourcing via setupProc/checkProc);
  • asynchronous send to self is not the primary usage in thread-module, so I don't think that "much performance being lost" here.

Just as loud thoughts.