Tcl Source Code

View Ticket
Login
Ticket UUID: 61c01e0edb08a9ed905aefdf17ce113da303c77b
Title: likely race condition with package require in several threads causing error
Type: Bug Version: any
Submitter: anonymous Created on: 2025-05-04 06:28:58
Subsystem: 37. File System Assigned To: nobody
Priority: 9 Immediate Severity: Critical
Status: Closed Last Modified: 2025-07-22 12:34:06
Resolution: Fixed Closed By: oehhar
    Closed on: 2025-07-22 12:34:06
Description:
Summary: 

There appears to be a race condition bug that occurs randomly, but not always, using packages from within several threads with version 9. Failures can take several program executions to occur, but usually less than a minute at about 1 run per second. The failure presents as a "can't find package" error.

Detail:

In tcl/tk 9.0.1 I am seeing what is highly likely a race condition bug with threads and packages that did not occur in 8.6. I have only run this on windows 10. The error occurs only inside a thread, and typically there needs to be 3 or more threads started before an error occurs, which is a failure to find a package. There are several packages I've had the error with, but the math package is usually the fastest to fail, and so is used in the below test script. The error occurs randomly. Some distro's see the error more frequently. One, a bawt tclkit, results in an access violation crash. The magicsplat 8.6 did not fail at all in over 1000 runs.

Enclosed below are a windows batch script and a tcl script. The batch script runs the tcl program in a loop, which takes about 1 second if it succeeds. When it fails to load the math package, it will display a tk dialog box which suspends the program and the console is open and available and the failing and working threads are still active and can receive thread::send requests for additional debugging.

When it fails, the auto_path variable (in the failing thread) does not have an entry for math. In looking at the pkgIndex script for math, there is the item:

package ifneeded math  1.2.6 [list source [file join $dir math.tcl]]

which by instrumenting these scripts, it was determined that this was indeed executed (in all the threads), and the script is correct, however, by instrumenting math.tcl, it was found that when the error "can't find package math" occurs, the math.tcl script had not (yet?) been executed. 

However, doing a [package ifneeded math 1.2.6] does return the script, in all threads, after the error occurs. 

If the command [package require math] is run a second time, it will work. Another workaround is to do a bogus package require on a non-existent package, and it then it will (usually) work. By reducing to two threads, or adding a delay before running the third thread, it would also not fail (at least with running 1500 times or so it didn't fail).

Although somewhat lengthy, this is the smallest script I could use to reliably get the error in a timeframe of seconds to minutes.

The script will dump out the auto_path (from the main thread) on startup. Then there is a proc called tdump which is used to view which of the several threads (aka tasks) is in error and the tid's for each. One can then use thread::send in the console to get info back from the thread whose package require failed, as shown below. 

Note that once it works (i.e. after say doing the package require a second time) the math entry is then found in the auto_path variable. This is set in the math.tcl file with this code:

		variable home [file join [pwd] [file dirname [info script]]]
		if {[lsearch -exact $::auto_path $home] == -1} {
			lappend ::auto_path $home
		}


Here's what the output looks like on a bawt distro when it fails, followed by two commands entered into the console to get the value of auto_path for 2 of the threads.:


//zipfs:/lib/tcl/tcl_library
//zipfs:/lib/tcl
C:/bawTcl/lib
C:/bawTcl/lib/cawt
C:/bawTcl/lib/tcl3d
C:/bawTcl/lib/tcllib
C:/bawTcl/lib/tclvfs/template
//zipfs:/lib/tk/tk_library
//zipfs:/lib/tk/tk_library/ttk
                 (taskname1,error)           = |can't find package math| 
                 (taskname1,tid)             = |tid0000000000003078| 
                 (taskname2,tid)             = |tid0000000000002F60| 
                 (taskname3,tid)             = |tid0000000000003E08|
 
() 1 % thread::send tid0000000000003078 {join [set auto_path] \n}
//zipfs:/lib/tcl/tcl_library
//zipfs:/lib/tcl
C:/bawTcl/lib
C:/bawTcl/lib/cawt
C:/bawTcl/lib/tcl3d
C:/bawTcl/lib/tcllib
C:/bawTcl/lib/tclvfs/template
//zipfs:/lib/tk/tk_library
//zipfs:/lib/tk/tk_library/ttk

() 2 % thread::send tid0000000000002F60 {join [set auto_path] \n}
//zipfs:/lib/tcl/tcl_library
//zipfs:/lib/tcl
C:/bawTcl/lib
C:/bawTcl/lib/cawt
C:/bawTcl/lib/tcl3d
C:/bawTcl/lib/tcllib
C:/bawTcl/lib/tclvfs/template
C:/bawTcl/lib/tcllib/math

(This 2nd thread did not fail, so Tk was not loaded to display the dialog box)

Here is the similar results on a magicsplat distro, which typically takes many more runs to fail, likely because the timing is different - maybe something to do with the use of zipfs in bawt, but not magicsplat.

D:/podcasts/Tcl90/lib/tcl9.0
D:/podcasts/Tcl90/lib
D:/podcasts/Tcl90/lib/cawt3.0.0
D:/podcasts/Tcl90/lib/tcl3d1.0.0
D:/podcasts/Tcl90/lib/tcllib2.0
D:/podcasts/Tcl90/lib/vfs1.4.2/template
D:/podcasts/Tcl90/lib/tk9.0
D:/podcasts/Tcl90/lib/tk9.0/ttk
                 (taskname1,error)           = |can't find package math| 
                 (taskname1,tid)             = |tid00000000000031DC| 
                 (taskname2,tid)             = |tid0000000000002490| 
                 (taskname3,tid)             = |tid0000000000003B64| 
() 1 % thread::send tid00000000000031DC {join [set auto_path] \n}
D:/podcasts/Tcl90/lib/tcl9.0
D:/podcasts/Tcl90/lib
D:/podcasts/Tcl90/lib/cawt3.0.0
D:/podcasts/Tcl90/lib/tcl3d1.0.0
D:/podcasts/Tcl90/lib/tcllib2.0
D:/podcasts/Tcl90/lib/vfs1.4.2/template
D:/podcasts/Tcl90/lib/tk9.0
D:/podcasts/Tcl90/lib/tk9.0/ttk
() 2 % thread::send tid0000000000002490 {join [set auto_path] \n}
D:/podcasts/Tcl90/lib/tcl9.0
D:/podcasts/Tcl90/lib
D:/podcasts/Tcl90/lib/cawt3.0.0
D:/podcasts/Tcl90/lib/tcl3d1.0.0
D:/podcasts/Tcl90/lib/tcllib2.0
D:/podcasts/Tcl90/lib/vfs1.4.2/template
D:/podcasts/Tcl90/lib/tcllib2.0/math
() 3 % 

Here is the windows batch script I use to repeatedly run the program. It takes 1 argument, and which decides which of my 4 distributions. You will need to adjust to your system to use it.

For me, the 4 distro's are in order 1-4,
 
a bawt 9.0.1, 
magicsplat 9.0.1, 
a bawt tclkit in a starpack, 
magicsplat 8.6

It never fails with the 8.6 magicsplat distro. With the bawt tclkit, it actually crashes with an access violation, and the program address is always the same. Unfortunately, I don't have any symbols to use for more info.



bugx.bat script
#######################################################


@echo off
setlocal

set "loopCounter=0"
set "distro=%1" &REM Get the distro parameter

if "%distro%"=="1" (
    set "commandToRun=C:\\bawTcl\\bin\\wish90.exe A:\\bug.tcl"
) else if "%distro%"=="2" (
    set "commandToRun=D:\\podcasts\\Tcl90\\bin\\wish90.exe A:\\bug.tcl"
) else if "%distro%"=="3" (
    set "commandToRun=C:\\tclf\\kits\\mytcl901.exe A:\\bug.tcl"
) else if "%distro%"=="4" (
    set "commandToRun=C:\\Users\\core5\\AppData\\Local\\Apps\\Tcl86\\bin\\wish.exe  A:\\bug.tcl"
) else (
    echo Invalid distro parameter.  Please specify 1, 2, 3, or 4.
    exit /b 1
)
echo Command to run: %commandToRun%
:loopStart
echo Loop Counter: %loopCounter%
%commandToRun%
rem timeout /t 1 >nul  &REM Add a 1-second delay
set /a loopCounter+=1
goto loopStart

endlocal



bug.tcl
#################################################################

console show
wm withdraw .   
package require Thread
    
tsv::set  tids [thread::id] mainthread  ;# for reverse lookup 
tsv::set  main mainthread [thread::id]  ;# for reverse lookup 
################################################# Tasks version 1.13h
namespace eval tasks {  

proc wait { ms } {              ;# non busy wait
    set uniq [incr ::__sleep__tmp__counter]
    set ::__sleep__tmp__$uniq 0
    after $ms set ::__sleep__tmp__$uniq 1
    vwait ::__sleep__tmp__$uniq
    unset ::__sleep__tmp__$uniq
}

    
################################################# dump all task shared variables
proc tdump {{pat .*} {max 90}} {         ;# dump all the shared Task variables
    set all 1
    set doputz 1
    set out {}
    if { [string index $pat 0] eq "-" } { ;# a leading - reduces output to just the variables
        set all 0
        set pat [string range $pat 1 end]
    } elseif { [string index $pat 0] eq "+" } { ;# a leading + no output puts either AND return results in $out
        set all 0
        set doputz 0
        set pat [string range $pat 1 end]
    }
    if { $all } {
        puts "\n------ Task(s) dump -----------------------------------------"
        puts "tsv::names  = |[tsv::names *]|"
        puts "tsv::tids   = |[tsv::array names tids *]|"
        puts "---------------------------------------------------------------"
    }
    set tvarnames [lsort -stride 2 -index 1 [tsv::array get tids]]
    
    if { $all } {
        puts "tid/names   = |$tvarnames|"
        puts "---------------------------------------------------------------"
    }
    foreach {var val}  [lsort -dictionary -stride 2 -index 1 $tvarnames ] {
        if { $all } {
            puts "[format %-10s $val] tid: $var  exists: [thread::exists $var]"
        }
        
        set tidnames [tsv::array names tvar $val,*]
        foreach tname [lsort $tidnames] {
            if { $tname eq "$val,queue" } {
                set value [::tsv::lrange tvar $tname 0 end] ;# tsv::get can cause a crash now in threads 2.8.8, so use this way
            } else {
                set value [tsv::get tvar $tname]
            }
            set value [string map {\n \u2936 \t \u02eb} $value]
            if { [regexp .*${pat}.* "$tname\t[string range $value 0 $max]"] } {
                if { $doputz } {
                    puts "                 [format %-27s ($tname)] = |[string range $value 0 $max]| "
                } else {
                    lappend out [list $tname $value]
                }
            }
        }
    }
    if { $all } {
        puts "---------------------------------------------------------------"
    }
    return $out ;# will be null unless +pat was used - to avoid dummping it all in interactive mode or windows console
}
#proc - main Task procs         -----------------------------------------------------------
#################################################
proc Task {name args} {        ;# create a Task

set script {
    if [catch {
        package require math
        if [catch {while 1 {
            thread::wait
            
        }} thread_err_code thread_err_dict] {
            tsv::set tvar taskname1,error $thread_err_dict  
            catch {package require Tk}; package require Tk; wm withdraw . ; set zzz [tk_messageBox -type yesno -detail {Select yes to exit, no to suspend} -title {task error} -message "$thread_err_code\n\n$thread_err_dict" -title "tid [thread::id]" ]; 
            if {$zzz eq "yes"} exit else {vwait ::forever1}
        }
    } err_code_Task_Create err_code_Task_Create_details] { 
            tsv::set tvar taskname1,error $err_code_Task_Create
            catch {package require Tk}; package require Tk; wm withdraw . ; set zzz [tk_messageBox -type yesno -detail {Select yes to exit, no to suspend} -title {Task create error} -message "$err_code_Task_Create \n$err_code_Task_Create_details"]
            if {$zzz eq "yes"} exit else {vwait ::forever2}
    }
}

    
    set tid [thread::create $script]
    
    tsv::set  tvar $name,tid        $tid
    tsv::set  tvar $name,script     $script
    tsv::set  tids $tid             $name       ;# for reverse lookup
    return $tid
}

            
namespace export tdump

}
# end of tasks namespace eval   

    puts [join [set auto_path] \n]
    namespace import tasks::*   
    update
    
    
    tasks::Task taskname1 
    tasks::Task taskname2 
    if { 1 } { ;# when 0, it does not fail on my systems
#       tasks::wait 10 ;# also sufficient to no longer fail
        tasks::Task taskname3
    }
    
    
    set ok 1
    tasks::wait 100
    tasks::tdump -,error|,tid
    
    foreach t [tasks::tdump +,error] {
        lassign $t a b;
        if {$b ne ""} {bell;set ok 0}
    }
    if { $ok } {
        wm withdraw .
        puts "*****ok******* [info nameof]\n[info patch]"
        after 250 exit
    }
User Comments: oehhar added on 2025-07-22 12:34:06:
Thanks for considering my comment, I appreciate.
All is ok,
Harald

sebres added on 2025-07-22 12:05:12:

> I would appreciate some source code comments with brief description and a reference to this ticket.

Well, basically it is commonly known "issue" of Tcl_FSGetNormalizedPath and only few percent of its usage, so I don't think we'd need a comment on every increment. A comment is only necessary if it is something unexpected or unusual, like in [88fef0563f33ac19] (where the same object obtained by two functions, and one does increment but another not). Regarding ticket reference, normally fossil blame would do the job, so in my opinion it'd be superfluous too.

And last but not least - the path API (and especially its caching framework) has historically become a mess and thus has many conceptual problems actually, so better would be to rewrite all that (and make current API deprecated). Also I'm not convinced that the half of Tcl_FSGetNormalizedPath calls is needed at all and cannot be replaced with Tcl_FSGetTranslatedPath, or with some new function combining that like in mentioned ZipFSPathInFilesystemProc, e. g. because the normalization rarely needed for absolute paths.


oehhar added on 2025-07-22 05:20:23:

Great work, Sergey and Eric !

I would appreciate some source code comments with brief description and a reference to this ticket.

Thanks for all, great ! Harald


anonymous added on 2025-07-22 01:26:54:
Success!

I did a download of tcl and tk from https://core.tcl-lang.org/tcl/zip/trunk and built from source with visual studio 2022. I copied the .exe and .dll's into a magicsplat 9.0.2. distro (as wish91.exe etc.).

I verified the source code I downloaded included Sergey's fixes.

It ran flawlessly. 

I ran with 25 threads each run, 13,000 runs over a few hours, and NO failures. I think this is a huge success with the package require in a thread problem.

I double checked by running the old version, and with 25 threads, it would on average get 4 thread failures in pretty much every run.

Many Thanks to Sergey!

sebres added on 2025-07-21 19:08:13:

Review is ready, more cases (increment ref-count with use-after-free prevention for interim normalized path object) fixed now in all branches...

A bit modified handling (the trick with increment/decrement only if normalized path doesn't equal to original given path), is made in order to fix that all minimally invasive, because otherwise if API will be used with freshly created path object, it could destroy this object, in case if normalized path is equal given object.


sebres added on 2025-07-19 17:30:40:

I found more by my review, will try to provide the fixes soon. Ticket will remain pending until then.


oehhar added on 2025-07-19 07:27:57:
Mega Cudos to Wizard Sergey !
Wi all  appreciate!
Harald

sebres added on 2025-07-18 17:53:00:

Fixed in all branches ([0d48bb289c1b9e0a] .. [34180ec93e99745a]), thus I'll close it here.

If further review reveals something similar later, I will extend this ticket or open a new one.


sebres added on 2025-07-18 15:24:00:

Weird. The FS-API of Tcl (at least for Windows) doesn't look thread safe to me now.

Debugging a bit, I see following picture in TclpMatchInDirectory:

# correct behaviour of glob:
==68b0==B0== D:/Projects/tcllib/build_9.x/lib => D:/Projects/tcllib/build_9.x/lib (32)
==68b0==B1== D:/Projects/tcllib/build_9.x/lib => D:/Projects/tcllib/build_9.x/lib (32)
==68b0==B2== D:/Projects/tcllib/build_9.x/lib => D:/Projects/tcllib/build_9.x/lib/
==68b0==BE== D:/Projects/tcllib/build_9.x/lib/*
...

# wrong behaviour of glob:
==68b0==B0== D:/Projects/tcllib/build_9.x/lib => D:/Projects/tcllib/build_9.x/lib (32)
==68b0==B1== D:/Projects/tcllib/build_9.x/lib => D:/ (-1)
==68b0==B2== D:/Projects/tcllib/build_9.x/lib => D://
==68b0==BE== D://*
...

B0 shows fileNamePtr as result of fileNamePtr = Tcl_FSGetNormalizedPath(interp, pathPtr);
B1 shows the same fileNamePtr, but after invocation of Tcl_FSGetNativePath(pathPtr).

Somewhere in between, the path object invalidates its normalized representation (pathPtr->normPathPtr), probably by epoch change, so fileNamePtr becomes invalid (since it is used not incremented inside of function).

This grave bug exists at least since 8.5, just probably since 8.7 or 9.0 the epoch changes often for some reason, so invalidation of object happens more often and behaviour gets noticeable in threaded environment.

But after all it is typical use-after-free and can cause crash with SF or BO, memory corruption etc.

I'll fix it now for all branches.

I guess, besides complete code review (to find similar issues), we have to re-release several versions hereafter or at least provide hotfixes for that.


anonymous added on 2025-07-18 04:25:47:
Good to hear the progress. For completeness, I did go ahead and instrument the pkgIndex file as such:

if {![package vsatisfies [package provide Tcl] 8.5 9]} {return}
proc _mysource {file} {
    set ::math_error "ok"
    if [catch {
        source $file
    } err] {
        set ::math_error $err
        return -code error $err
    }
}
package ifneeded math                    1.2.6 [list _mysource [file join $dir math.tcl]]
incr ::in_pkgIndex_tcl
...

In the failing thread, it simply never executes the source command, while in the threads that work, I see the ok value. I also modified the script to actually use a math function and tested for the value being correct:

set script {
    if [catch {
        package require math
        set ::math_fib_result_55 [math::fibonacci 10] 
        if { $::math_fib_result_55 != 55} {
        	error "error with math::fibonacci, not 55, was $::math_fib_result_55"
        }


And this would always, when it didn't fail on the package require, give the correct value of using the fibonacci function.

Thanks for looking at this!

sebres added on 2025-07-18 00:35:00:

OK, I reproduced it with stock tcl and latest tcllib (using package require in new interpreter, 8x threaded).

It looks like it is tclPkgUnknown, particularly sub-dir glob in
glob -directory $dir -join -nocomplain * pkgIndex.tcl
what sporadically returns empty list for tcllib directory, where normally it is something like ${dir}/tcllib2.0/pkgIndex.tcl.

Digging deeper.


anonymous added on 2025-07-17 19:33:38:
The magicsplat distro doesn't use the //zipfs and it can also fail.

Note: I can get the failure with other packages, not just math. I tried to see if the other packages with errors had something in common with the math package, but was unable to find a correlation. I am just using package math in my posted code since that seems to fail more often.

If I change the code to do,

catch {package require math}
catch {package require math}
catch {package require math}
package require math

then the error rate is lowered, but if I increase the number of threads, from say 3, to 25, the error rate is increased.

The access violation only happens with the tclkit from bawt. Since I don't build that, I don't know how to get symbols for it. What I do see is by using visual studio 2022 to inspect the crash dump file that windows creates.

In order to output a tk messagebox, I also do a package require Tk (after an error on package math), and on occasion in the tclkit version, this would get an error saying something like, permission error. I attribute that to some sort of timing where in a tclkit, the tk dll has to be written out to the temp directory, and is possibly not fully written (or closed) by one thread, while the other thread is trying to do the same thing. To mitigate that, so I can get the message out, I do the same thing,

catch {package require Tk); package require Tk

Since it mostly works, it seems unlikely that the file encoding is an issue. The file math.tcl is on disk .../lib/tcllib2.0/math.tcl with the magicsplat distro, and my text editor says it is windows 1252 encoding; it can tell me if it is BOM encoding and also if there were any characters that are not standard ascii.

The reason I use magicsplat and bawt is that I don't know how to get all the packages on a self built tcl/tk from source on linux. There I have in the past built tcl/tk with symbols and run using gdb.

In my test code, I do a package require math at the top of the main script thinking if it was loaded there, it might then work inside the threads, but that didn't stop the failures.

oehhar added on 2025-07-17 13:26:32:
Thanks, Sergey, to look into this and partage your wisdom.
I really appreciate,
Harald

sebres added on 2025-07-17 12:40:11:

It doesn't look like encoding issue to me... however if I see "access violation", I remember about [f2ff05fc84], so the question is whether something (C or tcl) would set system encoding in threads (even if it set to same value).

As for difference to 8.6 - it can be the library in zipfs.

Anyway it looks rather as a timing issue to me and not necessarily a race condition. And it has definitely to do with a expanding of auto_path, because for some reason the path to tcllib/math is not there if error occurs, however the path to tcllib was set (it is also interesting - who add the path to tcllib).

The first question is - can you try to use a tcl without zipfs?
So either build it with `configure --disable-zipfs`.
Or copy tcl-library together with all packages in some common directory, set env "TCL_LIBRARY" to this root path and start your test from that shell.

Another question is - who add `C:/bawTcl/lib` or `D:/podcasts/Tcl90/lib` in auto_path initially? Are they added by tcl itself, or by you (e.g. somewhere in init.tcl or by some package). I guess the init.tcl may be customized there (e. g. so enhancements of auto_load or auto_load_index), because by default tcl doesn't search pkgIndex.tcl recursively (but only in first level of each directory in auto_path), so if something adds path to tcllib "delayed" it'd work exactly as described - first time fails and 2nd attempt to require it'd load it successfully. Also I assume, it'd not fail anymore if you'd add path to tcllib by yourself, e. g. somewhere in init.tcl or with `lappend auto_path C:/bawTcl/lib/tcllib`.

So to clarify the issue, one needs to know which changes both tcl-editions have and how it is customized locally and how the auto_load mechanisms work (particularly who adds tcllib to auto_path).


oehhar added on 2025-07-17 06:55:31:

Hi Eric, I have read your new posting. New in tcl 9.x is that source may fail with encoding errors, which are hidden in tcl 8.6. Could you change the pckIndex.tcl file to catch the source command and to log an error, if the source fails ?

The issue might be in the file i/o code.

Just an idea... Harald


anonymous added on 2025-07-16 20:36:14:
updated test scripts, failure in 9.02 also

Here are a new batch script, that counts errors, and a new script that exits with a 0 or 1 depending on error status so the batch can be run for long periods. On an error, the dialog box, if "no" is clicked, will suspend, otherwise it will error exit in 10 seconds.

bugx.bat:

@echo off
setlocal
set "loopCounter=0"
set "errorCounter=0"
set "distro=%1" &REM Get the distro parameter
if "%distro%"=="1" (
    set "commandToRun=C:\\bawTcl\\bin\\wish90.exe A:\\bug.tcl"
) else if "%distro%"=="2" (
    set "commandToRun=D:\\podcasts\\Tcl902\\bin\\wish90.exe A:\\bug.tcl"
) else if "%distro%"=="3" (
    set "commandToRun=C:\\tclf\\kits\\mytcl901.exe A:\\bug.tcl"
) else if "%distro%"=="4" (
    set "commandToRun=C:\\Users\\core5\\AppData\\Local\\Apps\\Tcl86\\bin\\wish.exe  A:\\bug.tcl"
) else (
    echo Invalid distro parameter.  Please specify 1, 2, 3, or 4.
    exit /b 1
)
echo Command to run: %commandToRun%
:loopStart
echo Loop Counter: %loopCounter% ^| Error Counter: %errorCounter%
%commandToRun%
if errorlevel 1 (
    rem echo Error detected! Exit code: %errorlevel%
    set /a errorCounter+=1
) else (
    rem echo Command completed successfully.
)
rem timeout /t 1 >nul  &REM Add a 1-second delay
set /a loopCounter+=1
goto loopStart
endlocal





bug.tcl:



console show
wm withdraw .   
package require Thread
package require math    
tsv::set  tids [thread::id] mainthread  ;# for reverse lookup 
tsv::set  main mainthread [thread::id]  ;# for reverse lookup 
################################################# Tasks version 1.13h
namespace eval tasks {  

proc wait { ms } {              ;# non busy wait
    set uniq [incr ::__sleep__tmp__counter]
    set ::__sleep__tmp__$uniq 0
    after $ms set ::__sleep__tmp__$uniq 1
    vwait ::__sleep__tmp__$uniq
    unset ::__sleep__tmp__$uniq
}

    
################################################# dump all task shared variables
proc tdump {{pat .*} {max 90}} {         ;# dump all the shared Task variables
    set all 1
    set doputz 1
    set out {}
    if { [string index $pat 0] eq "-" } { ;# a leading - reduces output to just the variables
        set all 0
        set pat [string range $pat 1 end]
    } elseif { [string index $pat 0] eq "+" } { ;# a leading + no output puts either AND return results in $out
        set all 0
        set doputz 0
        set pat [string range $pat 1 end]
    }
    set tvarnames [lsort -stride 2 -index 1 [tsv::array get tids]]
    
    foreach {var val}  [lsort -dictionary -stride 2 -index 1 $tvarnames ] {
        
        set tidnames [tsv::array names tvar $val,*]
        foreach tname [lsort $tidnames] {
            set value [tsv::get tvar $tname]
#            set value [string map {\n \u2936 \t \u02eb} $value]
            if { [regexp .*${pat}.* "$tname\t[string range $value 0 $max]"] } {
                if { $doputz } {
                    puts "                 [format %-27s ($tname)] = |[string range $value 0 $max]| "
                } else {
                    lappend out [list $tname $value]
                }
            }
        }
    }
    return $out ;# will be null unless +pat was used - to avoid dummping it all in interactive mode or windows console
}
#proc - main Task procs         -----------------------------------------------------------
#################################################
proc Task {name args} {        ;# create a Task

set script {
    if [catch {
        package require math
        if [catch {while 1 {
            thread::wait
            
        }} thread_err_code thread_err_dict] {
            tsv::set tvar tasknamex,error $thread_err_dict  
            catch {package require Tk}; package require Tk; wm withdraw .  
            set zzz [tk_messageBox -type yesno -detail {Select yes to exit, no to suspend} -title {task error} \
                -message "1: $thread_err_code\n\n$thread_err_dict\nauto_path in thread: [thread::id]\n[join [set auto_path] \n]" 
            if {$zzz eq "yes"} exit else {thread::send maintid {after cancel $::exitafter} ; vwait ::forever1}
        }
    } err_code_Task_Create err_code_Task_Create_details] { 
            tsv::set tvar tasknamex,error $err_code_Task_Create
            catch {package require Tk}; package require Tk; wm withdraw . 
            set zzz [tk_messageBox -type yesno -detail {Select yes to exit, no to suspend} -title {Task create error} \
                -message "2: $err_code_Task_Create \n$err_code_Task_Create_details\nauto_path in thread: [thread::id]\n[join [set auto_path] \n]"]
            
            
            if {$zzz eq "yes"} exit else {thread::send maintid {after cancel $::exitafter} ; vwait ::forever2}
    }
}

    
    regsub -all tasknamex,error $script $name,error script;# substitute in the actual task name
    regsub -all maintid $script [thread::id] script       ;# substitute in the actual thread id of main thread
    set tid [thread::create $script]
    
    tsv::set  tvar $name,tid        $tid
    tsv::set  tvar $name,script     $script
    tsv::set  tids $tid             $name       ;# for reverse lookup
    return $tid
}

            
namespace export tdump

}
# end of tasks namespace eval   

    puts "auto_path in main thread:\n[join [set auto_path] \n]" ;# dump this in main thread
    namespace import tasks::*   
    update
    
    
    tasks::Task Taskname1 
    tasks::Task Taskname2 
    if { 1 } { ;# when 0, it does not fail on my systems
#       tasks::wait 10 ;# also sufficient to no longer fail
        tasks::Task Taskname3
    }
    
    
    set ok 1
    tasks::wait 100
    tasks::tdump -,error|,tid
    
    foreach t [tasks::tdump +,error] {
        lassign $t a b;
        if {$b ne ""} {bell;set ok 0}
    }
    if { $ok } {
        wm withdraw .
        puts "*****ok******* [info nameof]\n[info patch]"
        after 250 {exit 0}
    }
    set ::exitafter [after 10000 {exit 1}]

anonymous added on 2025-05-04 21:37:13:
There's a slight error in the reporting, the code needs to change the thread script in each thread, just before the thread is created:

	regsub -all taskname1,error $script $name,error script;# substitute in the actual task name
	set tid [thread::create $script]

otherwise it would report that the first thread had the error in all cases. In actuality, any of the 3 threads can get the error.