# TIP 509: Implement reentrant mutexes on all platforms
Author: Frédéric Bonnet <[email protected]>
State: Final
Type: Project
Vote: Done
Created: 24-May-2018
Post-History:
Keywords: Tcl,threads
Tcl-Version: 8.7
Vote-Results: 8/0/1 accepted
Votes-For: DKF, KBK, JN, JD, DGP, FV, SL, AK
Votes-Against: none
Votes-Present: BG
Tcl-Branch: tip-509
-----
# Abstract
This TIP proposes to improve the `Tcl_Mutex` API by enforcing a consistent
behavior on all core-supported platforms regarding reentrancy.
# Context
This TIP is inspired by a [request from
FlightAware](https://github.com/flightaware/Tcl-bounties#make-tclxs-signal-trap-handlers-safe-to-use-with-threaded-tcl)
to fix deadlock issues with TclX signal handling. A specific issue has been
opened to discuss the proposed implementation [here](https://github.com/flightaware/Tcl-bounties/issues/32).
# Rationale
As of Tcl 8.6, the man page for thread support `Thread.3` states that:
> The result of locking a mutex twice from the same thread is undefined. On some
> platforms it will result in a deadlock.
On Windows platforms, mutexes are implemented using Win32 critical sections,
which are reentrant. `Tcl_Mutex` are just plain `CRITICAL_SECTION *`.
On Unix platforms (this includes MacOS X), Tcl relies on the **pthread** library
for multithreading and synchronization primitives such as mutexes. `Tcl_Mutex`
are just plain `pthread_mutex_t *`. **pthread** mutexes are not reentrant by
default, though the `PTHREAD_MUTEX_RECURSIVE` attribute can be used at creation
time to make them so, but this possibility is not available on older systems.
The Tcl philosophy has always been to erase platform-specific peculiarities in
favor of overall multi-platform consistency, sometimes to the point of
implementing or emulating commonly available features on less capable platforms.
Therefore it feels natural to pursue this goal by making `Tcl_Mutex` reentrant
on all platforms and achieving a consistent behavior on both Windows and Unix.
# Specifications
This TIP proposes to replace the following text from the `Thread.3` man page:
> The result of locking a mutex twice from the same thread is undefined. On some
> platforms it will result in a deadlock.
by the following text:
> Mutexes are reentrant: they can be locked several times from the same thread.
> However there must be exactly one call to **Tcl_MutexUnlock** for each call to
> **Tcl_MutexLock** in order for a thread to release a mutex completely.
# Portability issues
## Windows
Mutexes are naturally reentrant on Windows systems, so no special work is
required.
## Unix
On **pthread**-based Unix systems that support the `PTHREAD_MUTEX_RECURSIVE`
attribute, all `pthread_mutex_t` made available as `Tcl_Mutex` will be created
using this attribute. This includes all but the oldest variants of Unix.
On **pthread**-based Unix systems that do not support the
`PTHREAD_MUTEX_RECURSIVE` attribute, reentrancy will be achieved by combining a
regular, non-reentrant `pthread_mutex_t`, with a thread-specific lock counter
accessible through a `pthread_key_t` data key. This counter keeps track of the
number of calls to `Tcl_MutexLock` minus the number of calls to
`Tcl_Mutex_Unlock`. `Tcl_MutexLock` increments the counter, but only calls
`pthread_mutex_lock` when the initial value is zero. `Tcl_Mutex_Unlock` behaves
symmetrically: it decrements the counter, and only calls `pthread_mutex_unlock`
when it reaches zero. This ensures that a thread never calls
`pthread_mutex_lock` twice on the same mutex, and only calls
`pthread_mutex_unlock` when the thread no longer holds it.
Detection of `PTHREAD_MUTEX_RECURSIVE` availability is done at configure time
thanks to the `AC_CHECK_DECLS` autoconf macro in `tcl.m4`.
# Potential incompatibilities
Although this TIP introduces a major change to `Tcl_Mutex` behavior on Unix, it
is very unlikely that this will break any existing code:
- Windows-only code will see no change in behavior.
- Unix-only code with reentrant mutexes is fatally flawed in the current state
of the Tcl core, since this results in a deadlock. At best, this TIP will fix
such hard-to-reproduce situations (case in point: TclX signals), and the code
will transition from the nonworking state to the working state (which,
according to Dr. John Ousterhout, is the best performance improvement). At
worst, this change will trigger a new class of reentrancy-related bugs on
already broken code.
- Multiplatform code in its current form behaves either inconsistently or
consistently, depending on whether it uses reentrant mutexes or not.
Consistent code will remain consistent, inconsistent code will become
consistent as the Unix version aligns with the Windows version.
# Related Bugs
[Bug #f4f44174](https://core.tcl-lang.org/tcl/tktview/f4f44174) demonstrates the
deadlock issue with a script based on TclX. The root cause is the asynchronous
event handler's `Tcl_Mutex` being locked twice from the same thread when a
signal handler interrupts a thread in the middle of a mutex-protected section,
which on Unix platforms results in a deadlock. The proposed implementation fixes
this issue.
# Implementation
The proposed implementation is available on branch
[tip-509](https://core.tcl-lang.org/timeline?r=tip-509) in the Tcl Fossil
repository.
# Copyright
This document has been placed in the public domain.