TIP 509: Implement reentrant mutexes on all platforms

Login
Author:         Frédéric Bonnet <[email protected]>
State:          Final
Type:           Project
Vote:           Done
Created:        24-May-2018
Post-History:   
Keywords:       Tcl,threads
Tcl-Version:	8.7
Vote-Results:   8/0/1 accepted
Votes-For:      DKF, KBK, JN, JD, DGP, FV, SL, AK
Votes-Against:  none
Votes-Present:  BG
Tcl-Branch:     tip-509

Abstract

This TIP proposes to improve the Tcl_Mutex API by enforcing a consistent behavior on all core-supported platforms regarding reentrancy.

Context

This TIP is inspired by a request from FlightAware to fix deadlock issues with TclX signal handling. A specific issue has been opened to discuss the proposed implementation here.

Rationale

As of Tcl 8.6, the man page for thread support Thread.3 states that:

The result of locking a mutex twice from the same thread is undefined. On some platforms it will result in a deadlock.

On Windows platforms, mutexes are implemented using Win32 critical sections, which are reentrant. Tcl_Mutex are just plain CRITICAL_SECTION *.

On Unix platforms (this includes MacOS X), Tcl relies on the pthread library for multithreading and synchronization primitives such as mutexes. Tcl_Mutex are just plain pthread_mutex_t *. pthread mutexes are not reentrant by default, though the PTHREAD_MUTEX_RECURSIVE attribute can be used at creation time to make them so, but this possibility is not available on older systems.

The Tcl philosophy has always been to erase platform-specific peculiarities in favor of overall multi-platform consistency, sometimes to the point of implementing or emulating commonly available features on less capable platforms. Therefore it feels natural to pursue this goal by making Tcl_Mutex reentrant on all platforms and achieving a consistent behavior on both Windows and Unix.

Specifications

This TIP proposes to replace the following text from the Thread.3 man page:

The result of locking a mutex twice from the same thread is undefined. On some platforms it will result in a deadlock.

by the following text:

Mutexes are reentrant: they can be locked several times from the same thread. However there must be exactly one call to Tcl_MutexUnlock for each call to Tcl_MutexLock in order for a thread to release a mutex completely.

Portability issues

Windows

Mutexes are naturally reentrant on Windows systems, so no special work is required.

Unix

On pthread-based Unix systems that support the PTHREAD_MUTEX_RECURSIVE attribute, all pthread_mutex_t made available as Tcl_Mutex will be created using this attribute. This includes all but the oldest variants of Unix.

On pthread-based Unix systems that do not support the PTHREAD_MUTEX_RECURSIVE attribute, reentrancy will be achieved by combining a regular, non-reentrant pthread_mutex_t, with a thread-specific lock counter accessible through a pthread_key_t data key. This counter keeps track of the number of calls to Tcl_MutexLock minus the number of calls to Tcl_Mutex_Unlock. Tcl_MutexLock increments the counter, but only calls pthread_mutex_lock when the initial value is zero. Tcl_Mutex_Unlock behaves symmetrically: it decrements the counter, and only calls pthread_mutex_unlock when it reaches zero. This ensures that a thread never calls pthread_mutex_lock twice on the same mutex, and only calls pthread_mutex_unlock when the thread no longer holds it.

Detection of PTHREAD_MUTEX_RECURSIVE availability is done at configure time thanks to the AC_CHECK_DECLS autoconf macro in tcl.m4.

Potential incompatibilities

Although this TIP introduces a major change to Tcl_Mutex behavior on Unix, it is very unlikely that this will break any existing code:

Related Bugs

Bug #f4f44174 demonstrates the deadlock issue with a script based on TclX. The root cause is the asynchronous event handler's Tcl_Mutex being locked twice from the same thread when a signal handler interrupts a thread in the middle of a mutex-protected section, which on Unix platforms results in a deadlock. The proposed implementation fixes this issue.

Implementation

The proposed implementation is available on branch tip-509 in the Tcl Fossil repository.

Copyright

This document has been placed in the public domain.