TIP 385: Functional Traces On Variables

Author:         Alexandre Ferrieux <[email protected]>
State:          Draft
Type:           Project
Vote:           Pending
Created:        13-Feb-2011
Keywords:       Tcl, traces
Tcl-Version:    9.1


Functional traces are new variants of variable traces, that rely purely on value passing rather than variable updating. Applied to array traces, they allow a much more efficient implementation of various storage backends.


The current variable trace API does not focus on the read or written value; instead, only the variable (and array key) name is made available to the callback. Actual manipulation of the value thus involves updating the variable within the trace callback: the new, "synthetic" value must be written for real.

While this approach is acceptable for a scalar variable, in the case of arrays it forces the "memoize" pattern: whenever reads are done on an array with a read trace, the read value is actually stored in the array, with no obvious spot to unset it afterwards. Hence, although we are backing the array with a function which is called on every read, the memory usage is that of a memoized function; this does not scale up at all.

A typical situation where this matters, is the use of traces to wrap the convenient array syntax around another kind of storage (like a pre-existing dictionary or database): what is intended is that $a($x) becomes syntactic sugar for [dict get $d $x], but without duplicating the storage in the array. This is currently impossible.

The proposed extension to the trace API allows for "functional" traces, in the usual computing sense of "no side effect": here the value of interest is passed as an argument to a write trace callback, and taken from the return value of a read trace callback. This way, no access to the underlying variable is needed. The array syntax can become a frontend to just about anything, without any memory consumption, and with adequate preformance since values are passed as _Tcl_Obj_s. Thus functional traces could also be named "virtual variables".

As an extra bonus, "read-only variables" can be implemented efficiently, which is notably impossible with the current API (since the write trace is called after the fact, it must rely on another mechanism to retrieve the pre-write value).


The new subcommands

trace add variable name readf cmdprefix

trace add variable name writef cmdprefix

trace add variable name unsetf cmdprefix

trace add variable name existsf cmdprefix

are the functional variants of the existing read, write, and unset variable traces. The new existsf is made possible by the functional style. The cmdprefix callback API is close to that of current traces, with two variations:

  1. For a readf or existsf trace, the callback's returned value is used as the result of the read/exists operation.

  2. For a writef trace, the value to be written is passed as an extra argument to the callback:

    cmdprefix name1 name2 writef value

In both cases, the value is passed efficiently as a Tcl_Obj.


Here is the full code exposing a dict as an array:

     trace add variable a readf [list trdictget ::d]
     trace add variable a writef [list trdictset ::d]
     proc trdictget {vv n1 n2 op} {
         upvar $vv v
         return [dict get $v $n2]
     proc trdictset {vv n1 n2 op x} {
         upvar $vv v
         dict set v $n2 $x

Another example with Pascal-like 1-based numeric indices into a list, with bound checking, disguised as array keys:

     trace add variable a readf [list trplistget ::l]
     trace add variable a writef [list trplistset ::l]
     proc trplistget {vv n1 n2 op} {
         upvar $vv v
         if {($n2<1) || ($n2>[llength $v])} {
             error "Index $n2 out of range for list $vv"
         return [lindex $v [expr {$n2-1}]]
     proc trplistset {vv n1 n2 op x} {
         upvar $vv v
         if {($n2<1) || ($n2>[llength $v])} {
             error "Index $n2 out of range for list $vv"
	 lset v [expr {$n2-1}] $x

Obviously, the same principle can be applied to an SQLite or TDBC database; the net result is an instant-loadable "array".

Possible Extensions

A natural generalization would be to hook into array-level operations like size, names, get, unset, startSearch, etc. But then, maybe an ensemble/oo API would be preferred (hook one single command prefix for all operations concerning array a). To be discussed.

Reference Implementation

Coming soon.


This document has been placed in the public domain.