Author: Donal Fellows <[email protected]>
State: Voting
Type: Project
Created: 09-May-2025
Tcl-Version: 9.1
Tcl-Branch: no-variable-width-instruction-issue
Abstract
This TIP proposes to change the set of bytecodes used in the Tcl bytecode engine. The primary goal of this is simplification to make the compiler easier to maintain.
Rationale
Tcl bytecode is complex to issue, and quirky in quite a few places. Chief among those are:
- Some opcodes come in two variants, especially the ones relating to jumps. That makes rewriting bytecodes in the optimiser much more difficult, and more confusing too. It also requires a significantly more elaborate scheme for creating the jumps, as the fulfilment of a forward jump may result in previously issued code having to be moved. That's horrible.
- Some opcodes have arguments that take single byte lengths of things, a distinct limitation at times. This is particularly a problem for opcodes relating to
incr
, which only have a single byte for the variable index. Procedures with more than 256 local variables are not the most common case, but are common enough. - The
INST_RETURN_CODE_BRANCH
opcode effectively does address arithmetic using the current Tcl result code, and that gives me the cold shivers.
Additionally, the instruction sequences for some commands (especially try
) can be very complex. We should simplify.
Specification
Except where noted below, the tcl::unsupported::assemble
command is already transparently aware of these changes.
Deprecations of old opcodes
This TIP proposes to deprecate these opcodes:
INST_PUSH1
INST_INVOKE_STK1
INST_LOAD_SCALAR1
INST_LOAD_SCALAR_STK
INST_LOAD_ARRAY1
INST_STORE_SCALAR1
INST_STORE_SCALAR_STK
INST_STORE_ARRAY1
INST_INCR_SCALAR1
INST_INCR_ARRAY1
INST_INCR_SCALAR1_IMM
INST_INCR_ARRAY1_IMM
INST_JUMP1
INST_JUMP_TRUE1
INST_JUMP_FALSE1
INST_APPEND_SCALAR1
INST_APPEND_ARRAY1
INST_LAPPEND_SCALAR1
INST_LAPPEND_ARRAY1
INST_RETURN_CODE_BRANCH
INST_TAILCALL1
(renamed fromINST_TAILCALL
)INST_TCLOO_NEXT1
(renamed fromINST_TCLOO_NEXT
)INST_TCLOO_NEXT_CLASS1
(renamed fromINST_TCLOO_NEXT_CLASS
)
Where known to be supported by the compiler, these elements of the TclInstruction
enumeration will be marked with the deprecated
attribute so that uses of them will result in warnings. If REMOVE_DEPRECATED_OPCODES
is defined during compilation, they will be entirely elided including their bytecode engine implementations (resulting in a bytecode engine that cannot have bytecodes for Tcl 9.0 loaded into it, a non-issue without the use of tbcload).
Renaming of opcodes
The following opcodes are renamed (with no other change to them):
INST_PUSH4
toINST_PUSH
INST_INVOKE_STK4
toINST_INVOKE_STK
INST_LOAD_SCALAR4
toINST_LOAD_SCALAR
INST_LOAD_ARRAY4
toINST_LOAD_ARRAY
INST_STORE_SCALAR4
toINST_STORE_SCALAR
INST_STORE_ARRAY4
toINST_STORE_ARRAY
INST_JUMP4
toINST_JUMP
INST_JUMP_TRUE4
toINST_JUMP_TRUE
INST_JUMP_FALSE4
toINST_JUMP_FALSE
INST_BEGIN_CATCH4
toINST_BEGIN_CATCH
INST_APPEND_SCALAR4
toINST_APPEND_SCALAR
INST_APPEND_ARRAY4
toINST_APPEND_ARRAY
INST_LAPPEND_SCALAR4
toINST_LAPPEND_SCALAR
INST_LAPPEND_ARRAY4
toINST_LAPPEND_ARRAY
New replacement opcodes
These replace existing, similarly-named, opcodes with versions with wider operands.
INST_INCR_SCALAR
INST_INCR_ARRAY
INST_INCR_SCALAR_IMM
INST_INCR_ARRAY_IMM
INST_TAILCALL
INST_TCLOO_NEXT
INST_TCLOO_NEXT_CLASS
Completely new opcodes
INST_SWAP
: This swaps the two elements on the top of the stack, and is significantly more efficient thanINST_REVERSE 2
.INST_ERROR_PREFIX_EQ
: This is a special comparison for handlingtrap
clauses intry
. Due to it requiring the two arguments to be different objects, this is not exposed by any other mechanism.INST_TCLOO_ID
: This isinfo object creationid
; it's cheaply available information.INST_DICT_PUT
: This lets code add a key/value to a dictionary value (i.e., the guts ofdict replace
). It also simplifiestry
.INST_DICT_REMOVE
: This lets code remove a key from a dictionary value (i.e., the guts ofdict remove
). To complete the set of operations given thatINST_DICT_PUT
is there.INST_IS_EMPTY
: This provides access to the newTcl_IsEmpty()
function. Typically introduced by the bytecode optimiser when presented with code likeexpr {$val eq ""}
.INST_JUMP_TABLE_NUM
: This is similar toINST_JUMP_TABLE
except that keys are integers (up to what can be expressed in aTcl_Size
). This simplifies many cases oftry
, replacesINST_RETURN_CODE_BRANCH
insubst
, and is expected to power a properswitch -integer
mode in a future TIP.
Note: INST_JUMP_TABLE_NUM
introduces a new aux data type, where the internal model is a hash table with TCL_ONE_WORD_KEYS
that maps Tcl_Size
to Tcl_Size
.
New internal types
There's a number of new internal types in tclCompile.h
. The main ones of interest are:
Tcl_LVTIndex
; an alias forTcl_Size
that specifically contains either a reference to a local variable orTCL_INDEX_NONE
.Tcl_AuxDataRef
; an alias forTcl_Size
that specifically contains an index into the auxiliary data table.Tcl_ExceptionRange
; an alias forTcl_Size
that specifically contains a reference to an exception range.Tcl_BytecodeLabel
; an alias forTcl_Size
that specifically is treated as if it contains the address of a jump target (replacingJumpFixup
records in many cases).
New instruction issuing macros
Note that some of these were previously used in just one file. For their usage and exact definitions, see the code!
// Issue an instruction without an argument.
#define OP(name)
// Issue an instruction with a single-byte argument.
#define OP1(name,val)
// Issue an instruction with a four-byte argument.
#define OP4(name,val)
// Issue an instruction with a single-byte argument and a four-byte argument.
#define OP14(name,val1,val2)
// Issue an instruction with two four-byte arguments.
#define OP44(name,val1,val2)
// Issue an instruction with a foun-byte argument and a single-byte argument.
#define OP41(name,val1,val2)
// Issue a potentially break/continue generating instruction without an argument.
#define INVOKE(name)
// Issue a potentially break/continue generating instruction with a single argument.
#define INVOKE4(name,arg1)
// Issue a potentially break/continue generating instruction with two arguments.
#define INVOKE41(name,arg1,arg2)
// Push a string literal.
#define PUSH(string)
// Push a string whose is computed with strlen().
#define PUSH_STRING(strVar)
// Push a string from a TCL_TOKEN_SIMPLE_WORD token.
#define PUSH_SIMPLE_TOKEN(tokenPtr)
// Take a reference to a Tcl_Obj and arrange for it to be pushed.
#define PUSH_OBJ(objPtr)
// Take a reference to a Tcl_Obj and arrange for it to be pushed.
// Handles extra flags, typically used for command names.
#define PUSH_OBJ_FLAGS(objPtr, flags)
// Push a general token. Needs which index of its command it is.
#define PUSH_TOKEN(tokenPtr, index)
// Push a token that is an expression.
#define PUSH_EXPR_TOKEN(tokenPtr, index)
// Compile the body of a command (e.g., [if], [while])
#define BODY(tokenPtr, index)
// Set the label to the current address. Typically paired with BACKJUMP.
#define BACKLABEL(var)
// Jump (of given type) backwards to the label defined by BACKLABEL.
#define BACKJUMP(name, var)
// Jump (of given type) forwards to the label defined by FWDLABEL.
#define FWDJUMP(name, var)
// Set the label to the current address. MUST be paired with FWDJUMP.
#define FWDLABEL(var)
// Create an unplaced CATCH exception range.
#define MAKE_CATCH_RANGE()
// Create an unplaced LOOP exception range.
#define MAKE_LOOP_RANGE()
// Wrap the given range around a body of code, placing its start and end.
#define CATCH_RANGE(range)
// Define where caught exceptions in the CATCH range branch to.
#define CATCH_TARGET(range)
// Define where caught BREAKs in the LOOP range branch to.
#define BREAK_TARGET(range)
// Define where caught CONTINUEs in the LOOP range branch to.
#define CONTINUE_TARGET(range)
// Finalize the LOOP exception range, setting the destinations for jumps.
#define FINALIZE_LOOP(range)
// Apply a correction to the stack depth.
#define STKDELTA(delta)
New macros in tclCompile.c
To keep things clearer and less prone to errors, the following macros are used for building the entries in the tclInstructionTable
global constant:
#define TCL_INSTRUCTION_ENTRY(name,stack) \
{name,1,stack,0,{OPERAND_NONE,OPERAND_NONE}}
#define TCL_INSTRUCTION_ENTRY1(name,size,stack,type1) \
{name,size,stack,1,{type1,OPERAND_NONE}}
#define TCL_INSTRUCTION_ENTRY2(name,size,stack,type1,type2) \
{name,size,stack,2,{type1,type2}}
These have no effect other than to make building the entries a bit less error-prone. (There's equivalent DEPRECATED_...
ones for the deprecated opcodes, but they're currently otherwise defined identically; they're just visual markers when reading the source code.)
New bytecode engine macros
Mostly the changes here are small, but there is one new general macro:
NEXT_INST_F0(pcAdjustment, nCleanup)
That's a cut-down version of NEXT_INST_F()
for the case when there's no result to handle, which is really quite common and means we can omit quite a bit of code that the compiler would otherwise have to work at to remove. If we're lucky, it just makes the bytecode engine faster to compile. If we're unlucky, it shrinks the size of the built code (due to removal of code that should have been unreachable).
Compatibility
There are no changes to the public Tcl C API. All API changes are strictly internal only.
If REMOVE_DEPRECATED_OPCODES
is not defined, full compatibility with Tcl 9.0 is maintained, though possibly with warnings.
Code that must handle the old opcodes, such as the bytecode engine, does:
#define ALLOW_DEPRECATED_OPCODES
prior to #include "tclCompile.h
to disable the warnings.
Code that saves and loads bytecodes is not expected to be able to handle these new opcodes without changes; the new auxiliary record type causes that.
To Tcl scripts, there should be no visible changes, other than the lifting of some limits and new opcodes in tcl::unsupported::assemble
.
Performance
The purpose of this change was to improve my sanity when reading the bytecode compilation code! However, a simple evaluation of the performance seems to indicate no substantive performance difference, and some increase in size of bytecode (to be expected as many common operations are now always issued with 4-byte operands). This is in line with expectations.
Implementation
See the no-variable-width-instruction-issue
Future Directions
NB: These all lie outside the scope of this TIP.
This TIP lays the groundwork for making more commands be bytecode compiled with expansion present, though more opcodes are likely to be required for much of that project.
There are several proposed routes for removing the deprecated bytecodes:
- Do not remove the existing bytecode implementations for now.
- Branch
remove-deprecated-opcodes-level1
removes the implementations but leaves the other bytecodes as they are. This is compatible with existing code so long as the deprecated bytecodes are not used. - Branch
remove-deprecated-opcodes-level2
compacts the remaining bytecodes. This is definitely not compatible with existing bytecodes... but that only matters for code that uses the TDK compiler andtebcload
.
Other things examined during the development of this TIP:
- Adding bytecodes for pushing special constants.
- Adopting C23
[[deprecated]]
annotations. (C23 has some other interesting goodies too.) - Adopting the
<stdint.h>
and<stdbool.h>
standard headers.
Copyright
This document has been placed in the public domain.