Author: Mathieu Lafon <[email protected]>
Author: Andreas Leitgeb <[email protected]>
State: Draft
Type: Project
Vote: Pending
Created: 21-Nov-2016
Post-History:
Keywords: Tcl,procedure,argument handling
Tcl-Version: 9.1
Tcl-Branch: tip-457
Abstract
This TIP proposes an enhancement of the Tcl language to support named arguments and additional features when calling a procedure.
Rationale
The naming of arguments to procedures is a computer language feature which allow developers to specify the name of an argument when calling a function. This is especially useful when dealing with arguments with default values, as this does not require to specify all previous arguments when only one argument is required to be specified.
As such, this is a commonly requested feature by Tcl developers, who have created various code snippets http://wiki.tcl.tk/10702 to simulate it. These snippets have drawbacks: not intuitive for new users, require to add extra code at the start of each procedure, no standard on the format to use, few errors handling, etc.
After discussing various possibilities with the community, it has been decided to extend the argument specification of the proc command and allow users to define options on arguments. This can be used to support named arguments but also add additional enhancements: flag arguments, pass-by-name (upvar) arguments, non-required arguments, ...
The others possibilities discussed are detailed in the Discussion section at the end of the document.
Specification
The proc documentation currently define argument specifiers as a list of one or two fields where the first field is the name of the argument and the optional second field is its default value.
The proposed modification is to support an alternate specifier format where the first field is also the name of the argument, followed by a paired list of options and their values. This format does not prevent the original format to be used as they can be easily distinguished: the new format uses an odd size list with a minimal size of three fields.
Available argument specifiers
The following argument specifiers are defined in this TIP:
-default VALUE defines the default value for the argument. It is ignored if -required 1 is also used.
% proc p { { a -default {} } } { list a $a } % p a {} % p foo a foo
-name NAME defines the argument to be a named argument. NAME defines the name of the argument when it is defined as a single string. If NAME is a list of strings, it is the list of names that can be used to refer to the argument (i.e. aliases). On the call-site, the name of the argument is prefixed by a single dash and followed by the value.
% proc p1 { { v -name val } } { list v $v } % p1 -val 1 v 1 % proc p2 { { v -name {v val value} } } { list v $v } % p2 -value 2 v 2 % p2 -v 2 v 2
-switch SWITCHES defines that the argument is defined on the call-site as a flag-only/switch parameter. SWITCHES is a list of possible switches. Each switch is defined either as a single string (switch name) or as a list of two entries (switch name and related value). On the call-site, the name of the switch is prefixed by a single dash and is not followed by any value. The value assigned to the argument is either the switch name or the related value depending on how it was defined.
% proc p { { dbg -default 0 -switch debug } } { list dbg $dbg } % p dbg 0 % p -debug dbg debug % proc p { { level -switch {{quiet 0} {verbose 9}} } { list level $level } % p -quiet level 0 % p -verbose level 9
-required BOOLEAN defines that the value is required to be set. If set to true, the argument is required and any default value is ignored. It is the default handling for non-named argument without a default value. If set to false, the argument is not required to be set and the related argument will be left unset if there is no default value. It is the default handling for named argument.
% proc p { { v -required 0 } } { if {[info exist v]} {list v $v} {return "v is unset"} } % p 5 v 5 % p v is unset
-upvar LEVEL defines, that the local argument will become an alias to the variable in the frame at level LEVEL corresponding to the parameter value. This is similar to what is achieved when using the upvar command. This specifier is incompatible with the -switch specifier.
% proc p { { v -upvar 1 } } { incr v } % set a 2 2 % p a 3 % set a 3
Further argument specifiers may be added in future TIP. Examples of new argument specifiers which may be added in the future:
type assertion (-assume TYPE)
argument documentation (-docstring DOC)
...
Named arguments
The following rules define how named arguments are expected to be specified on the call-site:
Named arguments must always be specified using their name, they can't be specified as positional arguments.
% proc p { {a -name A} } { list a $a } % p aa wrong # args: should be "p |-A a|" % p -A aa a aa
When several names (using -name or -switch options) are specified for the same argument, only one is required to be used on the call-site, unless a default value is also specified. If more than one is used, the latest value/switch is kept.
% proc p { { v -name {v val} } } { list v $v } % p -v 6 -val 8 v 8
Both -name and -switch specifiers can be used on the same argument.
% proc p { { level -name level -switch {{quiet 0} {verbose 9}} } { list level $level } % p -level 4 level 4 % p -verbose level 9
A group of contiguous named arguments are handled together and are not required to be specified in the same order as defined.
% proc p { {a -name A} {b -name B} } { list a $a b $b } % p -B bb -A aa % a aa b bb
The handling of a group of contiguous named arguments (which can be only one argument) is ended on the first argument which is either a parameter not starting with a dash or the special -- end-of-options marker. Remaining arguments will then be assigned to following positional arguments.
% proc p { {o -name opt} args } { list o $o args $args } % p -opt O 5 o O args 5 % p -opt O -1 0 wrong # args: should be "p |-opt o| ?arg ...?" % p -opt O -- -1 0 o O args {-1 0}
If there is a fixed number of non-optional positional arguments and no special args variable after the named group, the handling of a named group will also be ended when the remaining arguments to assign will be equal to the number of positional arguments after the group.
% proc p { {o -name opt} posarg } { list o $o posarg $posarg } % p -opt O -1 o O posarg -1
Generated usage description
The error message, automatically generated when the input arguments are invalid, is updated regarding new options:
Pass-by-name arguments (specified using -upvar level option) are surrounded by the '&' character.
% proc p { { v -upvar 1 } } { } % p wrong # args: should be "p &v&"
Named arguments are showed how they should be called and surounded by the '|' character. If several names have been specified, they are grouped together.
% proc p { { l -name level -switch {high low} -required 1} } {} % p wrong # args: should be "p |-level l|-high|-low|"
When an argument is optional, '?' is used.
% proc p { { v -name var } a } {} % p wrong # args: should be "p ?|-var v|? a"
Introspection
The info argspec proc command is added to get the argument specification of all arguments or of a specific argument.
% proc p { a { b 1 } { c -name c } } {}
% info argspec proc p
a { b -default 1 } { c -name c }
% info argspec proc p c
-name c
Similar info argspec subcommands are also added for lambda, object method and object constructor.
The info argspec specifiers command is added to get the specifiers supported by the current interpreter.
% info argspec specifiers
-default -name -required -switch -upvar
Other use cases
Extended argument specifiers can also be used with other proc-like functions. The following functions are supported and can use extended argument specifiers:
anonymous functions (lambda), used with apply command ;
TclOO constructor or methods.
Performance
The proposed modification has no significant performance impact:
existing code (and code not using extended argspec) is not impacted by the change as the current initialisation code is still available and used ;
code using extended argspec may be impacted because the initialisation code is different and is required to loop on each argument, but initial testing does not show a significant slowdown.
When using named arguments specifiers to replace a similar handling done in Tcl-pure code, there is however a significant increase in performance.
See https://gist.github.com/mlafon/70480877a28f3571e0377eabc0cee7be for details on performance testing done on the proposed implementation.
Implementation
This document proposes the following changes to the Tcl core:
Add ExtendedArgSpec structure which is linked from CompiledLocal and contains information about extended argument specification;
Add a flags field in the Proc structure to later identify a proc with at least one argument defined with an extended argument specification (PROC_HAS_EXT_ARG_SPEC);
Update proc creation to handle the extended argument specification and fill the ExtendedArgSpec structure;
Update InitArgsAndLocals to initialize the compiled locals using a dedicated function if the PROC_HAS_EXT_ARG_SPEC flag has been set on the proc. If unset, the original initialization code is still used.
Update ProcWrongNumArgs to generate an appropriate error message when an argument has been defined using an extended argument specification;
Add info argspec command;
Update documentation in doc/proc.n and doc/info.n;
Update impacted tests and add dedicated tests in tests/proc-enh.test.
Reference Implementation
The reference implementation is available in the tip-457 https://core.tcl-lang.org/tcl/timeline?r=tip-457 branch.
The code is licensed under the BSD license.
Discussion
This section details some of the alternate solutions for this feature or specific comments about it.
Initial approaches that tried to work with unmodified procedures are not detailed here for clarity.
Dedicated builtin command
A dedicated command can be used to handle the named arguments, using an -option value syntax, before calling the target procedures with all arguments correctly prepared.
% call -opts myproc -optC foo -optB {5 5} -- "some pos arg"
An implementation of this proposal is available at https://github.com/mlafon/tcl/tree/457-CALL-CMD . This proposal was abandoned as it was not enough intuitive for users.
Modification in how proc are defined
Tcl-pure procedures can be defined in a way which state that the procedure will automatically handle -option value arguments.
% proc -np myproc { varA { optB defB } { optC defC } { optD defD } args } { .. }
% myproc -optC foo -optB {5 5} -- "some pos arg"
An other possibility is to support options on arguments and allow name specification:
% proc myproc { varA { optB -default defB -name B } args } { .. }
% myproc a -B b zz
This is the currently proposed solution in this TIP. It requires the procedures to be modified but allow additional features.
Some people have expressed concern about the modification of the proc command, which is a core command of Tcl. A particular attention has been paid to ensure that existing code will not be impacted and that future usage could be later added by adding new specifiers.
Argument Parsing command
Cyan Ogilvie's paper from Tcl2016 https://www.tcl.tk/community/tcl2016/assets/talk33/parse_args-paper.pdf describes a C extension to provide core-like argument parsing at speed comparable to proc argument handling, in a terse and self-documenting way.
Alexandre Ferrieux has proposed http://code.activestate.com/lists/tcl-core/18447/ to use the same argument specifiers than this proposal, but with a dedicated command which can be called from the proc body. This has the advantage to not alter the proc command and could be located in an extension.
Although the proc usage will not be modified, this new command will probably have to access or modify internal proc structures, for example to support introspection.
Having to declare final local variables in the body, also seems confusing for users.
Preventing Data-dependent bugs
It has been proposed by Christian Gollwitzer http://code.activestate.com/lists/tcl-core/18457/ to make the special '--' end-of-options marker mandatory when the number of positional arguments after the named group is not fixed. This would suppress any potential Data-Dependent bugs related to the search of the initial dash and remove any unwanted object stringification, at the expense of forcing the user to explicitely use the end-of-option marker.
This proposal is currently not implemented but the documentation has been modified to list the cases for which '--' should be use.
Copyright
This document has been placed in the public domain.