Artifact [8676285b29]

Login

Artifact 8676285b290d95af0a53176b7de6260f236b8a65ab9fbe8cbf62ff3065849f93:


TIP:		148
Title:		Correct [list]-Quoting of the '#' Character
Version:	$Revision: 1.4 $
Author:		Don Porter <[email protected]>
State:		Final
Type:		Project
Vote:		Done
Created:	08-Aug-2003
Post-History:	
Tcl-Version:	8.5

~ Abstract

This TIP proposes the correction of a long-standing bug in the
[[list]]-quoting of the ''#'' character.

~ Background

Tcl has a bug in its [[list]]-quoting function.  The bug is recorded
as Tcl Bug 489537
(http://sf.net/tracker/?func=detail&aid=489537&group_id=10894&atid=110894).

Briefly, one of the documented functions of [[list]] is quoting of
arguments so the result can be passed to [[eval]] with each list
element becoming exactly one word of the command to be evaluated.
Consider the example script:

| # FILE: demo.tcl
| set cmdName [lindex $argv 0]
| proc $cmdName {} {puts Success!}
| set command [list $cmdName]
| puts "Evaluating \[$command]..."
| eval $command

This script expects one argument on the command line, and should write
''Success!'' to stdout.  This script works correctly for most input:

| $ tclsh demo.tcl foo
| Evaluating [foo]...
| Success!
| $ tclsh demo.tcl "with space"
| Evaluating [{with space}]...
| Success!

But it fails for any argument beginning with the ''#'' character:

| $ tclsh demo.tcl #bar
| Evaluating [#bar]...

This is because, contrary to the documentation, [[list]] does not
quote the leading ''#'' in a manner to make the list safe for passing
to [[eval]].  The Tcl parser sees the unquoted ''#'' as the start of a
comment.

Starting in Tcl 8.3, optimizations for evaluation of ''pure lists''
were added, making inconsistency due to ''Tcl_ObjType'' shimmering a
new symptom of this bug.  If we adapt the example script to remove the
[[puts]] (so that a ''pure list'' is maintained):

| # FILE: demo2.tcl
| set cmdName [lindex $argv 0]
| proc $cmdName {} {puts Success!}
| set command [list $cmdName]
| eval $command

We get a script that actually works with the troublesome input:

| $ tclsh demo2.tcl #bar
| Success!

This bug in [[list]]-quoting is present in all released versions of
Tcl since and including Tcl 7.4.  It may go back further.

There is no question that Tcl's behavior disagrees with its
documentation on this point.  I believe the documentation to be
correct.  From that viewpoint, this is a bug, not requiring a TIP for
fixing.  Because the bug has been around for so long, though, it seems
prudent to make the TIP proposal, if only as fair warning to those who
might have bug-dependent scripts to fix.  The particular fix proposed
also adds a single ''#define'' to Tcl's public header file.

A large number of tests have been added to the Tcl test suite in the
HEAD, demonstrating this bug in several ways.

~ Proposal

''Tcl_ConvertCountedElement()'' is modified to have the default
behavior of quoting any leading ''#'' character in a list element.
With this default quoting, any string representation of a list
generated by Tcl will not begin with the ''#'' character, so cannot be
mis-parsed as a comment.

''Tcl_ConvertCountedElement()'' is also modified to recognize a new
bit flag value in its ''flags'' argument, ''TCL_DONT_QUOTE_HASH'',
which is defined in Tcl's public header file so that it may be used by
extensions.  When the ''TCL_DONT_QUOTE_HASH'' bit is set in the
''flags'' argument, ''Tcl_ConvertCountedElement()'' will not quote the
leading ''#'' character.  Quoting of the leading ''#'' character is
only necessary for the first element of a list.  Those callers of
''Tcl_ConvertCountedElement()'' that can be sure they are not
generating the first element of a list can pass in the
''TCL_DONT_QUOTE_HASH'' bit to produce the simplest quoting required.
The behavior of the ''TCL_DONT_QUOTE_HASH'' bit flag is added to the
documentation.  The ''Tcl_ConvertElement()'' routine is similarly
modified (trivially, since it is just a wrapper).

All callers of ''Tcl_ConvertCountedElement()'' in the Tcl source code
are modified to use the ''TCL_DONT_QUOTE_HASH'' flag as appropriate,
so that Tcl continues to generate as simple string representations of
lists as possible that do not suffer from Bug 489537.

~ Prototype

A patch implementing this proposal is attached to Tcl Bug 489537 at
SourceForge.

~ Compatibility

After acceptance of this patch, the string representation of some
lists will change, though as little as possible while still fixing the
bug.  Scripts that perform string comparisons on lists may see
different results.  Notably, a test in a test suite that has a test
body that generates a list, and then has an expected result as a
string may see new test failures.  The minimal quoting changes should
keep this incompatibility to a minimum, but it may happen.

Notably there are no such compatibility problems in either the Tcl or
Tk test suites.  Any such incompatibility in other test suites can be
easily remedied by using [[list]] to generate the expected result.

~ Scope

It has been observed by some Tcl users that [[list]] is used for two
conceptually distinct purposes.  First, adding quoting to list
elements as required, so that element boundaries can be re-discovered
from the string representation.  Second, adding quoting so that the
string representation can be passed to [[eval]] with the original
element boundaries becoming the argument boundaries in the evaluation.
One can imagine a Tcl where these two functions were separated.
However, this TIP does not propose such a separation, and further
arguments on that point are out of scope, and should be considered in
another TIP, if at all.

~ Acknowledgements

The author acknowledges the discovery of this bug by Joe English,
analysis by Donal Fellows, and a first draft patch from Miguel Sofer.

~ Copyright

This document has been placed in the public domain.