Check-in [53528a425f]

Login

Many hyperlinks are disabled.
Use anonymous login to enable hyperlinks.

Overview
Comment:Formatting cleanup for TIP 461
Downloads: Tarball | ZIP archive
Timelines: family | ancestors | descendants | both | trunk
Files: files | file ages | folders
SHA3-256: 53528a425fb74f19f741930ffef54f7369ed529155a6d6508e0555f98d579274
User & Date: dkf 2019-06-05 19:51:00.602
Context
2019-06-07
16:53
New TIP #549: Make configure --enable-64bit the default check-in: 342fda5d85 user: jan.nijtmans tags: trunk
2019-06-05
19:51
Formatting cleanup for TIP 461 check-in: 53528a425f user: dkf tags: trunk
2019-06-03
15:14
More TIP #548 explanation. check-in: 0220e14300 user: jan.nijtmans tags: trunk
Changes
Unified Diff Show Whitespace Changes Patch
Changes to tip/461.md.
10
11
12
13
14
15
16
17
18
19
20

21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
	Keywords:       Tcl,expression
	Tcl-Version:    8.7
-----

# Abstract

This TIP proposes to complete the separation between string and numeric
comparison operations in [expr] and related commands \([for], [if],
[while], etc.\). It introduces new comparison operators **ge**, **gt**,
**le**, and **lt**, \(along with the corresponding commands in the
**::tcl::mathop** namespace\), and encourages programmers to restrict the six operators **==**, **>=**, **>**, **<=**, **<** and **!=** to comparisons of numeric

values.

# Rationale

Tcl throughout its history has had comparison operators that freely compare
numeric and string values. These operators behave as expected if both their
arguments are numeric: they compare values on the real number line. Hence, 15
< 0x10 < 0b10001. Similarly, if presented with non-numeric strings, they
compare the strings in lexicographic order, as a programmer might expect:
"bambam" < "barney" < "betty" < "fred".

Trouble arises, however, when numeric and non-numeric strings are compared.
The rule for comparison is that mixed-type comparisons like this are treated
as string comparisons. The result is that **<** does not induce an order.
There are inconsistent comparison results, rendering **<** and friends
worthless for sorting. 0x10 < 0y < 1 < 0x10.








|
|

|
>
|








|







10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
	Keywords:       Tcl,expression
	Tcl-Version:    8.7
-----

# Abstract

This TIP proposes to complete the separation between string and numeric
comparison operations in **expr** and related commands \(**for**, **if**,
**while**, etc.\). It introduces new comparison operators **ge**, **gt**,
**le**, and **lt**, \(along with the corresponding commands in the
**::tcl::mathop** namespace\), and encourages programmers to restrict the six
operators **==**, **>=**, **>**, **<=**, **<** and **!=** to comparisons of
numeric values.

# Rationale

Tcl throughout its history has had comparison operators that freely compare
numeric and string values. These operators behave as expected if both their
arguments are numeric: they compare values on the real number line. Hence, 15
< 0x10 < 0b10001. Similarly, if presented with non-numeric strings, they
compare the strings in lexicographic order, as a programmer might expect:
"`bambam`" < "`barney`" < "`betty`" < "`fred`".

Trouble arises, however, when numeric and non-numeric strings are compared.
The rule for comparison is that mixed-type comparisons like this are treated
as string comparisons. The result is that **<** does not induce an order.
There are inconsistent comparison results, rendering **<** and friends
worthless for sorting. 0x10 < 0y < 1 < 0x10.

61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
and should go in as soon as possible - no later than the next point
release, but ideally even in a patchlevel - so that programmers can
begin conversion as soon as possible. Use of the **==**, **>=**,
**>**, **<=**, **<**, and **!=** for comparing non-numeric
values shall immediately be deprecated.

The six string compare operators shall be declared to function so that
their results are the same as the results of [string compare]:

	    {$a lt $b}  <=> {[string compare $a $b] <  0}
	    {$a le $b}  <=> {[string compare $a $b] <= 0}
	    {$a eq $b}  <=> {[string compare $a $b] == 0}
	    {$a ne $b}  <=> {[string compare $a $b] != 0}
	    {$a gt $b}  <=> {[string compare $a $b] >  0}
	    {$a ge $b}  <=> {[string compare $a $b] >= 0}

It is also intended that any future changes to [string compare]
\(for example, a hypothetical change to make it follow Unicode collation
semantics\) will have the corresponding effect on these six operators.

Unlike what was specified in an earlier version of this TIP, no
changes are to  be made to the semantics of the comparison operators
 **==**, **>=**, **>**, **<=**, **<**, and **!=**.








|








|







62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
and should go in as soon as possible - no later than the next point
release, but ideally even in a patchlevel - so that programmers can
begin conversion as soon as possible. Use of the **==**, **>=**,
**>**, **<=**, **<**, and **!=** for comparing non-numeric
values shall immediately be deprecated.

The six string compare operators shall be declared to function so that
their results are the same as the results of **string compare**:

	    {$a lt $b}  <=> {[string compare $a $b] <  0}
	    {$a le $b}  <=> {[string compare $a $b] <= 0}
	    {$a eq $b}  <=> {[string compare $a $b] == 0}
	    {$a ne $b}  <=> {[string compare $a $b] != 0}
	    {$a gt $b}  <=> {[string compare $a $b] >  0}
	    {$a ge $b}  <=> {[string compare $a $b] >= 0}

It is also intended that any future changes to **string compare**
\(for example, a hypothetical change to make it follow Unicode collation
semantics\) will have the corresponding effect on these six operators.

Unlike what was specified in an earlier version of this TIP, no
changes are to  be made to the semantics of the comparison operators
 **==**, **>=**, **>**, **<=**, **<**, and **!=**.

128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
attempts to address them.

   1. _Tcl's expression parser has a hard limit of 64 different binary
      operators. This proposal consumes four of them, leaving only 28. There
      is a concern that this is a less-than-effective use of a limited
      resource._

	    > The limit is self-imposed, in an effort to make the nodes of an
      expression parse tree fit in exactly 16 bytes \(or four int's\). It is far
      from obvious that this pretty size is actually useful. Few expressions
      are more than a few dozen parse nodes, and typical expressions are not
      parsed multiple times. It appears that neither the speed of the parse
      nor the size of the tree will be critical issues in most applications.
      In any case, we still have nearly half the operators left.

   2. _There is some concern that using barewords for operators was a bad
      idea in the first place._ The fact that

		 expr {"foo"}

	    > and

		 set x foo; expr {$x}

	     > both work, while

		 expr {foo}

	     > is an invalid bareword is arguably surprising.

	    > Nevertheless, we have committed to the approach with the **eq**,
      **ne**, **in** and **ni** operators. These are unlikely to go
      away. Adding **lt**, **le**, **gt** and **ge** will make this
      problem no better nor worse.

	    > Moreover, the language of [expr] is not the same as Tcl. It does not
      strip comments, parse into words, and apply Tcl's precise substitution
      rules - and it would be surprising if it did!  There are other "little
      languages" throughout Tcl - regular expressions, glob patterns, assembly
      code, and so on. [expr] is one among many.

   3. _There is concern that [expr], which was originally intended almost
      exclusively for numeric calculations, is being abused with string
      arguments and possibly string results._

	    > The author of this TIP contends that we introduced string values to
      [expr] a long time ago, certainly by the time that the **eq**,
      **ne**, **in** and **ni** operations were introduced.  It is true
      that the use of numeric conversions in [expr] is incoherent, as seen
      in:

		   % proc tcl::mathfunc::cat {args} { join $args {} }
		   % expr {cat(0x1,0x2,"a")}
		   0x10x2a
		   % expr {cat(0x1)}
		   1

	    > \(Bug [e7c21ed678] is another manifestation of this general
      problem.\) Once again, adding additional string operations
      that behave, with respect to data types, exactly the same
      as ones that are already there will neither fix nor exacerbate
      the general problem.

   4. _Because [expr] has no interpreted form, the operations must have
      bytecode representations. The space of available bytecodes is under even
      more pressure than the space of available operators, and must not be
      squandered on operations that are duplicative of already-available
      functionality such as [string compare]._

	    > The obvious rebuttal is that [string compare] is already bytecoded.
      There are no new operations required, merely a compiler that is smart
      enough to emit a short codeburst rather than a single bytecode. As an
      example, the code for the expression

		   {$x lt $y}

	    > could
      be:

		   (0) loadScalar1 %v0        # var "x"
		   (2) loadScalar1 %v1        # var "y"
		   (4) strcmp 
		   (5) push1 0        # "0"
		   (7) lt 

	    > For the other string operators, only the last bytecode in the burst
      would change.  No new bytecode operations are needed. In fact, this
      codeburst is identical code to that generated for the expression

		   {[string compare $x $y] < 0}

# Copyright

This document has been placed in the public domain.








|












|



|



|

|




|



|

|



|
|

|








|
|
|
|
|

|



|

|






|
<







|








<
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205

206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221

attempts to address them.

   1. _Tcl's expression parser has a hard limit of 64 different binary
      operators. This proposal consumes four of them, leaving only 28. There
      is a concern that this is a less-than-effective use of a limited
      resource._

      The limit is self-imposed, in an effort to make the nodes of an
      expression parse tree fit in exactly 16 bytes \(or four int's\). It is far
      from obvious that this pretty size is actually useful. Few expressions
      are more than a few dozen parse nodes, and typical expressions are not
      parsed multiple times. It appears that neither the speed of the parse
      nor the size of the tree will be critical issues in most applications.
      In any case, we still have nearly half the operators left.

   2. _There is some concern that using barewords for operators was a bad
      idea in the first place._ The fact that

		 expr {"foo"}

      and

		 set x foo; expr {$x}

      both work, while

		 expr {foo}

      is an invalid bareword is arguably surprising.

      Nevertheless, we have committed to the approach with the **eq**,
      **ne**, **in** and **ni** operators. These are unlikely to go
      away. Adding **lt**, **le**, **gt** and **ge** will make this
      problem no better nor worse.

      Moreover, the language of **expr** is not the same as Tcl. It does not
      strip comments, parse into words, and apply Tcl's precise substitution
      rules - and it would be surprising if it did!  There are other "little
      languages" throughout Tcl - regular expressions, glob patterns, assembly
      code, and so on. **expr** is one among many.

   3. _There is concern that **expr**, which was originally intended almost
      exclusively for numeric calculations, is being abused with string
      arguments and possibly string results._

      The author of this TIP contends that we introduced string values to
      **expr** a long time ago, certainly by the time that the **eq**,
      **ne**, **in** and **ni** operations were introduced.  It is true
      that the use of numeric conversions in **expr** is incoherent, as seen
      in:

		   % proc tcl::mathfunc::cat {args} { join $args {} }
		   % expr {cat(0x1,0x2,"a")}
		   0x10x2a
		   % expr {cat(0x1)}
		   1

      \(Bug [e7c21ed678](/tcl/tktview?name=e7c21ed678) is another
      manifestation of this general problem.\) Once again, adding additional
      string operations that behave, with respect to data types, exactly the
      same as ones that are already there will neither fix nor exacerbate the
      general problem.

   4. _Because **expr** has no interpreted form, the operations must have
      bytecode representations. The space of available bytecodes is under even
      more pressure than the space of available operators, and must not be
      squandered on operations that are duplicative of already-available
      functionality such as **string compare**._

      The obvious rebuttal is that **string compare** is already bytecoded.
      There are no new operations required, merely a compiler that is smart
      enough to emit a short codeburst rather than a single bytecode. As an
      example, the code for the expression

		   {$x lt $y}

      could be:


		   (0) loadScalar1 %v0        # var "x"
		   (2) loadScalar1 %v1        # var "y"
		   (4) strcmp 
		   (5) push1 0        # "0"
		   (7) lt 

      For the other string operators, only the last bytecode in the burst
      would change.  No new bytecode operations are needed. In fact, this
      codeburst is identical code to that generated for the expression

		   {[string compare $x $y] < 0}

# Copyright

This document has been placed in the public domain.