Tcl Source Code

View Ticket
Login
Ticket UUID: 0ee626dfb24ee37f6ef1e4955b114ff09ea350f3
Title: lseq numeric overflow
Type: Bug Version: 9.0
Submitter: apnadkarni Created on: 2025-06-08 06:01:59
Subsystem: - New Builtin Commands Assigned To: apnadkarni
Priority: 5 Medium Severity: Minor
Status: Open Last Modified: 2025-06-10 16:33:26
Resolution: None Closed By: nobody
    Closed on:
Description:
% lseq 0x7fffffffffffffff count 2
9223372036854775807 -9223372036854775808

The above should raise an error instead.

User Comments: apnadkarni added on 2025-06-10 16:33:26:
I think there are multiple efforts now ongoing in arithSeries so I'm going to hold off doing anything more until the dust settles. No appetite for m-way merges and would be duplicated effort in any case.

jan.nijtmans added on 2025-06-10 15:02:23:

Fixed [1a6c24c637cc413e|here]

> Still working in this area...

I'm aware of that, but 9.0.2 won't take long. I prefer to keep 9.0.2 in sync with 9.1 (regarding bug-fixes). More test-case or fixes for other corner-cases can always be added later.

I'll leave it to you whether you want to make a new ticket for other corner-cases, or re-use this one.


apnadkarni added on 2025-06-10 10:59:11:
I didn't know we tested with -ftrapv but I knew about the overflow wrapping issue and that was my reason for starting the apn-lseq-tests branch for boundary conditions. Unfortunately, it made it into the trunk because though I had tagged the bug branch with core- it never got to run because of the workflow issues and after a few days, I lost patience :-) having tested successfully on Win and Linux (without -ftrapv of course).

Still working in this area...

dkf added on 2025-06-10 09:06:36:

Now when we build with -ftrapv (one of the configurations used on the trunk) we get an abort on line 762 of tclArithSeries.c during test lseq-bug-0ee626dfb2-0:

	    end = start + (step * (len - 1));

(Technically it's in code called from that line due to how trapping arithmetic is implemented on this platform, but that's the line at fault.)

On the other hand, I can't see anywhere where end (or its trivial derivative, dend, computed on the next line) is used at all after that point in that function. Deleting the whole innermost branch containing that failing location makes the test pass.


griffin added on 2025-06-09 17:26:04:
I've come around.

I can agree that a step of 0 is, in essence, a divide by zero condition.  (length = (end - start)/step)

I can also agree that any numeric value that cannot be represented should be an error. 
-0x8000_0000_0000 _0000 is an overflow condition.

apnadkarni added on 2025-06-09 11:53:54:

@griffin

I still have discomfort with the return of an empty list under conditions where we should generate errors. For example,

% llength [lseq -0x8000000000000000 to -1 by 1]
0

which is clearly wrong. Since Tcl cannot handle counts that large, it should generate an error, not return a bogus result.

Also, with regards to step size of 0, consider

% lseq 42 to 42 by 0
%

The return value I would expect is {42}, not an empty string. But the documentation states a step size of 0 will return an empty list so behaves as documented. Still unexpected in my opinion and should be changed. Conversely, if start != end, a step size of 0 is an infinite series of "start" and not an empty sequence. Given Tcl cannot represent that, again an error should be raised. If we fix the former and not the latter, we have inconsistency like

[llength [lseq 42 to 42 by 0]] > [llength [lseq 42 to 43 by 0]]

If you disagree with the above and no one else has a problem with the current behavior, I'll leave it as it is.


apnadkarni added on 2025-06-08 13:57:34:

The empty error message seems to have been fixed already.

I would like to add the following additional checks:

  • a negative count argument should generate an error, not return an empty list.

  • a step size of 0 should generate an error if the passed length is greater than 1 and not return an empty list.

Comments?


apnadkarni added on 2025-06-08 11:25:05:

Along the same lines (I think same root cause),

% lseq 0x7fffffffffffffff count 3 by -0x7fffffffffffffff
9223372036854775807 0 -9223372036854775807
% lseq 0x7fffffffffffffff count 3 by -0x8000000000000000
9223372036854775807 -1 9223372036854775807

The first line above is correct, the second is not.

Proposed fix on branch [bug-0ee626dfb2].

Also, not related but recording as a reminder to be fixed.

% catch {lseq -0x7f .. 0x7f count 2} message
1
% set message
% set errorCode
NONE

Above correctly generates a (syntax) error but with an empty error message.