Author: Jan Nijtmans <[email protected]>
State: Final
Type: Project
Vote: Done
Created: 20-Aug-2018
Post-History:
Keywords: Tcl
Tcl-Version: 8.7
Tcl-Branch: tip-514
Vote-Results: 3/0/2 accepted
Votes-For: DKF, JN, KBK
Votes-Against: none
Votes-Present: JD, SL
Abstract
This TIP proposes to resolve the platform differences between int/wide/entier math functions and commands like "string is integer"/"string is wide"/"string is entier". At the script level it should not be relevant whether the platform is 32-bit or 64-bit any more.
Most Tcl commands already accept unlimited integers, so there is hardly any command left which need to be checked for correct range.
Rationale
Some examples:
% string is int -4294967296 0 % string is int -4294967295 1 % string is int 4294967295 1 % string is int 4294967296 0
So valid integers appear to range from -(2^32-1) to +2^32-1. Most people learn in school that 32-bit integers range from -2^31 to 2^31-1. Are Tcl's integers 33-bit, but then excluding -4294967296?
% string is wide -18446744073709551616 0 % string is wide -18446744073709551615 1 % string is wide 18446744073709551615 1 % string is wide 18446744073709551616 0
So valid wide integers appear to range from -(2^64-1) to +2^64-1. Most people learn in school that 64-bit integers range from -2^63 to 2^63-1. Are Tcl's wide integers 65-bit, but then excluding -18446744073709551616?
% expr int(2147483648) ; #on LP64/ILP64 platforms 2147483648 % expr int(2147483648) ; #on other platforms -2147483648 % expr int(9223372036854775808) ; #on LP64/ILP64 platforms -9223372036854775808 % expr int(9223372036854775808) ; #on other platforms 0 % expr wide(9223372036854775808); #all platforms -9223372036854775808
So, int() does either 32-bit or 64-bit truncation, but the highest left-over bit becomes the sign-bit. wide() does 64-bit truncation.
Specification
On all platforms, the int() math function is modified not to do truncation any more. int() will thus become synonym for entier(). The wide() and entier() functions will become deprecated in Tcl 9.0, but they will not be removed yet.
On all platforms, "string is integer" will become synonym to "string is entier". So these functions will report true (1) if the number looks like an integer with unlimited range.
On all platforms, "string is wideinteger" will start checking for the proper 64-bit signed range. So this functions will only report true (1) if the value is in the range of -9223372036854775808..9223372036854775807
The "string is wideinteger" and "string is entier" commands will become deprecated in Tcl 9.0, but they will not be removed yet.
The C function Tcl_GetIntFromObj() is changed to return TCL_OK if the Tcl_Obj contains a value in the range of -2147483648..4294967295. So, it succeeds if the number fits in either a platform "int", either a platform "unsigned int" type.
The C function Tcl_GetWideIntFromObj() is changed to return TCL_OK if the Tcl_Obj contains a value in the range of -9223372036854775808..9223372036854775807. So, it succeeds if the number fits in a Tcl_WideInt.
The C function Tcl_GetLongFromObj() is changed to behave like Tcl_GetIntFromObj() if sizeof(long) == sizeof(int), and to behave like Tcl_GetWideIntFromObj() if sizeof(long) == sizeof(Tcl_WideInt)
Implications
"string is integer" can no longer be used to check for a specific range. That doesn't matter any more, because since the introduction of bignums the command argument that was being protected doesn't throw an exception any more.
int() can no longer be used for 32-bit/64-bit (platform-dependent) truncation.
If you still really want to protect some command argument from overflowing, Use Tcl_GetWideIntFromObj() in this command, and use "string is wide" to check for proper range. But - still better - is use Tcl_GetWideIntFromObj(), while falling back to Tcl_GetBignumFromObj() if the range requires it. That's what Tcl itself is doing almost everywhere to prevent under/overflow errors.
Implementation
Currently, the proposed implementation is available in the tip-514 branch.
Copyright
This document has been placed in the public domain.