Tcl Source Code

View Ticket
Login
2024-11-22
19:08 Ticket [f2b5f89c0d] clock scan of leapsecond: wrong result in 8.6 and 9.0 status still Open with 3 other changes artifact: b251962eae user: jan.nijtmans
15:18 Ticket [f2b5f89c0d]: 3 changes artifact: e16b55ce42 user: sebres
13:48 Ticket [f2b5f89c0d]: 3 changes artifact: 7bc3647888 user: jan.nijtmans
12:37 Ticket [f2b5f89c0d]: 3 changes artifact: 3746fd5d0e user: sebres
10:27 Ticket [f2b5f89c0d]: 3 changes artifact: a384dc4f16 user: jan.nijtmans
08:14 Ticket [f2b5f89c0d]: 3 changes artifact: 8c3afb2a96 user: pooryorick
2024-11-21
22:09 Ticket [f2b5f89c0d]: 3 changes artifact: 8ecefd73a4 user: sebres
21:14
fix leap seccond scan [f2b5f89c0d8520ea] (and [aee9f2b916afd976]): recognize leap second (estimate a... Leaf check-in: d4b4648db7 user: sebres tags: bug-f2b5f89c0d-alt-leapsec
21:09
test coverage illustrating the proper leep second behaviour for [f2b5f89c0d8520ea] (and [aee9f2b916a... check-in: 377156e138 user: sebres tags: bug-f2b5f89c0d-alt-leapsec
20:28 Ticket [f2b5f89c0d] clock scan of leapsecond: wrong result in 8.6 and 9.0 status still Open with 3 other changes artifact: 659e3165f2 user: pooryorick
19:01 Ticket [f2b5f89c0d]: 3 changes artifact: d424ee580a user: sebres
18:03 Ticket [f2b5f89c0d]: 3 changes artifact: 9f948e8c31 user: jan.nijtmans
17:19 Ticket [f2b5f89c0d]: 3 changes artifact: b4328a7412 user: sebres
14:16 Ticket [f2b5f89c0d]: 3 changes artifact: 84a2e09626 user: jan.nijtmans
14:05 Ticket [f2b5f89c0d]: 3 changes artifact: 88996a4c65 user: jan.nijtmans
13:29 Ticket [f2b5f89c0d]: 3 changes artifact: 90ac968696 user: sebres
13:29 Ticket [f2b5f89c0d]: 3 changes artifact: b10b508f27 user: sebres
13:27 Open ticket [f2b5f89c0d]. artifact: c76709701b user: sebres
12:31 Closed ticket [f2b5f89c0d]. artifact: d93f523e71 user: jan.nijtmans
11:27
Fix [aee9f2b916] and [f2b5f89c0d]: clock scan of leapsecond/24:00, ISO-8601-compatibility. check-in: 5b5f4e8334 user: jan.nijtmans tags: core-8-branch
11:14
Fix [f2b5f89c0d]: clock scan of leapsecond: wrong result in 8.6 and 9.0. Update changes.md check-in: 853f0295ff user: jan.nijtmans tags: trunk, main
2024-11-20
21:34 Ticket [f2b5f89c0d] clock scan of leapsecond: wrong result in 8.6 and 9.0 status still Open with 3 other changes artifact: 701774f27a user: jan.nijtmans
20:24 Ticket [f2b5f89c0d]: 3 changes artifact: 53ed7977bc user: sebres
2024-11-19
23:53
starts branch with alternate solution for [f2b5f89c0d8520ea]; tools/tclZIC.tcl: produces leap second... check-in: 877284201f user: sebres tags: bug-f2b5f89c0d-alt-leapsec
14:32 Ticket [f2b5f89c0d] clock scan of leapsecond: wrong result in 8.6 and 9.0 status still Open with 3 other changes artifact: beddfeb2b9 user: sebres
13:19 Ticket [f2b5f89c0d]: 3 changes artifact: 382abed95c user: jan.nijtmans
2024-11-15
18:03 Ticket [f2b5f89c0d]: 3 changes artifact: e174d8f70d user: sebres
17:44
fixed wrong time conversion by free-scan (-1 from ToSecond() caused day decrement) - now the result ... check-in: 595fad24d7 user: sebres tags: core-8-6-branch
17:02
fixed wrong time conversion by free-scan (-1 from ToSecond() caused day decrement) - now the result ... check-in: 83c5c578cb user: sebres tags: core-8-6-branch
11:38
Add testcases related to [f2b5f89c0d]: clock scan of leapsecond: wrong result in 8.6 and 9.0 check-in: 320e0b70b8 user: jan.nijtmans tags: trunk, main
11:23 Ticket [f2b5f89c0d] clock scan of leapsecond: wrong result in 8.6 and 9.0 status still Open with 3 other changes artifact: dd8ece7a0a user: sebres
11:18
Add testcases related to [f2b5f89c0d]: clock scan of leapsecond: wrong result in 8.6 and 9.0 check-in: e13bdc7d81 user: jan.nijtmans tags: core-8-branch
11:04
Add testcases related to [f2b5f89c0d]: clock scan of leapsecond: wrong result in 8.6 and 9.0 check-in: a199e8f76e user: jan.nijtmans tags: core-8-6-branch
07:23 Ticket [f2b5f89c0d] clock scan of leapsecond: wrong result in 8.6 and 9.0 status still Open with 3 other changes artifact: 26beeb63b6 user: jan.nijtmans
01:16 Ticket [f2b5f89c0d]: 3 changes artifact: 4d954a2262 user: sebres
2024-11-14
23:24 Ticket [f2b5f89c0d]: 4 changes artifact: 130a7c03f9 user: jan.nijtmans
22:53 Ticket [f2b5f89c0d]: 3 changes artifact: 8a3bc4f354 user: sebres
20:59 Ticket [f2b5f89c0d]: 3 changes artifact: fedad4724f user: jan.nijtmans
19:22 Ticket [f2b5f89c0d]: 3 changes artifact: 5945124dac user: sebres
17:01 Ticket [f2b5f89c0d]: 3 changes artifact: b8fce418d0 user: jan.nijtmans
16:49
Possible fix for [f2b5f89c0d]: clock scan of leapsecond: wrong result in 9.0 check-in: 4d72246575 user: jan.nijtmans tags: bug-f2b5f89c0d
16:47 New ticket [f2b5f89c0d] clock scan of leapsecond: wrong result in 8.6 and 9.0. artifact: f095a03ea5 user: jan.nijtmans

Ticket UUID: f2b5f89c0d8520ea86b929daa19ec13fc3a24d2d
Title: clock scan of leapsecond: wrong result in 8.6 and 9.0
Type: Bug Version: 8.6, 9.0
Submitter: jan.nijtmans Created on: 2024-11-14 16:47:34
Subsystem: 16. Commands A-H Assigned To: jan.nijtmans
Priority: 5 Medium Severity: Minor
Status: Open Last Modified: 2024-11-22 19:08:42
Resolution: Later Closed By: nobody
    Closed on: 2024-11-21 12:31:01
Description:

First, have a look at Tcl 8.7:

$ tclsh8.7
% clock scan "2012-06-30 23:59:59" -gmt 1
1341100799
% clock scan "2012-06-30 23:59:60" -gmt 1
1341100800
This is correct. The 30th of june in 2012 had a leap second, so it - indeed - had a sixtieth second.

Now Tcl 9.0:

$ tclsh9.0
% clock scan "2012-06-30 23:59:59" -gmt 1
1341100799
% clock scan "2012-06-30 23:59:60" -gmt 1
unable to convert input string: invalid time
Understandable, but incorrect: this is a valid timestamp. See: Leap_second

Finally, Tcl 8.6:

$ tclsh8.6
% clock scan "2012-06-30 23:59:59" -gmt 1
1341100799
% clock scan "2012-06-30 23:59:60" -gmt 1
1341014399
% clock format 1341014399 -gmt 1
Fri Jun 29 23:59:59 GMT 2012
So, adding a single second ends up on the previous day.

User Comments: jan.nijtmans added on 2024-11-22 19:08:42:

> However I'm still unsure whether the leap second need to be considered as valid (and my branch need to be merged). But I'm definitely against your solution, that you merged to 8.7/9.0 despite all objections from my side.

I was hoping to be able to come to an agreement before Tcl 9.0.1 (which will be there soon). It's not going to happen.

Given your doubts and my objections to your overly-complex solution, I'm not going through with it. Eventually there should be a --enable-leap option, which follows POSIX 'enhanced' timestamps (which follow leap-seconds and leap-minutes), but such a standard doesn't exist yet. I will revert the changes.

Still, we should document that leap-seconds are not supported. TIP #690 didn't mention that in its "compatibility" section. So, it should be documented now. Only "24:00" should be kept: It's clearly documented by the ISO 8601 2022 amendment as belonging to the next day. We should document that as well.


sebres added on 2024-11-22 15:18:44:

There is no 'right' answer here.
So we have to round down (your propsal) or up (my proposal, and current Tcl 8.x behavior)...

And what is about:

  • all current OS round it down (freezes the second);
  • the ISO standard definitely says about the same calendar day;
  • it is more logical and consistent to behave the same year/month/quarter/week/date/hour/minute, because it simply corresponds to input string more sane;
  • it'd be more consistent if some formatted tokens will be supplied to other processes or further calculations (since it is still the same day and the OS second "T235960 UTC" is equal "T235959 UTC").
  • Tcl 8.x behavior is not even few days old (previously it was wrong completely, at least in free-scan unless I repaired it);

To me the "right" answer is pretty clear (or there are basically two answers: round down or forbid it).

Returning a different timestamp for -valid 1 and -valid 0 is not logical: -valid should only influence the validation, nothing else.

Who said it will?
My branch does it ALWAYS, regardless the -validate parameter... Correct leap second will be the same as "T235959 UTC", in any case. Only wrong leap second moves to next day (with -valid 0) as previously:

% clock scan "2012-06-30 23:59:60 UTC"
1341100799
% clock scan "2012-06-30 23:59:60 UTC" -valid 1
1341100799

% clock scan "2012-06-29 23:59:60 UTC"
1341014400
% clock scan "2012-06-29 23:59:60 UTC" -valid 1
unable to convert input string: invalid time

See also the new tests in the branch that doesn't have constraints to valid_off (so works for both in the same manner).

The challenge is providing a consistent behavior.

I already mastered this challenge.
However I'm still unsure whether the leap second need to be considered as valid (and my branch need to be merged). But I'm definitely against your solution, that you merged to 8.7/9.0 despite all objections from my side.

This makes 100% sense to me

It does not at all (the both former are today, the latter is tomorrow).
And then you seemed not to read, what I wrote all the time before.

For example, what do you think would happen, if some large bank transaction (occurred exactly on the leap second) moves to the next day which would be at least next quarter or even 1 Jan of next year, for instance? Right, it can be booked in wrong week, quarter or even year, what could have hard consequences for the company or even for the state, cause new debts or decreasing the rank score. Let alone every bank has reporting obligations, for instance to ECB, and ECB also has such reporting obligations, which can be wrong. Therefore I emphasize it again, I'll always prefer no result (BOOM) to the wrong result. But also "rounding" down looks to me fewer wrong than "rounding" up (jump to next calendar day), at least it is more consistent in any directions (see begin of my comment).


jan.nijtmans added on 2024-11-22 13:48:14:

> And I don't stop to ask you, why you are in mode to continue to prefer a false result to no result (or even to right result, like in my branch)?

There is no 'right' answer here. Returning 1341100799 is as wrong as 1341100800. What UNIX does is smearing out, so we could map 23:59:59 to 1341100799, 23:59:60 to 1341100799.5 and 00:00:00 (next day) to 1341100800. But we currently don't support fractions, so we have to round down (your propsal) or up (my proposal, and current Tcl 8.x behavior). Returning a different timestamp for -valid 1 and -valid 0 is not logical: -valid should only influence the validation, nothing else.

The challenge is providing a consistent behavior. I'm doing that by assuming a second value of "60" to be valid unless the time string contains something which disallows this. Such as a date which is not jan 1, june 30, july 1 or dec 31. Or when the (minute % 15) != 14. Or (recently) when the year is < 1972 or > 2017.

% clock scan "23:59:59"
1732316399
% clock scan "23:59:60"
1732316400
% clock scan "00:00:00 tomorrow"
1732316400

This makes 100% sense to me, even though I know today we don't have a leap-second.


sebres added on 2024-11-22 12:37:34:

If a script is using clock scan with -validate in order to validate a time representation, and after validation stores the original representation into a persistent database

Artificial scenario, IMHO, because nobody would then supply original timestamps in this case, rather resulting unix-time (valid) or some formatted tokens (e. g. in case of sqlite astronomic JD with time fraction like %Ej). But if yes, it is the troubles of related software (you'd not really expect that everything works everywhere equally).

Is the new option -strict-leap-sec really necessary?

No. I wanted just to provide possibility to match leap seconds as precise as possible, with -strict-leap-sec 0 (default) it will also accept other potentially leap seconds that are not in the list (what is bad in sense of accuracy, but good for avoidance of error by missing iana-db update).

I still think that leap seconds are something for a system that calculates clock time to be aware of, but not to represent.

Yes. Moreover, you cannot represent it with unix-time - in my branch string inputs T235960 and T235959 produce the same result by scan, and it is impossible to produce T235960 with the format back (can only generate T235959).

That means the current list of 27 entries is final

Fluctuations by Earth's rotation are hardly predictable, so one can only says about probabilities. Yes, the Earth spinning now faster than at any time in the last 50 years, but it can change very fast till 2035.

Do we really need to generate this list with Tcl-code, and then import it? Wouldn't a simple static array in C suffice?

I'd prefer to have everything from iana-db in the same place, and load this list on demand (like TZ). At least till 2035, as long as it is still dynamic.

I really think it doesn't need to be much more complicated than that.

I really want to emphasize that if you decided to support the leap seconds and consider the scan as valid, you have to produce correct result. And the timestamp "2012-06-30 23:59:60 UTC" in unix-time is then 1341100799 and not 1341100800 (what would lie chronologically in the next calendar day).
And I don't stop to ask you, why you are in mode to continue to prefer a false result to no result (or even to right result, like in my branch)?


jan.nijtmans added on 2024-11-22 10:27:44:

In 2035, the current leap-second system will be abandoned, maybe replace by leap minutes. Until then no positive leap-seconds are expected, only a negative leap-second. That means the current list of 27 entries is final: there won't be more of them any more. Do we really need to generate this list with Tcl-code, and then import it? Wouldn't a simple static array in C suffice?

It also means that the risk Nathan is describing is non-existent: Leap seconds only existed between 1972 and 2016. For current timestamps they are illegal already. What's the risk? Tcl 8.6 already accepted leap-seconds (even though the POSIX timestamp was wrong, which is now corrected).

I now added a year-check to the code in core-8-branch and trunk. I really think it doesn't need to be much more complicated than that.


pooryorick added on 2024-11-22 08:14:47:

Nice patch. However, this might still be introducing an extra complexity for no good reason. If a script is using clock scan with -validate in order to validate a time representation, and after validation stores the original representation into a persistent database, all other consumers of the data must then be prepared to deal with leap second representations. Sqlite, for example, is not:

sqlite> select date('2012-06-30 23:59:00');
2012-06-30
sqlite> select date('2012-06-30 23:59:60');

sqlite>

So in this case another consumer must be prepared for a time representation that was already validated by Tcl to evaluate to an empty string in an SQL script. This seems like a footgun. Sebres, despite the time that you invested in preparing this patch, is it really a good idea? If the answer is yes, and this is going to go into Tcl, then given that most applications don't need and don't want to deal with leap seconds, should an option like -leapseconds be added to activate this behaviour?

Is the new option -strict-leap-sec really necessary? Given that parsing leap second representations is somewhat niche, maybe anyone that wants that functionality should just accept the cost of calculating strict leap seconds.

I still think that leap seconds are something for a system that calculates clock time to be aware of, but not to represent.


sebres added on 2024-11-21 22:09:29:

This was my first assumption too, but it was based on:

  • the complexity (or rather the relation between the complexity and needs of such scan)
  • the effort to implement it properly, as it were, to comply the ISO 8601 and OS behaviour (freeze the second) at the same time;
  • the common sense on top (leap seconds is not really representable in result of scan, at least as long as it remains unix-time) and the fact that it has been implemented quasi nowhere.

However, I think I managed it in-between, here is the branch that shall satisfy all of them: bug-f2b5f89c0d-alt-leapsec.

The pros:

  • leap second detected more precise, because estimated from UTC tokens or/and complying real leap-second table of iana;
  • leap second remains in the same calendar day as previous second;
  • almost no performance loss (excepting the single load of leap-sec table once per process); not affected at all for all times excepting Thh:mm:60; and it uses very optimal algorithm (interpolation binary search, IBS), which checks whether the second is leap or not, in optimal case with single, mostly with 2, maximally with 3 iterations in the table, so it is almost to neglect;
  • besides the scan result of leap second covers ISO 8601 standard more sane and it is also more similar to OS behaviour (long, frozen second Thh:mm:59), it also matches better to the input string (e. g. all tokens excepting the second remain the same - calendar year, month, week of year, day, weekday, hour and minute are the same as specified in input or are in accordance with the timestamp of leap second);
  • with the table it'd be able to capture every leap second (after update with tclZIC from iana database), even if it will not follow the current rules, e. g. belong to different dates than "31 Dec" or "30 Jun";
  • apart from the table, it can also estimate the leap second ("31 Dec" or "30 Jun", "T23:59:60 UTC"), what will detect possible future leap second without an update (of "tzdata/.leapsec");
  • on demand, it is possible to disable estimation of leap second, so it'd only inspect the leap table.

The cons:

  • unknown, excepting the deviation to 8.6 (but before the fix [595fad24d70e1693] it never worked properly, so it does no matter, in my opinion);
  • a bit grown code base.


pooryorick added on 2024-11-21 20:28:49:

I believe it is safest to not to provide for the representation of leap seconds.

At https://www.cl.cam.ac.uk/~mgk25/posix-clocks.html, Markus Kuhn states, "Later I came to the conclusion that attempting to present inserted leap seconds as an out-of-scale timestamp of the form 23:59:60 to applications at an operating-system-API or network-protocol level is not very useful for most applications and that a standardized smoothed form of UTC, such as my UTC-SLS proposal (2006), is far more practical in almost all applications, except for those concerned with precisely tracking the motion of physical masses, where TAI is a more useful timebase."

This, along with the position of PostGreSQL that sebres pointed out, is further evidence that the best policy is not to try to accept and handle representations of leap seconds. I.e. Tcl should behave as follows:

% clock scan "2012-06-30 23:59:60" -gmt 1
unable to convert input string: invalid time


sebres added on 2024-11-21 19:01:16:

No matter formatted or not, your variant produces a result which going to the next day, month or even year, what is definitely wrong. According to ISO 8601 the leap second shall remain in the same calendar day! Your leap seconds moves to next calendar day, month or even year. This is neither expected result, nor correlates with the standard.

All the time I told you about the problematic or its representation, and you ignored it. Now you just tell - don't bring format into play, to justify your mistake. The problem is not the format, but the fact that the unix-time you generated is in the next calendar day. That's it.

How one can ignore format (or whatever action with resulting unix-time), if it is a target of conversion? Your approach may work only if someone will check whether the given time is not rejected, but would not work if the result of scan used somewhere in further calculations.

Imagine: someone needs to obtain day of week, week of year, day, month or year from the result of scan operation - everything this will be incorrect for this input. And, again, not conform to the standard too and to the handling of the OSs.

Either this must be fixed (remains the same second than "23:59:59 UTC", how also the OSs will do by such seconds), or it must be reverted and leap second is then basically invalid as before by scan operations.

And you continue to prefer a false result to no result.


jan.nijtmans added on 2024-11-21 18:03:34:

> No, you did not

Yes, I did. Please don't bring "clock format" into play, because it doesn't produce ISO-8601 dates but POSIX dates, starting with POSIX timestamps. POSIX doesn't know anything about leap-seconds, so this is expected.

POSIX is a subset of ISO-8601. Since we are on UNIX, and POSIX timestamps may be used when interfacing with the OS, we cannot change the behavior of "clock format". We cannot re-define POSIX timestamps such that days are not a multiple of SECONDS_PER_DAY any more.


sebres added on 2024-11-21 17:19:16:

No, it doesn't do that any more. I found a solution for that.
without jumping to the next day any more.

No, you did not:

% clock format [clock scan "Jun 30 23:59:60 GMT 2012"] -gmt 1
Sun Jul 01 00:00:00 GMT 2012
% clock format [clock scan "Jul 01 01:59:60 CEST 2012"] -gmt 1
Sun Jul 01 00:00:00 GMT 2012

% clock format [clock scan "Jan 01 00:59:60 CET 2017" -gmt 1] -gmt 1
Sun Jan 01 00:00:00 GMT 2017
% clock format [clock scan "Dec 31 23:59:60 GMT 2016" -gmt 1] -gmt 1
Sun Jan 01 00:00:00 GMT 2017

this is definitely next day and month (and next year for the latter).

% clock scan "Fri Jun 29 23:59:60 GMT 2012"
unable to convert input string: invalid time
As expected

Well, related your code, you allowed only Jan 01 and Jul 01... Then it'd cause another issues, I guess...
And how about this (what shall fail in GMT, since it is not leap sec there):

% clock scan "Jan 01 23:59:60 GMT 2013"
1357084800
The major issue is that the validation happens in 2 stages, one before local time in UTC conversion and another after. And the obtaining of new tokens happens on demand, depending on several parameters (like CLF_ASSEMBLE_*) and input tokens (like CLF_JULIANDAY), which may be missed by some constellations and the check as you organized it may be wrong. Especially if you'd fix above-mentioned issue with switch to next day/month/year.


jan.nijtmans added on 2024-11-21 14:16:39:
> There are many corner cases where it'd not work, e. g. try "Fri Jun 29 23:59:60 GMT 2012" (shall fail, not a leap second), or where the relative time scanned together with leap second.

<pre>
    $ tclsh9.0
    % clock scan "Fri Jun 29 23:59:60 GMT 2012"
    unable to convert input string: invalid time
    % 
</pre>

As expected

jan.nijtmans added on 2024-11-21 14:05:06:

> And your implementation, as already said, switches to the next calendar day (and next calendar month or even still worse to next calendar year)

No, it doesn't do that any more. I found a solution for that. ;-) https://core.tcl-lang.org/tcl/file?name=generic/tclClock.c&ci=d3744ffffc620ecc&ln=3736-3737

In case of a leap-second or "24:00", yySecondOfDay can become equal to SECONDS_PER_DAY, without jumping to the next day any more. That already solves all corner-cases related to jumping to the next calendar day/month/year.


sebres added on 2024-11-21 13:27:49:

> Let's try to fool it:
> % clock scan "Sun Jun 30 23:59:60 GMT 2012"
> unable to convert input string: invalid day of week

There are many corner cases where it'd not work, e. g. try "Fri Jun 29 23:59:60 GMT 2012" (shall fail, not a leap second), or where the relative time scanned together with leap second.

> The "ACST" timezone is - apparently - unkown to the parser

Yes, by TZs scanned from input string it is still a bit inconsistent (let alone they are hardcoded). At some time I prepared a solution for that, removing special coded handling for TZ and considering tzdata instead, just it resulting to many further issues, one of them is https://github.com/sebres/tclclockmod/issues/22#issuecomment-1183083552 I'll try to reopen it later.

> Thanks for all your input!

You are welcome... Unfortunately, it seemed like you read only half of it. Just kidding...
I'll provide my fix soon, that closes many corner cases, and important one - result of scan remain in same day by leap second (how all OSs do it with the unix-time, therefore this will reflect it more consistently).

> I don't think the ISO 8601 standard changes that often. Now we are conformant to the 2022 version.

The emphasis was not on ISO-change, but on its hardly possible representation in unix-time. But if we are talking about standards, here is a quote from it:

‘T235960’ represents the final second of a calendar day with a positive leap second in UTC.

And your implementation, as already said, switches to the next calendar day (and next calendar month or even still worse to next calendar year).


jan.nijtmans added on 2024-11-21 12:31:01:

Closing, since valid leap-second timestamps are now handled as being valid.

> I'm trying to implement this properly (see my new branch bug-f2b5f89c0d-alt-leapsec) by consulting the leap seconds table with converted UTC time

Sure, some non-valid timestamps are accepted as valid now. But - still - the current approach is useful even for an additional table-based check: If yyMinute % 15 != 14 (for example) or the date is not January 1st, June 30th, July 1st or December 31th, there is no need even to consult the table. That's already handling 99.93% of all cases!

> ISO can change something every year, and one can and probably must reflect it on the scanning of the input side, but not if the result of conversion is impossible in principle.

I don't think the ISO 8601 standard changes that often. Now we are conformant to the 2022 version. If - in the future - "leap minutes" are introduced, we will need to change the implementation again. For now, the 2022 version is the best we can do.

Thanks!


jan.nijtmans added on 2024-11-20 21:34:15:

Thanks for all your examples! I know it's complicated, but it appears to be working now (I always like a challenge ...):

    $ tclsh9.0
    % clock scan "Sat Jun 30 23:59:60 GMT 2012"
    1341100800
    % clock scan "Sun Jul 01 01:59:60 CEST 2012"
    1341100800
    % clock scan "Sat Jun 30 19:59:60 EDT 2012"
    1341100800
    % clock scan "Sun Jul 01 09:29:60 ACST 2012"
    unable to convert date-time string "Sun Jul 01 09:29:60 ACST 2012": syntax error (characters 20-23)
    % clock scan "Sun Jul 01 08:44:60 +0845 2012"
    1341100800

Let's try to fool it:

    % clock scan "Sun Jun 30 23:59:60 GMT 2012"
    unable to convert input string: invalid day of week

The "ACST" timezone is - apparently - unkown to the parser, but all of them give the right answer now.

Thanks for all your input!


sebres added on 2024-11-20 20:24:54:

Maybe it is not clear, so for the record, just to emphasize the complexity of the problem - the time "23:59:60", as well as the dates "31 Jan" and "30 Jun" are only timestamps in UTC, in other time zones they are completely different, for instance the leap second "Sat Jun 30 23:59:60 GMT 2012" corresponds "Sun Jul 01 01:59:60 CEST 2012" and "Sat Jun 30 19:59:60 EDT 2012", but also "Sun Jul 01 09:29:60 ACST 2012" and especially nice "Sun Jul 01 08:44:60 +0845 2012" (:Australia/Eucla). All this timestamps represents the same unix time and the same leap second.

This also explain why the usage of "24:00:00" or "23:60:00" is not practicable (and I never saw that before), because it makes sense only for +0000/GMT/UTC time zone, for any other zone (depending on its offset) it may be completely "valid" time, that doesn't reflect the leap.

After all the approach to validate it in the way like in branch bug-f2b5f89c0d is hardly possible, because all the scan tokens are always local time, and the check would be (pseudo) correct only for GMT/UTC and wrong for any other TZs.

I'm trying to implement this properly (see my new branch bug-f2b5f89c0d-alt-leapsec) by consulting the leap seconds table with converted UTC time.


sebres added on 2024-11-19 14:32:56:

Again, this ISO amendment only affects the input side, the output of clock scan is and still remains the unix time, what simply doesn't know leap second, and so unable to represent it correctly at all.
ISO can change something every year, and one can and probably must reflect it on the scanning of the input side, but not if the result of conversion is impossible in principle.

I don't understand why you prefer a wrong result to the impossible result in case of -validate 1.

Validity is basically a new mechanism to control whether the input is correct, but also and probably more important that the result is correct and the input was converted properly. And initially I implemented it in the way that the result was converted back and compared with input tokens and validation failed if the tokens deviated (what would happen in this particular case)... Later I reimplemented it more optimal by direct comparison, but it shall not change the principle - if back conversion generates different tokens than input, the simplest math doesn't work and the validation must fail. In my opinion the leap second is not such important thing, for which we shall make exceptions in this simple mechanic.

I guess, there are also another corner cases what your branch doesn't consider at the moment and they could fail, for instance the check of the scanned weekday would probably fail (because of the day change, e. g. Sat to Sun):

% clock scan "Sat Jun 30 23:59:60 2012" -gmt 1 -valid 1
unable to convert input string: invalid day of week
% clock scan "Sat Jun 30 23:59:60 2012" -format "%a %b %d %T %Y" -gmt 1 -valid 1
unable to convert input string: invalid day of week
(untested, but I guess it will)

In my opinion, a note in the documentation like this would be fully enough:
Since leap second is not representable in the result of scan (unix time), scan of time like 23:59:60 is generally impossible with -valid 1 (default in 9.0).
That's it. Let alone that all programming languages doing it also.

However if the decision will be nevertheless "let it be", I would recommend another variant to fix it - freeze the second in the same date: e. g. if the scanned second is 60 and date matches leap conditions, decrement time, so we'd remain in the same date and every validity rules, covering date tokens would be correct.


jan.nijtmans added on 2024-11-19 13:19:08:

I found a Preview of the latest amendment (2022) to the ISO-8601 specification. It states:

NOTE 6 The last instant of the day exists within the last second of the day, which can be one of the 
following:
      with no leap second 23:59:59
      with a negative leap second 23:59:58
      with a positive leap second 23:59:60

So, seconds can be "60", but only in case of a positive leap second. Negative leap seconds didn't ever occur in history.

EXAMPLE 2 The following are valid reduced precision expressions of the ending of the day:
—   ‘T24’ (time of day expression with omission of minutes and seconds);
—   ‘24’ (time of day expression with omission of minutes, seconds and time designator);
—   ‘T24:00’ (time of day expression with omission of seconds);
—   ‘24:00’ (time of day expression with omission of seconds and time designator)

This means that a hour value of "24" is possible, as indication of the end of the day, but only when the minutes, seconds and milliseconds are all zero.


sebres added on 2024-11-15 18:03:49:

Since free-scan of 8.6 was indeed wrong (regardless the conclusion of this issue with leap second), because of day decrement caused by -1 from ToSecond(), I fixed it now by backporting the fix from tclclockmod (8.7) in [595fad24d70e1693], additionally it covers now formatted scan too. So now it works with the same result as formatted scan (also the adjustment of test clock-46.6 was necessary, so it is also cherry-picked from tclclockmod and it is equal now to 8.7+).
And [92493234f7da30f0] illustrates the coverage change for 8.7/9.0 - the same tests as 8.6 with a constraint valid_off and additional tests (leap in wrong days) with constraint !valid_off, inclusive formatted scan for all tests too.

However (just for the record) all the changes are not affected by the decision about this (and are pure fixes and regression tests), I'm still against considering the leap second as valid input.


sebres added on 2024-11-15 11:23:10:

The problem is we cannot resolve that - what you constantly forget is that [clock scan] converts string input to unix time, which in turn simply doesn't know a leap second at all. It was solved in a pragmatic way by programmers and a leap second is simply a time adjustment or else a short time drift (one second in year lasts longer or switches later). Neither it can be represented somehow, nor it is expected to be converted from input, in no one language on earth. Moreover it'd be mathematically wrong (and yes clock scan and clock format are reverse functions in this regard)... So lets also follow this pragmatically.

The comparison with 140km/h doesn't match too, because it represents not only the speed, but also the distance one would reach with that speed, what is a mathematically correct representation in any direction. Leap second is not such thing, at least in current state of the art, how it is implemented in target systems (OS) for Tcl.

Regarding ISO 8601: well, 0 is also available number in math, but you'd not expect that Tcl will start to support division by zero.

And the fundamental question, IMHO, is how you'd represent this leap second in a unit of measure that never has this ability - unix time always reflects 60 seconds per minute, 3600 per hour, 86400 per day. Unexceptionally. It knows leap day (in a leap years), but not an extra second.

Regarding the migration notes (you mentioned somewhere), one doesn't even need to mention it in, because it never worked correctly before (so very probably never used). But perhaps it is something for the documentation. However migration notes shall contain the switch of validation mode, so this is fully enough in my opinion.

In case of possible future improvement, the things can surely change, but then clock facilities need to consider different "base" value than seconds since epoch, e. g. something like a new clock-object containing integer or tuple representation, which can consider leap second fully and value goes "incremented" by leap second. Something like new option, e.g. -ut 1 or whatever, that would expect different base format for [clock add] and [clock format] as well as different return value by [clock scan].

However, how many documentations say, such feature is questionable... Here is PostgreSQLs point of view, for example: There's been very little user demand for leap-second-aware date arithmetic, and the difficulties with extrapolating such arithmetic into the future mean that we're not likely ever to try to support it.


jan.nijtmans added on 2024-11-15 07:23:57:

Noting that you are constantly refering to UNIX or POSIX time, where leap seconds don't exist. I'm refering to ISO 8601 time, where "60" are valid seconds and "24:00" is a valid time (refering to 00:00 the next day). Both are well-defined in ISO 8601. How can we resolve that? We can discuss for hours whether 140 km/h is a valid speed on highways. For you (Germany) it is, for me (Netherlands) it isn't ;-)

So, should the "clock" command implement/accept POSIX time or ISO 8601 time? That's the fundamental question


sebres added on 2024-11-15 01:16:24:

No matter how (-valid 0 or [string map {23:59:60 23:59:59}]) or other pre-processors, the result may be wrong or not (but it is and remains users decision)...
For instance, because the thing you want to implement is simply not supported in unix time at all. It is quasi like an undefined behavior - basically you don't know whether 2016-12-31 23:59:60 is 1483228800 or 1483228799, because it simply doesn't exist in sense of unix time (or nobody knowns which second since epoch it really mean - the first or the next). Maybe it is not important, but perhaps it is, therefore - UB.

With other words, it is like DST hole, for instance the dates "2024-03-31 02:00:00 CET" or "2024-03-31 02:59:59 CEST" simply don't exist in that timezone, because after "01:59:59" follows "03:00:00":

% clock format 1711846799 -format "%Y-%m-%d %H:%M:%S" -timezone :Europe/Berlin
2024-03-31 01:59:59
% clock format 1711846800 -format "%Y-%m-%d %H:%M:%S" -timezone :Europe/Berlin
2024-03-31 03:00:00
% clock scan "2024-03-31 02:00:00" -format "%Y-%m-%d %H:%M:%S" -timezone :Europe/Berlin unable to convert input string: invalid time (does not exist in this time-zone) % clock scan "2024-03-31 02:59:59" -format "%Y-%m-%d %H:%M:%S" -timezone :Europe/Berlin unable to convert input string: invalid time (does not exist in this time-zone)

And you seem to implement similar thing for the leap second, which also doesn't exist in sense of clock and POSIX time. I know that leap second cannot be compared with DST, but similar is here that this is a non-existing thing, so to say a phantom, and it shall remain so (at least with default clock facilities). And by the way, it is also affected by -validate parameter, because both of above mentioned scans will work with -valid 0 (and generate 1 hour deviation by formatting back) and therefore also here have a certain similarity, in sense of wrong unix time.

Anyway I see this definitively as an enhancement and not as bug fix. But if it is improvement, then properly (with real UT-time values, without a phantom behavior, with correct leap tables etc).


jan.nijtmans added on 2024-11-14 23:24:49:

> And if someone really needs to parse the textual form "23:56:60" (e. g. in the way you trying to do that), it'd be possible either using own pre-processor or just with a `clock scan "23:56:60" -valid 0`. That's it.

Yes, that's the only thing I want. You are making it much too big an issue. Do you really want to extend the migration notes with this information? Using "-valid 0" would allow 23:99:99 too, it's not a solution I'm recommending.


sebres added on 2024-11-14 22:53:42:

Neither it is reasonable (see the output of conversion by scan and format back, last command from my example), nor it is OK in 8.7... Or rather it's "OK" only because 8.7 uses -valid 0 by default, with -valid 1 it'd behave in the same way as 9.0 (produce an error "invalid time" too).

As for possible later fix of "imperfections/limitations", regardless the complexity it'd also introduce deviation from several systems, e. g. on linux (or POSIX/GNU/langs/whatever):

$ TZ=GMT date -d @1483228800 '+%Y-%m-%d %H:%M:%S'
2017-01-01 00:00:00
$ tclsh <<<'puts [clock format 1483228800 -f {%Y-%m-%d %H:%M:%S} -gmt 1]'
2017-01-01 00:00:00
$ python -c 'from datetime import datetime; print(datetime.utcfromtimestamp(1483228800).strftime("%Y-%m-%d %H:%M:%S"))'
2017-01-01 00:00:00

also scanning of time like 23:59:60 is completely not supported (basically nowhere):

$ TZ=GMT date -d '2017-01-01 00:00:00' '+%s'
1483228800
$ TZ=GMT date -d '2016-12-31 23:59:60' '+%s'
date: invalid date ‘2016-12-31 23:59:60’
$ python -c 'from datetime import datetime; print(datetime.fromisoformat("2017-01-01 00:00:00+00:00").timestamp())' 1483228800.0 $ python -c 'from datetime import datetime; print(datetime.fromisoformat("2016-12-31 23:59:60+00:00").timestamp())' ValueError: second must be in 0..59

Why tcl must do the things (especially in the wrong way), that also basically not accepted by any tool or programming language?

Again, handling of leap seconds not supported by unix time (and POSIX) at all, and introducing them would:

  • allow values that are not allowed everywhere else and not a standard;
  • deviate from any other tools and programming languages by its values (e. g. value of seconds since POSIX shall be then larger by +27 seconds) or behavior.

Also see stackoverflow: Unix time and leap seconds:

The number of seconds per day are fixed with Unix timestamps.

The Unix time number is zero at the Unix epoch, and increases by exactly 86400 per day since the epoch.

So it cannot represent leap seconds. The OS will slow down the clock to accommodate for this. The leap seconds is simply not existent as far a Unix timestamps are concerned.

So it will be not just wrong, but different to any other clock/date/timestamp handling.

Just consider the leap second like everyone else - it is just a small clock-slowdown (or a short time-jump), so the same timestamp will be used for two real-world seconds (one real and another leap second).
And if someone really needs to parse the textual form "23:56:60" (e. g. in the way you trying to do that), it'd be possible either using own pre-processor or just with a clock scan "23:56:60" -valid 0. That's it.
It is anyway needed only in really special cases or for real-time near programming (but then one mostly uses monotonic time, with micro- or nanosecond precision, and/or completely different facilities).

The same is valid for other artificial forms, for instance sometimes used by the banks, like 32 Jan or 30 Feb (value dates in the electronic bank statement). Nobody else will really accept them.


jan.nijtmans added on 2024-11-14 20:59:39:

Let's correct the distinction between "valid" and "invalid" first. I'm OK (for now) with the behavior of 8.7, I'm aware of the imperfections/limitations of that.

My goal is to let valid string representation go through with the conversions, and detect as much invalid situations as possible. The actual conversion output like 8.7 is reasonable, and can be handled separate from this ticket.


sebres added on 2024-11-14 19:22:32:

Nope, the fix in that way will be incorrect:

  1. leap seconds are affecting both, the POSIX time and its string representation (textual form) equally, the fix (in this way) would only affect the latter and incorrectly correlate with the former, what simply makes the timestamps 2016-12-31 23:59:60 and 2017-01-01 00:00:00 completely equivalent, what in turn has nothing to do with the leap second. It looks to me like (faulty) analogy between leap hour and DST, what mathematically seen is definitely wrong. Moreover by a scan you'd never get time 23:59:60 back, so it'd introduce additional inconsistency:
    % clock scan "2016-12-31 23:59:60" -gmt 1 -valid 0
    1483228800
    % clock scan "2017-01-01 00:00:00" -gmt 1
    1483228800
    % clock format [clock scan "2016-12-31 23:59:60" -gmt 1 -valid 0] -gmt 1 -format "%Y-%m-%d %T"
    2017-01-01 00:00:00
    
  2. therefore a correct implementation shall definitely consider leap seacond like a leap year, however in opposite to it, to compute the textual representation or elapsed time in seconds between two given UTC dates, it'd require the consultation of a table of leap seconds (it is not nearly mathematical like the leap years);
  3. many clocks implement leap seconds in different manner: some ignore them completely (e. g. accept possible time drift by time adjustment), some use different adjustments procedures, e. g. leap seconds in Unix time are commonly implemented by repeating 23:59:59 (rarely by adding the time-stamp 23:59:60), SNTP simply freezes time during the leap second;
  4. the textual representation of a leap second is defined by BIPM as "23:59:60", moreover leap second is always scheduled for UTC midnight at the end of the last day of a month (Jun or Dec), so even if we'd decide to ignore leap second but allows to accept such times in textual form, it shall be considered by validity rules of "...:60" times (allowed only there, ideally checking aforementioned table of leap seconds).

Accordingly, your example for 8.7 is not correct too... basically the subject of leap seconds was always "ignored" in tcls clock. The validity feature (introduced in 8.7 and made default on in 9.0) just brought the thing to light.


jan.nijtmans added on 2024-11-14 17:01:42:

Possible fix [4d72246575|here]. Since "leap minutes" could be introduced in the future (even leap hours, but that looks far away), we could prepare for that already.

The ISO 8601 standard allows leap seconds and leap hours, but no leap minutes.