Ticket UUID: | e02798626dfbcd7b33db19e0fc3a9fb9fa0714f4 | |||
Title: | clock command is 8 times slower with Tcl 9.0b1 | |||
Type: | Bug | Version: | 8.7+ | |
Submitter: | anonymous | Created on: | 2024-02-24 18:36:48 | |
Subsystem: | 16. Commands A-H | Assigned To: | nobody | |
Priority: | 5 Medium | Severity: | Important | |
Status: | Closed | Last Modified: | 2024-02-28 18:51:25 | |
Resolution: | Fixed | Closed By: | anonymous | |
Closed on: | 2024-02-28 18:51:25 | |||
Description: |
tcl_platform(machine) = x86_64 tcl_platform(os) = Darwin tcl_platform(osVersion) = 23.3.0 Here is my procedure : proc testclock {} { set base [clock scan {03/09/1968} -format {%d/%m/%Y}] for {set i 1} {$i < 20000} {incr i} { set now [clock scan {+1 days} -base $base] set date [clock format [clock scan {+1 months} -base $now] -format {%d/%m/%Y}] set base $now } } testclock time tclsh8.6.13 testclock.tcl > 1,09s user 0,01s system 99% cpu 1,105 total time tclsh9.0 testclock.tcl > 7,92s user 0,02s system 99% cpu 7,977 total | |||
User Comments: |
anonymous added on 2024-02-28 18:51:25:
+8 Thanks to all! sebres added on 2024-02-28 10:03:56: merged to 8.7/9.0, closed jan.nijtmans added on 2024-02-27 23:19:35: Fix in branch "fix-e02798626d--msgcat-regress" branch looks perfect to me. Many thanks! sebres added on 2024-02-27 18:17:51: > The shimmering between utf8 and UNICODE is quite new to me and is an important fact. Well, it is rather not the shimmering of utf-8 to unicode (utf-8 is always there as objPtr->bytes). I'm even unsure it shimmers. Anyway, since string match looks indeed better, I'd rewrite it so. oehhar added on 2024-02-27 17:47:58: Wow, I would never have thought this. The shimmering between utf8 and UNICODE is quite new to me and is an important fact. You learn each day, thank you so much. Here are the results on my tcl 8.6.13: % proc test {ns} { timerate { regexp -- {^::oo::} $ns } }; test ::tcl::clock 0.105722 µs/# 5368601 # 9458756 #/sec 567.580 net-ms % proc test {ns} { timerate { string equal -length 6 {::oo::} $ns } }; test ::tcl::clock 0.642979 µs/# 1382123 # 1555260 #/sec 888.676 net-ms % proc test {ns} { timerate { regexp -- {^::oo::} $ns } }; test ::oo::test 0.188651 µs/# 3714746 # 5300782 #/sec 700.792 net-ms 5 % proc test {ns} { timerate { string equal -length 6 {::oo::} $ns } }; test ::oo::test 0.659135 µs/# 1351934 # 1517139 #/sec 891.107 net-ms % expr 0.642979/0.105722 6.081789977488129 % expr 0.659135/0.188651 3.4939385425998273 So factor 6 and 3.5 faster for regexp, incredibble. I also tried "string match {::oo::*}" which is close to regexp. TIP 688 is started and may be improved : https://core.tcl-lang.org/tips/doc/trunk/tip/688.md Thank you, you guys rock! Harald sebres added on 2024-02-27 16:44:41:
As for You can try it by yourself (I guess slowing down by factor 10):
sebres added on 2024-02-27 16:31:59: This is fixed now (see branch fix-e02798626d--msgcat-regress). There is no regression anymore (it looks even faster than 8.6 now)... As already mentioned, it also fixes unintentional set of Although better would be to rewrite ::msgcat::PackageNamespaceGet completely, without regexp and without attempts to detect a namespace by self-invocation, if tclOO would implement one of this facilities: - something like
- something like
- something like
No idea, whether there is better way to retrieve it at tcl-level now. May be Donal (dkf) know? Anyway, it works as expected now, tests passed too. Please test and review. If the fix is OK, I'll merge it in 8.7/9.0. oehhar added on 2024-02-27 16:22:40: Thanks for commit [29eaa2a82fed26b7], great work! I did all this 7 years ago for René together with Donal Fellows... May I just propose to replace if {![regexp -- {^::oo::} $ns]} { by if {![string equal -length 6 {::oo::} $ns]} { Take care, Harald oehhar added on 2024-02-27 16:06:15: That is great news, thanks a lot ! Harald sebres added on 2024-02-27 15:46:23: The issue also affects 8.7 (thus version updated). And yes, it is definitely ::msgcat::PackageNamespaceGet (so something shimmers there, but it's not the catalog dict). BTW, previously ::msgcat::PackageNamespaceGet did change frame variable oehhar added on 2024-02-27 11:08:17: Great Sergey, I really appreciate ! Take care, Harald sebres added on 2024-02-27 10:43:22:
> This could be avoided using the new "mcn" command with a namespace as first argument. I don't think so: sure it'd speed-up a bit the mc for non-class NS (this evident +1µs/# illustrated below), but it doesn't explain +28µs/# in both blocks (as well as loss of this extra costs in other block with "EnterLocale" immediately after I replaced mc-call with I'm trying to find out what happens here if I get a free minute. oehhar added on 2024-02-27 06:51:52: I have scanned the msgcat code. The main additional task is to check, if it is called by an oo class. See function proc ::msgcat::PackageNamespaceGet, which is now called for each mc command. This could be avoided using the new "mcn" command with a namespace as first argument. Might that be a path? Thank you and take care, Harald oehhar added on 2024-02-27 06:28:57: Sergey, great analysis, thank you. I will double-check the msgcat module. There might be checks for a dict like "string is list and llength %2 == 0" which will shimmer to strings... The new clock branch is also great, thank you. Any improvement is welcomed. Take care, Harald sebres added on 2024-02-26 22:28:04:
Back to the issue...
Ultimately, I found what the problem is: it is not clock at all (or not directly clock). My profiler module doesn't work for 9.0 (it segfaults), so I wrote a small simple tcl-level profiler tool and injection for How one can see it drastically increased times in % timerate { namespace inscope ::tcl::clock { mc GREGORIAN_CHANGE_DATE } } -3.939602 µs/# 253833 # 253832 #/sec 1000.001 net-ms +2.451155 µs/# 407971 # 407971 #/sec 1000.000 net-ms it'd show marginal deviation (it's surely slower than 8.6, but doesn't cost 30µs/# and basically 10 times faster than if it is executed within Because it also affect To confirm that, I modified the gist like here - GIST/sebres/4da0d76792111(v1). And the result is as expected many times faster, and now the profiler blocks are sorted similar to 8.6 ( My conclusion: this is rather an issue of Thus, as already assumed, this worsening must be fixed regardless the clock-speedup merge (or at least documented like "don't use ... anymore"), because otherwise people using sebres added on 2024-02-26 18:22:10: Here is the first attempt to show all the differences... The complete description is the RFE [ddc948cff9], see "Known small incompatibilities" section and in following comment (2018-05-29 18:21:07). The main differences in tcl tests (change of precedence of some clock-formats by There are also few new formats, for instance The main enhancements of command are: - clock configure with possibility to configure main default options per interpreter; - new option
- clock accept new base value By the way, periodically I rebased the fixes made for tclclockmod to clock-speedup branch, but the last rebase was 2020-06-23. Since this last rebase, tclclockmod implementation got several fixes (many of them also affecting tcl-core too), that are not back-ported yet to the tcl-core branch, for the full list see GH/tclclockmod/973aba1084b2274a..master. IIRC, Flightaware uses also this variant of tclclockmod (possibly excepting few of the last commits) many years. sebres added on 2024-02-26 15:29:16: I already saw your mail and preparing an answer (must find old mail about changes affecting test cases, as well as designate all the corner changes). Also the current tclclockmod implementation exhibit certain differences (by continued further enhancements). Regarding the regression - it'd be good to know what exactly it is, anyway.
Just because such huge effect may be caused by some other enhancement of 9.0, and in this case could affect also another code pieces in similar way.
Sure, the clock engine would be "fixed" with merge of clock-speedup branch, but it'd be putting things too simply for whole 9.0 core, at least until one can't completely exclude further regressions by similar handling somewhere. oehhar added on 2024-02-26 14:45:16: Sergey, great analysis, thank you ! Ashok, Jan and myself discussed this within a telco today. Ashok and Jan both searched for the reason and tested for example zip file system compared to physical file system, but with no findings. The proposal is as follows:
Any help apprciated. If you have a list of corner case changes, I may author a TIP. Thank you, you guys rock ! Harald sebres added on 2024-02-26 13:22:48: Heavy regression seems to be only for
The second last illustrates that this occurs not in yacc/bison parser or lex transformation, but rather by internal calculation somewhere around relative time adjustment. This confirms that too:
And yes, tclclockmod seems to be not affected (works for 9.0 with exactly the same speed as for 8.6/8.7). oehhar added on 2024-02-25 18:49:10: Yes, that is quite sad. And we have the patch by Sergey which accellerates the clock command by a factor of 20. And it was never merged... Any work here appreciated: Here are the branches:
But we are happy, if we get 9.0 in a certain suboptimal way out of the door. Thank you and take care, Harald |
