Ticket UUID: | 28cc67a606a7ab771af4c7b542b34ec1a377edc7 | |||
Title: | array set from dict: set values multi times on the same array is slower with Tcl9.0. | |||
Type: | Bug | Version: | 9.0 | |
Submitter: | anonymous | Created on: | 2024-04-15 15:59:55 | |
Subsystem: | 16. Commands A-H | Assigned To: | griffin | |
Priority: | 5 Medium | Severity: | Minor | |
Status: | Closed | Last Modified: | 2024-04-25 19:55:10 | |
Resolution: | Fixed | Closed By: | griffin | |
Closed on: | 2024-04-25 19:55:10 | |||
Description: |
proc SimplePERM {} { set permmod [dict create] for {set i 0} {$i < 512} {incr i} { dict set permmod $i [expr {$i + int(rand() * (256 - $i))}] } return $permmod } proc generateData {} { array set val0 {} ; array set val1 {} ; array set val2 {} for {set i 0} {$i <= 20} {incr i} { for {set j 0} {$j <= 20} {incr j} { for {set k 0} {$k <= 20} {incr k} { set v [SimplePERM] array set val0 $v array set val1 $v array set val2 $v } } } } generateData time tclsh9.0b2 setarray.tcl > 8,23s user 0,02s system 99% cpu 8,281 total time tclsh9.0b1 setarray.tcl > 7,93s user 0,02s system 99% cpu 7,962 total time tclsh8.6.14 setarray.tcl > 2,71s user 0,01s system 99% cpu 2,726 total | |||
User Comments: |
griffin added on 2024-04-25 19:55:10:
Removed abstract list extension from dict objtype. The implementation is far from optimal. Handling non-pure dict values will require a complex solution. sebres added on 2024-04-23 16:49:09:
Well, this is even one of the additional issues - once the string representation created (for whatever reason), it is not possible anymore to know whether dict is "pure" dict or not. The lists have a "canonical" flag, so the same must be implemented for dict (it is canonical unless obtained from string representation), so TclIsPureDict could be replaced by TclIsCanonicalDict and in that case it is no matter whether string representation is there or not. dkf added on 2024-04-23 15:16:33: I was glancing at this earlier, and wondered why we were always generating the string representation of the dictionary, even in cases where only the internal representation exists (when we can guarantee that the list length is exactly twice the dictionary size). oehhar added on 2024-04-22 07:08:12: Thanks, Brian, for looking into this, I appreciate. Your personal message indicates that the issue is easy to fix but requires further investigation. Thanks for all the constant work. It is often sad (but normal procedure) that a great achevement gets tainted by small side-issues. But that is normal, thank you all for making TCL even better ! Harald griffin added on 2024-04-22 01:58:41: Removing abstractlist extension from dict objtype. This implementation was naive and dict is probably already optimized w.r.t. list operations. Further analysis is needed to determine if any better approaches are possible. Changes on branch [bug28cc67a606a7] for review. oehhar added on 2024-04-17 18:51:45: Hi Sergey, thank you for finding the issue. I think, the sense of the beta phase is just to find those issues. We always have the possibility to not merge and nobody will try, or by merging and get feed-back. I prefer to merge. I think, the abstract list is a great feature. And it is sure, that it is not perfect. Lets work on it and give Brian and others some time to comment. Thank you again, Harald sebres added on 2024-04-16 22:17:45: The assumption that the regression is related to abstract list (type enhancement) is confirmed: [f525e7ad92ff63c8] doesn't show it at all (neither by iteration over dict, nor by llength or array set). Sure the dict will shimmer to list there, but only once (where in current trunk due to internal duplication by iteration, new dict object will shimmer to list every time). Still worse it'd do that over the freshly created string-representation, where previously it did that over dict internals (for pure-dicts). I have also a small question, by the way: since when exactly half-finished solutions (with large effect or such affecting almost everything) get merged now in main branch?
If I take a closer look at that implementation (not particularly dict, but complete type-extention of TIP 636) it looks not really well thought out and rather as some quick (and even more or less dirty) partly-backed solution to me. sebres added on 2024-04-16 20:27:31:
It doesn't "return an array", because it is currently impossible in tcl - [array get] returns a dict:
Tcl cannot (and probably never will) directly return an array - array is just a complex variable (hash table in var in scope/NS/whatever), so [array get] generates a dict at the time of execution to be able to return it (basically for the same reason why one cannot return directly $env (that's as implemented in core-tcl, there are another tcl-engines which able to do that and even such where array and dict are the same stuff). anonymous added on 2024-04-16 17:45:33: The case is also present if SimplePERM proc returns an array. array set permmod {} for {set i 0} {$i < 512} {incr i} { set permmod($i) 1 } return [array get permmod] anonymous added on 2024-04-16 17:33:28: Indeed, your analysis is much more advanced than mine. Which made me think that it was the array command that was the problem, was that if I created one array for each iteration, the time was identical between tcl9 and tcl8.6.14. sebres added on 2024-04-16 17:22:57: This definitely belongs to abstract-lists implementation of 9.0 and although it was supposed to be better (avoid shimmer of dict to list), in the reality it is rather worse now:
All that makes difficult to implement it with reasonable performance for all corner cases. However I'll take a closer look to this later. sebres added on 2024-04-16 14:51:37: It is only affected if the value is set from dict. Basically as already correct assumed, it is not array set - it is foreach over dict content (foreach_start/foreach_step instructions compiled internally) that cause performance regression there. Here is an example without array set, that shows the same regression with 9.0 (also llength $v does it):
sebres added on 2024-04-16 14:02:24: This is not platform-related, I see this on both windows and linux pretty well:
Trying this on both versions:
doesn't explain that 3x difference in the "granulated" times (it is slower but few percent only, e. g. array set shows in 9.0 ca. 5% degradation only, and all others are similar, if not even less). The same happens if I put timerate into generateData to measure whole code once. However if timerate is outside of original generateData (measures single or multiple invocations of generateData), there is indeed 3x difference between 8.6 and 9.0 (8.7 seems to be not affected). Here is the original example rewrote as much simpler version with 10x performance degradation (however still O(n) just 10x larger), which also illustrates difference with inlined in proc timerate (without degradation):
Result as diff between 9.0 (bad -) and 8.6/8.7 (good +): - 2302136 µs/# 1 # 0.434 #/sec 2302.136 net-ms + 200457.0 µs/# 1 # 4.989 #/sec 200.457 net-ms - 4640549 µs/# 1 # 0.215 #/sec 4640.549 net-ms + 475257.0 µs/# 1 # 2.104 #/sec 475.257 net-ms # timerate inside proc: - 157615.0 µs/# 1 # 6.345 #/sec 157.615 net-ms + 136238.0 µs/# 1 # 7.340 #/sec 136.238 net-ms - 320462.0 µs/# 1 # 3.120 #/sec 320.462 net-ms + 286145.0 µs/# 1 # 3.495 #/sec 286.145 net-ms Anyway I don't think it is array set command alone. anonymous added on 2024-04-15 16:29:09: Harald, Thank you for testing, I too am surprised to see so much difference. My platform is macOS. Maybe someone else can test on the same OS (if that's where the issue comes from). oehhar added on 2024-04-15 16:10:16: Nicolas, thanks for the ticket. I have tried your command with wish on windows with timerate "cmd". Here are the results: TCL 8.6.14 % timerate generateData 7586957 µs/# 1 # 0.132 #/sec 7586.957 net-ms TCL 9.0b2rc0 % timerate generateData 9272919 µs/# 1 # 0.108 #/sec 9272.919 net-ms So, the difference is not so huge. Take care, Harald |