Ticket UUID: | bcd1004104651998d1ae99aa6869e249c0ed0019 | |||
Title: | A change to the system encoding, a stale cached file FsPath object, and an incorrect result | |||
Type: | Bug | Version: | ||
Submitter: | pooryorick | Created on: | 2019-10-22 14:27:51 | |
Subsystem: | 37. File System | Assigned To: | pooryorick | |
Priority: | 5 Medium | Severity: | Important | |
Status: | Closed | Last Modified: | 2019-10-28 19:26:14 | |
Resolution: | Fixed | Closed By: | pooryorick | |
Closed on: | 2019-10-28 19:26:14 | |||
Description: |
(text/x-fossil-wiki)
The ext file system has no concept of "encoding": It names a file exactly the sequence of octets it is handed. In the following script, iso8859-1 is used to create a file, and then that same sequence of octets is obtained using <code>glob</code>. Then an FsPath object created under a different system encoding is created and cached. Later on that FsPath object is used under the original system encoding when it shouldn't be. <code><verbatim> set name 登鸛鵲樓 encoding system iso8859-1 # create an iso8859-1-compatible sequence of octets to name the file by. # They happen to be a string encoded in utf-8, but that's irrelevant. set utf8 [encoding convertto utf-8 $name] set chan [open $utf8 w] puts $chan hello close $chan # lassign [glob -directory [pwd] *$utf8*] fname encoding system utf-8 # The sequence of octets is now itself converted to utf-8. Such a file does not # exist, and a FsPath object has been cached. So far so good. puts [list utf-8 [file exists $fname]] ;#-> 0 encoding system iso8859-1 # Here the answer here should be 1, but the FsPath object created under a # previous system encoding is used when it shouldn't be. puts [list iso8859-1 [file exists $fname]] ;# -> 0 # Discard the cached FsPath object, causing a new FsPath to be created using # the (currently correct) system encoding. puts [list iso8859-1 {string range} [file exists [string range $fname 0 end]]] ;# -> 1 </verbatim></code> | |||
User Comments: |
jan.nijtmans added on 2019-10-28 16:41:56:
Well, I put back the test, so is this one fixed now? jan.nijtmans added on 2019-10-28 16:25:29: (text/x-fossil-wiki) > Or ... just change the expected test result for UNIX ... Of course, I meant Windows here .... jan.nijtmans added on 2019-10-28 16:21:10: Or ... just change the expected test result for UNIX ... pooryorick added on 2019-10-28 16:11:46: (text/x-fossil-wiki) The bug can not be reproduced on NTFS filesystems in the first place. That's why with an NTFS filesystem the first result of filesystemEncoding-1.0 is <code>1</code> rather than <code>0</code>. pooryorick added on 2019-10-28 16:08:06: (text/x-fossil-wiki) Maybe I'm wrong, but because NTFS has a native encoding whereas common *NIX filsystems do not, this test can not be adapted for NTFS. Therefore it could be constrained to *NIX systems rather than removed. jan.nijtmans added on 2019-10-28 08:58:50: I just removed the filesystemEncoding test-case now, knowing that this bug is still not solved for Windows. That means, it's not critical any more now. jan.nijtmans added on 2019-10-28 08:25:03: (text/x-fossil-wiki) On UNIX, it appears to work, but all travis Windows builds are failing: <pre> ==== filesystemEncoding-1.0 issue bcd100410465 FAILED ==== Contents of test case: set dir [tcltests::tempdir] set saved [encoding system] encoding system iso8859-1 set fname1a $dir/$fname1 set utf8name [encoding convertto utf-8 $fname1a] makeFile {} $utf8name set globbed [lindex [glob -directory $dir *] 0] encoding system utf-8 lappend res [file exists $globbed] encoding system iso8859-1 lappend res [file exists $globbed] return $res ---- Result was: 1 1 ---- Result should have been (exact matching): 0 1 ==== filesystemEncoding-1.0 FAILED </pre> This way Donald Porter cannot make his RC, so please have a look at this! pooryorick added on 2019-10-26 15:42:06: (text/x-fossil-wiki) Fixed in [440efbd556ce7041] for core-8-6-branch. In contrast with the proposed fix for the core-8-branch, this is a one-line fix. It should probably also be applied to core-8-branch, and additional work done there to refine the record keeping when system encoding changes. bll added on 2019-10-25 00:43:34: Ok, reproduced. bll added on 2019-10-24 23:05:35: Unable to reproduce: 8.6.9 on MX Linux 19 8.6.5 on Ubuntu 16.04 bll added on 2019-10-24 23:01:10: Unable to reproduce with tcl 8.6.9 on Linux. pooryorick added on 2019-10-24 21:19:03: (text/x-fossil-wiki) The initially-provided example behaves as described in core-8-6-branch, but in core-8-branch only behaves as described prior to commit [165586b224f85f10], " TIP #430 implementation", 2018-09-12. This is because even when built with <code>--disable-zipfs</code>, in core-8-branch there are calls to <code>ZipFSPathInFilesystemProc</code> from <code>Tcl_FSGetFileSystemForPath</code>, and <code>ZipFSPathInFilesystemProc</code> uses <code>Tcl_GetString(pathPtr)</code> instead of working with the <code>fsPathType</code> internal representation. jan.nijtmans added on 2019-10-24 21:04:02: Hm ... Added the Tcl_FreeEncoding() now, it doesn't seem to work ... Anyway, I'll leave the branch to you, in case you want to work on it further. Please, next time, don't merge it to core-8-branch as long as you are not reasonable confident that all test-cases work. Thanks! pooryorick added on 2019-10-24 20:33:51: (text/x-fossil-wiki) The failue of encoding-2.2 was fixed by adding a corresponding <code>Tcl_FreeEncoding</code> for the newly-added <code>Tcl_GetEncoding</code>. There's more work to be done on this issue though. There might be undesired interactions between <code>FsPath.nativePathPtr</code> and <code>FsPath.filesystemEpoch</code>. These are older issues, not something introduced by the current changes. jan.nijtmans added on 2019-10-24 08:29:22: (text/x-fossil-wiki) Since Travis already fails for 2 days, I put the work on this ticket in a separate [bug-bcd1004104] branch. Please have a look: Something still wrong with the code, or should the test-case be revised? jan.nijtmans added on 2019-10-22 22:11:24: (text/x-fossil-wiki) Starting with [0f2870649c804dd8]: <pre> encoding.test ==== encoding-2.2 Tcl_FreeEncoding: refcount != 0 FAILED ==== Contents of test case: encoding system shiftjis ;# incr ref count encoding dirs [list [pwd]] set x [encoding convertto shiftjis \u4e4e] ;# old one found encoding system iso8859-1 llength shiftjis ;# Shimmer away any cache of Tcl_Encoding lappend x [catch {encoding convertto shiftjis \u4e4e} msg] $msg ---- Result was: Á 0 Á ---- Result should have been (exact matching): Á 1 {unknown encoding "shiftjis"} ==== encoding-2.2 FAILED </pre> |