Ticket UUID: | fa3d9fd818fa0072b60dcd14614ebf33d90e4d6b | |||
Title: | [fcopy $chan1 $chan2 -size $size] is not [puts -nonewline $chan2 [read $chan1 -size $size] | |||
Type: | Bug | Version: | ||
Submitter: | pooryorick | Created on: | 2023-04-03 20:55:33 | |
Subsystem: | 25. Channel System | Assigned To: | pooryorick | |
Priority: | 1 Zero | Severity: | Important | |
Status: | Closed | Last Modified: | 2023-04-07 21:10:24 | |
Resolution: | Fixed | Closed By: | pooryorick | |
Closed on: | 2023-04-07 21:10:24 | |||
Description: |
Even if two channels have he same encoding [fcopy $chan1 $chan2 -size] might not be the same as [fcopy $chan1 $chan2 -size]. The makes [fcopy -size] difficult to use correctly. In the following script, copying with a size of 1 between to channels, both using utf-8 encoding, does not copy a full character. Although the behaviour is currently documented to be what it is, it's still bad, and should be fixed:
| |||
User Comments: |
pooryorick added on 2023-04-06 10:51:52:
Fixed in [39a45eb8ff]. I think the proper fix is to stop using NULL to
represent the binary encoding in jan.nijtmans added on 2023-04-05 22:49:02: There's a setting "Beta: Use Unicode UTF-8 for worldwide language support" in Windows. So, if we wait long enough, UTF-8 will eventually become the default encoding in Windows. The problem will automatically disappear. So we could simply ignore it. Not high prio to fix. jan.nijtmans added on 2023-04-05 22:23:41: > I don't know why the tests fail only with the msvc build though It's because the gcc environment uses the bash shell, which reports to its clients that the system encoding is utf-8. The Visual Studio build environment uses the normal CMD shell, which has the normal system encoding on Windows. So, yes, this explains the difference. So, this leads to 3 different solutions. * Implementation change: Whenever ChannelState.encoding is NULL, make sure that Tcl_UtfToExternal() is explicitly given the iso8859-1 encoding. * don't set ChannelState.encoding to NULL when we want iso8859-1 * change Tcl_UtfToExternal() so NULL means iso8859-1 I think the last one is not desirable (would need a TIP, but utf-8 would be more reasonable), but the other two should be possible. pooryorick added on 2023-04-05 17:10:51:
I believe the test failures on Windows are related to the fact that
jan.nijtmans added on 2023-04-04 22:08:31: Further info: In the "bug-fa3d9fd818fa0072" branch, the same failure can be seen, so it's sure that this is the trigger. jan.nijtmans added on 2023-04-04 22:06:23: Unfortunately, with this fix two testcases started failing on Windows, see: https://github.com/tcltk/tcl/actions/runs/4605104184/jobs/8152506769 For now, I disabled those 2 test-cases, but please take a look. pooryorick added on 2023-04-03 21:28:37: Fixed in [51d813943bcaf835]. |
