Ticket UUID: | d433c0e0add0496ef4a7d73c8580149605dadff6 | |||
Title: | TCL_UTF_MAX == 4 problems | |||
Type: | Bug | Version: | core-8-6-10-rc | |
Submitter: | chw | Created on: | 2019-11-12 23:43:27 | |
Subsystem: | 44. UTF-8 Strings | Assigned To: | jan.nijtmans | |
Priority: | 5 Medium | Severity: | Minor | |
Status: | Closed | Last Modified: | 2019-11-13 14:25:53 | |
Resolution: | Fixed | Closed By: | jan.nijtmans | |
Closed on: | 2019-11-13 14:25:53 | |||
Description: |
On Debian 9 using $ cd .../unix ; CC="cc -DTCL_UTF_MAX=4" ./configure --prefix=/tmp/mytcl --disable-static ; make $ ./tclsh % set X "\U1F602" freezes when run in a gnome-terminal. If an xterm is used instead, more chars than expected are output from the tclsh, as shown in this strace dump: write(1, "\360\302\230\302\230\302\202\r\n", 9) = 9 When rebuilding with $ cd .../unix ; CC="cc -DTCL_UTF_MAX=6" ./configure --prefix=/tmp/mytcl --disable-static ; make $ ./tclsh % set X "\U1F602" the emoji is properly displayed in gnome-terminal and strace yields the expected write(1, "\360\237\230\202\r\n", 6) = 6 Right now, this ticket is a showstopper for undroidwish/vanillawish (not for AndroidWish). | |||
User Comments: |
jan.nijtmans added on 2019-11-13 14:25:53:
Thanks for confirming the fix! .. And for the bug report to start with ... !!!! chw added on 2019-11-13 14:07:20: Bingo! The exact same sequence I've run against [b5633ba3bd] now works both with "\U1F602" and "\uD83D\uDE02" notation. I think you can close the ticket now, thanks for fixing. jan.nijtmans added on 2019-11-13 12:39:01: Please, try again with [e377ac273f]. I think I really got it now ... chw added on 2019-11-13 09:51:51: Again on Debian 9, core-8-6-branch [b5633ba3bd] using $ cd .../unix ; CC="cc -DTCL_UTF_MAX=4" ./configure --prefix=/tmp/mytcl --disable-static ; make $ ./tclsh % set X "\U1F602" the gnome-terminal hangs and the strace dump is write(1, "\360\302\230\302\230\302\202\r\n", 9) Using surrogate pair notation outputs the expected emoji % set X "\uD83D\uDE02" and strace dump is write(1, "\360\237\230\202\r\n", 6) = 6 Now let's do the opposite direction % set X "\uD83D\uDE02" % set F [open OUT w] % puts -nonewline $F $X % close $F % set F [open OUT] % set Y [read $F] ; close $F % set Y the gnome-terminal hangs again and the strace dump is write(1, "\303\260\302\230\302\230\302\202\r\n", 10) jan.nijtmans added on 2019-11-13 09:21:36: Did you bisect? If you did, I guess that commit [9e1984c250d1a859] introduced this bug. This commit made it possible that TclUtfToUniChar() is called when it points to the 2nd character of a valid 4-byte UTF-8 character. The macro didn't account for that. Thanks! jan.nijtmans added on 2019-11-13 09:10:00: This should be fixed in [b5633ba3bd8fa74e]. Can you confirm that this fixes this? Thanks! jan.nijtmans added on 2019-11-13 07:52:19: Thanks for the report. On core-8-6-10-rc this is definitely exported to work, so raising to "minor" chw added on 2019-11-13 06:47:37: For reference, the core-8-6-9 branch built with $ cd .../unix ; CC="cc -DTCL_UTF_MAX=4" ./configure --prefix=/tmp/mytcl --disable-static ; make $ ./tclsh % set X "\U1F602" outputs the expected emoji and the strace dump is: write(1, "\360\237\230\202\r\n", 6) = 6 chw added on 2019-11-13 06:42:25: The core-8-7-a3-rc branch built with $ cd .../unix ; CC="cc -DTCL_UTF_MAX=4" ./configure --prefix=/tmp/mytcl --disable-static ; make $ ./tclsh % set X "\U1F602" gives some weird output in gnome-terminal and the strace dump is: write(1, "\360\313\234\313\234\342\200\232\r\n", 10) |
