Tcl Source Code

Check-in [82477e9d3a]
Login
Bounty program for improvements to Tcl and certain Tcl packages.
Tcl 2019 Conference, Houston/TX, US, Nov 4-8
Send your abstracts to [email protected]
or submit via the online form by Sep 9.

Many hyperlinks are disabled.
Use anonymous login to enable hyperlinks.

Overview
Comment:For Tcl >= 8.7, always compile-in the extended Unicode tables, no matter the value of TCL_UTF_MAX. Do this in all Tcl versions, in order to prevent merge conflicts in future Unicode table updates.
Downloads: Tarball | ZIP archive | SQL archive
Timelines: family | ancestors | descendants | both | core-8-branch
Files: files | file ages | folders
SHA3-256: 82477e9d3a00b3f44d331956f6ef36297878b458694cb7560c3c363d5900294d
User & Date: jan.nijtmans 2019-03-17 22:16:41
Context
2019-03-18
22:14
enlarge a few small buffers, which could overflow using Unicode characters > /UFFFF. Eliminate some... check-in: 41c373ad8f user: jan.nijtmans tags: core-8-branch
20:07
Add 4 new encodings, and add documentation. check-in: 0ac59eb0c6 user: jan.nijtmans tags: utf-max
15:25
merge 8.7 check-in: 9c1a58d079 user: dgp tags: tip-445-api-fix
2019-03-17
22:17
Merge 8.7 check-in: 5bfbe84775 user: jan.nijtmans tags: trunk
22:16
For Tcl >= 8.7, always compile-in the extended Unicode tables, no matter the value of TCL_UTF_MAX. D... check-in: 82477e9d3a user: jan.nijtmans tags: core-8-branch
22:15
For Tcl >= 8.7, always compile-in the extended Unicode tables, no matter the value of TCL_UTF_MAX. D... check-in: 5c8a1f59fa user: jan.nijtmans tags: core-8-6-branch
2019-03-15
20:52
Eliminate usage of mp_isneg(), just check bignum->sign directly (as libtommath itself does) Make Tcl... check-in: 515a22d41d user: jan.nijtmans tags: core-8-branch
Changes
Hide Diffs Unified Diffs Show Whitespace Changes Patch

Changes to generic/tclUniData.c.

191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
....
1149
1150
1151
1152
1153
1154
1155
1156
1157
1158
1159
1160
1161
1162
1163
....
1609
1610
1611
1612
1613
1614
1615
1616
1617
1618
1619
1620
1621
1622
1623
....
1668
1669
1670
1671
1672
1673
1674
1675
1676
1677
1678
1679
    9920, 9920, 9920, 9920, 9920, 9920, 9920, 9920, 9920, 9920, 9920, 9920,
    9920, 9920, 9920, 9920, 9920, 1344, 1344, 1344, 1344, 1344, 1344, 1344,
    1344, 1344, 1344, 1344, 9952, 1344, 1344, 9984, 1824, 10016, 10048,
    10080, 1344, 1344, 10112, 10144, 1344, 1344, 1344, 1344, 1344, 1344,
    1344, 1344, 1344, 1344, 10176, 10208, 1344, 10240, 1344, 10272, 10304,
    10336, 10368, 10400, 10432, 1344, 1344, 1344, 10464, 10496, 64, 10528,
    10560, 10592, 4736, 10624, 10656
#if TCL_UTF_MAX > 3
    ,10688, 10720, 10752, 1824, 1344, 1344, 1344, 8288, 10784, 10816, 10848,
    10880, 10912, 10944, 10976, 11008, 1824, 1824, 1824, 1824, 9280, 1344,
    11040, 11072, 1344, 11104, 11136, 11168, 11200, 1344, 11232, 1824,
    11264, 11296, 11328, 1344, 11360, 11392, 11424, 11456, 1344, 11488,
    1344, 11520, 1824, 1824, 1824, 1824, 1344, 1344, 1344, 1344, 1344,
    1344, 1344, 1344, 1344, 7776, 4704, 10272, 1824, 1824, 1824, 1824,
    11552, 11584, 11616, 11648, 4736, 11680, 1824, 11712, 11744, 11776,
................................................................................
    5, 6, 3, 3, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 92, 15, 15, 15,
    15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15,
    15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15,
    15, 15, 15, 15, 15, 15, 15, 15, 92, 92, 0, 0, 15, 15, 15, 15, 15, 15,
    0, 0, 15, 15, 15, 15, 15, 15, 0, 0, 15, 15, 15, 15, 15, 15, 0, 0, 15,
    15, 15, 0, 0, 0, 4, 4, 7, 11, 14, 4, 4, 0, 14, 7, 7, 7, 7, 14, 14,
    0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 17, 17, 17, 14, 14, 0, 0
#if TCL_UTF_MAX > 3
    ,15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 0, 15, 15, 15, 15,
    15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15,
    15, 15, 15, 15, 15, 0, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15,
    15, 15, 15, 15, 15, 15, 15, 15, 0, 15, 15, 0, 15, 15, 15, 15, 15, 15,
    15, 15, 15, 15, 15, 15, 15, 15, 15, 0, 0, 15, 15, 15, 15, 15, 15, 15,
    15, 15, 15, 15, 15, 15, 15, 0, 0, 3, 3, 3, 0, 0, 0, 0, 18, 18, 18,
    18, 18, 18, 18, 18, 18, 18, 18, 18, 18, 18, 18, 18, 18, 18, 18, 18,
................................................................................
    -2750143, -976319, -2746047, 2763650, 2762882, -2759615, -2751679,
    -2760383, -2760127, -2768575, 1859714, -9044927, -10823615, -12158,
    -10830783, -10833599, -10832575, -10830015, -10817983, -10824127,
    -10818751, 237633, -12223, -10830527, -9058239, 237698, 9949314,
    18, 17, 10305, 10370, 8769, 8834
};

#if TCL_UTF_MAX > 3
#   define UNICODE_OUT_OF_RANGE(ch) (((ch) & 0x1fffff) >= 0x2fa20)
#else
#   define UNICODE_OUT_OF_RANGE(ch) (((ch) & 0x1f0000) != 0)
#endif

/*
 * The following constants are used to determine the category of a
................................................................................
#define GetDelta(info) ((info) >> 8)

/*
 * This macro extracts the information about a character from the
 * Unicode character tables.
 */

#if TCL_UTF_MAX > 3
#   define GetUniCharInfo(ch) (groups[groupMap[pageMap[((ch) & 0x1fffff) >> OFFSET_BITS] | ((ch) & ((1 << OFFSET_BITS)-1))]])
#else
#   define GetUniCharInfo(ch) (groups[groupMap[pageMap[((ch) & 0xffff) >> OFFSET_BITS] | ((ch) & ((1 << OFFSET_BITS)-1))]])
#endif






|







 







|







 







|







 







|




191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
....
1149
1150
1151
1152
1153
1154
1155
1156
1157
1158
1159
1160
1161
1162
1163
....
1609
1610
1611
1612
1613
1614
1615
1616
1617
1618
1619
1620
1621
1622
1623
....
1668
1669
1670
1671
1672
1673
1674
1675
1676
1677
1678
1679
    9920, 9920, 9920, 9920, 9920, 9920, 9920, 9920, 9920, 9920, 9920, 9920,
    9920, 9920, 9920, 9920, 9920, 1344, 1344, 1344, 1344, 1344, 1344, 1344,
    1344, 1344, 1344, 1344, 9952, 1344, 1344, 9984, 1824, 10016, 10048,
    10080, 1344, 1344, 10112, 10144, 1344, 1344, 1344, 1344, 1344, 1344,
    1344, 1344, 1344, 1344, 10176, 10208, 1344, 10240, 1344, 10272, 10304,
    10336, 10368, 10400, 10432, 1344, 1344, 1344, 10464, 10496, 64, 10528,
    10560, 10592, 4736, 10624, 10656
#if TCL_UTF_MAX > 3 || TCL_MAJOR_VERSION > 8 || TCL_MINOR_VERSION > 6
    ,10688, 10720, 10752, 1824, 1344, 1344, 1344, 8288, 10784, 10816, 10848,
    10880, 10912, 10944, 10976, 11008, 1824, 1824, 1824, 1824, 9280, 1344,
    11040, 11072, 1344, 11104, 11136, 11168, 11200, 1344, 11232, 1824,
    11264, 11296, 11328, 1344, 11360, 11392, 11424, 11456, 1344, 11488,
    1344, 11520, 1824, 1824, 1824, 1824, 1344, 1344, 1344, 1344, 1344,
    1344, 1344, 1344, 1344, 7776, 4704, 10272, 1824, 1824, 1824, 1824,
    11552, 11584, 11616, 11648, 4736, 11680, 1824, 11712, 11744, 11776,
................................................................................
    5, 6, 3, 3, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 92, 15, 15, 15,
    15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15,
    15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15,
    15, 15, 15, 15, 15, 15, 15, 15, 92, 92, 0, 0, 15, 15, 15, 15, 15, 15,
    0, 0, 15, 15, 15, 15, 15, 15, 0, 0, 15, 15, 15, 15, 15, 15, 0, 0, 15,
    15, 15, 0, 0, 0, 4, 4, 7, 11, 14, 4, 4, 0, 14, 7, 7, 7, 7, 14, 14,
    0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 17, 17, 17, 14, 14, 0, 0
#if TCL_UTF_MAX > 3 || TCL_MAJOR_VERSION > 8 || TCL_MINOR_VERSION > 6
    ,15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 0, 15, 15, 15, 15,
    15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15,
    15, 15, 15, 15, 15, 0, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15,
    15, 15, 15, 15, 15, 15, 15, 15, 0, 15, 15, 0, 15, 15, 15, 15, 15, 15,
    15, 15, 15, 15, 15, 15, 15, 15, 15, 0, 0, 15, 15, 15, 15, 15, 15, 15,
    15, 15, 15, 15, 15, 15, 15, 0, 0, 3, 3, 3, 0, 0, 0, 0, 18, 18, 18,
    18, 18, 18, 18, 18, 18, 18, 18, 18, 18, 18, 18, 18, 18, 18, 18, 18,
................................................................................
    -2750143, -976319, -2746047, 2763650, 2762882, -2759615, -2751679,
    -2760383, -2760127, -2768575, 1859714, -9044927, -10823615, -12158,
    -10830783, -10833599, -10832575, -10830015, -10817983, -10824127,
    -10818751, 237633, -12223, -10830527, -9058239, 237698, 9949314,
    18, 17, 10305, 10370, 8769, 8834
};

#if TCL_UTF_MAX > 3 || TCL_MAJOR_VERSION > 8 || TCL_MINOR_VERSION > 6
#   define UNICODE_OUT_OF_RANGE(ch) (((ch) & 0x1fffff) >= 0x2fa20)
#else
#   define UNICODE_OUT_OF_RANGE(ch) (((ch) & 0x1f0000) != 0)
#endif

/*
 * The following constants are used to determine the category of a
................................................................................
#define GetDelta(info) ((info) >> 8)

/*
 * This macro extracts the information about a character from the
 * Unicode character tables.
 */

#if TCL_UTF_MAX > 3 || TCL_MAJOR_VERSION > 8 || TCL_MINOR_VERSION > 6
#   define GetUniCharInfo(ch) (groups[groupMap[pageMap[((ch) & 0x1fffff) >> OFFSET_BITS] | ((ch) & ((1 << OFFSET_BITS)-1))]])
#else
#   define GetUniCharInfo(ch) (groups[groupMap[pageMap[((ch) & 0xffff) >> OFFSET_BITS] | ((ch) & ((1 << OFFSET_BITS)-1))]])
#endif

Changes to tools/uniParse.tcl.

208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
...
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
...
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
...
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
    set last [expr {[llength $pMap] - 1}]
    for {set i 0} {$i <= $last} {incr i} {
	if {$i == [expr {0x10000 >> $shift}]} {
	    set line [string trimright $line " \t,"]
	    puts $f $line
	    set lastpage [expr {[lindex $line end] >> $shift}]
	    puts stdout "lastpage: $lastpage"
	    puts $f "#if TCL_UTF_MAX > 3"
	    set line "    ,"
	}
	append line [lindex $pMap $i]
	if {$i != $last} {
	    append line ", "
	}
	if {[string length $line] > 70} {
................................................................................
    set line "    "
    set lasti [expr {[llength $pages] - 1}]
    for {set i 0} {$i <= $lasti} {incr i} {
	set page [lindex $pages $i]
	set lastj [expr {[llength $page] - 1}]
	if {$i == ($lastpage + 1)} {
	    puts $f [string trimright $line " \t,"]
	    puts $f "#if TCL_UTF_MAX > 3"
	    set line "    ,"
	}
	for {set j 0} {$j <= $lastj} {incr j} {
	    append line [lindex $page $j]
	    if {$j != $lastj || $i != $lasti} {
		append line ", "
	    }
................................................................................
	    puts $f [string trimright $line]
	    set line "    "
	}
    }
    puts $f $line
    puts -nonewline $f "};

#if TCL_UTF_MAX > 3
#   define UNICODE_OUT_OF_RANGE(ch) (((ch) & 0x1fffff) >= [format 0x%x $next])
#else
#   define UNICODE_OUT_OF_RANGE(ch) (((ch) & 0x1f0000) != 0)
#endif

/*
 * The following constants are used to determine the category of a
................................................................................
#define GetDelta(info) ((info) >> 8)

/*
 * This macro extracts the information about a character from the
 * Unicode character tables.
 */

#if TCL_UTF_MAX > 3
#   define GetUniCharInfo(ch) (groups\[groupMap\[pageMap\[((ch) & 0x1fffff) >> OFFSET_BITS\] | ((ch) & ((1 << OFFSET_BITS)-1))\]\])
#else
#   define GetUniCharInfo(ch) (groups\[groupMap\[pageMap\[((ch) & 0xffff) >> OFFSET_BITS\] | ((ch) & ((1 << OFFSET_BITS)-1))\]\])
#endif
"

    close $f
}

uni::main

return






|







 







|







 







|







 







|












208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
...
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
...
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
...
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
    set last [expr {[llength $pMap] - 1}]
    for {set i 0} {$i <= $last} {incr i} {
	if {$i == [expr {0x10000 >> $shift}]} {
	    set line [string trimright $line " \t,"]
	    puts $f $line
	    set lastpage [expr {[lindex $line end] >> $shift}]
	    puts stdout "lastpage: $lastpage"
	    puts $f "#if TCL_UTF_MAX > 3 || TCL_MAJOR_VERSION > 8 || TCL_MINOR_VERSION > 6"
	    set line "    ,"
	}
	append line [lindex $pMap $i]
	if {$i != $last} {
	    append line ", "
	}
	if {[string length $line] > 70} {
................................................................................
    set line "    "
    set lasti [expr {[llength $pages] - 1}]
    for {set i 0} {$i <= $lasti} {incr i} {
	set page [lindex $pages $i]
	set lastj [expr {[llength $page] - 1}]
	if {$i == ($lastpage + 1)} {
	    puts $f [string trimright $line " \t,"]
	    puts $f "#if TCL_UTF_MAX > 3 || TCL_MAJOR_VERSION > 8 || TCL_MINOR_VERSION > 6"
	    set line "    ,"
	}
	for {set j 0} {$j <= $lastj} {incr j} {
	    append line [lindex $page $j]
	    if {$j != $lastj || $i != $lasti} {
		append line ", "
	    }
................................................................................
	    puts $f [string trimright $line]
	    set line "    "
	}
    }
    puts $f $line
    puts -nonewline $f "};

#if TCL_UTF_MAX > 3 || TCL_MAJOR_VERSION > 8 || TCL_MINOR_VERSION > 6
#   define UNICODE_OUT_OF_RANGE(ch) (((ch) & 0x1fffff) >= [format 0x%x $next])
#else
#   define UNICODE_OUT_OF_RANGE(ch) (((ch) & 0x1f0000) != 0)
#endif

/*
 * The following constants are used to determine the category of a
................................................................................
#define GetDelta(info) ((info) >> 8)

/*
 * This macro extracts the information about a character from the
 * Unicode character tables.
 */

#if TCL_UTF_MAX > 3 || TCL_MAJOR_VERSION > 8 || TCL_MINOR_VERSION > 6
#   define GetUniCharInfo(ch) (groups\[groupMap\[pageMap\[((ch) & 0x1fffff) >> OFFSET_BITS\] | ((ch) & ((1 << OFFSET_BITS)-1))\]\])
#else
#   define GetUniCharInfo(ch) (groups\[groupMap\[pageMap\[((ch) & 0xffff) >> OFFSET_BITS\] | ((ch) & ((1 << OFFSET_BITS)-1))\]\])
#endif
"

    close $f
}

uni::main

return