Tcl Source Code

Changes On Branch tip-388-impl
Login
Bounty program for improvements to Tcl and certain Tcl packages.

Many hyperlinks are disabled.
Use anonymous login to enable hyperlinks.

Changes In Branch tip-388-impl Excluding Merge-Ins

This is equivalent to a diff from 3ea7c67cbf to 8b3fef2633

2011-09-16
13:23
IMPLEMENTATION OF TIP #388 check-in: 4d6af4f7a4 user: jan.nijtmans tags: trunk, potential incompatibility
08:34
[Bug 3391977]: Ensure that the -headers option to http::geturl overrides the -type option (importan... check-in: ece59da1db user: dkf tags: trunk
08:14
merge to feature branch check-in: 7c746c8b38 user: jan.nijtmans tags: tip-389-impl
08:12
merge trunk to feature branch Closed-Leaf check-in: 8b3fef2633 user: jan.nijtmans tags: tip-388-impl
2011-09-15
16:27
3408408 Partial improvement by sharing as literals the computed values of constant subexpressions wh... check-in: 3ea7c67cbf user: dgp tags: trunk
2011-09-13
20:04
3390638 Workaround broken solaris studio cc optimizer. Thanks to Wolfgang S. Kechel. check-in: b9fb2d7653 user: dgp tags: trunk
2011-08-29
07:32
Merge to feature branch check-in: a28c1f710a user: jan.nijtmans tags: tip-388-impl

Changes to doc/Tcl.n.

     2      2   '\" Copyright (c) 1993 The Regents of the University of California.
     3      3   '\" Copyright (c) 1994-1996 Sun Microsystems, Inc.
     4      4   '\"
     5      5   '\" See the file "license.terms" for information on usage and redistribution
     6      6   '\" of this file, and for a DISCLAIMER OF ALL WARRANTIES.
     7      7   '\"
     8      8   .so man.macros
     9         -.TH Tcl n "8.5" Tcl "Tcl Built-In Commands"
            9  +.TH Tcl n "8.6" Tcl "Tcl Built-In Commands"
    10     10   .BS
    11     11   .SH NAME
    12     12   Tcl \- Tool Command Language
    13     13   .SH SYNOPSIS
    14     14   Summary of Tcl language syntax.
    15     15   .BE
    16     16   .SH DESCRIPTION
................................................................................
   189    189   .TP 7
   190    190   \e\e
   191    191   Backslash
   192    192   .PQ \e "" .
   193    193   .TP 7
   194    194   \e\fIooo\fR 
   195    195   .
   196         -The digits \fIooo\fR (one, two, or three of them) give an eight-bit octal 
   197         -value for the Unicode character that will be inserted.  The upper bits of the
   198         -Unicode character will be 0.
          196  +The digits \fIooo\fR (one, two, or three of them) give a eight-bit octal 
          197  +value for the Unicode character that will be inserted, in the range \fI000\fR
          198  +- \fI377\fR.  The parser will stop just before this range overflows, or when
          199  +the maximum of three digits is reached.  The upper bits of the Unicode
          200  +character will be 0.
   199    201   .TP 7
   200    202   \e\fBx\fIhh\fR 
   201    203   .
   202         -The hexadecimal digits \fIhh\fR give an eight-bit hexadecimal value for the
   203         -Unicode character that will be inserted.  Any number of hexadecimal digits
   204         -may be present; however, all but the last two are ignored (the result is
   205         -always a one-byte quantity).  The upper bits of the Unicode character will
   206         -be 0.
          204  +The hexadecimal digits \fIhh\fR (one or two of them) give an eight-bit
          205  +hexadecimal value for the Unicode character that will be inserted.  The upper
          206  +bits of the Unicode character will be 0.
   207    207   .TP 7
   208    208   \e\fBu\fIhhhh\fR 
   209    209   .
   210    210   The hexadecimal digits \fIhhhh\fR (one, two, three, or four of them) give a
   211    211   sixteen-bit hexadecimal value for the Unicode character that will be
   212         -inserted.
          212  +inserted.  The upper bits of the Unicode character will be 0.
          213  +.TP 7
          214  +\e\fBU\fIhhhhhhhh\fR 
          215  +.
          216  +The hexadecimal digits \fIhhhhhhhh\fR (one up to eight of them) give a
          217  +twentiy-one-bit hexadecimal value for the Unicode character that will be
          218  +inserted, in the range U+0000..U+10FFFF.  The parser will stop just
          219  +before this range overflows, or when the maximum of eight digits
          220  +is reached.  The upper bits of the Unicode character will be 0.
          221  +.PP
          222  +The range U+010000..U+10FFFD is reserved for the future.
   213    223   .PP
   214    224   Backslash substitution is not performed on words enclosed in braces,
   215    225   except for backslash-newline as described above.
   216    226   .RE
   217    227   .IP "[10] \fBComments.\fR"
   218    228   If a hash character
   219    229   .PQ #

Changes to doc/re_syntax.n.

   355    355   .TP
   356    356   \fB\et\fR
   357    357   .
   358    358   horizontal tab, as in C
   359    359   .TP
   360    360   \fB\eu\fIwxyz\fR
   361    361   .
   362         -(where \fIwxyz\fR is exactly four hexadecimal digits) the Unicode
          362  +(where \fIwxyz\fR is one up to four hexadecimal digits) the Unicode
   363    363   character \fBU+\fIwxyz\fR in the local byte ordering
   364    364   .TP
   365    365   \fB\eU\fIstuvwxyz\fR
   366    366   .
   367         -(where \fIstuvwxyz\fR is exactly eight hexadecimal digits) reserved
   368         -for a somewhat-hypothetical Unicode extension to 32 bits
          367  +(where \fIstuvwxyz\fR is one up to eight hexadecimal digits) reserved
          368  +for a Unicode extension up to 21 bits. The digits are parsed until the
          369  +first non-hexadecimal character is encountered, the maximun of eight
          370  +hexadecimal digits are reached, or an overflow would occur in the maximum
          371  +value of \fBU+\fI10ffff\fR.
   369    372   .TP
   370    373   \fB\ev\fR
   371    374   .
   372    375   vertical tab, as in C are all available.
   373    376   .TP
   374         -\fB\ex\fIhhh\fR
          377  +\fB\ex\fIhh\fR
   375    378   .
   376         -(where \fIhhh\fR is any sequence of hexadecimal digits) the character
   377         -whose hexadecimal value is \fB0x\fIhhh\fR (a single character no
   378         -matter how many hexadecimal digits are used).
          379  +(where \fIhh\fR is one or two hexadecimal digits) the character
          380  +whose hexadecimal value is \fB0x\fIhh\fR.
   379    381   .TP
   380    382   \fB\e0\fR
   381    383   .
   382    384   the character whose value is \fB0\fR
   383    385   .TP
          386  +\fB\e\fIxyz\fR
          387  +.
          388  +(where \fIxyz\fR is exactly three octal digits, and is not a \fIback
          389  +reference\fR (see below)) the character whose octal value is
          390  +\fB0\fIxyz\fR. The first digit must be in the range 0-3, otherwise
          391  +the two-digit form is assumed.
          392  +.TP
   384    393   \fB\e\fIxy\fR
   385    394   .
   386    395   (where \fIxy\fR is exactly two octal digits, and is not a \fIback
   387    396   reference\fR (see below)) the character whose octal value is
   388    397   \fB0\fIxy\fR
   389         -.TP
   390         -\fB\e\fIxyz\fR
   391         -.
   392         -(where \fIxyz\fR is exactly three octal digits, and is not a back
   393         -reference (see below)) the character whose octal value is
   394         -\fB0\fIxyz\fR
   395    398   .RE
   396    399   .PP
   397    400   Hexadecimal digits are
   398    401   .QR \fB0\fR \fB9\fR ,
   399    402   .QR \fBa\fR \fBf\fR ,
   400    403   and
   401    404   .QR \fBA\fR \fBF\fR .

Changes to generic/regc_lex.c.

   738    738    ^ static int lexescape(struct vars *);
   739    739    */
   740    740   static int			/* not actually used, but convenient for RETV */
   741    741   lexescape(
   742    742       struct vars *v)
   743    743   {
   744    744       chr c;
          745  +    int i;
   745    746       static const chr alert[] = {
   746    747   	CHR('a'), CHR('l'), CHR('e'), CHR('r'), CHR('t')
   747    748       };
   748    749       static const chr esc[] = {
   749    750   	CHR('E'), CHR('S'), CHR('C')
   750    751       };
   751    752       const chr *save;
................................................................................
   814    815   	NOTE(REG_ULOCALE);
   815    816   	RETV(CCLASS, 'S');
   816    817   	break;
   817    818       case CHR('t'):
   818    819   	RETV(PLAIN, CHR('\t'));
   819    820   	break;
   820    821       case CHR('u'):
   821         -	c = lexdigits(v, 16, 4, 4);
          822  +	c = (uchr) lexdigits(v, 16, 1, 4);
   822    823   	if (ISERR()) {
   823    824   	    FAILW(REG_EESCAPE);
   824    825   	}
   825    826   	RETV(PLAIN, c);
   826    827   	break;
   827    828       case CHR('U'):
   828         -	c = lexdigits(v, 16, 8, 8);
          829  +	i = lexdigits(v, 16, 1, 8);
   829    830   	if (ISERR()) {
   830    831   	    FAILW(REG_EESCAPE);
   831    832   	}
   832         -	RETV(PLAIN, c);
          833  +	if (i > 0xFFFF) {
          834  +	    /* TODO: output a Surrogate pair
          835  +	     */
          836  +	    i = 0xFFFD;
          837  +	}
          838  +	RETV(PLAIN, (uchr) i);
   833    839   	break;
   834    840       case CHR('v'):
   835    841   	RETV(PLAIN, CHR('\v'));
   836    842   	break;
   837    843       case CHR('w'):
   838    844   	NOTE(REG_ULOCALE);
   839    845   	RETV(CCLASS, 'w');
................................................................................
   840    846   	break;
   841    847       case CHR('W'):
   842    848   	NOTE(REG_ULOCALE);
   843    849   	RETV(CCLASS, 'W');
   844    850   	break;
   845    851       case CHR('x'):
   846    852   	NOTE(REG_UUNPORT);
   847         -	c = lexdigits(v, 16, 1, 255);	/* REs >255 long outside spec */
          853  +	c = (uchr) lexdigits(v, 16, 1, 2);
   848    854   	if (ISERR()) {
   849    855   	    FAILW(REG_EESCAPE);
   850    856   	}
   851    857   	RETV(PLAIN, c);
   852    858   	break;
   853    859       case CHR('y'):
   854    860   	NOTE(REG_ULOCALE);
................................................................................
   862    868   	RETV(SEND, 0);
   863    869   	break;
   864    870       case CHR('1'): case CHR('2'): case CHR('3'): case CHR('4'):
   865    871       case CHR('5'): case CHR('6'): case CHR('7'): case CHR('8'):
   866    872       case CHR('9'):
   867    873   	save = v->now;
   868    874   	v->now--;		/* put first digit back */
   869         -	c = lexdigits(v, 10, 1, 255);	/* REs >255 long outside spec */
          875  +	c = (uchr) lexdigits(v, 10, 1, 255);	/* REs >255 long outside spec */
   870    876   	if (ISERR()) {
   871    877   	    FAILW(REG_EESCAPE);
   872    878   	}
   873    879   
   874    880   	/*
   875    881   	 * Ugly heuristic (first test is "exactly 1 digit?")
   876    882   	 */
................................................................................
   889    895   	/*
   890    896   	 * And fall through into octal number.
   891    897   	 */
   892    898   
   893    899       case CHR('0'):
   894    900   	NOTE(REG_UUNPORT);
   895    901   	v->now--;		/* put first digit back */
   896         -	c = lexdigits(v, 8, 1, 3);
          902  +	c = (uchr) lexdigits(v, 8, 1, 3);
   897    903   	if (ISERR()) {
   898    904   	    FAILW(REG_EESCAPE);
          905  +	}
          906  +	if (c > 0xff) {
          907  +	    /* out of range, so we handled one digit too much */
          908  +	    v->now--;
          909  +	    c >>= 3;
   899    910   	}
   900    911   	RETV(PLAIN, c);
   901    912   	break;
   902    913       default:
   903    914   	assert(iscalpha(c));
   904    915   	FAILW(REG_EESCAPE);	/* unknown alphabetic escape */
   905    916   	break;
   906    917       }
   907    918       assert(NOTREACHED);
   908    919   }
   909    920   
   910    921   /*
   911    922    - lexdigits - slurp up digits and return chr value
   912         - ^ static chr lexdigits(struct vars *, int, int, int);
          923  + ^ static int lexdigits(struct vars *, int, int, int);
   913    924    */
   914         -static chr			/* chr value; errors signalled via ERR */
          925  +static int			/* chr value; errors signalled via ERR */
   915    926   lexdigits(
   916    927       struct vars *v,
   917    928       int base,
   918    929       int minlen,
   919    930       int maxlen)
   920    931   {
   921         -    uchr n;			/* unsigned to avoid overflow misbehavior */
          932  +    int n;
   922    933       int len;
   923    934       chr c;
   924    935       int d;
   925    936       const uchr ub = (uchr) base;
   926    937   
   927    938       n = 0;
   928    939       for (len = 0; len < maxlen && !ATEOS(); len++) {
          940  +	if (n > 0x10fff) {
          941  +	    /* Stop when continuing would otherwise overflow */
          942  +	    break;
          943  +	}
   929    944   	c = *v->now++;
   930    945   	switch (c) {
   931    946   	case CHR('0'): case CHR('1'): case CHR('2'): case CHR('3'):
   932    947   	case CHR('4'): case CHR('5'): case CHR('6'): case CHR('7'):
   933    948   	case CHR('8'): case CHR('9'):
   934    949   	    d = DIGITVAL(c);
   935    950   	    break;
................................................................................
   954    969   	}
   955    970   	n = n*ub + (uchr)d;
   956    971       }
   957    972       if (len < minlen) {
   958    973   	ERR(REG_EESCAPE);
   959    974       }
   960    975   
   961         -    return (chr)n;
          976  +    return n;
   962    977   }
   963    978   
   964    979   /*
   965    980    - brenext - get next BRE token
   966    981    * This is much like EREs except for all the stupid backslashes and the
   967    982    * context-dependency of some things.
   968    983    ^ static int brenext(struct vars *, pchr);

Changes to generic/regcomp.c.

    75     75   /* === regc_lex.c === */
    76     76   static void lexstart(struct vars *);
    77     77   static void prefixes(struct vars *);
    78     78   static void lexnest(struct vars *, const chr *, const chr *);
    79     79   static void lexword(struct vars *);
    80     80   static int next(struct vars *);
    81     81   static int lexescape(struct vars *);
    82         -static chr lexdigits(struct vars *, int, int, int);
           82  +static int lexdigits(struct vars *, int, int, int);
    83     83   static int brenext(struct vars *, pchr);
    84     84   static void skip(struct vars *);
    85     85   static chr newline(NOPARMS);
    86     86   #ifdef REG_DEBUG
    87     87   static const chr *ch(NOPARMS);
    88     88   #endif
    89     89   static chr chrnamed(struct vars *, const chr *, const chr *, pchr);

Changes to generic/regcustom.h.

    93     93   typedef Tcl_UniChar chr;	/* The type itself. */
    94     94   typedef int pchr;		/* What it promotes to. */
    95     95   typedef unsigned uchr;		/* Unsigned type that will hold a chr. */
    96     96   typedef int celt;		/* Type to hold chr, or NOCELT */
    97     97   #define	NOCELT (-1)		/* Celt value which is not valid chr */
    98     98   #define	CHR(c) (UCHAR(c))	/* Turn char literal into chr literal */
    99     99   #define	DIGITVAL(c) ((c)-'0')	/* Turn chr digit into its value */
   100         -#if TCL_UTF_MAX > 3
          100  +#if TCL_UTF_MAX > 4
   101    101   #define	CHRBITS	32		/* Bits in a chr; must not use sizeof */
   102    102   #define	CHR_MIN	0x00000000	/* Smallest and largest chr; the value */
   103    103   #define	CHR_MAX	0xffffffff	/* CHR_MAX-CHR_MIN+1 should fit in uchr */
   104    104   #else
   105    105   #define	CHRBITS	16		/* Bits in a chr; must not use sizeof */
   106    106   #define	CHR_MIN	0x0000		/* Smallest and largest chr; the value */
   107    107   #define	CHR_MAX	0xffff		/* CHR_MAX-CHR_MIN+1 should fit in uchr */

Changes to generic/tcl.h.

  2149   2149   #define TCL_CONVERT_MULTIBYTE	(-1)
  2150   2150   #define TCL_CONVERT_SYNTAX	(-2)
  2151   2151   #define TCL_CONVERT_UNKNOWN	(-3)
  2152   2152   #define TCL_CONVERT_NOSPACE	(-4)
  2153   2153   
  2154   2154   /*
  2155   2155    * The maximum number of bytes that are necessary to represent a single
  2156         - * Unicode character in UTF-8. The valid values should be 3 or 6 (or perhaps 1
  2157         - * if we want to support a non-unicode enabled core). If 3, then Tcl_UniChar
  2158         - * must be 2-bytes in size (UCS-2) (the default). If 6, then Tcl_UniChar must
  2159         - * be 4-bytes in size (UCS-4). At this time UCS-2 mode is the default and
  2160         - * recommended mode. UCS-4 is experimental and not recommended. It works for
  2161         - * the core, but most extensions expect UCS-2.
         2156  + * Unicode character in UTF-8. The valid values should be 3, 4 or 6
         2157  + * (or perhaps 1 if we want to support a non-unicode enabled core). If 3 or
         2158  + * 4, then Tcl_UniChar must be 2-bytes in size (UCS-2) (the default). If 6,
         2159  + * then Tcl_UniChar must be 4-bytes in size (UCS-4). At this time UCS-2 mode
         2160  + * is the default and recommended mode. UCS-4 is experimental and not
         2161  + * recommended. It works for the core, but most extensions expect UCS-2.
  2162   2162    */
  2163   2163   
  2164   2164   #ifndef TCL_UTF_MAX
  2165   2165   #define TCL_UTF_MAX		3
  2166   2166   #endif
  2167   2167   
  2168   2168   /*
  2169   2169    * This represents a Unicode character. Any changes to this should also be
  2170   2170    * reflected in regcustom.h.
  2171   2171    */
  2172   2172   
  2173         -#if TCL_UTF_MAX > 3
         2173  +#if TCL_UTF_MAX > 4
  2174   2174       /*
  2175   2175        * unsigned int isn't 100% accurate as it should be a strict 4-byte value
  2176   2176        * (perhaps wchar_t). 64-bit systems may have troubles. The size of this
  2177   2177        * value must be reflected correctly in regcustom.h and
  2178   2178        * in tclEncoding.c.
  2179   2179        * XXX: Tcl is currently UCS-2 and planning UTF-16 for the Unicode
  2180   2180        * XXX: string rep that Tcl_UniChar represents.  Changing the size

Changes to generic/tclParse.c.

   750    750   {
   751    751       int result = 0;
   752    752       register const char *p = src;
   753    753   
   754    754       while (numBytes--) {
   755    755   	unsigned char digit = UCHAR(*p);
   756    756   
   757         -	if (!isxdigit(digit)) {
          757  +	if (!isxdigit(digit) || (result > 0x10fff)) {
   758    758   	    break;
   759    759   	}
   760    760   
   761    761   	p++;
   762    762   	result <<= 4;
   763    763   
   764    764   	if (digit >= 'a') {
................................................................................
   862    862       case 't':
   863    863   	result = 0x9;
   864    864   	break;
   865    865       case 'v':
   866    866   	result = 0xb;
   867    867   	break;
   868    868       case 'x':
   869         -	count += TclParseHex(p+1, numBytes-2, &result);
          869  +	count += TclParseHex(p+1, (numBytes > 3) ? 2 : numBytes-2, &result);
   870    870   	if (count == 2) {
   871    871   	    /*
   872    872   	     * No hexadigits -> This is just "x".
   873    873   	     */
   874    874   
   875    875   	    result = 'x';
   876    876   	} else {
................................................................................
   884    884   	count += TclParseHex(p+1, (numBytes > 5) ? 4 : numBytes-2, &result);
   885    885   	if (count == 2) {
   886    886   	    /*
   887    887   	     * No hexadigits -> This is just "u".
   888    888   	     */
   889    889   	    result = 'u';
   890    890   	}
          891  +	break;
          892  +    case 'U':
          893  +	count += TclParseHex(p+1, (numBytes > 9) ? 8 : numBytes-2, &result);
          894  +	if (count == 2) {
          895  +	    /*
          896  +	     * No hexadigits -> This is just "U".
          897  +	     */
          898  +	    result = 'U';
          899  +	}
   891    900   	break;
   892    901       case '\n':
   893    902   	count--;
   894    903   	do {
   895    904   	    p++;
   896    905   	    count++;
   897    906   	} while ((count < numBytes) && ((*p == ' ') || (*p == '\t')));
................................................................................
   913    922   		    || (UCHAR(*p) >= '8')) {
   914    923   		break;
   915    924   	    }
   916    925   	    count = 3;
   917    926   	    result = (result << 3) + (*p - '0');
   918    927   	    p++;
   919    928   	    if ((numBytes == 3) || !isdigit(UCHAR(*p))	/* INTL: digit */
   920         -		    || (UCHAR(*p) >= '8')) {
          929  +		    || (UCHAR(*p) >= '8') || (result >= 0x20)) {
   921    930   		break;
   922    931   	    }
   923    932   	    count = 4;
   924    933   	    result = UCHAR((result << 3) + (*p - '0'));
   925    934   	    break;
   926    935   	}
   927    936   

Changes to tests/reg.test.

   622    622   expectMatch	13.10 MP	"a\\cHb"	"a\bb"	"a\bb"
   623    623   expectMatch	13.11 LMP	"a\\e"		"a\033"	"a\033"
   624    624   expectMatch	13.12 P		"a\\fb"		"a\fb"	"a\fb"
   625    625   expectMatch	13.13 P		"a\\nb"		"a\nb"	"a\nb"
   626    626   expectMatch	13.14 P		"a\\rb"		"a\rb"	"a\rb"
   627    627   expectMatch	13.15 P		"a\\tb"		"a\tb"	"a\tb"
   628    628   expectMatch	13.16 P		"a\\u0008x"	"a\bx"	"a\bx"
   629         -expectError	13.17 -		{a\u008x}	EESCAPE
          629  +expectMatch	13.17 P		{a\u008x}	"a\bx"	"a\bx"
   630    630   expectMatch	13.18 P		"a\\u00088x"	"a\b8x"	"a\b8x"
   631    631   expectMatch	13.19 P		"a\\U00000008x"	"a\bx"	"a\bx"
   632         -expectError	13.20 -		{a\U0000008x}	EESCAPE
          632  +expectMatch	13.20 P		{a\U0000008x}	"a\bx"	"a\bx"
   633    633   expectMatch	13.21 P		"a\\vb"		"a\vb"	"a\vb"
   634    634   expectMatch	13.22 MP	"a\\x08x"	"a\bx"	"a\bx"
   635    635   expectError	13.23 -		{a\xq}		EESCAPE
   636         -expectMatch	13.24 MP	"a\\x0008x"	"a\bx"	"a\bx"
          636  +expectMatch	13.24 MP	"a\\x08x"	"a\bx"	"a\bx"
   637    637   expectError	13.25 -		{a\z}		EESCAPE
   638    638   expectMatch	13.26 MP	"a\\010b"	"a\bb"	"a\bb"
          639  +expectMatch	13.27 P		"a\\U00001234x"	"a\u1234x"	"a\u1234x"
          640  +expectMatch	13.28 P		{a\U00001234x}	"a\u1234x"	"a\u1234x"
          641  +expectMatch	13.29 P		"a\\U0001234x"	"a\u1234x"	"a\u1234x"
          642  +expectMatch	13.30 P		{a\U0001234x}	"a\u1234x"	"a\u1234x"
          643  +expectMatch	13.31 P		"a\\U000012345x"	"a\u12345x"	"a\u12345x"
          644  +expectMatch	13.32 P		{a\U000012345x}	"a\u12345x"	"a\u12345x"
          645  +expectMatch	13.33 P		"a\\U1000000x"	"a\ufffd0x"	"a\ufffd0x"
          646  +expectMatch	13.34 P		{a\U1000000x}	"a\ufffd0x"	"a\ufffd0x"
   639    647   
   640    648   
   641    649   doing 14 "back references"
   642    650   # ugh
   643    651   expectMatch	14.1  RP	{a(b*)c\1}	abbcbb	abbcbb	bb
   644    652   expectMatch	14.2  RP	{a(b*)c\1}	ac	ac	""
   645    653   expectNomatch	14.3  RP	{a(b*)c\1}	abbcb
................................................................................
   678    686   	"abbbbbbbbbbbc" abbbbbbbbbbbc b b b b b b b b b b
   679    687   # but we're fussy about border cases -- guys who want octal should use the zero
   680    688   expectError	15.9  -	{a((((((((((b\10))))))))))c}	ESUBREG
   681    689   # BREs don't have octal, EREs don't have backrefs
   682    690   expectMatch	15.10 MP	"a\\12b"	"a\nb"	"a\nb"
   683    691   expectError	15.11 b		{a\12b}		ESUBREG
   684    692   expectMatch	15.12 eAS	{a\12b}		a12b	a12b
          693  +expectMatch	15.13 MP	{a\701b}	a\u00381b	a\u00381b
   685    694   
   686    695   
   687    696   doing 16 "expanded syntax"
   688    697   expectMatch	16.1 xP		"a b c"		"abc"	"abc"
   689    698   expectMatch	16.2 xP		"a b #oops\nc\td"	"abcd"	"abcd"
   690    699   expectMatch	16.3 x		"a\\ b\\\tc"	"a b\tc"	"a b\tc"
   691    700   expectMatch	16.4 xP		"a b\\#c"	"ab#c"	"ab#c"

Changes to tests/utf.test.

   167    167   bsCheck \14	12
   168    168   bsCheck \141	97
   169    169   bsCheck b\0	98
   170    170   bsCheck \x	120
   171    171   bsCheck \xa	10
   172    172   bsCheck \xA	10
   173    173   bsCheck \x41	65
   174         -bsCheck \x541	65
          174  +bsCheck \x541	84
   175    175   bsCheck \u	117
   176    176   bsCheck \uk	117
   177    177   bsCheck \u41	65
   178    178   bsCheck \ua	10
   179    179   bsCheck \uA	10
   180    180   bsCheck \340	224
   181    181   bsCheck \ua1	161
   182    182   bsCheck \u4e21	20001
          183  +bsCheck \741	60
          184  +bsCheck \U	85
          185  +bsCheck \Uk	85
          186  +bsCheck \U41	65
          187  +bsCheck \Ua	10
          188  +bsCheck \UA	10
          189  +bsCheck \Ua1	161
          190  +bsCheck \U4e21	20001
          191  +bsCheck \U004e21	20001
          192  +bsCheck \U00004e21	20001
          193  +bsCheck \U00110000	65533
          194  +bsCheck \Uffffffff	65533
   183    195   
   184    196   test utf-11.1 {Tcl_UtfToUpper} {
   185    197       string toupper {}
   186    198   } {}
   187    199   test utf-11.2 {Tcl_UtfToUpper} {
   188    200       string toupper abc
   189    201   } ABC