Tcl Source Code

Check-in [3e8ada19f5]
Login

Many hyperlinks are disabled.
Use anonymous login to enable hyperlinks.

Overview
Comment:Fix Tcl_UtfToUniCharDString() function, handling invalid byte at the end of the string: Not quite correct for bytes between 0x80-0x9F, according to TIP
Downloads: Tarball | ZIP archive
Timelines: family | ancestors | descendants | both | core-8-branch
Files: files | file ages | folders
SHA3-256: 3e8ada19f5d691e05350bdc0a83d0609513db5e546d791ed81140082c420e307
User & Date: jan.nijtmans 2019-03-20 22:45:51.323
Context
2019-03-21
19:56
Remove incorrect comment. Simplify handling of last bytes in Tcl_UniCharToUtfDString(), since TclUt... check-in: 33251a211f user: jan.nijtmans tags: core-8-branch
2019-03-20
22:54
Merge 8.7 check-in: 3ea5d3e8a3 user: jan.nijtmans tags: utf-max
22:51
Merge 8.7 check-in: bb9b52ab82 user: jan.nijtmans tags: trunk
22:45
Fix Tcl_UtfToUniCharDString() function, handling invalid byte at the end of the string: Not quite co... check-in: 3e8ada19f5 user: jan.nijtmans tags: core-8-branch
2019-03-18
22:32
Comment Comment Tcl_UniCharToUtf() better, what happens handling surrogates. Add type cast in tclUtf... check-in: b02df08680 user: jan.nijtmans tags: core-8-branch
Changes
Unified Diff Ignore Whitespace Patch
Changes to generic/tclUtf.c.
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
	    if ((unsigned)(*chPtr - 0x10000) <= 0xFFFFF) {
		return 4;
	    }
#endif
	}

	/*
	 * A four-byte-character lead-byte not followed by two trail-bytes
	 * represents itself.
	 */
    }

    *chPtr = byte;
    return 1;
}







|







449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
	    if ((unsigned)(*chPtr - 0x10000) <= 0xFFFFF) {
		return 4;
	    }
#endif
	}

	/*
	 * A four-byte-character lead-byte not followed by three trail-bytes
	 * represents itself.
	 */
    }

    *chPtr = byte;
    return 1;
}
615
616
617
618
619
620
621


622
623
624
625
626
627
628
629
630
631
632
    end = src + length - 4;
    while (p < end) {
	p += TclUtfToUniChar(p, &ch);
	*w++ = ch;
    }
    end += 4;
    while (p < end) {


	if (Tcl_UtfCharComplete(p, end-p)) {
	    p += TclUtfToUniChar(p, &ch);
	} else if (((UCHAR(*p)-0x80)) < 0x20) {
	    ch = cp1252[UCHAR(*p++)-0x80];
	} else {
	    ch = UCHAR(*p++);
	}
	*w++ = ch;
    }
    *w = '\0';
    Tcl_DStringSetLength(dsPtr,







>
>
|

<
<







615
616
617
618
619
620
621
622
623
624
625


626
627
628
629
630
631
632
    end = src + length - 4;
    while (p < end) {
	p += TclUtfToUniChar(p, &ch);
	*w++ = ch;
    }
    end += 4;
    while (p < end) {
	if (((unsigned)(UCHAR(*p)-0x80)) < 0x20) {
	    ch = cp1252[UCHAR(*p++)-0x80];
	} else if (Tcl_UtfCharComplete(p, end-p)) {
	    p += TclUtfToUniChar(p, &ch);


	} else {
	    ch = UCHAR(*p++);
	}
	*w++ = ch;
    }
    *w = '\0';
    Tcl_DStringSetLength(dsPtr,
669
670
671
672
673
674
675


676
677
678
679
680
681
682
683
684
685
686
    end = src + length - 4;
    while (p < end) {
	p += TclUtfToWChar(p, &ch);
	*w++ = ch;
    }
    end += 4;
    while (p < end) {


	if (Tcl_UtfCharComplete(p, end-p)) {
	    p += TclUtfToWChar(p, &ch);
	} else if (((UCHAR(*p)-0x80)) < 0x20) {
	    ch = cp1252[UCHAR(*p++)-0x80];
	} else {
	    ch = UCHAR(*p++);
	}
	*w++ = ch;
    }
    *w = '\0';
    Tcl_DStringSetLength(dsPtr,







>
>
|
|
<
<







669
670
671
672
673
674
675
676
677
678
679


680
681
682
683
684
685
686
    end = src + length - 4;
    while (p < end) {
	p += TclUtfToWChar(p, &ch);
	*w++ = ch;
    }
    end += 4;
    while (p < end) {
	if (((unsigned)(UCHAR(*p)-0x80)) < 0x20) {
	    ch = cp1252[UCHAR(*p++)-0x80];
	} else if (Tcl_UtfCharComplete(p, end-p)) {
		    p += TclUtfToWChar(p, &ch);


	} else {
	    ch = UCHAR(*p++);
	}
	*w++ = ch;
    }
    *w = '\0';
    Tcl_DStringSetLength(dsPtr,