Tcl Source Code: Changes On Branch encodings-with-flags

Many hyperlinks are disabled.
Use anonymous login to enable hyperlinks.

Changes In Branch encodings-with-flags Excluding Merge-Ins

This is equivalent to a diff from 59b0904ac7 to b3471f0625

2022-03-17
09:51		TIP #601: Make "encoding convertto/convertfrom" throw exceptions check-in: fb919be8a3 user: jan.nijtmans tags: core-8-branch
2022-03-14
18:06		TIP605 encoding failindex: prepare TCL 9.0 main branch with branch "encodings-with-flags" merged. check-in: c8fc04b93b user: oehhar tags: tip607-encoding-failindex
15:46		TIP607 encoding failindex: start implementation check-in: a611c76903 user: oehhar tags: tip601-encoding-failindex, tip607-encoding-failindex
2022-03-05
21:56		-nothrow -> -nocomplain Closed-Leaf check-in: b3471f0625 user: jan.nijtmans tags: encodings-with-flags
10:32		Add "const" to Tcl_SetNotifier() argument. Should have been part of TIP #27, looooooong ago. This si... check-in: a993dee980 user: jan.nijtmans tags: core-8-branch
2022-03-04
16:21		Merge 8.7 check-in: 791cc676b8 user: jan.nijtmans tags: encodings-with-flags
16:19		Merge 8.7 check-in: 1e892e4df9 user: jan.nijtmans tags: trunk, main
16:18		Merge 8.6 check-in: 59b0904ac7 user: jan.nijtmans tags: core-8-branch
16:15		Fix [8d8bb39962]: incorrect notifier initialization in tclXtNotify.c check-in: 362f6d184c user: jan.nijtmans tags: core-8-6-branch
2022-02-24
22:20		3 more files with TCL_UTF_MAX checks check-in: 70eb0efb69 user: jan.nijtmans tags: core-8-branch

Changes to doc/Encoding.3.

Changes to generic/tcl.decls.

Changes to generic/tcl.h.

Changes to generic/tclCmdAH.c.

Changes to generic/tclDecls.h.

Changes to generic/tclEncoding.c.

Changes to generic/tclStubInit.c.

Changes to tests/chanio.test.

Changes to tests/cmdAH.test.

Changes to tests/encoding.test.

Changes to tests/http.test.

Changes to tests/io.test.

Changes to tests/main.test.

Changes to tests/safe.test.

Changes to tests/source.test.

︙			︙
21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37	.sp int \fBTcl_GetEncodingFromObj\fR(\fIinterp, objPtr, encodingPtr\fR) .sp char * \fBTcl_ExternalToUtfDString\fR(\fIencoding, src, srcLen, dstPtr\fR) .sp char * \fBTcl_UtfToExternalDString\fR(\fIencoding, src, srcLen, dstPtr\fR) .sp int \fBTcl_ExternalToUtf\fR(\fIinterp, encoding, src, srcLen, flags, statePtr, dst, dstLen, srcReadPtr, dstWrotePtr, dstCharsPtr\fR) .sp int \fBTcl_UtfToExternal\fR(\fIinterp, encoding, src, srcLen, flags, statePtr, dst, dstLen, srcReadPtr, dstWrotePtr, dstCharsPtr\fR)	> > > > > >	21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43	.sp int \fBTcl_GetEncodingFromObj\fR(\fIinterp, objPtr, encodingPtr\fR) .sp char * \fBTcl_ExternalToUtfDString\fR(\fIencoding, src, srcLen, dstPtr\fR) .sp size_t \fBTcl_ExternalToUtfDStringEx\fR(\fIencoding, src, srcLen, flags, dstPtr\fR) .sp char * \fBTcl_UtfToExternalDString\fR(\fIencoding, src, srcLen, dstPtr\fR) .sp size_t \fBTcl_UtfToExternalDStringEx\fR(\fIencoding, src, srcLen, flags, dstPtr\fR) .sp int \fBTcl_ExternalToUtf\fR(\fIinterp, encoding, src, srcLen, flags, statePtr, dst, dstLen, srcReadPtr, dstWrotePtr, dstCharsPtr\fR) .sp int \fBTcl_UtfToExternal\fR(\fIinterp, encoding, src, srcLen, flags, statePtr, dst, dstLen, srcReadPtr, dstWrotePtr, dstCharsPtr\fR)
︙			︙
104 105 106 107 108 109 110 ~~111~~ 112 113 114 115 116 117 118	converted. \fBTCL_ENCODING_END\fR signifies that the source buffer is the last block in a (potentially multi-block) input stream, telling the conversion routine to perform any finalization that needs to occur after the last byte is converted and then to reset to an initial state. \fBTCL_ENCODING_STOPONERROR\fR signifies that the conversion routine should return immediately upon reading a source character that does not exist in the target encoding; otherwise a default fallback character will ~~automatically be substituted.~~ .AP Tcl_EncodingState statePtr in/out Used when converting a (generally long or indefinite length) byte stream in a piece-by-piece fashion. The conversion routine stores its current state in \fIstatePtr\fR after \fIsrc\fR (the buffer containing the current piece) has been converted; that state information must be passed back when converting the next piece of the stream so the conversion routine knows what state it was in when it left off at the end of the	\| > > >	110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127	converted. \fBTCL_ENCODING_END\fR signifies that the source buffer is the last block in a (potentially multi-block) input stream, telling the conversion routine to perform any finalization that needs to occur after the last byte is converted and then to reset to an initial state. \fBTCL_ENCODING_STOPONERROR\fR signifies that the conversion routine should return immediately upon reading a source character that does not exist in the target encoding; otherwise a default fallback character will automatically be substituted. The flag \fBTCL_ENCODING_NOCOMPLAIN\fR has no effect, it is reserved for Tcl 9.0. The flag \fBTCL_ENCODING_MODIFIED\fR makes \fBTcl_UtfToExternalDStringEx\fR and \fBTcl_UtfToExternal\fR produce the byte sequence \exC0\ex80 in stead of \ex00, for the utf-8/cesu-8 encoders. .AP Tcl_EncodingState statePtr in/out Used when converting a (generally long or indefinite length) byte stream in a piece-by-piece fashion. The conversion routine stores its current state in \fIstatePtr\fR after \fIsrc\fR (the buffer containing the current piece) has been converted; that state information must be passed back when converting the next piece of the stream so the conversion routine knows what state it was in when it left off at the end of the
︙			︙
203 204 205 206 207 208 209 210 211 212 213 214 215 216	\fBTcl_ExternalToUtfDString\fR converts a source buffer \fIsrc\fR from the specified \fIencoding\fR into UTF-8. The converted bytes are stored in \fIdstPtr\fR, which is then null-terminated. The caller should eventually call \fBTcl_DStringFree\fR to free any information stored in \fIdstPtr\fR. When converting, if any of the characters in the source buffer cannot be represented in the target encoding, a default fallback character will be used. The return value is a pointer to the value stored in the DString. .PP \fBTcl_ExternalToUtf\fR converts a source buffer \fIsrc\fR from the specified \fIencoding\fR into UTF-8. Up to \fIsrcLen\fR bytes are converted from the source buffer and up to \fIdstLen\fR converted bytes are stored in \fIdst\fR. In all cases, \fIsrcReadPtr\fR is filled with the number of bytes that were successfully converted from \fIsrc\fR and \fIdstWrotePtr\fR is filled with the corresponding number of bytes that were stored in \fIdst\fR. The return	> > > > >	212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230	\fBTcl_ExternalToUtfDString\fR converts a source buffer \fIsrc\fR from the specified \fIencoding\fR into UTF-8. The converted bytes are stored in \fIdstPtr\fR, which is then null-terminated. The caller should eventually call \fBTcl_DStringFree\fR to free any information stored in \fIdstPtr\fR. When converting, if any of the characters in the source buffer cannot be represented in the target encoding, a default fallback character will be used. The return value is a pointer to the value stored in the DString. .PP \fBTcl_ExternalToUtfDStringEx\fR is the same as \fBTcl_ExternalToUtfDString\fR, but it has an additional flags parameter. The return value is the index of the first byte in the input string causing a conversion error. Or (size_t)-1 if all is OK. .PP \fBTcl_ExternalToUtf\fR converts a source buffer \fIsrc\fR from the specified \fIencoding\fR into UTF-8. Up to \fIsrcLen\fR bytes are converted from the source buffer and up to \fIdstLen\fR converted bytes are stored in \fIdst\fR. In all cases, \fIsrcReadPtr\fR is filled with the number of bytes that were successfully converted from \fIsrc\fR and \fIdstWrotePtr\fR is filled with the corresponding number of bytes that were stored in \fIdst\fR. The return
︙			︙
241 242 243 244 245 246 247 248 249 250 251 252 253 254	into the specified \fIencoding\fR. The converted bytes are stored in \fIdstPtr\fR, which is then terminated with the appropriate encoding-specific null. The caller should eventually call \fBTcl_DStringFree\fR to free any information stored in \fIdstPtr\fR. When converting, if any of the characters in the source buffer cannot be represented in the target encoding, a default fallback character will be used. The return value is a pointer to the value stored in the DString. .PP \fBTcl_UtfToExternal\fR converts a source buffer \fIsrc\fR from UTF-8 into the specified \fIencoding\fR. Up to \fIsrcLen\fR bytes are converted from the source buffer and up to \fIdstLen\fR converted bytes are stored in \fIdst\fR. In all cases, \fIsrcReadPtr\fR is filled with the number of bytes that were successfully converted from \fIsrc\fR and \fIdstWrotePtr\fR is filled with the corresponding number of bytes that were stored in	> > > > >	255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273	into the specified \fIencoding\fR. The converted bytes are stored in \fIdstPtr\fR, which is then terminated with the appropriate encoding-specific null. The caller should eventually call \fBTcl_DStringFree\fR to free any information stored in \fIdstPtr\fR. When converting, if any of the characters in the source buffer cannot be represented in the target encoding, a default fallback character will be used. The return value is a pointer to the value stored in the DString. .PP \fBTcl_UtfToExternalDStringEx\fR is the same as \fBTcl_UtfToExternalDString\fR, but it has an additional flags parameter. The return value is the index of the first byte of an utf-8 byte-sequence in the input string causing a conversion error. Or (size_t)-1 if all is OK. .PP \fBTcl_UtfToExternal\fR converts a source buffer \fIsrc\fR from UTF-8 into the specified \fIencoding\fR. Up to \fIsrcLen\fR bytes are converted from the source buffer and up to \fIdstLen\fR converted bytes are stored in \fIdst\fR. In all cases, \fIsrcReadPtr\fR is filled with the number of bytes that were successfully converted from \fIsrc\fR and \fIdstWrotePtr\fR is filled with the corresponding number of bytes that were stored in
︙			︙

︙			︙
509 510 511 512 513 514 515 ~~516 517~~ 518 519 520 521 522 523 524	/ Tcl_Command TclInitEncodingCmd( Tcl_Interp interp) /* Tcl interpreter */ { static const EnsembleImplMap encodingImplMap[] = { ~~{"convertfrom", EncodingConvertfromObjCmd, TclCompileBasic1~~Or2~~ArgCmd, NULL, NULL, 0}, {"convertto", EncodingConverttoObjCmd, TclCompileBasic1~~Or2~~ArgCmd, NULL, NULL, 0},~~ {"dirs", EncodingDirsObjCmd, TclCompileBasic0Or1ArgCmd, NULL, NULL, 1}, {"names", EncodingNamesObjCmd, TclCompileBasic0ArgCmd, NULL, NULL, 0}, {"system", EncodingSystemObjCmd, TclCompileBasic0Or1ArgCmd, NULL, NULL, 1}, {NULL, NULL, NULL, NULL, NULL, 0} }; return TclMakeEnsemble(interp, "encoding", encodingImplMap);	\| \|	509 510 511 512 513 514 515 516 517 518 519 520 521 522 523 524	/ Tcl_Command TclInitEncodingCmd( Tcl_Interp interp) /* Tcl interpreter */ { static const EnsembleImplMap encodingImplMap[] = { {"convertfrom", EncodingConvertfromObjCmd, TclCompileBasic1To3ArgCmd, NULL, NULL, 0}, {"convertto", EncodingConverttoObjCmd, TclCompileBasic1To3ArgCmd, NULL, NULL, 0}, {"dirs", EncodingDirsObjCmd, TclCompileBasic0Or1ArgCmd, NULL, NULL, 1}, {"names", EncodingNamesObjCmd, TclCompileBasic0ArgCmd, NULL, NULL, 0}, {"system", EncodingSystemObjCmd, TclCompileBasic0Or1ArgCmd, NULL, NULL, 1}, {NULL, NULL, NULL, NULL, NULL, 0} }; return TclMakeEnsemble(interp, "encoding", encodingImplMap);
︙			︙
546 547 548 549 550 551 552 553 554 555 556 ~~557 558 559 560~~ ~~561~~ 562 ~~563~~ 564 565 566 567 568 569 ~~570~~ ~~571~~ 572 573 574 575 576 577 578	Tcl_Obj const objv[]) / Argument objects. / { Tcl_Obj data; /* Byte array to convert / Tcl_DString ds; / Buffer to hold the string / Tcl_Encoding encoding; / Encoding to use / int length; / Length of the byte array being converted / const char bytesPtr; /* Pointer to the first byte of the array / if (objc == 2) { encoding = Tcl_GetEncoding(interp, NULL); data = objv[1]; ~~} else if (objc ~~== 3~~) { if (Tcl_GetEncodingFromObj(interp, objv[1], &encoding) != TCL_OK) { return TCL_ERROR; }~~ ~~data =~~ obj~~v[2];~~ } else { ~~Tcl_WrongNumArgs(interp, 1, objv, "?encoding? data");~~ return TCL_ERROR; } / * Convert the string into a byte array in 'ds' / ~~bytesPtr = (char ) Tcl_GetByteArrayFromObj(data, &length);~~ ~~Tcl_ExternalToUtfDString(encoding, bytesPtr, length, ~~&ds);~~~~ /* * Note that we cannot use Tcl_DStringResult here because it will * truncate the string at the first null byte. */ Tcl_SetObjResult(interp, TclDStringToObj(&ds));	> > > > > > > > > > > > \| \| \| \| > > > > \| > > > > > \| > > > \| > > > > > > \| > > > > > > > > > > >	546 547 548 549 550 551 552 553 554 555 556 557 558 559 560 561 562 563 564 565 566 567 568 569 570 571 572 573 574 575 576 577 578 579 580 581 582 583 584 585 586 587 588 589 590 591 592 593 594 595 596 597 598 599 600 601 602 603 604 605 606 607 608 609 610 611 612 613 614 615 616 617 618 619	Tcl_Obj const objv[]) / Argument objects. / { Tcl_Obj data; /* Byte array to convert / Tcl_DString ds; / Buffer to hold the string / Tcl_Encoding encoding; / Encoding to use / int length; / Length of the byte array being converted / const char bytesPtr; /* Pointer to the first byte of the array / #if TCL_MAJOR_VERSION > 8 \|\| defined(TCL_NO_DEPRECATED) int flags = TCL_ENCODING_STOPONERROR; #else int flags = TCL_ENCODING_NOCOMPLAIN; #endif size_t result; if (objc == 2) { encoding = Tcl_GetEncoding(interp, NULL); data = objv[1]; } else if ((unsigned)(objc - 2) < 3) { data = objv[objc - 1]; bytesPtr = Tcl_GetString(objv[1]); if (bytesPtr[0] == '-' && bytesPtr[1] == 'n' && !strncmp(bytesPtr, "-nocomplain", strlen(bytesPtr))) { flags = TCL_ENCODING_NOCOMPLAIN; } else if (objc < 4) { if (Tcl_GetEncodingFromObj(interp, objv[objc - 2], &encoding) != TCL_OK) { return TCL_ERROR; } goto encConvFromOK; } else { goto encConvFromError; } if (objc < 4) { encoding = Tcl_GetEncoding(interp, NULL); } else if (Tcl_GetEncodingFromObj(interp, objv[objc - 2], &encoding) != TCL_OK) { return TCL_ERROR; } } else { encConvFromError: Tcl_WrongNumArgs(interp, 1, objv, "?-nocomplain? ?encoding? data"); return TCL_ERROR; } encConvFromOK: / * Convert the string into a byte array in 'ds' / #if !defined(TCL_NO_DEPRECATED) && (TCL_MAJOR_VERSION < 9) if (!(flags & TCL_ENCODING_STOPONERROR)) { bytesPtr = (char ) Tcl_GetByteArrayFromObj(data, &length); } else #endif bytesPtr = (char ) TclGetBytesFromObj(interp, data, &length); if (bytesPtr == NULL) { return TCL_ERROR; } result = Tcl_ExternalToUtfDStringEx(encoding, bytesPtr, length, flags, &ds); if ((flags & TCL_ENCODING_STOPONERROR) && (result != (size_t)-1)) { char buf[TCL_INTEGER_SPACE]; sprintf(buf, "%" TCL_Z_MODIFIER "u", result); Tcl_SetObjResult(interp, Tcl_ObjPrintf("unexpected byte sequence starting at index %" TCL_Z_MODIFIER "u: '\\x%X'", result, UCHAR(bytesPtr[result]))); Tcl_SetErrorCode(interp, "TCL", "ENCODING", "ILLEGALSEQUENCE", buf, NULL); Tcl_DStringFree(&ds); return TCL_ERROR; } / * Note that we cannot use Tcl_DStringResult here because it will * truncate the string at the first null byte. */ Tcl_SetObjResult(interp, TclDStringToObj(&ds));
︙			︙
608 609 610 611 612 613 614 615 616 617 618 ~~619 620 621 622~~ ~~623~~ 624 ~~625~~ 626 627 628 629 630 631 632 633 ~~634~~ 635 636 637 638 639 640 641	Tcl_Obj const objv[]) / Argument objects. / { Tcl_Obj data; /* String to convert / Tcl_DString ds; / Buffer to hold the byte array / Tcl_Encoding encoding; / Encoding to use / int length; / Length of the string being converted / const char stringPtr; /* Pointer to the first byte of the string / if (objc == 2) { encoding = Tcl_GetEncoding(interp, NULL); data = objv[1]; ~~} else if (objc ~~== 3~~) { if (Tcl_GetEncodingFromObj(interp, objv[1], &encoding) != TCL_OK) { return TCL_ERROR; }~~ ~~data =~~ obj~~v[2];~~ } else { ~~Tcl_WrongNumArgs(interp, 1, objv, "?encoding? data");~~ return TCL_ERROR; } / * Convert the string to a byte array in 'ds' / stringPtr = TclGetStringFromObj(data, &length); ~~Tcl_UtfToExternalDString(encoding, stringPtr, length, ~~&ds);~~~~ Tcl_SetObjResult(interp, Tcl_NewByteArrayObj((unsigned char) Tcl_DStringValue(&ds), Tcl_DStringLength(&ds))); Tcl_DStringFree(&ds); /* * We're done with the encoding	> > > > > > > > > > > > \| \| \| \| > > > > \| > > > > > \| > \| > > > > > > > > > > > > > >	649 650 651 652 653 654 655 656 657 658 659 660 661 662 663 664 665 666 667 668 669 670 671 672 673 674 675 676 677 678 679 680 681 682 683 684 685 686 687 688 689 690 691 692 693 694 695 696 697 698 699 700 701 702 703 704 705 706 707 708 709 710 711 712 713 714 715 716 717 718	Tcl_Obj const objv[]) / Argument objects. / { Tcl_Obj data; /* String to convert / Tcl_DString ds; / Buffer to hold the byte array / Tcl_Encoding encoding; / Encoding to use / int length; / Length of the string being converted / const char stringPtr; /* Pointer to the first byte of the string / size_t result; #if TCL_MAJOR_VERSION > 8 \|\| defined(TCL_NO_DEPRECATED) int flags = TCL_ENCODING_STOPONERROR; #else int flags = TCL_ENCODING_NOCOMPLAIN; #endif if (objc == 2) { encoding = Tcl_GetEncoding(interp, NULL); data = objv[1]; } else if ((unsigned)(objc - 2) < 3) { data = objv[objc - 1]; stringPtr = Tcl_GetString(objv[1]); if (stringPtr[0] == '-' && stringPtr[1] == 'n' && !strncmp(stringPtr, "-nocomplain", strlen(stringPtr))) { flags = TCL_ENCODING_NOCOMPLAIN; } else if (objc < 4) { if (Tcl_GetEncodingFromObj(interp, objv[objc - 2], &encoding) != TCL_OK) { return TCL_ERROR; } goto encConvToOK; } else { goto encConvToError; } if (objc < 4) { encoding = Tcl_GetEncoding(interp, NULL); } else if (Tcl_GetEncodingFromObj(interp, objv[objc - 2], &encoding) != TCL_OK) { return TCL_ERROR; } } else { encConvToError: Tcl_WrongNumArgs(interp, 1, objv, "?-nocomplain? ?encoding? data"); return TCL_ERROR; } encConvToOK: / * Convert the string to a byte array in 'ds' / stringPtr = TclGetStringFromObj(data, &length); result = Tcl_UtfToExternalDStringEx(encoding, stringPtr, length, flags, &ds); if ((flags & TCL_ENCODING_STOPONERROR) && (result != (size_t)-1)) { size_t pos = Tcl_NumUtfChars(stringPtr, result); int ucs4; char buf[TCL_INTEGER_SPACE]; TclUtfToUCS4(&stringPtr[result], &ucs4); sprintf(buf, "%" TCL_Z_MODIFIER "u", result); Tcl_SetObjResult(interp, Tcl_ObjPrintf("unexpected character at index %" TCL_Z_MODIFIER "u: 'U+%06X'", pos, ucs4)); Tcl_SetErrorCode(interp, "TCL", "ENCODING", "ILLEGALSEQUENCE", buf, NULL); Tcl_DStringFree(&ds); return TCL_ERROR; } Tcl_SetObjResult(interp, Tcl_NewByteArrayObj((unsigned char) Tcl_DStringValue(&ds), Tcl_DStringLength(&ds))); Tcl_DStringFree(&ds); /* * We're done with the encoding
︙			︙

︙			︙
1938 1939 1940 1941 1942 1943 1944 ~~1945~~ ~~1946~~ 1947 1948 1949 1950 1951 1952 1953	EXTERN int Tcl_UtfCharComplete(const char src, int length); / 655 / EXTERN const char Tcl_UtfNext(const char src); / 656 / EXTERN const char Tcl_UtfPrev(const char src, const char start); /* 657 / EXTERN int Tcl_UniCharIsUnicode(int ch); ~~/ ~~Slot 658 is reserved~~ /~~ ~~/ ~~Slot 659 is reserved~~ /~~ / 660 / EXTERN int Tcl_AsyncMarkFromSignal(Tcl_AsyncHandler async, int sigNumber); / Slot 661 is reserved / / Slot 662 is reserved / / Slot 663 is reserved / / Slot 664 is reserved */	\| > > > \| > > >	1938 1939 1940 1941 1942 1943 1944 1945 1946 1947 1948 1949 1950 1951 1952 1953 1954 1955 1956 1957 1958 1959	EXTERN int Tcl_UtfCharComplete(const char src, int length); / 655 / EXTERN const char Tcl_UtfNext(const char src); / 656 / EXTERN const char Tcl_UtfPrev(const char src, const char start); /* 657 / EXTERN int Tcl_UniCharIsUnicode(int ch); / 658 / EXTERN size_t Tcl_ExternalToUtfDStringEx(Tcl_Encoding encoding, const char src, int srcLen, int flags, Tcl_DString dsPtr); / 659 / EXTERN size_t Tcl_UtfToExternalDStringEx(Tcl_Encoding encoding, const char src, int srcLen, int flags, Tcl_DString dsPtr); / 660 / EXTERN int Tcl_AsyncMarkFromSignal(Tcl_AsyncHandler async, int sigNumber); / Slot 661 is reserved / / Slot 662 is reserved / / Slot 663 is reserved / / Slot 664 is reserved */
︙			︙
2645 2646 2647 2648 2649 2650 2651 ~~2652 2653~~ 2654 2655 2656 2657 2658 2659 2660	char * (tclGetStringFromObj) (Tcl_Obj objPtr, size_t lengthPtr); / 651 / Tcl_UniChar (tclGetUnicodeFromObj) (Tcl_Obj objPtr, size_t lengthPtr); / 652 / unsigned char (tclGetByteArrayFromObj) (Tcl_Obj objPtr, size_t numBytesPtr); / 653 / int (tcl_UtfCharComplete) (const char src, int length); / 654 / const char (tcl_UtfNext) (const char src); /* 655 / const char (tcl_UtfPrev) (const char src, const char start); / 656 / int (tcl_UniCharIsUnicode) (int ch); /* 657 / ~~void~~ (~~res~~er~~ved~~658~~)(void);~~ ~~void~~ (~~res~~er~~ved~~659~~)(void);~~ int (tcl_AsyncMarkFromSignal) (Tcl_AsyncHandler async, int sigNumber); /* 660 / void (reserved661)(void); void (reserved662)(void); void (reserved663)(void); void (reserved664)(void); void (reserved665)(void); void (*reserved666)(void);	\| \|	2651 2652 2653 2654 2655 2656 2657 2658 2659 2660 2661 2662 2663 2664 2665 2666	char * (tclGetStringFromObj) (Tcl_Obj objPtr, size_t lengthPtr); / 651 / Tcl_UniChar (tclGetUnicodeFromObj) (Tcl_Obj objPtr, size_t lengthPtr); / 652 / unsigned char (tclGetByteArrayFromObj) (Tcl_Obj objPtr, size_t numBytesPtr); / 653 / int (tcl_UtfCharComplete) (const char src, int length); / 654 / const char (tcl_UtfNext) (const char src); /* 655 / const char (tcl_UtfPrev) (const char src, const char start); / 656 / int (tcl_UniCharIsUnicode) (int ch); /* 657 / size_t (tcl_ExternalToUtfDStringEx) (Tcl_Encoding encoding, const char src, int srcLen, int flags, Tcl_DString dsPtr); /* 658 / size_t (tcl_UtfToExternalDStringEx) (Tcl_Encoding encoding, const char src, int srcLen, int flags, Tcl_DString dsPtr); /* 659 / int (tcl_AsyncMarkFromSignal) (Tcl_AsyncHandler async, int sigNumber); /* 660 / void (reserved661)(void); void (reserved662)(void); void (reserved663)(void); void (reserved664)(void); void (reserved665)(void); void (*reserved666)(void);
︙			︙
4002 4003 4004 4005 4006 4007 4008 ~~4009~~ ~~4010~~ 4011 4012 4013 4014 4015 4016 4017	(tclStubsPtr->tcl_UtfCharComplete) /* 654 / #define Tcl_UtfNext \ (tclStubsPtr->tcl_UtfNext) / 655 / #define Tcl_UtfPrev \ (tclStubsPtr->tcl_UtfPrev) / 656 / #define Tcl_UniCharIsUnicode \ (tclStubsPtr->tcl_UniCharIsUnicode) / 657 / ~~/ ~~Slot~~ 658 ~~is reserved~~ /~~ ~~/ ~~Slot~~ 659 ~~is reserved~~ /~~ #define Tcl_AsyncMarkFromSignal \ (tclStubsPtr->tcl_AsyncMarkFromSignal) / 660 / / Slot 661 is reserved / / Slot 662 is reserved / / Slot 663 is reserved / / Slot 664 is reserved / / Slot 665 is reserved */	> \| > \|	4008 4009 4010 4011 4012 4013 4014 4015 4016 4017 4018 4019 4020 4021 4022 4023 4024 4025	(tclStubsPtr->tcl_UtfCharComplete) /* 654 / #define Tcl_UtfNext \ (tclStubsPtr->tcl_UtfNext) / 655 / #define Tcl_UtfPrev \ (tclStubsPtr->tcl_UtfPrev) / 656 / #define Tcl_UniCharIsUnicode \ (tclStubsPtr->tcl_UniCharIsUnicode) / 657 / #define Tcl_ExternalToUtfDStringEx \ (tclStubsPtr->tcl_ExternalToUtfDStringEx) / 658 / #define Tcl_UtfToExternalDStringEx \ (tclStubsPtr->tcl_UtfToExternalDStringEx) / 659 / #define Tcl_AsyncMarkFromSignal \ (tclStubsPtr->tcl_AsyncMarkFromSignal) / 660 / / Slot 661 is reserved / / Slot 662 is reserved / / Slot 663 is reserved / / Slot 664 is reserved / / Slot 665 is reserved */
︙			︙

︙			︙
511 512 513 514 515 516 517 ~~518~~ 519 520 ~~521~~ 522 523 524 525 526 527 528	* * Side effects: * Depends on the memory, object, and IO subsystems. * --------------------------------------------------------------------------- / ~~/* Those flags must not conflict with other TCL_ENCODING_* flags in tcl.h /~~ / Since TCL_ENCODING_MODIFIED is only used for utf-8/cesu-8 and * TCL_ENCODING_LE is only used for utf-16/utf-32/ucs-2. re-use the same value / ~~#define TCL_ENCODING_MODIFIED 0x20 / Converting NULL bytes to 0xC0 0x80 /~~ #define TCL_ENCODING_LE TCL_ENCODING_MODIFIED / Little-endian encoding / #define TCL_ENCODING_UTF 0x200 / For UTF-8 encoding, allow 4-byte output sequences */ void TclInitEncodingSubsystem(void) { Tcl_EncodingType type;	< <	511 512 513 514 515 516 517 518 519 520 521 522 523 524 525 526	* * Side effects: * Depends on the memory, object, and IO subsystems. * --------------------------------------------------------------------------- / /* Since TCL_ENCODING_MODIFIED is only used for utf-8/cesu-8 and * TCL_ENCODING_LE is only used for utf-16/utf-32/ucs-2. re-use the same value / #define TCL_ENCODING_LE TCL_ENCODING_MODIFIED / Little-endian encoding / #define TCL_ENCODING_UTF 0x200 / For UTF-8 encoding, allow 4-byte output sequences */ void TclInitEncodingSubsystem(void) { Tcl_EncodingType type;
︙			︙
1140 1141 1142 1143 1144 1145 1146 1147 1148 1149 ~~1150~~ 1151 1152 1153 1154 1155 1156 1157 1158 1159 1160 1161 1162 1163 1164 1165 1166 ~~1167~~ 1168 1169 1170 1171 1172 1173 1174 1175 1176 1177 1178 1179 ~~1180~~ 1181 1182 1183 1184 1185 1186 1187	* for the default system encoding. / const char src, /* Source string in specified encoding. / int srcLen, / Source string length in bytes, or < 0 for * encoding-specific string length. / Tcl_DString dstPtr) /* Uninitialized or free DString in which the * converted string is stored. / { char dst; Tcl_EncodingState state; const Encoding encodingPtr; ~~int ~~flags,~~ dstLen, result, soFar, srcRead, dstWrote, dstChars;~~ Tcl_DStringInit(dstPtr); dst = Tcl_DStringValue(dstPtr); dstLen = dstPtr->spaceAvl - 1; if (encoding == NULL) { encoding = systemEncoding; } encodingPtr = (Encoding ) encoding; if (src == NULL) { srcLen = 0; } else if (srcLen < 0) { srcLen = encodingPtr->lengthProc(src); } ~~flags = TCL_ENCODING_START \| TCL_ENCODING_END;~~ if (encodingPtr->toUtfProc == UtfToUtfProc) { flags \|= TCL_ENCODING_MODIFIED \| TCL_ENCODING_UTF; } while (1) { result = encodingPtr->toUtfProc(encodingPtr->clientData, src, srcLen, flags, &state, dst, dstLen, &srcRead, &dstWrote, &dstChars); soFar = dst + dstWrote - Tcl_DStringValue(dstPtr); src += srcRead; if (result != TCL_CONVERT_NOSPACE) { Tcl_DStringSetLength(dstPtr, soFar); ~~return ~~Tcl_D~~St~~ringValue(dstPtr~~);~~ } flags &= ~TCL_ENCODING_START; srcLen -= srcRead; if (Tcl_DStringLength(dstPtr) == 0) { Tcl_DStringSetLength(dstPtr, dstLen); } Tcl_DStringSetLength(dstPtr, 2 * Tcl_DStringLength(dstPtr) + 1);	> > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > \| > \| \|	1138 1139 1140 1141 1142 1143 1144 1145 1146 1147 1148 1149 1150 1151 1152 1153 1154 1155 1156 1157 1158 1159 1160 1161 1162 1163 1164 1165 1166 1167 1168 1169 1170 1171 1172 1173 1174 1175 1176 1177 1178 1179 1180 1181 1182 1183 1184 1185 1186 1187 1188 1189 1190 1191 1192 1193 1194 1195 1196 1197 1198 1199 1200 1201 1202 1203 1204 1205 1206 1207 1208 1209 1210 1211 1212 1213 1214 1215 1216 1217 1218 1219 1220 1221 1222 1223 1224 1225 1226 1227 1228 1229 1230 1231	* for the default system encoding. / const char src, /* Source string in specified encoding. / int srcLen, / Source string length in bytes, or < 0 for * encoding-specific string length. / Tcl_DString dstPtr) /* Uninitialized or free DString in which the * converted string is stored. / { Tcl_ExternalToUtfDStringEx(encoding, src, srcLen, 0, dstPtr); return Tcl_DStringValue(dstPtr); } / ------------------------------------------------------------------------- * Tcl_ExternalToUtfDStringEx -- * * Convert a source buffer from the specified encoding into UTF-8. * The parameter flags controls the behavior, if any of the bytes in * the source buffer are invalid or cannot be represented in utf-8. * Possible flags values: * TCL_ENCODING_STOPONERROR: don't replace invalid characters/bytes but * return the first error position (Default in Tcl 9.0). * TCL_ENCODING_NOCOMPLAIN: replace invalid characters/bytes by a default * fallback character. Always return -1 (Default in Tcl 8.7). * TCL_ENCODING_MODIFIED: convert NULL bytes to \xC0\x80 in stead of 0x00. * Only valid for "utf-8" and "cesu-8". This flag may be used together * with the other flags. * * Results: * The converted bytes are stored in the DString, which is then NULL * terminated in an encoding-specific manner. The return value is * the error position in the source string or -1 if no conversion error * is reported. * * Side effects: * None. * ------------------------------------------------------------------------- / size_t Tcl_ExternalToUtfDStringEx( Tcl_Encoding encoding, /* The encoding for the source string, or NULL * for the default system encoding. / const char src, /* Source string in specified encoding. / int srcLen, / Source string length in bytes, or < 0 for * encoding-specific string length. / int flags, / Conversion control flags. / Tcl_DString dstPtr) /* Uninitialized or free DString in which the * converted string is stored. / { char dst; Tcl_EncodingState state; const Encoding encodingPtr; int dstLen, result, soFar, srcRead, dstWrote, dstChars; const char srcStart = src; Tcl_DStringInit(dstPtr); dst = Tcl_DStringValue(dstPtr); dstLen = dstPtr->spaceAvl - 1; if (encoding == NULL) { encoding = systemEncoding; } encodingPtr = (Encoding ) encoding; if (src == NULL) { srcLen = 0; } else if (srcLen < 0) { srcLen = encodingPtr->lengthProc(src); } flags \|= TCL_ENCODING_START \| TCL_ENCODING_END; if (encodingPtr->toUtfProc == UtfToUtfProc) { flags \|= TCL_ENCODING_MODIFIED \| TCL_ENCODING_UTF; } while (1) { result = encodingPtr->toUtfProc(encodingPtr->clientData, src, srcLen, flags, &state, dst, dstLen, &srcRead, &dstWrote, &dstChars); soFar = dst + dstWrote - Tcl_DStringValue(dstPtr); src += srcRead; if (result != TCL_CONVERT_NOSPACE) { Tcl_DStringSetLength(dstPtr, soFar); return (result == TCL_OK) ? (size_t)-1 : (size_t)(src - srcStart); } flags &= ~TCL_ENCODING_START; srcLen -= srcRead; if (Tcl_DStringLength(dstPtr) == 0) { Tcl_DStringSetLength(dstPtr, dstLen); } Tcl_DStringSetLength(dstPtr, 2 Tcl_DStringLength(dstPtr) + 1);
︙			︙
1332 1333 1334 1335 1336 1337 1338 1339 1340 1341 ~~1342~~ 1343 1344 1345 1346 1347 1348 1349 1350 1351 1352 1353 1354 1355 1356 1357 ~~1358~~ 1359 1360 1361 1362 1363 1364 1365 1366 1367 1368 1369 1370 ~~1371~~ 1372 1373 1374 1375 1376 1377 1378	* NULL for the default system encoding. / const char src, /* Source string in UTF-8. / int srcLen, / Source string length in bytes, or < 0 for * strlen(). / Tcl_DString dstPtr) /* Uninitialized or free DString in which the * converted string is stored. / { char dst; Tcl_EncodingState state; const Encoding encodingPtr; ~~int ~~flags,~~ dstLen, result, soFar, srcRead, dstWrote, dstChars;~~ Tcl_DStringInit(dstPtr); dst = Tcl_DStringValue(dstPtr); dstLen = dstPtr->spaceAvl - 1; if (encoding == NULL) { encoding = systemEncoding; } encodingPtr = (Encoding ) encoding; if (src == NULL) { srcLen = 0; } else if (srcLen < 0) { srcLen = strlen(src); } ~~flags = TCL_ENCODING_START \| TCL_ENCODING_END;~~ while (1) { result = encodingPtr->fromUtfProc(encodingPtr->clientData, src, srcLen, flags, &state, dst, dstLen, &srcRead, &dstWrote, &dstChars); soFar = dst + dstWrote - Tcl_DStringValue(dstPtr); src += srcRead; if (result != TCL_CONVERT_NOSPACE) { int i = soFar + encodingPtr->nullSize - 1; while (i >= soFar) { Tcl_DStringSetLength(dstPtr, i--); } ~~return ~~Tcl_D~~St~~ringValue(dstPtr~~);~~ } flags &= ~TCL_ENCODING_START; srcLen -= srcRead; if (Tcl_DStringLength(dstPtr) == 0) { Tcl_DStringSetLength(dstPtr, dstLen); }	> > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > \| > \| \|	1376 1377 1378 1379 1380 1381 1382 1383 1384 1385 1386 1387 1388 1389 1390 1391 1392 1393 1394 1395 1396 1397 1398 1399 1400 1401 1402 1403 1404 1405 1406 1407 1408 1409 1410 1411 1412 1413 1414 1415 1416 1417 1418 1419 1420 1421 1422 1423 1424 1425 1426 1427 1428 1429 1430 1431 1432 1433 1434 1435 1436 1437 1438 1439 1440 1441 1442 1443 1444 1445 1446 1447 1448 1449 1450 1451 1452 1453 1454 1455 1456 1457 1458 1459 1460 1461 1462 1463 1464 1465 1466 1467 1468 1469	* NULL for the default system encoding. / const char src, /* Source string in UTF-8. / int srcLen, / Source string length in bytes, or < 0 for * strlen(). / Tcl_DString dstPtr) /* Uninitialized or free DString in which the * converted string is stored. / { Tcl_UtfToExternalDStringEx(encoding, src, srcLen, 0, dstPtr); return Tcl_DStringValue(dstPtr); } / ------------------------------------------------------------------------- * Tcl_UtfToExternalDStringEx -- * * Convert a source buffer from UTF-8 to the specified encoding. * The parameter flags controls the behavior, if any of the bytes in * the source buffer are invalid or cannot be represented in the * target encoding. * Possible flags values: * TCL_ENCODING_STOPONERROR: don't replace invalid characters/bytes but * return the first error position (Default in Tcl 9.0). * TCL_ENCODING_NOCOMPLAIN: replace invalid characters/bytes by a default * fallback character. Always return -1 (Default in Tcl 8.7). * TCL_ENCODING_MODIFIED: convert NULL bytes to \xC0\x80 in stead of 0x00. * Only valid for "utf-8" and "cesu-8". This flag may be used together * with the other flags. * * Results: * The converted bytes are stored in the DString, which is then NULL * terminated in an encoding-specific manner. The return value is * the error position in the source string or -1 if no conversion error * is reported. * * Side effects: * None. * ------------------------------------------------------------------------- / size_t Tcl_UtfToExternalDStringEx( Tcl_Encoding encoding, /* The encoding for the converted string, or * NULL for the default system encoding. / const char src, /* Source string in UTF-8. / int srcLen, / Source string length in bytes, or < 0 for * strlen(). / int flags, / Conversion control flags. / Tcl_DString dstPtr) /* Uninitialized or free DString in which the * converted string is stored. / { char dst; Tcl_EncodingState state; const Encoding encodingPtr; int dstLen, result, soFar, srcRead, dstWrote, dstChars; const char srcStart = src; Tcl_DStringInit(dstPtr); dst = Tcl_DStringValue(dstPtr); dstLen = dstPtr->spaceAvl - 1; if (encoding == NULL) { encoding = systemEncoding; } encodingPtr = (Encoding *) encoding; if (src == NULL) { srcLen = 0; } else if (srcLen < 0) { srcLen = strlen(src); } flags \|= TCL_ENCODING_START \| TCL_ENCODING_END; while (1) { result = encodingPtr->fromUtfProc(encodingPtr->clientData, src, srcLen, flags, &state, dst, dstLen, &srcRead, &dstWrote, &dstChars); soFar = dst + dstWrote - Tcl_DStringValue(dstPtr); src += srcRead; if (result != TCL_CONVERT_NOSPACE) { int i = soFar + encodingPtr->nullSize - 1; while (i >= soFar) { Tcl_DStringSetLength(dstPtr, i--); } return (result == TCL_OK) ? (size_t)-1 : (size_t)(src - srcStart); } flags &= ~TCL_ENCODING_START; srcLen -= srcRead; if (Tcl_DStringLength(dstPtr) == 0) { Tcl_DStringSetLength(dstPtr, dstLen); }
︙			︙
2191 2192 2193 2194 2195 2196 2197 2198 2199 2200 2201 2202 2203 2204	* Returns TCL_OK if conversion was successful. * * Side effects: * None. * ------------------------------------------------------------------------- / static int UtfToUtfProc( ClientData clientData, /* additional flags, e.g. TCL_ENCODING_MODIFIED / const char src, /* Source string in UTF-8. / int srcLen, / Source string length in bytes. / int flags, / Conversion control flags. */	> > > > > >	2282 2283 2284 2285 2286 2287 2288 2289 2290 2291 2292 2293 2294 2295 2296 2297 2298 2299 2300 2301	* Returns TCL_OK if conversion was successful. * * Side effects: * None. * ------------------------------------------------------------------------- / #if TCL_MAJOR_VERSION > 8 \|\| defined(TCL_NO_DEPRECATED) # define STOPONERROR !(flags & TCL_ENCODING_NOCOMPLAIN) #else # define STOPONERROR (flags & TCL_ENCODING_STOPONERROR) #endif static int UtfToUtfProc( ClientData clientData, /* additional flags, e.g. TCL_ENCODING_MODIFIED / const char src, /* Source string in UTF-8. / int srcLen, / Source string length in bytes. / int flags, / Conversion control flags. */
︙			︙
2274 2275 2276 2277 2278 2279 2280 ~~2281~~ 2282 2283 2284 2285 2286 2287 2288 2289 2290 2291 2292 2293 2294 2295 ~~2296~~ 2297 2298 2299 2300 2301 2302 2303	* Always check before using TclUtfToUCS4. Not doing can so * cause it run beyond the end of the buffer! If we happen such an * incomplete char its bytes are made to represent themselves * unless the user has explicitly asked to be told. / if (flags & TCL_ENCODING_MODIFIED) { ~~if (~~flags & TCL_ENCODING_~~STOPONERROR) {~~ result = TCL_CONVERT_MULTIBYTE; break; } ch = UCHAR(src++); } else { char chbuf[2]; chbuf[0] = UCHAR(src++); chbuf[1] = 0; TclUtfToUCS4(chbuf, &ch); } dst += Tcl_UniCharToUtf(ch, dst); } else { int low; const char saveSrc = src; size_t len = TclUtfToUCS4(src, &ch); ~~if ((len < 2) && (ch != 0) && ~~(flags & TCL_ENCODING_~~STOPONERROR)~~ && (flags & TCL_ENCODING_MODIFIED)) { result = TCL_CONVERT_SYNTAX; break; } src += len; if (!(flags & TCL_ENCODING_UTF) && (ch > 0x3FF)) { if (ch > 0xFFFF) {	\| \|	2371 2372 2373 2374 2375 2376 2377 2378 2379 2380 2381 2382 2383 2384 2385 2386 2387 2388 2389 2390 2391 2392 2393 2394 2395 2396 2397 2398 2399 2400	* Always check before using TclUtfToUCS4. Not doing can so * cause it run beyond the end of the buffer! If we happen such an * incomplete char its bytes are made to represent themselves * unless the user has explicitly asked to be told. / if (flags & TCL_ENCODING_MODIFIED) { if (STOPONERROR) { result = TCL_CONVERT_MULTIBYTE; break; } ch = UCHAR(src++); } else { char chbuf[2]; chbuf[0] = UCHAR(src++); chbuf[1] = 0; TclUtfToUCS4(chbuf, &ch); } dst += Tcl_UniCharToUtf(ch, dst); } else { int low; const char saveSrc = src; size_t len = TclUtfToUCS4(src, &ch); if ((len < 2) && (ch != 0) && STOPONERROR && (flags & TCL_ENCODING_MODIFIED)) { result = TCL_CONVERT_SYNTAX; break; } src += len; if (!(flags & TCL_ENCODING_UTF) && (ch > 0x3FF)) { if (ch > 0xFFFF) {
︙			︙
2314 2315 2316 2317 2318 2319 2320 ~~2321~~ 2322 2323 2324 2325 2326 2327 2328 2329 2330 2331 2332 2333 2334 2335 ~~2336~~ 2337 2338 2339 2340 2341 2342 2343	* A surrogate character is detected, handle especially. / low = ch; len = (src <= srcEnd-3) ? TclUtfToUCS4(src, &low) : 0; if (((low & ~0x3FF) != 0xDC00) \|\| (ch & 0x400)) { ~~if (~~flags & TCL_ENCODING_~~STOPONERROR) {~~ result = TCL_CONVERT_UNKNOWN; src = saveSrc; break; } cesu8: dst++ = (char) (((ch >> 12) \| 0xE0) & 0xEF); dst++ = (char) (((ch >> 6) \| 0x80) & 0xBF); dst++ = (char) ((ch \| 0x80) & 0xBF); continue; } src += len; dst += Tcl_UniCharToUtf(ch, dst); ch = low; } else if (!Tcl_UniCharIsUnicode(ch)) { ~~if (~~flags & TCL_ENCODING_~~STOPONERROR) {~~ result = TCL_CONVERT_UNKNOWN; src = saveSrc; break; } if (!(flags & TCL_ENCODING_MODIFIED)) { ch = 0xFFFD; }	> \| \|	2411 2412 2413 2414 2415 2416 2417 2418 2419 2420 2421 2422 2423 2424 2425 2426 2427 2428 2429 2430 2431 2432 2433 2434 2435 2436 2437 2438 2439 2440 2441	* A surrogate character is detected, handle especially. / low = ch; len = (src <= srcEnd-3) ? TclUtfToUCS4(src, &low) : 0; if (((low & ~0x3FF) != 0xDC00) \|\| (ch & 0x400)) { if (STOPONERROR) { result = TCL_CONVERT_UNKNOWN; src = saveSrc; break; } cesu8: dst++ = (char) (((ch >> 12) \| 0xE0) & 0xEF); dst++ = (char) (((ch >> 6) \| 0x80) & 0xBF); dst++ = (char) ((ch \| 0x80) & 0xBF); continue; } src += len; dst += Tcl_UniCharToUtf(ch, dst); ch = low; } else if (!Tcl_UniCharIsUnicode(ch)) { if (STOPONERROR) { result = TCL_CONVERT_UNKNOWN; src = saveSrc; break; } if (!(flags & TCL_ENCODING_MODIFIED)) { ch = 0xFFFD; }
︙			︙
2515 2516 2517 2518 2519 2520 2521 ~~2522~~ 2523 2524 2525 2526 2527 2528 2529	} if (dst > dstEnd) { result = TCL_CONVERT_NOSPACE; break; } len = TclUtfToUCS4(src, &ch); if (!Tcl_UniCharIsUnicode(ch)) { ~~if (~~flags & TCL_ENCODING_~~STOPONERROR) {~~ result = TCL_CONVERT_UNKNOWN; break; } ch = 0xFFFD; } src += len; if (flags & TCL_ENCODING_LE) {	\|	2613 2614 2615 2616 2617 2618 2619 2620 2621 2622 2623 2624 2625 2626 2627	} if (dst > dstEnd) { result = TCL_CONVERT_NOSPACE; break; } len = TclUtfToUCS4(src, &ch); if (!Tcl_UniCharIsUnicode(ch)) { if (STOPONERROR) { result = TCL_CONVERT_UNKNOWN; break; } ch = 0xFFFD; } src += len; if (flags & TCL_ENCODING_LE) {
︙			︙
2718 2719 2720 2721 2722 2723 2724 ~~2725~~ 2726 2727 2728 2729 2730 2731 2732	} if (dst > dstEnd) { result = TCL_CONVERT_NOSPACE; break; } len = TclUtfToUCS4(src, &ch); if (!Tcl_UniCharIsUnicode(ch)) { ~~if (~~flags & TCL_ENCODING_~~STOPONERROR) {~~ result = TCL_CONVERT_UNKNOWN; break; } ch = 0xFFFD; } src += len; if (flags & TCL_ENCODING_LE) {	\|	2816 2817 2818 2819 2820 2821 2822 2823 2824 2825 2826 2827 2828 2829 2830	} if (dst > dstEnd) { result = TCL_CONVERT_NOSPACE; break; } len = TclUtfToUCS4(src, &ch); if (!Tcl_UniCharIsUnicode(ch)) { if (STOPONERROR) { result = TCL_CONVERT_UNKNOWN; break; } ch = 0xFFFD; } src += len; if (flags & TCL_ENCODING_LE) {
︙			︙
2938 2939 2940 2941 2942 2943 2944 ~~2945~~ 2946 2947 2948 2949 2950 2951 2952	break; } ch = toUnicode[byte][((unsigned char ) src)]; } else { ch = pageZero[byte]; } if ((ch == 0) && (byte != 0)) { ~~if (~~flags & TCL_ENCODING_~~STOPONERROR) {~~ result = TCL_CONVERT_SYNTAX; break; } if (prefixBytes[byte]) { src--; } ch = (Tcl_UniChar) byte;	\|	3036 3037 3038 3039 3040 3041 3042 3043 3044 3045 3046 3047 3048 3049 3050	break; } ch = toUnicode[byte][((unsigned char ) src)]; } else { ch = pageZero[byte]; } if ((ch == 0) && (byte != 0)) { if (STOPONERROR) { result = TCL_CONVERT_SYNTAX; break; } if (prefixBytes[byte]) { src--; } ch = (Tcl_UniChar) byte;
︙			︙
3054 3055 3056 3057 3058 3059 3060 ~~3061~~ 3062 3063 3064 3065 3066 3067 3068	if (!len) { word = 0; } else #endif word = fromUnicode[(ch >> 8)][ch & 0xFF]; if ((word == 0) && (ch != 0)) { ~~if (~~flags & TCL_ENCODING_~~STOPONERROR) {~~ result = TCL_CONVERT_UNKNOWN; break; } word = dataPtr->fallback; } if (prefixBytes[(word >> 8)] != 0) { if (dst + 1 > dstEnd) {	\|	3152 3153 3154 3155 3156 3157 3158 3159 3160 3161 3162 3163 3164 3165 3166	if (!len) { word = 0; } else #endif word = fromUnicode[(ch >> 8)][ch & 0xFF]; if ((word == 0) && (ch != 0)) { if (STOPONERROR) { result = TCL_CONVERT_UNKNOWN; break; } word = dataPtr->fallback; } if (prefixBytes[(word >> 8)] != 0) { if (dst + 1 > dstEnd) {
︙			︙
3242 3243 3244 3245 3246 3247 3248 ~~3249~~ 3250 3251 3252 3253 3254 3255 3256	*/ if (ch > 0xFF #if TCL_UTF_MAX < 4 \|\| ((ch >= 0xD800) && (len < 3)) #endif ) { ~~if (~~flags & TCL_ENCODING_~~STOPONERROR) {~~ result = TCL_CONVERT_UNKNOWN; break; } #if TCL_UTF_MAX < 4 if ((ch >= 0xD800) && (len < 3)) { len = 4; }	\|	3340 3341 3342 3343 3344 3345 3346 3347 3348 3349 3350 3351 3352 3353 3354	*/ if (ch > 0xFF #if TCL_UTF_MAX < 4 \|\| ((ch >= 0xD800) && (len < 3)) #endif ) { if (STOPONERROR) { result = TCL_CONVERT_UNKNOWN; break; } #if TCL_UTF_MAX < 4 if ((ch >= 0xD800) && (len < 3)) { len = 4; }
︙			︙
3469 3470 3471 3472 3473 3474 3475 ~~3476~~ 3477 3478 3479 3480 3481 3482 3483	* We have a split-up or unrecognized escape sequence. If we * checked all the sequences, then it's a syntax error, otherwise * we need more bytes to determine a match. / if ((checked == dataPtr->numSubTables + 2) \|\| (flags & TCL_ENCODING_END)) { ~~if (~~(flags & TCL_ENCODING_~~STOPONERROR~~) == 0~~) {~~ / * Skip the unknown escape sequence. */ src += longest; continue; }	\|	3567 3568 3569 3570 3571 3572 3573 3574 3575 3576 3577 3578 3579 3580 3581	* We have a split-up or unrecognized escape sequence. If we * checked all the sequences, then it's a syntax error, otherwise * we need more bytes to determine a match. / if ((checked == dataPtr->numSubTables + 2) \|\| (flags & TCL_ENCODING_END)) { if (!STOPONERROR) { / * Skip the unknown escape sequence. */ src += longest; continue; }
︙			︙
3644 3645 3646 3647 3648 3649 3650 ~~3651~~ 3652 3653 3654 3655 3656 3657 3658	if (word != 0) { break; } } if (word == 0) { state = oldState; ~~if (~~flags & TCL_ENCODING_~~STOPONERROR) {~~ result = TCL_CONVERT_UNKNOWN; break; } encodingPtr = GetTableEncoding(dataPtr, state); tableDataPtr = (const TableEncodingData *)encodingPtr->clientData; word = tableDataPtr->fallback; }	\|	3742 3743 3744 3745 3746 3747 3748 3749 3750 3751 3752 3753 3754 3755 3756	if (word != 0) { break; } } if (word == 0) { state = oldState; if (STOPONERROR) { result = TCL_CONVERT_UNKNOWN; break; } encodingPtr = GetTableEncoding(dataPtr, state); tableDataPtr = (const TableEncodingData *)encodingPtr->clientData; word = tableDataPtr->fallback; }
︙			︙

︙			︙
245 246 247 248 249 250 251 ~~252~~ 253 254 255 256 257 258 259 260 ~~261~~ 262 263 264 265 266 267 268	set f [open $path(test1) w] chan configure $f -encoding ascii -buffering line -translation crlf chan puts -nonewline $f "\n12" contents $path(test1) } -cleanup { chan close $f } -result "\r\n12" ~~test chan-io-3.4 {WriteChars: loop over stage buffer} {~~ # stage buffer maps to more than can be queued at once. set f [open $path(test1) w] chan configure $f -encoding jis0208 -buffersize 16 chan puts -nonewline $f "\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\" set x [list [contents $path(test1)]] chan close $f lappend x [contents $path(test1)] } [list "!)!)!)!)!)!)!)!)" "!)!)!)!)!)!)!)!)!)!)!)!)!)!)!)"] ~~test chan-io-3.5 {WriteChars: saved != 0} {~~ # Bytes produced by UtfToExternal from end of last channel buffer had to # be moved to beginning of next channel buffer to preserve requested # buffersize. set f [open $path(test1) w] chan configure $f -encoding jis0208 -buffersize 17 chan puts -nonewline $f "\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\" set x [list [contents $path(test1)]]	\| \|	245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268	set f [open $path(test1) w] chan configure $f -encoding ascii -buffering line -translation crlf chan puts -nonewline $f "\n12" contents $path(test1) } -cleanup { chan close $f } -result "\r\n12" test chan-io-3.4 {WriteChars: loop over stage buffer} deprecated { # stage buffer maps to more than can be queued at once. set f [open $path(test1) w] chan configure $f -encoding jis0208 -buffersize 16 chan puts -nonewline $f "\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\" set x [list [contents $path(test1)]] chan close $f lappend x [contents $path(test1)] } [list "!)!)!)!)!)!)!)!)" "!)!)!)!)!)!)!)!)!)!)!)!)!)!)!)"] test chan-io-3.5 {WriteChars: saved != 0} deprecated { # Bytes produced by UtfToExternal from end of last channel buffer had to # be moved to beginning of next channel buffer to preserve requested # buffersize. set f [open $path(test1) w] chan configure $f -encoding jis0208 -buffersize 17 chan puts -nonewline $f "\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\" set x [list [contents $path(test1)]]
︙			︙
281 282 283 284 285 286 287 ~~288~~ 289 290 291 292 293 294 295	set f [open $path(test1) w] chan configure $f -encoding shiftjis -buffersize 16 chan puts -nonewline $f "12345678901234ＡＢ" set x [list [contents $path(test1)]] chan close $f lappend x [contents $path(test1)] } [list "12345678901234\x82\x60" "12345678901234\x82\x60\x82\x61"] ~~test chan-io-3.7 {WriteChars: (bufPtr->nextAdded > bufPtr->length)} {~~ # When translating UTF-8 to external, the produced bytes went past end of # the channel buffer. This is done on purpose - we then truncate the bytes # at the end of the partial character to preserve the requested blocksize # on flush. The truncated bytes are moved to the beginning of the next # channel buffer. set f [open $path(test1) w] chan configure $f -encoding jis0208 -buffersize 17	\|	281 282 283 284 285 286 287 288 289 290 291 292 293 294 295	set f [open $path(test1) w] chan configure $f -encoding shiftjis -buffersize 16 chan puts -nonewline $f "12345678901234ＡＢ" set x [list [contents $path(test1)]] chan close $f lappend x [contents $path(test1)] } [list "12345678901234\x82\x60" "12345678901234\x82\x60\x82\x61"] test chan-io-3.7 {WriteChars: (bufPtr->nextAdded > bufPtr->length)} deprecated { # When translating UTF-8 to external, the produced bytes went past end of # the channel buffer. This is done on purpose - we then truncate the bytes # at the end of the partial character to preserve the requested blocksize # on flush. The truncated bytes are moved to the beginning of the next # channel buffer. set f [open $path(test1) w] chan configure $f -encoding jis0208 -buffersize 17
︙			︙

︙			︙
174 175 176 177 178 179 180 ~~181~~ 182 183 184 185 186 187 188	encoding } -result {wrong # args: should be "encoding subcommand ?arg ...?"} test cmdAH-4.2 {Tcl_EncodingObjCmd} -returnCodes error -body { encoding foo } -result {unknown or ambiguous subcommand "foo": must be convertfrom, convertto, dirs, names, or system} test cmdAH-4.3 {Tcl_EncodingObjCmd} -returnCodes error -body { encoding convertto ~~} -result {wrong # args: should be "encoding convertto ?encoding? data"}~~ test cmdAH-4.4 {Tcl_EncodingObjCmd} -returnCodes error -body { encoding convertto foo bar } -result {unknown encoding "foo"} test cmdAH-4.5 {Tcl_EncodingObjCmd} -setup { set system [encoding system] } -body { encoding system jis0208	\|	174 175 176 177 178 179 180 181 182 183 184 185 186 187 188	encoding } -result {wrong # args: should be "encoding subcommand ?arg ...?"} test cmdAH-4.2 {Tcl_EncodingObjCmd} -returnCodes error -body { encoding foo } -result {unknown or ambiguous subcommand "foo": must be convertfrom, convertto, dirs, names, or system} test cmdAH-4.3 {Tcl_EncodingObjCmd} -returnCodes error -body { encoding convertto } -result {wrong # args: should be "encoding convertto ?-nocomplain? ?encoding? data"} test cmdAH-4.4 {Tcl_EncodingObjCmd} -returnCodes error -body { encoding convertto foo bar } -result {unknown encoding "foo"} test cmdAH-4.5 {Tcl_EncodingObjCmd} -setup { set system [encoding system] } -body { encoding system jis0208
︙			︙
196 197 198 199 200 201 202 ~~203~~ 204 205 206 207 208 209 210	encoding system iso8859-1 encoding convertto jis0208 乎 } -cleanup { encoding system $system } -result 8C test cmdAH-4.7 {Tcl_EncodingObjCmd} -returnCodes error -body { encoding convertfrom ~~} -result {wrong # args: should be "encoding convertfrom ?encoding? data"}~~ test cmdAH-4.8 {Tcl_EncodingObjCmd} -returnCodes error -body { encoding convertfrom foo bar } -result {unknown encoding "foo"} test cmdAH-4.9 {Tcl_EncodingObjCmd} -setup { set system [encoding system] } -body { encoding system jis0208	\|	196 197 198 199 200 201 202 203 204 205 206 207 208 209 210	encoding system iso8859-1 encoding convertto jis0208 乎 } -cleanup { encoding system $system } -result 8C test cmdAH-4.7 {Tcl_EncodingObjCmd} -returnCodes error -body { encoding convertfrom } -result {wrong # args: should be "encoding convertfrom ?-nocomplain? ?encoding? data"} test cmdAH-4.8 {Tcl_EncodingObjCmd} -returnCodes error -body { encoding convertfrom foo bar } -result {unknown encoding "foo"} test cmdAH-4.9 {Tcl_EncodingObjCmd} -setup { set system [encoding system] } -body { encoding system jis0208
︙			︙

︙			︙
18 19 20 21 22 23 24 25 26 27 28 29 30 31	variable x catch { ::tcltest::loadTestedCommands package require -exact tcl::test [info patchlevel] } proc toutf {args} { variable x lappend x "toutf $args" } proc fromutf {args} { variable x lappend x "fromutf $args"	> >	18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33	variable x catch { ::tcltest::loadTestedCommands package require -exact tcl::test [info patchlevel] } testConstraint deprecated [expr {![info exists tcl_precision]}] proc toutf {args} { variable x lappend x "toutf $args" } proc fromutf {args} { variable x lappend x "fromutf $args"
︙			︙
293 294 295 296 297 298 299 ~~300~~ 301 302 303 304 305 306 307	test encoding-11.11 {encoding: extended Unicode UTF-32} { viewable [encoding convertto utf-32be 😹] } "\x00\x01\xF69 (\\u0000\\u0001\\u00F69)" # OpenEncodingFile is fully tested by the rest of the tests in this file. test encoding-12.1 {LoadTableEncoding: normal encoding} { set x [encoding convertto iso8859-3 Ġ] ~~append x [encoding convertto iso8859-3 Õ]~~ append x [encoding convertfrom iso8859-3 Õ] } "Õ?Ġ" test encoding-12.2 {LoadTableEncoding: single-byte encoding} { set x [encoding convertto iso8859-3 abĠg] append x [encoding convertfrom iso8859-3 abÕg] } "abÕgabĠg" test encoding-12.3 {LoadTableEncoding: multi-byte encoding} {	\|	295 296 297 298 299 300 301 302 303 304 305 306 307 308 309	test encoding-11.11 {encoding: extended Unicode UTF-32} { viewable [encoding convertto utf-32be 😹] } "\x00\x01\xF69 (\\u0000\\u0001\\u00F69)" # OpenEncodingFile is fully tested by the rest of the tests in this file. test encoding-12.1 {LoadTableEncoding: normal encoding} { set x [encoding convertto iso8859-3 Ġ] append x [encoding convertto -nocomplain iso8859-3 Õ] append x [encoding convertfrom iso8859-3 Õ] } "Õ?Ġ" test encoding-12.2 {LoadTableEncoding: single-byte encoding} { set x [encoding convertto iso8859-3 abĠg] append x [encoding convertfrom iso8859-3 abÕg] } "abÕgabĠg" test encoding-12.3 {LoadTableEncoding: multi-byte encoding} {
︙			︙
342 343 344 345 346 347 348 ~~349~~ 350 351 352 353 354 ~~355~~ 356 357 358 359 360 ~~361~~ 362 363 364 365 366 ~~367~~ 368 369 370 371 372 ~~373~~ 374 375 376 377 378 ~~379~~ 380 381 382 383 384 ~~385~~ 386 387 388 389 390 ~~391~~ 392 393 394 395 396 ~~397~~ 398 399 400 401 402 ~~403~~ 404 405 406 407 408 ~~409~~ 410 411 412 413 414 415 416	test encoding-15.5 {UtfToUtfProc emoji character input} { set x \xF0\x9F\x98\x82 set y [encoding convertfrom utf-8 \xF0\x9F\x98\x82] list [string length $x] $y } "4 😂" test encoding-15.6 {UtfToUtfProc emoji character output} { set x \uDE02\uD83D\uDE02\uD83D ~~set y [encoding convertto utf-8 \uDE02\uD83D\uDE02\uD83D]~~ binary scan $y H* z list [string length $y] $z } {10 edb882f09f9882eda0bd} test encoding-15.7 {UtfToUtfProc emoji character output} { set x \uDE02\uD83D\uD83D ~~set y [encoding convertto utf-8 \uDE02\uD83D\uD83D]~~ binary scan $y H* z list [string length $x] [string length $y] $z } {3 9 edb882eda0bdeda0bd} test encoding-15.8 {UtfToUtfProc emoji character output} { set x \uDE02\uD83Dé ~~set y [encoding convertto utf-8 \uDE02\uD83Dé]~~ binary scan $y H* z list [string length $x] [string length $y] $z } {3 8 edb882eda0bdc3a9} test encoding-15.9 {UtfToUtfProc emoji character output} { set x \uDE02\uD83DX ~~set y [encoding convertto utf-8 \uDE02\uD83DX]~~ binary scan $y H* z list [string length $x] [string length $y] $z } {3 7 edb882eda0bd58} test encoding-15.10 {UtfToUtfProc high surrogate character output} { set x \uDE02é ~~set y [encoding convertto utf-8 \uDE02é]~~ binary scan $y H* z list [string length $x] [string length $y] $z } {2 5 edb882c3a9} test encoding-15.11 {UtfToUtfProc low surrogate character output} { set x \uDA02é ~~set y [encoding convertto utf-8 \uDA02é]~~ binary scan $y H* z list [string length $x] [string length $y] $z } {2 5 eda882c3a9} test encoding-15.12 {UtfToUtfProc high surrogate character output} { set x \uDE02Y ~~set y [encoding convertto utf-8 \uDE02Y]~~ binary scan $y H* z list [string length $x] [string length $y] $z } {2 4 edb88259} test encoding-15.13 {UtfToUtfProc low surrogate character output} { set x \uDA02Y ~~set y [encoding convertto utf-8 \uDA02Y]~~ binary scan $y H* z list [string length $x] [string length $y] $z } {2 4 eda88259} test encoding-15.14 {UtfToUtfProc high surrogate character output} { set x \uDE02 ~~set y [encoding convertto utf-8 \uDE02]~~ binary scan $y H* z list [string length $x] [string length $y] $z } {1 3 edb882} test encoding-15.15 {UtfToUtfProc low surrogate character output} { set x \uDA02 ~~set y [encoding convertto utf-8 \uDA02]~~ binary scan $y H* z list [string length $x] [string length $y] $z } {1 3 eda882} test encoding-15.16 {UtfToUtfProc: Invalid 4-byte UTF-8, see [ed29806ba]} { set x \xF0\xA0\xA1\xC2 ~~set y [encoding convertfrom utf-8 \xF0\xA0\xA1\xC2]~~ list [string length $x] $y } "4 \xF0\xA0\xA1\xC2" test encoding-15.17 {UtfToUtfProc emoji character output} { set x 😂 set y [encoding convertto utf-8 😂] binary scan $y H* z list [string length $y] $z	\| \| \| \| \| \| \| \| \| \| \|	344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418	test encoding-15.5 {UtfToUtfProc emoji character input} { set x \xF0\x9F\x98\x82 set y [encoding convertfrom utf-8 \xF0\x9F\x98\x82] list [string length $x] $y } "4 😂" test encoding-15.6 {UtfToUtfProc emoji character output} { set x \uDE02\uD83D\uDE02\uD83D set y [encoding convertto -nocomplain utf-8 \uDE02\uD83D\uDE02\uD83D] binary scan $y H* z list [string length $y] $z } {10 edb882f09f9882eda0bd} test encoding-15.7 {UtfToUtfProc emoji character output} { set x \uDE02\uD83D\uD83D set y [encoding convertto -nocomplain utf-8 \uDE02\uD83D\uD83D] binary scan $y H* z list [string length $x] [string length $y] $z } {3 9 edb882eda0bdeda0bd} test encoding-15.8 {UtfToUtfProc emoji character output} { set x \uDE02\uD83Dé set y [encoding convertto -nocomplain utf-8 \uDE02\uD83Dé] binary scan $y H* z list [string length $x] [string length $y] $z } {3 8 edb882eda0bdc3a9} test encoding-15.9 {UtfToUtfProc emoji character output} { set x \uDE02\uD83DX set y [encoding convertto -nocomplain utf-8 \uDE02\uD83DX] binary scan $y H* z list [string length $x] [string length $y] $z } {3 7 edb882eda0bd58} test encoding-15.10 {UtfToUtfProc high surrogate character output} { set x \uDE02é set y [encoding convertto -nocomplain utf-8 \uDE02é] binary scan $y H* z list [string length $x] [string length $y] $z } {2 5 edb882c3a9} test encoding-15.11 {UtfToUtfProc low surrogate character output} { set x \uDA02é set y [encoding convertto -nocomplain utf-8 \uDA02é] binary scan $y H* z list [string length $x] [string length $y] $z } {2 5 eda882c3a9} test encoding-15.12 {UtfToUtfProc high surrogate character output} { set x \uDE02Y set y [encoding convertto -nocomplain utf-8 \uDE02Y] binary scan $y H* z list [string length $x] [string length $y] $z } {2 4 edb88259} test encoding-15.13 {UtfToUtfProc low surrogate character output} { set x \uDA02Y set y [encoding convertto -nocomplain utf-8 \uDA02Y] binary scan $y H* z list [string length $x] [string length $y] $z } {2 4 eda88259} test encoding-15.14 {UtfToUtfProc high surrogate character output} { set x \uDE02 set y [encoding convertto -nocomplain utf-8 \uDE02] binary scan $y H* z list [string length $x] [string length $y] $z } {1 3 edb882} test encoding-15.15 {UtfToUtfProc low surrogate character output} { set x \uDA02 set y [encoding convertto -nocomplain utf-8 \uDA02] binary scan $y H* z list [string length $x] [string length $y] $z } {1 3 eda882} test encoding-15.16 {UtfToUtfProc: Invalid 4-byte UTF-8, see [ed29806ba]} { set x \xF0\xA0\xA1\xC2 set y [encoding convertfrom -nocomplain utf-8 \xF0\xA0\xA1\xC2] list [string length $x] $y } "4 \xF0\xA0\xA1\xC2" test encoding-15.17 {UtfToUtfProc emoji character output} { set x 😂 set y [encoding convertto utf-8 😂] binary scan $y H* z list [string length $y] $z
︙			︙
483 484 485 486 487 488 489 ~~490~~ 491 492 ~~493~~ 494 495 496 497 498 499 500	test encoding-17.1 {UtfToUtf16Proc} -body { encoding convertto utf-16 "\U460DC" } -result "\xD8\xD8\xDC\xDC" test encoding-17.2 {UtfToUcs2Proc} -body { encoding convertfrom utf-16 [encoding convertto ucs-2 "\U460DC"] } -result "\uFFFD" test encoding-17.3 {UtfToUtf16Proc} -body { ~~encoding convertto utf-16be "\uDCDC"~~ } -result "\xFF\xFD" test encoding-17.4 {UtfToUtf16Proc} -body { ~~encoding convertto utf-16le "\uD8D8"~~ } -result "\xFD\xFF" test encoding-17.5 {UtfToUtf16Proc} -body { encoding convertto utf-32le "\U460DC" } -result "\xDC\x60\x04\x00" test encoding-17.6 {UtfToUtf16Proc} -body { encoding convertto utf-32be "\U460DC" } -result "\x00\x04\x60\xDC"	\| \|	485 486 487 488 489 490 491 492 493 494 495 496 497 498 499 500 501 502	test encoding-17.1 {UtfToUtf16Proc} -body { encoding convertto utf-16 "\U460DC" } -result "\xD8\xD8\xDC\xDC" test encoding-17.2 {UtfToUcs2Proc} -body { encoding convertfrom utf-16 [encoding convertto ucs-2 "\U460DC"] } -result "\uFFFD" test encoding-17.3 {UtfToUtf16Proc} -body { encoding convertto -nocomplain utf-16be "\uDCDC" } -result "\xFF\xFD" test encoding-17.4 {UtfToUtf16Proc} -body { encoding convertto -nocomplain utf-16le "\uD8D8" } -result "\xFD\xFF" test encoding-17.5 {UtfToUtf16Proc} -body { encoding convertto utf-32le "\U460DC" } -result "\xDC\x60\x04\x00" test encoding-17.6 {UtfToUtf16Proc} -body { encoding convertto utf-32be "\U460DC" } -result "\x00\x04\x60\xDC"
︙			︙
611 612 613 614 615 616 617 ~~618~~ 619 620 ~~621~~ 622 623 624 625 626 ~~627~~ 628 629 ~~630~~ 631 632 633 634 635 ~~636~~ 637 638 639 640 641 642 643 644	list $count [viewable $line] } [list 3 "乎乞也 (\\u4E4E\\u4E5E\\u4E5F)"] test encoding-24.4 {Parse valid or invalid utf-8} { string length [encoding convertfrom utf-8 "\xC0\x80"] } 1 test encoding-24.5 {Parse valid or invalid utf-8} { ~~string length [encoding convertfrom utf-8 "\xC0\x81"]~~ } 2 test encoding-24.6 {Parse valid or invalid utf-8} { ~~string length [encoding convertfrom utf-8 "\xC1\xBF"]~~ } 2 test encoding-24.7 {Parse valid or invalid utf-8} { string length [encoding convertfrom utf-8 "\xC2\x80"] } 1 test encoding-24.8 {Parse valid or invalid utf-8} { ~~string length [encoding convertfrom utf-8 "\xE0\x80\x80"]~~ } 3 test encoding-24.9 {Parse valid or invalid utf-8} { ~~string length [encoding convertfrom utf-8 "\xE0\x9F\xBF"]~~ } 3 test encoding-24.10 {Parse valid or invalid utf-8} { string length [encoding convertfrom utf-8 "\xE0\xA0\x80"] } 1 test encoding-24.11 {Parse valid or invalid utf-8} { ~~string length [encoding convertfrom utf-8 "\xEF\x~~BF\xBF~~"]~~ } 1 file delete [file join [temporaryDirectory] iso2022.txt] # # Begin jajp encoding round-trip conformity tests # proc foreach-jisx0208 {varName command} {	\| \| \| \| > > > > > > > > > \| > > > > > > > > > > > > > > > > > > > > > > > > > > >	613 614 615 616 617 618 619 620 621 622 623 624 625 626 627 628 629 630 631 632 633 634 635 636 637 638 639 640 641 642 643 644 645 646 647 648 649 650 651 652 653 654 655 656 657 658 659 660 661 662 663 664 665 666 667 668 669 670 671 672 673 674 675 676 677 678 679 680 681 682	list $count [viewable $line] } [list 3 "乎乞也 (\\u4E4E\\u4E5E\\u4E5F)"] test encoding-24.4 {Parse valid or invalid utf-8} { string length [encoding convertfrom utf-8 "\xC0\x80"] } 1 test encoding-24.5 {Parse valid or invalid utf-8} { string length [encoding convertfrom -nocomplain utf-8 "\xC0\x81"] } 2 test encoding-24.6 {Parse valid or invalid utf-8} { string length [encoding convertfrom -nocomplain utf-8 "\xC1\xBF"] } 2 test encoding-24.7 {Parse valid or invalid utf-8} { string length [encoding convertfrom utf-8 "\xC2\x80"] } 1 test encoding-24.8 {Parse valid or invalid utf-8} { string length [encoding convertfrom -nocomplain utf-8 "\xE0\x80\x80"] } 3 test encoding-24.9 {Parse valid or invalid utf-8} { string length [encoding convertfrom -nocomplain utf-8 "\xE0\x9F\xBF"] } 3 test encoding-24.10 {Parse valid or invalid utf-8} { string length [encoding convertfrom utf-8 "\xE0\xA0\x80"] } 1 test encoding-24.11 {Parse valid or invalid utf-8} { string length [encoding convertfrom -nocomplain utf-8 "\xEF\xBF\xBF"] } 1 test encoding-24.12 {Parse valid or invalid utf-8} -constraints deprecated -body { encoding convertfrom utf-8 "\xC0\x81" } -returnCodes 1 -result {unexpected byte sequence starting at index 0: '\xC0'} test encoding-24.13 {Parse valid or invalid utf-8} -constraints deprecated -body { encoding convertfrom utf-8 "\xC1\xBF" } -returnCodes 1 -result {unexpected byte sequence starting at index 0: '\xC1'} test encoding-24.14 {Parse valid or invalid utf-8} { string length [encoding convertfrom utf-8 "\xC2\x80"] } 1 test encoding-24.15 {Parse valid or invalid utf-8} -constraints deprecated -body { encoding convertfrom utf-8 "Z\xE0\x80" } -returnCodes 1 -result {unexpected byte sequence starting at index 1: '\xE0'} test encoding-24.16 {Parse valid or invalid utf-8} -constraints {testbytestring deprecated} -body { encoding convertto utf-8 [testbytestring "Z\u4343\x80"] } -returnCodes 1 -result {expected byte sequence but character 1 was '䍃' (U+004343)} test encoding-24.17 {Parse valid or invalid utf-8} -constraints {testbytestring deprecated} -body { encoding convertto utf-8 [testbytestring "Z\xE0\x80"] } -result "Z\xC3\xA0\xE2\x82\xAC" test encoding-24.18 {Parse valid or invalid utf-8} -constraints {testbytestring deprecated} -body { encoding convertto utf-8 [testbytestring "Z\xE0\x80xxxxxx"] } -result "Z\xC3\xA0\xE2\x82\xACxxxxxx" test encoding-24.19 {Parse valid or invalid utf-8} -constraints deprecated -body { encoding convertto utf-8 "ZX\uD800" } -returnCodes 1 -match glob -result "unexpected character at index 2: 'U+00D800'" test encoding-24.20 {Parse with -nocomplain but without providing encoding} { string length [encoding convertfrom -nocomplain "\x20"] } 1 test encoding-24.21 {Parse with -nocomplain but without providing encoding} { string length [encoding convertto -nocomplain "\x20"] } 1 test encoding-24.22 {Syntax error, two encodings} -body { encoding convertfrom iso8859-1 utf-8 "ZX\uD800" } -returnCodes 1 -result {wrong # args: should be "::tcl::encoding::convertfrom ?-nocomplain? ?encoding? data"} test encoding-24.23 {Syntax error, two encodings} -body { encoding convertto iso8859-1 utf-8 "ZX\uD800" } -returnCodes 1 -result {wrong # args: should be "::tcl::encoding::convertto ?-nocomplain? ?encoding? data"} file delete [file join [temporaryDirectory] iso2022.txt] # # Begin jajp encoding round-trip conformity tests # proc foreach-jisx0208 {varName command} {
︙			︙
786 787 788 789 790 791 792 ~~793~~ 794 795 796 797 798 799 800	} test encoding-28.0 {all encodings load} -body { set string hello foreach name [encoding names] { incr count ~~encoding convertto $name $string~~ # discard the cached internal representation of Tcl_Encoding # Unfortunately, without this, encoding 2-1 fails. llength $name } return $count } -result [expr {[info exists ::tcl_precision] ? 92 : 91}]	\|	824 825 826 827 828 829 830 831 832 833 834 835 836 837 838	} test encoding-28.0 {all encodings load} -body { set string hello foreach name [encoding names] { incr count encoding convertto -nocomplain $name $string # discard the cached internal representation of Tcl_Encoding # Unfortunately, without this, encoding 2-1 fails. llength $name } return $count } -result [expr {[info exists ::tcl_precision] ? 92 : 91}]
︙			︙

︙			︙
264 265 266 267 268 269 270 ~~271~~ 272 273 274 275 276 277 278 279 280 ~~281~~ 282 283 284 285 286 287 288	set f [open $path(test1) w] fconfigure $f -encoding ascii -buffering line -translation crlf puts -nonewline $f "\n12" set x [contents $path(test1)] close $f set x } "\r\n12" ~~test io-3.4 {WriteChars: loop over stage buffer} {~~ # stage buffer maps to more than can be queued at once. set f [open $path(test1) w] fconfigure $f -encoding jis0208 -buffersize 16 puts -nonewline $f "\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\" set x [list [contents $path(test1)]] close $f lappend x [contents $path(test1)] } [list "!)!)!)!)!)!)!)!)" "!)!)!)!)!)!)!)!)!)!)!)!)!)!)!)"] ~~test io-3.5 {WriteChars: saved != 0} {~~ # Bytes produced by UtfToExternal from end of last channel buffer # had to be moved to beginning of next channel buffer to preserve # requested buffersize. set f [open $path(test1) w] fconfigure $f -encoding jis0208 -buffersize 17 puts -nonewline $f "\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\"	\| \|	264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288	set f [open $path(test1) w] fconfigure $f -encoding ascii -buffering line -translation crlf puts -nonewline $f "\n12" set x [contents $path(test1)] close $f set x } "\r\n12" test io-3.4 {WriteChars: loop over stage buffer} deprecated { # stage buffer maps to more than can be queued at once. set f [open $path(test1) w] fconfigure $f -encoding jis0208 -buffersize 16 puts -nonewline $f "\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\" set x [list [contents $path(test1)]] close $f lappend x [contents $path(test1)] } [list "!)!)!)!)!)!)!)!)" "!)!)!)!)!)!)!)!)!)!)!)!)!)!)!)"] test io-3.5 {WriteChars: saved != 0} deprecated { # Bytes produced by UtfToExternal from end of last channel buffer # had to be moved to beginning of next channel buffer to preserve # requested buffersize. set f [open $path(test1) w] fconfigure $f -encoding jis0208 -buffersize 17 puts -nonewline $f "\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\"
︙			︙
303 304 305 306 307 308 309 ~~310~~ 311 312 313 314 315 316 317	set f [open $path(test1) w] fconfigure $f -encoding shiftjis -buffersize 16 puts -nonewline $f "12345678901234ＡＢ" set x [list [contents $path(test1)]] close $f lappend x [contents $path(test1)] } [list "12345678901234\x82\x60" "12345678901234\x82\x60\x82\x61"] ~~test io-3.7 {WriteChars: (bufPtr->nextAdded > bufPtr->length)} {~~ # When translating UTF-8 to external, the produced bytes went past end # of the channel buffer. This is done purpose -- we then truncate the # bytes at the end of the partial character to preserve the requested # blocksize on flush. The truncated bytes are moved to the beginning # of the next channel buffer. set f [open $path(test1) w]	\|	303 304 305 306 307 308 309 310 311 312 313 314 315 316 317	set f [open $path(test1) w] fconfigure $f -encoding shiftjis -buffersize 16 puts -nonewline $f "12345678901234ＡＢ" set x [list [contents $path(test1)]] close $f lappend x [contents $path(test1)] } [list "12345678901234\x82\x60" "12345678901234\x82\x60\x82\x61"] test io-3.7 {WriteChars: (bufPtr->nextAdded > bufPtr->length)} deprecated { # When translating UTF-8 to external, the produced bytes went past end # of the channel buffer. This is done purpose -- we then truncate the # bytes at the end of the partial character to preserve the requested # blocksize on flush. The truncated bytes are moved to the beginning # of the next channel buffer. set f [open $path(test1) w]
︙			︙
1528 1529 1530 1531 1532 1533 1534 ~~1535~~ 1536 1537 1538 1539 1540 1541 1542 1543 1544 1545 ~~1546~~ 1547 1548 1549 1550 1551 1552 1553	close $f set f [open $path(test1)] fconfigure $f -encoding utf-8 -buffersize 10 set in [read $f] close $f scan [string index $in end] %c } 160 ~~test io-12.9 {ReadChars: multibyte chars split} {~~ set f [open $path(test1) w] fconfigure $f -translation binary puts -nonewline $f [string repeat a 9]\xC2 close $f set f [open $path(test1)] fconfigure $f -encoding utf-8 -buffersize 10 set in [read $f] close $f scan [string index $in end] %c } 194 ~~test io-12.10 {ReadChars: multibyte chars split} {~~ set f [open $path(test1) w] fconfigure $f -translation binary puts -nonewline $f [string repeat a 9]\xC2 close $f set f [open $path(test1)] fconfigure $f -encoding utf-8 -buffersize 11 set in [read $f]	\| \|	1528 1529 1530 1531 1532 1533 1534 1535 1536 1537 1538 1539 1540 1541 1542 1543 1544 1545 1546 1547 1548 1549 1550 1551 1552 1553	close $f set f [open $path(test1)] fconfigure $f -encoding utf-8 -buffersize 10 set in [read $f] close $f scan [string index $in end] %c } 160 test io-12.9 {ReadChars: multibyte chars split} deprecated { set f [open $path(test1) w] fconfigure $f -translation binary puts -nonewline $f [string repeat a 9]\xC2 close $f set f [open $path(test1)] fconfigure $f -encoding utf-8 -buffersize 10 set in [read $f] close $f scan [string index $in end] %c } 194 test io-12.10 {ReadChars: multibyte chars split} deprecated { set f [open $path(test1) w] fconfigure $f -translation binary puts -nonewline $f [string repeat a 9]\xC2 close $f set f [open $path(test1)] fconfigure $f -encoding utf-8 -buffersize 11 set in [read $f]
︙			︙