Tcl Source Code: View Ticket

Ticket UUID:	73bb42fb3f35cd613af6fcea465e35bbfd352216
Title:	Panic "Buffer Underflow, BUFFER_PADDING not enough"
Type:	Bug	Version:	8.7+
Submitter:	apnadkarni	Created on:	2025-01-14 11:23:18
Subsystem:	- New Builtin Commands	Assigned To:	nobody
Priority:	7 High	Severity:	Critical
Status:	Closed	Last Modified:	2025-03-09 03:46:00
Resolution:	Fixed	Closed By:	apnadkarni
		Closed on:	2025-03-09 03:46:00
Description:	Following script causes the panic quoted in the subject line (9.0.0, 9.0.1, trunk) `set buffersize 8 writeFile crasher.txt binary [string repeat x [expr $buffersize-1]]\r\xe9XXXXXXXXXXXXXXXXXXXXXXXX set fd [open crasher.txt] fconfigure $fd -encoding utf-8 -buffersize $buffersize -translation crlf catch {read $fd [expr $buffersize+1]} e; # (Correctly) raises encoding error puts $e close $fd set fd [open crasher.txt] fconfigure $fd -encoding utf-8 -buffersize $buffersize -translation crlf read $fd $buffersize; # <- Panic` Note the crash seems to only happen when the CR is at the buffer edge and the iso8859-1 encoding is read as utf-8. Otherwise, an encoding error is correctly raised. However, I'm not sure what other circumstances would cause the same effect.
User Comments:	apnadkarni added on 2025-03-09 03:46:00: Fair point. Added a comment based on my own analysis and understanding. oehhar added on 2025-03-07 12:29:23: Thanks to the team for the finding and the great fix, I appreciate! I would appreciate some comments in the source code including a reference to this bug. This code is so complex that we don't find us there. So, commenting effort is crucial IMHO. Thanks for all, Harald sebres added on 2025-03-06 13:17:47: Fixed in all branches 8.7+, thus closed. Thx Ashok! apnadkarni added on 2025-03-05 05:54:52: While the panic is the same, the underlying bug I believe is separate from what was reported in this ticket. A slightly simpler version to reproduce for debugging. `% writeFile x.txt binary \xD6[string repeat _ 20] % set f [open x.txt] file257cf8b69f0 % fconfigure $f -blocking 0 -buffersize 10 -translation lf -eofchar {} % read $f 1 Buffer Underflow, BUFFER_PADDING not enough` I understand what the problem is. Kind of. The encoding routines return an error (as they should). ReadChars overwrites that code with TCL_OK (presumably to allow callers up the stack to retrieve whatever could be successfully read). Because the error was right at the first byte, no characters are decoded so further in ReadChars it presumes more data is needed and returns -1. The caller DoReadChars then checks for EOF and since that is not set, repeats the call. Possible fix here which checks for encoding error flag in addition to eof but not sure that is the right place or if it is complete. Test suite passes (fwiw) as does the bug test. sebres added on 2025-03-04 23:08:50: Nope, it continues (thus reopened)... After long research of an issue posted in chat by emiliano (350K large `gitk` over huge repo made boom with BU-panic), I could reduce it to following script, that "successfully" generates the "Buffer Underflow" with current 9.x: `set f [open [list \| [info nameofexecutable] << {fconfigure stdout -translation binary; puts \xD6[string repeat _ 20]}]]; fconfigure $f -blocking 0 -buffersize 10 -translation lf -eofchar {}; while {![string length [set d [read $f]]]} {after 10}; puts $d; close $f` After all it is crucial combi of -profile strict, -blocking 0 and -buffersize smaller than read content. In this particular case it reads 20 bytes with buffer 10 bytes (in case of `gitk` read 500K by 4K buffer) with single not utf-8 char inside: the "trigger" is `\xD6`-char (`Ö` as single-byte, quasi in "binary" or cp1252 form), what shall cause encoding error in utf-8, but it causes the panic instead. I'm in, but a bit busy right now, so posting it for the record and when someone could fix it faster than me. P.S. by the way, because there is no application- or interp-wide option to switch default `-profile` (greetings to all with the opinion "we want no such global configuration facilities"), to be able to use application further, people need to implement ugly mock-ups like this now: `rename ::open ::_mckpd_open proc ::open args { set ch [::_mckpd_open {}$args]; chan configure $ch -profile tcl8; return $ch }` (nice, isn't it?) oehhar added on 2025-01-29 17:51:58: Thanks. Merged in with commit [9e6cf84324]. I am not 100% sure but it fixes the issue and is an improvement. There might be other cases which are not jet handled. Thanks, Harald apnadkarni added on 2025-01-28 06:04:41: Harald, I'm ok with your patch being merged in. apnadkarni added on 2025-01-20 10:34:01: The destination buffer and source are counted in bytes. The `TCL_ENCODING_CHAR_LIMIT` flag pertains to the last parameter which is a character, not byte, count. oehhar added on 2025-01-20 09:59:59: Thanks Ashok, I have the impression, that "count" is in Bytes. So, it should be set to 4 (UNICODE_CHAR_MAX) to get at minimum the next character. But in this routine, we want the result "/n", "/r" or other character (which is not handled here), so one byte might be enough. Thanks, Harald apnadkarni added on 2025-01-20 09:46:08: Sorry, I did take a look but thought I would just wait for the Monday meeting to discuss. Your change looks ok to me. Just as a precaution to ensure maximum of 2 chars returned, one could set `count` to 2 and add the `TCL_ENCODING_CHAR_LIMIT` flag to the second `Tcl_ExternalToUtf` call but not mandatory. oehhar added on 2025-01-20 09:22:28: Ashok, the week-end is over and no sign from you (what is ok). If you like, I may do a tracing session Tuesday evening to look, what the corrected version really does. I am relatively sure, that the fix is required, e.g. not checking the result is not ok. But I am not sure, if it is correct and complete. At least my introduced source code comment should be enhanced. Thank you and take care, Harald apnadkarni added on 2025-01-16 01:29:55: Thanks Harald. I will review this weekend but looks fine. Whether the \r should show up in the decoded data is a good question. If translation is CRLF I think it is fine if it does not. /Ashok oehhar added on 2025-01-15 17:01:34: Hi Wizards, there are now the provided test cases with commit [d6a8c4e91a]. Maybe, someone: May review Start CI check on the branch I am probably at my end. THanks for all, Harald oehhar added on 2025-01-15 12:52:40: Another interesting side-effect is, that it is hard to locate the error position from the error dict. The error dict contains in both test cases: % catch {read $fd $buffersize} e d 1 % set e error reading "file32f59a8": invalid or incomplete multibyte or wide character % set d -data xxxxxxx -code 1 -level 0 -errorstack {INNER {invokeStk1 read file32f59a8 8}} -errorcode {POSIX EILSEQ {invalid or incomplete multibyte or wide character}} -errorinfo {error reading "file32f59a8": invalid or incomplete multibyte or wide character while executing The "-data" key contains the last correct data, but the "\r" never appears. That is not a bug. It is just a lesson learned, that with "-translation crlf" (or auto), there might be hidden data between the error position and the returned good data. Thanks for all, Harald oehhar added on 2025-01-15 12:41:35: The check was added in checkin [db261a5c5d] starting branch [73bb42fb3f-panic-buffer]. It works to solve the issue. This is only empiric stupid testing, sorry. Thanks, Harald oehhar added on 2025-01-15 08:54:30: Yes. The 2nd reason I have seen is that the limitation to one output byte always returns "TCL_CONVERT_NOSPACE" instead of the encoding error. Perhaps, that would also be a point to look into. Thanks, Harald apnadkarni added on 2025-01-15 07:48:12: Thanks for following up. I think you are on the right track. The second Tcl_ExternalToUtf needs to be checked for at least the TCL_CONVERT_SYNTAX error. You're right in that the whole channel i/o <-> encoding interaction is more than a little head splitting; lots of conditions, corner cases, back tracking etc. oehhar added on 2025-01-14 17:24:22: generic/tclio.c Line 6574 / * Space is made at the beginning of the buffer to copy the * previous unused bytes there. Check first if the buffer we are * using actually has enough space at its beginning for the data * we are copying. Because if not we will write over the buffer * management information, especially the 'nextPtr'. * * Note that the BUFFER_PADDING (See AllocChannelBuffer) is used * to prevent exactly this situation. I.e. it should never happen. * Therefore it is ok to panic should it happen despite the * precautions. / if (nextPtr->nextRemoved < srcLen) { Tcl_Panic("Buffer Underflow, BUFFER_PADDING not enough"); } Current values: srclen = 17 nextPtr->nextRemoved = 16 src: "\réXXXXXXXXXXXXXXX" Is the final assertion. The way to this result: First, "XXXXXXX\r" is received and decoded without issue. We fall in Case 2 of line 6401, as the "\n" is missing in the translation. In the next round: code = Tcl_ExternalToUtf(NULL, encoding, src, srcLen, flags, &statePtr->inputEncodingState, dst, dstLimit, &srcRead, &dstDecoded, &numChars); with src = "xxxxxxx\r" srcLen = 8 srcRead=dstDecoded=numChars=7 returns code = -4: (TCL_CONVERT_NOSPACE) The following test: if (code == TCL_CONVERT_UNKNOWN \|\| code == TCL_CONVERT_SYNTAX \|\| (code == TCL_CONVERT_MULTIBYTE && GotFlag(statePtr, CHANNEL_EOF))) { SetFlag(statePtr, CHANNEL_ENCODING_ERROR); code = TCL_OK; } does not fire as "TCL_CONVERT_NOSPACE" is not handled. Then, the "\r" is not contained in the decoded data and thus, the translation works. The first 7 characters are returned by ReadChars Next round, only a one byte "\r" is decoded and failes due to "\r\n" eol requirement. This is correctly detected in line 6552 and "CHANNEL_NEED_MORE_DATA" is set. Next round, only one byte "\r" is decoded and failes again. This time, more data is available in the next buffer. Thus, the not consumed "\r" is prepended to the next buffer and ReadChars is called again (Line 6583) with "CharsToRead=1", The buffer now contains 9 characters starting from "\r". nextPtr->nextRemoved -= srcLen; memcpy(RemovePoint(nextPtr), src, srcLen); RecycleBuffer(statePtr, bufPtr, 0); statePtr->inQueueHead = nextPtr; Tcl_SetObjLength(objPtr, numBytes); return ReadChars(statePtr, objPtr, charsToRead, factorPtr); Then, the encoding decode (line 6325) is called with all 9 characters and a destination limit of 4. This results in code=-4 (TCL_CONVERT_NOSPACE). and only the "\r" decoded. Due to the flag "ENCODING_CHAR_LIMIT", the output length is limited to dstCharsPtr (numChars on input), which is 1. Then, it is remarked, that we still have characters. The encoding is translated in line 6449. Tcl_ExternalToUtf(NULL, encoding, src, srcLen, (statePtr->inputEncodingFlags \| TCL_ENCODING_NO_TERMINATE), &statePtr->inputEncodingState, buffer, sizeof(buffer), &read, &decoded, &count); with the parameters: src: \r\xe9XXXXXX srclen = 9 This returns -1 (TCL_CONVERT_SYNTAX) which is ignored. The following handling handles \r\n and eof, but not the error case. Again, we return with "NEED_MORE_DATA" flag. I think, by adding a result check to Tcl_ExternalToUtf would probably the right moment to react on this. But lets continue. ReadChar is called again for 1 character. Tcl_ExternalToUtf is called again with 1 char output limit, resulting in -4 Again, additional characters are decoded using Tcl_ExternalToUtf. The resulting error is not checked again. In this round, it is tried to add the currently 9 characters to the beginning of the next buffer and ReadChar is called with this extended buffer (line 6582). Again, Tcl_ExternalToUtf is now called with 17 source characters and only 1 output character resulting in -4. Again, result is "NEED_MORE_DATA". In the next round, the panic condition fires, as it is tried to put the 9 characters in front of the next buffer, but there is not sufficient space. In a nutchell: As the "\n" is missing, ReadChar returns with "NEED_MORE_DATA" This additional data is accumulated in a 24 byte buffer, until there is no space and the panic fires. The root cause is a non checked encoding conversion in line 6449. The purpose is only to read a "\n", but an encoding error arises. This error is not checked and the following handling ignores the error. There may be many solutions: catch the encoding error there and handle it think about the whole procedure and get it somewhere else Headbloat for today. Thanks for all, Harald apnadkarni added on 2025-01-14 15:14:52: Harald, Currently have other things to finish so if you have the inclination do take a look. /Ashok oehhar added on 2025-01-14 14:17:22: I can reproduce with 9.0.1 32bit WIndows. Nice message box ;-). Shall I start a debugging session with VS2022 ? I suppose, you already did... Take care, Harald