Ticket UUID: | 6b49149b4e7c774d6b8a168eeaa9a72a152f12c5 | |||
Title: | Tkinter problem with text get | |||
Type: | Bug | Version: | 8.7 | |
Submitter: | jan.nijtmans | Created on: | 2023-04-16 09:14:17 | |
Subsystem: | 18. [text] | Assigned To: | jan.nijtmans | |
Priority: | 5 Medium | Severity: | Important | |
Status: | Open | Last Modified: | 2023-04-25 12:38:19 | |
Resolution: | None | Closed By: | nobody | |
Closed on: | ||||
Description: |
Reported by Marc Culler: I also wanted to tell you about some other side effects of changes to Tk 8.7. I wanted to start using Tk 8.7 with tkinter in Python. I expected that I would be able to just replace 8.6 with 8.7, since we are so careful about backwards compatibility. But I was wrong. I had to make several changes to tkinter.py. The reason was that there were several places where the authors assumed that the result of calling a Tk command would be a pointer that could be treated as a C string, but in fact it was a Tcl_Obj*. Somehow those needed to be explicitly converted to python strings. I will attach my patch file. | |||
User Comments: |
marc_culler (claiming to be Marc Culler) added on 2023-04-25 12:38:19:
@chrstphrchvz:
You are correct. I was misreporting the situation. And I did not need to change any code in _tkinter.c to work around this. Just in tkinter.py. chrstphrchvz added on 2023-04-22 06:03:15: For now, it would help to have "utf32string" registered, if there is no other advisable way to retrieve its Tcl_ObjType*. At the same time, it seems that using str() explicitly, as done in Marc’s patch, is not uncommon in Tkinter’s own Python code, and might be the right thing to do whenever a Python string is expected (rather than assume FromObj() won’t return a _tkinter.Tcl_Obj). Maybe someday Tkinter’s Python code will get updated with PEP 484 type hints and encourage using str() where appropriate. jan.nijtmans added on 2023-04-19 08:46:00: > Or would Tcl prefer "string" not be registered in 9.0 either? Personally, I think that the registration was a bad idea at all, so I would prefer to get rid of the "string" registration at all. But, I'm realistic: There's too much code outside depending on that. So I don't think that's reasonable to be done for 9.0 (maybe even never). I'm Ok with registering the "utf32string" type for 8.7, if it helps for Python. chrstphrchvz added on 2023-04-18 22:26:38: A better choice than string is lower might be string length, since regardless of whether it gets compiled, it will use TclGetCharLength() which requires a Unicode representation for non-bytearray values longer than 1 character. "string" is registered in Tcl 9.0. It seems contrary to precedent for a type to become registered (rather than unregistered) in a later version. Is there a reason to not also register the equivalent "utf32string" type in 8.7? Or would Tcl prefer "string" not be registered in 9.0 either? chrstphrchvz added on 2023-04-18 15:47:49: I have only tried Jan’s workaround without Tkinter so far, but it helps me notice that the text widget get command under Tcl 8.7 was returning values with the unregistered "utf32string" type rather than the registered UTF-16 "string" type. So the issue is likely due to Tkinter’s FromObj() not recognizing "utf32string". FromObj() apparently only intends to create Python strings for pure strings and Unicode string values, while the string or Unicode representation for values with an unrecognized type is only retrieved upon request (using the Python str() function). So although I can see how both Jan and Marc’s workarounds should be effective for one specific command, I believe Tkinter would want to address this more generally by having FromObj() also recognize "utf32string". Is there a better way to retrieve the Tcl_Obj* of "utf32string", without needing anything like TCL_UTF_MAX > 3 or new C APIs, than something like the following? Tcl_Obj *sPtr1 = Tcl_NewStringObj("s", 1); Tcl_Eval(interp, // Start with a pure string "set s a; " // Run a command which causes s to have Tcl’s preferred Unicode string type "string is lower $s"); Tcl_Obj *a = Tcl_ObjGetVar2(interp, sPtr1, NULL, 0); Tcl_ObjType *unicodeStringType = a->typePtr; And, I would find it less confusing for "utf32string" to not be renamed to "string" in Tcl 9.0. jan.nijtmans added on 2023-04-18 07:48:44: So, can anyone confirm if my patch works?: [571b0fd902e11ec4] Of course, there will be multiple similar places, but this should be a start. chrstphrchvz added on 2023-04-18 06:59:19: Also, I wonder if Marc is actually describing places in Tkinter’s Python code which fail to get a Python string from a Python object whose class is _tkinter.Tcl_Obj, rather than instances in its C code of a Tcl_Obj* being mistaken for a pointer to a C string. The latter sounds like a fairly serious error regardless of Tcl version used, and one which I would be very curious to see but have not spotted myself. chrstphrchvz added on 2023-04-18 06:49:58: Tkinter was apparently never updated to support the TIP 93 text widget get command syntax: it only implements support for a single index range (as seen in the snippet from Marc’s patch), and not multiple ranges. So how is this issue possibly due to a Tcl list being returned? jan.nijtmans added on 2023-04-17 07:19:10: The text "get" method can return either a single string, but it also could be a list of strings, which - each of them - should be threated as a C string. I think that would be difficult to handle in tkinter, so in this case it's OK to do that in the Tk C code (IMHU). jan.nijtmans added on 2023-04-16 18:41:29: Marc's change is simply: --- Python-3.11.1/Lib/tkinter/__init__.py 2022-12-06 13:05:27 +++ Python-3.11.1.new/Lib/tkinter/__init__.py 2023-01-07 20:52:16 @@ -3774,7 +3774,7 @@def get(self, index1, index2=None): """Return the text from INDEX1 to INDEX2 (not included).""" - return self.tk.call(self._w, 'get', index1, index2) + return str(self.tk.call(self._w, 'get', index1, index2)) # (Image commands are new in 8.0)def image_cget(self, index, option): Basically the same can be done in Tk C-code as well: [571b0fd902e11ec4] Doing this in Tk C-code has the advantage that other extensions than tkInter, which might make the same assumptions, will be fixed by it as well. I'm OK with either solution. chrstphrchvz added on 2023-04-16 14:36:08: …in other words, the issue here may be another Tkinter 8.7 incompatibility--one which may affect other usage besides getting text widget contents--and so I am doubtful that Tk should implement a workaround for this issue. I would be interested in seeing Marc’s changes to tkinter.py. chrstphrchvz added on 2023-04-16 14:09:30: Tkinter must make changes to how it uses Tcl C APIs for 8.7/9.0 compatibility anyway. I am looking into contributing a few of them: https://github.com/python/cpython/issues/103194 |
