Tk Library Source Code

View Ticket
Login
Ticket UUID: 2783215
Title: search result is incorrect with multibyte characters
Type: Bug Version: None
Submitter: zhangweiwu Created on: 2009-04-28 21:25:19
Subsystem: ldap Assigned To: mic42
Priority: 5 Medium Severity:
Status: Closed Last Modified: 2009-04-30 10:38:23
Resolution: Invalid Closed By: zhangweiwu
    Closed on: 2009-04-30 03:38:23
Description:
This problem is best explained with the screenshot I attached.

In my case all Chinese characters are broken, I think other multibyte UTF-8 string data should also be broken. Easily reproducible, we have this problem on Windows as well as Linux (where the screenshot was taken). The screenshot was taken from a system with UTF-8 locale setting.
User Comments: zhangweiwu added on 2009-04-30 10:38:23:

allow_comments - 1

Thanks for your support. I could not understand it but the fix does work for me (using encoding convertfrom utf-8).

Reason I could not understand it: I used to code in php and from there the data obtained from ldap search is UTF-8. I especially don't know what sense it make to convert from utf-8 when current locale is utf-8, which means converting from utf-8 to utf-8 solves the problem! But anyway thanks for quick respons, two days later we have big problem delaying the business.

mic42 added on 2009-04-30 03:38:43:
The results of an ldap search are not autoconverted, because they might be binary data and not utf-8 encoded, for example jpegPhoto attributes or similar things.

Please check if you get the correct result if you apply [encoding convertfrom utf-8 $data] to the results you get. If that is the case this is the expected behaviour, if not its really a bug.

The ldapx package has some helpers to do this automatically, see the documentation at:
http://tcllib.sourceforge.net/doc/ldapx.html especially the -utf8 option.

zhangweiwu added on 2009-04-29 04:25:19:

File Added - 324852: tclbug.png

Attachments: