Tk Library Source Code

View Ticket
Login
Ticket UUID: 2913700
Title: extra character in range [!..~]
Type: Bug Version: None
Submitter: jeremy_c Created on: 2009-12-13 14:07:41
Subsystem: mime Assigned To: eee
Priority: 5 Medium Severity:
Status: Closed Last Modified: 2013-01-11 10:59:53
Resolution: Rejected Closed By: eee
    Closed on: 2013-01-11 03:59:53
Description:
When parsing email messages, I ran across 1 message of thousands that mime would not parse. It's a spam message, none the less, it should parse? The error is:

expecting character in range [!..~]
    while executing
"$state(encoding) -mode decode  -- $state(string)"
    (procedure "mime::getbody" line 104)
    invoked from within
"mime::getbody $m"
    (procedure "_body" line 5)
    invoked from within
"_body $m"
    (procedure "insert" line 55)
    invoked from within
"insert $content"
    (procedure "import_file" line 17)
    invoked from within
"import_file {*}$args"
    (procedure "::mailroom::message::import" line 5)
    invoked from within
"::mailroom::message import -file $filename"
    ("foreach" body line 3)
    invoked from within
"foreach filename $messages {
  puts "Importing $filename"
  ::mailroom::message import -file $filename
}"
    (file "test.tcl" line 12)
User Comments: eee added on 2013-01-11 10:59:53:
The mime module does document the behavior on error, and since the input was invalid and erroneous, the mime module was behaving as documented.

In the general case, and especially when dealing with hostile input such as arbitrary email, the application needs to account for the possibility that invalid input will cause an error, and the application needs to handle the error in some manner, such as offering the user the raw content of the email.

eee added on 2013-01-11 04:46:03:
The problem here is the handling of invalid data. The email declares itself to have a content type of "text/plain; charset=us-ascii", and a content transfer encoding of "quoted-printable". However, there are FormFeed characters and eight-bit "smart quotes" characters imbedded in both the headers and the body of the message.

This error occurs at line 1574 of mime.tcl, when mime::getbody tries to decode what it thinks is quoted-printable data read from a file, and it looks like the same would occur at line 1639 if the invalid data were passed to mime::initialize as a string. Hmm, looks like it could be triggered at line 806 too? In all cases, it's calling a command "quoted-printable -mode decode" that appears to be coming from the Trf package.

I'm really not sure what the correct way to handle something like this should be.

jeremy_c added on 2009-12-13 21:07:42:

File Added - 355053: bad_email_20091204_091126

Attachments: