Tcl Source Code

View Ticket
Login
Ticket UUID: 338d979f5be43e5d147f7a9c7d50eff9cd10917a
Title: Image data without content type is corrupted by http package
Type: Bug Version: 2.9.1
Submitter: sbron Created on: 2020-07-24 20:07:22
Subsystem: 29. http Package Assigned To: nobody
Priority: 5 Medium Severity: Minor
Status: Closed Last Modified: 2022-09-10 11:58:57
Resolution: Fixed Closed By: kjnash
    Closed on: 2022-09-10 11:58:57
Description:

When a web site doesn't provide a content type, RFC 2616 specifies: "If and only if the media type is not given by a Content-Type field, the recipient MAY attempt to guess the media type via inspection of its content and/or the name extension(s) of the URI used to identify the resource. If the media type remains unknown, the recipient SHOULD treat it as type "application/octet-stream". The http package does not follow this directive and treats all data with an unknown content type as text. This is an unfortunate choice, because once converted to text (translating \r\n to \n), information has been lost that can't be recovered. Reporting text as binary data on the other hand can easily be corrected.

As an example, a popular Dutch web site produces a png image without providing a content-type header: https://image.buienradar.nl/2.0/image/sprite/RadarMapRainNL. See also the attached script.

The only work-around is to add the -binary option to http::geturl. But that treats any response as binary, even when a site does provide a content-type header.

User Comments: kjnash added on 2022-09-10 11:58:57:
Fixed in commit [763efd4e7f], branch http-bugfixes-2022H2.

Attachments: