Tcl Library Source Code

View Ticket
Login
Ticket UUID: acd8c27943874c37dc6eaade0ab574950edcbc59
Title: fileutil::fileType - detects some pdf files just as text
Type: Bug Version: 1.17
Submitter: pooryorick Created on: 2015-05-28 00:49:24
Subsystem: fileutil Assigned To: aku
Priority: 5 Medium Severity: Minor
Status: Closed Last Modified: 2015-06-02 00:57:03
Resolution: Fixed Closed By: aku
    Closed on: 2015-06-02 00:57:03
Description:

Some pdf files fail the binary test, in which case they are also not reported as pdf files. Here is a script to generate such a pdf file:

#! /bin/env tclsh

package require fileutil
package require pdf4tcl

set fname pdf4tcl_01.pdf
::pdf4tcl::new mypdf -file $fname 
mypdf startPage
mypdf setFont 12 Courier
mypdf text Hello
mypdf finish
mypdf destroy

puts [fileutil::fileType $fname]

The following small change would cause fileType to return text pdf for such a file.

--- modules/fileutil/fileutil.tcl
+++ modules/fileutil/fileutil.tcl
@@ -1654,14 +1654,15 @@
         }
     } elseif { $binary && [string match "MM\x00\**" $test] } {
         lappend type graphic tiff
     } elseif { $binary && [string match "BM*" $test] && [string range $test 6 9] == "\x00\x00\x00\x00" } {
         lappend type graphic bitmap
-    } elseif { $binary && [string match "\%PDF\-*" $test] } {
-        lappend type pdf
     } elseif { ! $binary && [string match -nocase "*\<html\>*" $test] } {
         lappend type html
+    } elseif {[string match "\%PDF\-*" $test] } {
+       puts [list whee $binary]
+        lappend type pdf
     } elseif { [string match "\%\!PS\-*" $test] } {
        lappend type ps
        if { [string match "* EPSF\-*" $test] } {
            lappend type eps
        }

User Comments: aku added on 2015-06-02 00:57:03:

Applied. Commit [2b866cf322]. Pushed. Testsuite extended using the script output as demo-file. Version bumped to 1.14.11

Should check what the fumagic packages report. They are more like Unix file(1).