Ticket UUID: | 953854 | |||
Title: | Errors when parsing HTML to a tree | |||
Type: | Patch | Version: | None | |
Submitter: | nobody | Created on: | 2004-05-14 10:11:51 | |
Subsystem: | htmlparse | Assigned To: | andreas_kupries | |
Priority: | 1 Zero | Severity: | ||
Status: | Closed | Last Modified: | 2006-01-18 13:15:37 | |
Resolution: | Accepted | Closed By: | andreas_kupries | |
Closed on: | 2006-01-18 06:15:37 | |||
Description: |
Hello, In func proc ::htmlparse::mapEscapes line: return [subst $new] should change to: return [subst -nobackslashes -novariables $new] If not, if new has a backslash \, the subs breaks the string (specially noted in paths on Windows) ------- In func ::htmlparse::Reorder Lines: if { $sibling == {} || (![string compare $tp [$tree get $sibling type]]) } { break } Should change to: if { $sibling == "" } { break } if { [lsearch "h1 h2 h3 h4 h5 h6 p li" [$tree get $sibling type]] != -1 } { break } Second option is less agressive when reordering tags. Regards, Ramon Ribó [email protected] | |||
User Comments: |
andreas_kupries added on 2006-01-18 13:15:36:
Logged In: YES user_id=75003 Mostly accepted. The changes to mapEscapes are outdated, this was fixed in a different way, by an additional quoting step protecting Tcl's special characters. Reordering advice taken. Examples are in the testsuite, actually. The relevant testcases have been updated. andreas_kupries added on 2004-09-30 04:46:10: Logged In: YES user_id=75003 Do you have small examples which demonstrate the bad behaviour ? They would also become test cases. nobody added on 2004-05-14 17:11:52: File Added - 87140: htmlparse.tcl |
Attachments:
- htmlparse.tcl [download] added by nobody on 2004-05-14 17:11:52. [details]