2015-09-21
| ||
19:07 | • Closed ticket [0e0e150e49]: Fix for quantified regexp back-references plus 7 other changes artifact: 4a2d0afe79 user: dgp | |
19:04 | [1115587][0e0e150e49] Major fix for regexp handling of quantified backrefs. Contributed by Tom Lane ... check-in: c8dfe06653 user: dgp tags: trunk | |
2015-09-19
| ||
15:55 | • Add attachment quantified-backrefs.patch to ticket [0e0e150e49] artifact: 5d27502ea2 user: tgl | |
15:55 | • New ticket [0e0e150e49] Fix for quantified regexp back-references. artifact: 3e9cf926a1 user: tgl | |
Ticket UUID: | 0e0e150e49479e3f3f7b20efa1817813216fe2ad | |||
Title: | Fix for quantified regexp back-references | |||
Type: | Patch | Version: | 8.6.4 | |
Submitter: | tgl | Created on: | 2015-09-19 15:55:29 | |
Subsystem: | 43. Regexp | Assigned To: | dgp | |
Priority: | 5 Medium | Severity: | Important | |
Status: | Closed | Last Modified: | 2015-09-21 19:07:42 | |
Resolution: | Accepted | Closed By: | dgp | |
Closed on: | 2015-09-21 19:07:42 | |||
Description: |
The attached patch solves the problems with quantified back-references that were previously discussed in http://core.tcl.tk/tcl/tktview?name=1115587 This patch represents a port of work that's been done on Postgres' copy of the regexp library over the past several years, specifically these commits: 5223f96d92fd6fb6fcf260da9f9cb111831f0b37 173e29aa5deefd9e71c183583ba37805c8102a72 3cbfe485e44d055b9e6a27e47069729375059f8c 4dd78bf37aa29d04b3f358b08c4a2fa43cf828e7 2a4c46e0baf2d51117cd4468b28705d01ffcbff9 3694b4d7e1aa02f917f9d18c550fbb49b96efa83 which you can look at in the Postgres repo at http://git.postgresql.org/gitweb/?p=postgresql.git if you want a sense of the development history. This submission would probably be easier to follow if I'd submitted individual patches equivalent to each of those steps ... but transposing code between Postgres and Tcl layout conventions is enough of a pain in the rear that I couldn't muster the energy to do it repeatedly. In fact, this still doesn't follow Tcl layout conventions very well, partly because I'm not totally certain what they are. I'm hoping you have a suitable reformatting tool. Anyway, the core of the fix is to introduce an explicit "iteration" subre type, rather than relying completely on a compile-time transformation, as I'd speculated about in my comments in 1115587. Subsequent cleanup includes getting rid of the useless "retry memory" stuff and folding the parallel dissect() and cdissect() code paths into a single implementation, which is why the regexec.c changes are so bulky-looking. We've been using this successfully in Postgres for several years, with only a couple of minor bugs discovered (see the last two commits mentioned above). So I now feel confident enough in it to recommend that you adopt it. This supersedes my previous submission at ticket 3487443, which I've now closed. | |||
User Comments: |
dgp added on 2015-09-21 19:07:42:
Fix accepted for release in Tcl 8.6.5. Unfortunately 8.5 and 8.6 have diverged too much to adapt this patch to apply to 8.5.19 with the effort I'm willing to spare. If fixing this in continuing 8.5.* releases is important, the best way may be simply to copy all the 8.6 source files r*.c as is back to the 8.5 branch. I don't know of any differences that have value to be preserved. |
Attachments:
- quantified-backrefs.patch [download] added by tgl on 2015-09-19 15:55:52. [details]