Tcl Source Code

Check-in [93531a5a13]
Login
Bounty program for improvements to Tcl and certain Tcl packages.
Tcl 2019 Conference, Houston/TX, US, Nov 4-8
Send your abstracts to [email protected]
or submit via the online form by Sep 9.

Many hyperlinks are disabled.
Use anonymous login to enable hyperlinks.

Overview
Comment:[11250a236d] Made the documentation of non-greediness overrides more obvious.
Downloads: Tarball | ZIP archive | SQL archive
Timelines: family | ancestors | descendants | both | trunk
Files: files | file ages | folders
SHA1: 93531a5a13c88e48159b027a1983a8f0841e7c57
User & Date: dkf 2015-05-18 08:20:43
Context
2015-11-05
01:48
merge trunk; partial while hunting for a merge bug - updated to 2015-05-18 check-in: 26f9bf8072 user: msofer tags: mig-optimize
2015-05-19
20:34
Simplify code generation for a list of literals. Generates slightly simpler bytecode too. check-in: 1a292c2874 user: dkf tags: trunk
19:22
Don't cause string rep generation in [list <lit> <lit> ... <lit>] bytecode. Candidate for merge to t... check-in: a3d89bfa32 user: dgp tags: dgp-defer-string-rep
2015-05-18
14:44
merge trunk Leaf check-in: 91c27597c9 user: dgp tags: bug-3608714
08:20
[11250a236d] Made the documentation of non-greediness overrides more obvious. check-in: 93531a5a13 user: dkf tags: trunk
07:51
[c11a51c482] Stop race condition with -accept config option, and allow overriding of it via -headers... check-in: ab0370691f user: dkf tags: trunk
Changes
Hide Diffs Unified Diffs Ignore Whitespace Patch

Changes to doc/re_syntax.n.

679
680
681
682
683
684
685
686
687
688
























689
690
691
692
693
694
695
Subject to the constraints imposed by the rules for matching the whole
RE, subexpressions also match the longest or shortest possible
substrings, based on their preferences, with subexpressions starting
earlier in the RE taking priority over ones starting later. Note that
outer subexpressions thus take priority over their component
subexpressions.
.PP
Note that the quantifiers \fB{1,1}\fR and \fB{1,1}?\fR can be used to
force longest and shortest preference, respectively, on a
subexpression or a whole RE.
























.PP
Match lengths are measured in characters, not collating elements. An
empty string is considered longer than no match at all. For example,
.QW \fBbb*\fR
matches the three middle characters of
.QW \fBabbbc\fR ,
.QW \fB(week|wee)(night|knights)\fR






|


>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>







679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
Subject to the constraints imposed by the rules for matching the whole
RE, subexpressions also match the longest or shortest possible
substrings, based on their preferences, with subexpressions starting
earlier in the RE taking priority over ones starting later. Note that
outer subexpressions thus take priority over their component
subexpressions.
.PP
The quantifiers \fB{1,1}\fR and \fB{1,1}?\fR can be used to
force longest and shortest preference, respectively, on a
subexpression or a whole RE.
.RS
.PP
\fBNOTE:\fR This means that you can usually make a RE be non-greedy overall by
putting \fB{1,1}?\fR after one of the first non-constraint atoms or
parenthesized sub-expressions in it. \fIIt pays to experiment\fR with the
placing of this non-greediness override on a suitable range of input texts
when you are writing a RE if you are using this level of complexity.
.PP
For example, this regular expression is non-greedy, and will match the
shortest substring possible given that
.QW \fBabc\fR
will be matched as early as possible (the quantifier does not change that):
.PP
.CS
ab{1,1}?c.*x.*cba
.CE
.PP
The atom
.QW \fBa\fR
has no greediness preference, we explicitly give one for
.QW \fBb\fR ,
and the remaining quantifiers are overridden to be non-greedy by the preceding
non-greedy quantifier.
.RE
.PP
Match lengths are measured in characters, not collating elements. An
empty string is considered longer than no match at all. For example,
.QW \fBbb*\fR
matches the three middle characters of
.QW \fBabbbc\fR ,
.QW \fB(week|wee)(night|knights)\fR