Mercurial > hg > xemacs-beta
changeset 2255:03d9d549c3fa
[xemacs-hg @ 2004-09-08 10:32:50 by stephent]
update for shy groups <87zn41uk4j.fsf@tleepslib.sk.tsukuba.ac.jp>
author | stephent |
---|---|
date | Wed, 08 Sep 2004 10:32:55 +0000 |
parents | cf4470caf504 |
children | 6ffd69eff907 |
files | man/ChangeLog man/lispref/searching.texi |
diffstat | 2 files changed, 28 insertions(+), 13 deletions(-) [+] |
line wrap: on
line diff
--- a/man/ChangeLog Wed Sep 08 10:22:01 2004 +0000 +++ b/man/ChangeLog Wed Sep 08 10:32:55 2004 +0000 @@ -1,3 +1,9 @@ +2004-09-08 Stephen J. Turnbull <stephen@xemacs.org> + + * lispref/searching.texi (Syntax of Regexps): Add example of use + of shy groups in variable subexpression, correct rumor that there + may be substantial performance gain. + 2004-08-13 Stephen J. Turnbull <stephen@xemacs.org> * xemacs/help.texi (Misc Help): Info-goto-emacs-key-command-node @@ -21,7 +27,7 @@ * internals/internals.texi (Techniques for XEmacs Developers): Be specific when discussing optimization. - (Techniques for XEmacs Developers): Fragments that are meaningful + (Techniques for XEmacs Developers): Fragments that are meaningless by themselves or contain placeholders should be @samp, not @code. (Modules for Internationalization): Add description of mule-coding.c and further deprecate mule.c.
--- a/man/lispref/searching.texi Wed Sep 08 10:22:01 2004 +0000 +++ b/man/lispref/searching.texi Wed Sep 08 10:32:55 2004 +0000 @@ -446,7 +446,7 @@ matches the same text that matched the @var{digit}th occurrence of a @samp{\( @dots{} \)} construct. -In other words, after the end of a @samp{\( @dots{} \)} construct. the +In other words, after the end of a @samp{\( @dots{} \)} construct, the matcher remembers the beginning and end of the text matched by that construct. Then, later on in the regular expression, you can use @samp{\} followed by @var{digit} to match that same text, whatever it @@ -473,19 +473,28 @@ This is useful when you need a lot of grouping @samp{\( @dots{} \)} constructs, but only want to remember one or two -- or if you have more than nine groupings and need to use backreferences to refer to -the groupings at the end. +the groupings at the end. It also allows construction of regular +expressions from variable subexpressions that contain varying numbers of +non-capturing subexpressions, without disturbing the group counts for +the main expression. For example + +@example +(let ((sre (if foo "\\(?:bar\\|baz\\)" "quux"))) + (re-search-forward (format "a\\(b+ %s c+\\) d" sre) nil t) + (match-string 1)) +@end example -Using @samp{\(?: @dots{} \)} rather than @samp{\( @dots{} \)} when you -don't need the captured substrings ought to speed up your programs some, -since it shortens the code path followed by the regular expression -engine, as well as the amount of memory allocation and string copying it -must do. The actual performance gain to be observed has not been -measured or quantified as of this writing. -@c This is used to good advantage by the font-locking code, and by -@c `regexp-opt.el'. +It is very tedious to write this kind of code without shy groups, even +if you know what all the alternative subexpressions will look like. -The shy grouping operator has been borrowed from Perl, and has not been -available prior to XEmacs 20.3, nor is it available in FSF Emacs. +Using @samp{\(?: @dots{} \)} rather than @samp{\( @dots{} \)} should +give little performance gain, as the start of each group must be +recorded for the purpose of back-tracking in any case, and no string +copying is done until @code{match-string} is called. + +The shy grouping operator has been borrowed from Perl, and was not +available prior to XEmacs 20.3, and has only been available in GNU Emacs +since version 21. @item \w @cindex @samp{\w} in regexp