changeset 2255:03d9d549c3fa

[xemacs-hg @ 2004-09-08 10:32:50 by stephent] update for shy groups <87zn41uk4j.fsf@tleepslib.sk.tsukuba.ac.jp>
author stephent
date Wed, 08 Sep 2004 10:32:55 +0000
parents cf4470caf504
children 6ffd69eff907
files man/ChangeLog man/lispref/searching.texi
diffstat 2 files changed, 28 insertions(+), 13 deletions(-) [+]
line wrap: on
line diff
--- a/man/ChangeLog	Wed Sep 08 10:22:01 2004 +0000
+++ b/man/ChangeLog	Wed Sep 08 10:32:55 2004 +0000
@@ -1,3 +1,9 @@
+2004-09-08  Stephen J. Turnbull  <stephen@xemacs.org>
+
+	* lispref/searching.texi (Syntax of Regexps): Add example of use
+	of shy groups in variable subexpression, correct rumor that there
+	may be substantial performance gain.
+
 2004-08-13  Stephen J. Turnbull  <stephen@xemacs.org>
 
 	* xemacs/help.texi (Misc Help): Info-goto-emacs-key-command-node
@@ -21,7 +27,7 @@
 
 	* internals/internals.texi (Techniques for XEmacs Developers): Be
 	specific when discussing optimization.
-	(Techniques for XEmacs Developers): Fragments that are meaningful
+	(Techniques for XEmacs Developers): Fragments that are meaningless
 	by themselves or contain placeholders should be @samp, not @code.
 	(Modules for Internationalization): Add description of mule-coding.c
 	and further deprecate mule.c.
--- a/man/lispref/searching.texi	Wed Sep 08 10:22:01 2004 +0000
+++ b/man/lispref/searching.texi	Wed Sep 08 10:32:55 2004 +0000
@@ -446,7 +446,7 @@
 matches the same text that matched the @var{digit}th occurrence of a
 @samp{\( @dots{} \)} construct.
 
-In other words, after the end of a @samp{\( @dots{} \)} construct.  the
+In other words, after the end of a @samp{\( @dots{} \)} construct, the
 matcher remembers the beginning and end of the text matched by that
 construct.  Then, later on in the regular expression, you can use
 @samp{\} followed by @var{digit} to match that same text, whatever it
@@ -473,19 +473,28 @@
 This is useful when you need a lot of grouping @samp{\( @dots{} \)}
 constructs, but only want to remember one or two -- or if you have
 more than nine groupings and need to use backreferences to refer to
-the groupings at the end.
+the groupings at the end.  It also allows construction of regular
+expressions from variable subexpressions that contain varying numbers of
+non-capturing subexpressions, without disturbing the group counts for
+the main expression.  For example
+
+@example
+(let ((sre (if foo "\\(?:bar\\|baz\\)" "quux")))
+  (re-search-forward (format "a\\(b+ %s c+\\) d" sre) nil t)
+  (match-string 1))
+@end example
 
-Using @samp{\(?: @dots{} \)} rather than @samp{\( @dots{} \)} when you
-don't need the captured substrings ought to speed up your programs some,
-since it shortens the code path followed by the regular expression
-engine, as well as the amount of memory allocation and string copying it
-must do.  The actual performance gain to be observed has not been
-measured or quantified as of this writing.
-@c This is used to good advantage by the font-locking code, and by
-@c `regexp-opt.el'.
+It is very tedious to write this kind of code without shy groups, even
+if you know what all the alternative subexpressions will look like.
 
-The shy grouping operator has been borrowed from Perl, and has not been
-available prior to XEmacs 20.3, nor is it available in FSF Emacs.
+Using @samp{\(?: @dots{} \)} rather than @samp{\( @dots{} \)} should
+give little performance gain, as the start of each group must be
+recorded for the purpose of back-tracking in any case, and no string
+copying is done until @code{match-string} is called.
+
+The shy grouping operator has been borrowed from Perl, and was not
+available prior to XEmacs 20.3, and has only been available in GNU Emacs
+since version 21.
 
 @item \w
 @cindex @samp{\w} in regexp