diff man/lispref/searching.texi @ 371:cc15677e0335 r21-2b1

Import from CVS: tag r21-2b1
author cvs
date Mon, 13 Aug 2007 11:03:08 +0200
parents a4f53d9b3154
children 6240c7796c7a
line wrap: on
line diff
--- a/man/lispref/searching.texi	Mon Aug 13 11:01:58 2007 +0200
+++ b/man/lispref/searching.texi	Mon Aug 13 11:03:08 2007 +0200
@@ -159,7 +159,7 @@
   A @dfn{regular expression} (@dfn{regexp}, for short) is a pattern that
 denotes a (possibly infinite) set of strings.  Searching for matches for
 a regexp is a very powerful operation.  This section explains how to write
-regexps; the following section says how to search using them.
+regexps; the following section says how to search for them.
 
  To gain a thorough understanding of regular expressions and how to use
 them to best advantage, we recommend that you study @cite{Mastering
@@ -167,15 +167,14 @@
 1997}. (It's known as the "Hip Owls" book, because of the picture on its
 cover.)  You might also read the manuals to @ref{(gawk)Top},
 @ref{(ed)Top}, @cite{sed}, @cite{grep}, @ref{(perl)Top},
-@ref{(regex)Top}, @ref{(rx)Top}, @cite{pcre}, and @ref{(flex)Top}.  All
-of these programs and libraries make effective use of regular
-expressions.
+@ref{(regex)Top}, @ref{(rx)Top}, @cite{pcre}, and @ref{(flex)Top}, which
+also make good use of regular expressions.
 
  The XEmacs regular expression syntax most closely resembles that of
 @cite{ed}, or @cite{grep}, the GNU versions of which all utilize the GNU
 @cite{regex} library.  XEmacs' version of @cite{regex} has recently been
-extended with some Perl--like capabilities, which are described in the
-next section.
+extended with some Perl--like capabilities, described in the next
+section.
 
 @menu
 * Syntax of Regexps::       Rules for writing regular expressions.
@@ -264,7 +263,7 @@
 @cindex @samp{?} in regexp
 is a quantifying suffix operator similar to @samp{*}, except that the
 preceding expression can match either once or not at all.  For example,
-@samp{ca?r} matches @samp{car} or @samp{cr}, but does not match anything
+@samp{ca?r} matches @samp{car} or @samp{cr}, but does not match anyhing
 else.
 
 @item *?
@@ -274,10 +273,10 @@
 @dfn{non-greedy} quantifier, a regexp construct borrowed from Perl.
 @c Did perl get this from somewhere?  What's the real history of *? ?
 
-This construct is very useful for when you want to match the text inside
-a pair of delimiters.  For instance, @samp{/\*.*?\*/} will match C
-comments in a string.  This could not be so elegantly achieved without
-the use of a non-greedy quantifier.
+This construct very useful for when you want to match the text inside a
+pair of delimiters.  For instance, @samp{/\*.*?\*/} will match C
+comments in a string.  This could not be achieved without the use of
+greedy quantifier.
 
 This construct has not been available prior to XEmacs 20.4.  It is not
 available in FSF Emacs.
@@ -456,80 +455,24 @@
 the same exact text.
 
 @item \(?: @dots{} \)
-@cindex @samp{(?:} in regexp
+@cindex @samp{\(?:} in regexp
 @cindex regexp grouping
 is called a @dfn{shy} grouping operator, and it is used just like
-@samp{\( @dots{} \)}, except that it does not cause the match
+@samp{\( @dots{} \)}, except that it does not cause the matched
 substring to be recorded for future reference.
 
-This is useful when you need to use a lot of nested grouping @samp{\(
-@dots{} \)} constructs to express complex alternation, but only want to
-memoize, or capture, one or two of the subexpression matches.  Since
-@samp{\(?: @dots{} \)} doesn't capture a sub-match, it also doesn't need
-to be counted when you count @samp{\( @dots{} \)} groups to figure the
-@samp{match-string} index.  That turns out to be a very convenient
-characteristic.
-
-This situation occurs where parts of a regular expression have been
-automaticly generated by a program that builds them from lists of
-strings, and the static code following the matching operation must
-access a specific match number.  Here's an example that shows this.
-
-We will assume that @code{(require 'regexp-opt)} has been executed
-already, to ensure that @file{regexp-opt.el}, which is part of the
-@code{xemacs-devel} package, is loaded.
-@ifinfo
-Please evaluate that @code{require} expression now, using @kbd{C-x C-e},
-if you intend to try the following example.
-@end ifinfo
-In a real program, lets pretend that @var{varnames} would be a list of
-strings holding the names of some variables extracted somehow from the
-text of a program source you are editing and running this function on.
-For the purposes of this illustration, we can just bind it in the
-@code{let*} expression.
+This is useful when you need a lot of grouping @samp{\( @dots{} \)}
+constructs, but only want to remember one or two.  Then you can use 
+not want to remember them for later use with @code{match-string}.
 
-@example
-@group
-(let* ((varnames '("k" "n" "i" "j" "varname"))
-       (keys-regexp (regexp-opt
-                     (mapcar #'symbol-name
-                             '(if then else elif
-                               case in of do while
-                               with for next unless
-                               cond begin end))))
-      (varname-regexp (regexp-opt varnames))
-      (contrived-regexp (concat "\\(" keys-regexp "\\)"
-                                "\\s-(\\s-\\("
-                                varname-regexp
-                                "\\)\\s-)"))
-      (keyname "")
-      (varname ""))
-  ;; @r{In the body of this particular defun, we:}
-  (re-search-forward contrived-regexp nil t)
-  ;; @r{@dots{} and it finds a match.  Now we want to extract the}
-  ;; @r{text that it matched on, and save it into @code{keyname}}
-  ;; @r{and @code{varname}.}
-  (setq keyname (match-string 1)
-        varname (match-string 2))
-  ;; @r{@dots{} and then do something with those values.}
-  (list keyname varname))
-
-;; @r{Here's something for it to match, so you can try it with}
-;; @kbd{C-x C-e}
-;; while ( j ) do ...
-@end group
-@end example
-
-Here you should see that if the regular expression returned by
-@code{regexp-opt} did not use @samp{\(?: @dots{} \)} for grouping, and
-instead used @samp{\( @dots{} \)}, it would be necessary to count the
-number of opening parentheses in the @code{keys-regexp} and to use that
-figure to calculate which match number is matched by the
-@code{varname-regexp}.  It is much more convenient to be able to just
-ask for the second match string.
-
-@c This is used to good advantage by the font-locking code....
-@c ... It will be.  It's not yet, but will be.
+Using @samp{\(?: @dots{} \)} rather than @samp{\( @dots{} \)} when you
+don't need the captured substrings ought to speed up your programs some,
+since it shortens the code path followed by the regular expression
+engine, as well as the amount of memory allocation and string copying it
+must do.  The actual performance gain to be observed has not been
+measured or quantified as of this writing.
+@c This is used to good advantage by the font-locking code, and by
+@c `regexp-opt.el'.  ... It will be.  It's not yet, but will be.
 
 The shy grouping operator has been borrowed from Perl, and has not been
 available prior to XEmacs 20.3, nor is it available in FSF Emacs.
@@ -699,10 +642,10 @@
 
   In XEmacs, you can search for the next match for a regexp either
 incrementally or not.  Incremental search commands are described in the
-@cite{The XEmacs Lisp Reference Manual}.  @xref{Regexp Search, , Regular
-Expression Search, xemacs, The XEmacs Lisp Reference Manual}.  Here we
-describe only the search functions useful in programs.  The principal
-one is @code{re-search-forward}.
+@cite{The XEmacs Reference Manual}.  @xref{Regexp Search, , Regular Expression
+Search, emacs, The XEmacs Reference Manual}.  Here we describe only the search
+functions useful in programs.  The principal one is
+@code{re-search-forward}.
 
 @deffn Command re-search-forward regexp &optional limit noerror repeat
 This function searches forward in the current buffer for a string of
@@ -1153,7 +1096,7 @@
 
 If @var{count} is zero, then the value is the position of the start of
 the entire match.  Otherwise, @var{count} specifies a subexpression in
-the regular expression, and the value of the function is the starting
+the regular expresion, and the value of the function is the starting
 position of the match for that subexpression.
 
 The value is @code{nil} for a subexpression inside a @samp{\|}