xemacs-beta: man/lispref/searching.texi comparison

comparison man/lispref/searching.texi @ 314:341dac730539 r21-0b55

Import from CVS: tag r21-0b55

author	cvs
date	Mon, 13 Aug 2007 10:44:22 +0200
parents	70ad99077275
children	512e409c26a2

comparison

equal deleted inserted replaced

-:2905de29931f
+:341dac730539
 @cindex regexp
 A @dfn{regular expression} (@dfn{regexp}, for short) is a pattern that
 denotes a (possibly infinite) set of strings.  Searching for matches for
 a regexp is a very powerful operation.  This section explains how to write
-regexps; the following section says how to search for them.
+regexps; the following section says how to search using them.
 To gain a thorough understanding of regular expressions and how to use
 them to best advantage, we recommend that you study @cite{Mastering
 Regular Expressions, by Jeffrey E.F. Friedl, O'Reilly and Associates,
 1997}. (It's known as the "Hip Owls" book, because of the picture on its
 cover.)  You might also read the manuals to @ref{(gawk)Top},
 @ref{(ed)Top}, @cite{sed}, @cite{grep}, @ref{(perl)Top},
-@ref{(regex)Top}, @ref{(rx)Top}, @cite{pcre}, and @ref{(flex)Top}, which
+@ref{(regex)Top}, @ref{(rx)Top}, @cite{pcre}, and @ref{(flex)Top}.  All
-also make good use of regular expressions.
+of these programs and libraries make effective use of regular
+expressions.
 The XEmacs regular expression syntax most closely resembles that of
 @cite{ed}, or @cite{grep}, the GNU versions of which all utilize the GNU
 @cite{regex} library.  XEmacs' version of @cite{regex} has recently been
-extended with some Perl--like capabilities, described in the next
+extended with some Perl--like capabilities, which are described in the
-section.
+next section.
 @menu
 * Syntax of Regexps::       Rules for writing regular expressions.
 * Regexp Example::          Illustrates regular expression syntax.
 @end menu
 @item ?
 @cindex @samp{?} in regexp
 is a quantifying suffix operator similar to @samp{*}, except that the
 preceding expression can match either once or not at all.  For example,
-@samp{ca?r} matches @samp{car} or @samp{cr}, but does not match anyhing
+@samp{ca?r} matches @samp{car} or @samp{cr}, but does not match anything
 else.
 @item *?
 @cindex @samp{*?} in regexp
 works just like @samp{*}, except that rather than matching the longest
 match, it matches the shortest match.  @samp{*?} is known as a
 @dfn{non-greedy} quantifier, a regexp construct borrowed from Perl.
 @c Did perl get this from somewhere?  What's the real history of *? ?
-This construct very useful for when you want to match the text inside a
+This construct is very useful for when you want to match the text inside
-pair of delimiters.  For instance, @samp{/\*.*?\*/} will match C
+a pair of delimiters.  For instance, @samp{/\*.*?\*/} will match C
-comments in a string.  This could not be achieved without the use of
+comments in a string.  This could not be so elegantly achieved without
-greedy quantifier.
+the use of a nongreedy quantifier.
 This construct has not been available prior to XEmacs 20.4.  It is not
 available in FSF Emacs.
 @item +?
 composed of two identical halves.  The @samp{\(.*\)} matches the first
 half, which may be anything, but the @samp{\1} that follows must match
 the same exact text.
 @item \(?: @dots{} \)
-@cindex @samp{\(?:} in regexp
+@cindex @samp{(?:} in regexp
 @cindex regexp grouping
 is called a @dfn{shy} grouping operator, and it is used just like
-@samp{\( @dots{} \)}, except that it does not cause the matched
+@samp{\( @dots{} \)}, except that it does not cause the match
 substring to be recorded for future reference.
-This is useful when you need a lot of grouping @samp{\( @dots{} \)}
+This is useful when you need to use a lot of nested grouping @samp{\(
-constructs, but only want to remember one or two.  Then you can use
+@dots{} \)} constructs to express complex alternation, but only want to
-not want to remember them for later use with @code{match-string}.
+memoize, or capture, one or two of the subexpression matches.  Since
+@samp{\(?: @dots{} \)} doesn't capture a submatch, it also doesn't need
-Using @samp{\(?: @dots{} \)} rather than @samp{\( @dots{} \)} when you
+to be counted when you count @samp{\( @dots{} \)} groups to figure the
-don't need the captured substrings ought to speed up your programs some,
+@samp{match-string} index.  That turns out to be a very convenient
-since it shortens the code path followed by the regular expression
+characteristic.
-engine, as well as the amount of memory allocation and string copying it
-must do.  The actual performance gain to be observed has not been
+This situtation occurs where parts of a regular expression have been
-measured or quantified as of this writing.
+automaticly generated by a program that builds them from lists of
-@c This is used to good advantage by the font-locking code, and by
+strings, and the static code following the matching operation must
-@c `regexp-opt.el'.  ... It will be.  It's not yet, but will be.
+access a specific match number.  Here's an example that shows this:
+@example
+@group
+;; Assume that:
+(require 'regexp-opt) ;; gets executed at toplevel
+;;; `regexp-opt.el' is part of the "xemacs-devel" package.
+;; ... and that VARNAMES is a list of strings holding the name of some
+;; variables extracted from the program source you are editting and
+;; running this function on.  For this example, it will just be bound
+;; in the let* expression.
+(let* ((varnames '("k" "n" "i" "j" "varname"))
+(keys-regexp (regexp-opt
+		     (mapcar #'symbol-name
+			     '(if then else elif
+			       case in of do while
+			       with for next unless
+			       cond begin end))))
+(varname-regexp (regexp-opt varnames))
+(contrived-regexp (concat "\\(" keys-regexp "\\)"
+				"\\s-(\\s-\\("
+				varname-regexp
+				"\\)\\s-)"))
+(keyname "")
+(varname ""))
+;; In the body of this particular defun, we:
+(re-search-forward contrived-regexp nil t)
+;; ... and it finds a match.  Now we want to extract the text that
+;; it matched on, and save it into KEYNAME and VARNAME.
+(setq keyname (match-string 1)
+	varname (match-string 2))
+;; ... and then do something with those values.
+(list keyname varname))
+;; Here's something for it to match, so you can try it with `C-x C-e'.
+;; while ( j ) do ...
+@end group
+@end example
+Here you can see that if the regular expression returned by
+@samp{regexp-opt} did not use @samp{\(?: @dots{} \)} for grouping, and
+instead used @samp{\( @dots{} \)}, it would be necessary to count the
+number of opening parentheses in the @samp{keys-regexp} and to use that
+figure to calculate which match number is matched by the
+@code{varname-regexp}.  It is much more convienient to be able to just
+ask for the second match string.
+@c This is used to good advantage by the font-locking code....
+@c ... It will be.  It's not yet, but will be.
 The shy grouping operator has been borrowed from Perl, and has not been
 available prior to XEmacs 20.3, nor is it available in FSF Emacs.
 @item \w

Mercurial > hg > xemacs-beta

comparison man/lispref/searching.texi @ 314:341dac730539 r21-0b55