Mercurial > hg > xemacs-beta
diff man/lispref/searching.texi @ 371:cc15677e0335 r21-2b1
Import from CVS: tag r21-2b1
author | cvs |
---|---|
date | Mon, 13 Aug 2007 11:03:08 +0200 |
parents | a4f53d9b3154 |
children | 6240c7796c7a |
line wrap: on
line diff
--- a/man/lispref/searching.texi Mon Aug 13 11:01:58 2007 +0200 +++ b/man/lispref/searching.texi Mon Aug 13 11:03:08 2007 +0200 @@ -159,7 +159,7 @@ A @dfn{regular expression} (@dfn{regexp}, for short) is a pattern that denotes a (possibly infinite) set of strings. Searching for matches for a regexp is a very powerful operation. This section explains how to write -regexps; the following section says how to search using them. +regexps; the following section says how to search for them. To gain a thorough understanding of regular expressions and how to use them to best advantage, we recommend that you study @cite{Mastering @@ -167,15 +167,14 @@ 1997}. (It's known as the "Hip Owls" book, because of the picture on its cover.) You might also read the manuals to @ref{(gawk)Top}, @ref{(ed)Top}, @cite{sed}, @cite{grep}, @ref{(perl)Top}, -@ref{(regex)Top}, @ref{(rx)Top}, @cite{pcre}, and @ref{(flex)Top}. All -of these programs and libraries make effective use of regular -expressions. +@ref{(regex)Top}, @ref{(rx)Top}, @cite{pcre}, and @ref{(flex)Top}, which +also make good use of regular expressions. The XEmacs regular expression syntax most closely resembles that of @cite{ed}, or @cite{grep}, the GNU versions of which all utilize the GNU @cite{regex} library. XEmacs' version of @cite{regex} has recently been -extended with some Perl--like capabilities, which are described in the -next section. +extended with some Perl--like capabilities, described in the next +section. @menu * Syntax of Regexps:: Rules for writing regular expressions. @@ -264,7 +263,7 @@ @cindex @samp{?} in regexp is a quantifying suffix operator similar to @samp{*}, except that the preceding expression can match either once or not at all. For example, -@samp{ca?r} matches @samp{car} or @samp{cr}, but does not match anything +@samp{ca?r} matches @samp{car} or @samp{cr}, but does not match anyhing else. @item *? @@ -274,10 +273,10 @@ @dfn{non-greedy} quantifier, a regexp construct borrowed from Perl. @c Did perl get this from somewhere? What's the real history of *? ? -This construct is very useful for when you want to match the text inside -a pair of delimiters. For instance, @samp{/\*.*?\*/} will match C -comments in a string. This could not be so elegantly achieved without -the use of a non-greedy quantifier. +This construct very useful for when you want to match the text inside a +pair of delimiters. For instance, @samp{/\*.*?\*/} will match C +comments in a string. This could not be achieved without the use of +greedy quantifier. This construct has not been available prior to XEmacs 20.4. It is not available in FSF Emacs. @@ -456,80 +455,24 @@ the same exact text. @item \(?: @dots{} \) -@cindex @samp{(?:} in regexp +@cindex @samp{\(?:} in regexp @cindex regexp grouping is called a @dfn{shy} grouping operator, and it is used just like -@samp{\( @dots{} \)}, except that it does not cause the match +@samp{\( @dots{} \)}, except that it does not cause the matched substring to be recorded for future reference. -This is useful when you need to use a lot of nested grouping @samp{\( -@dots{} \)} constructs to express complex alternation, but only want to -memoize, or capture, one or two of the subexpression matches. Since -@samp{\(?: @dots{} \)} doesn't capture a sub-match, it also doesn't need -to be counted when you count @samp{\( @dots{} \)} groups to figure the -@samp{match-string} index. That turns out to be a very convenient -characteristic. - -This situation occurs where parts of a regular expression have been -automaticly generated by a program that builds them from lists of -strings, and the static code following the matching operation must -access a specific match number. Here's an example that shows this. - -We will assume that @code{(require 'regexp-opt)} has been executed -already, to ensure that @file{regexp-opt.el}, which is part of the -@code{xemacs-devel} package, is loaded. -@ifinfo -Please evaluate that @code{require} expression now, using @kbd{C-x C-e}, -if you intend to try the following example. -@end ifinfo -In a real program, lets pretend that @var{varnames} would be a list of -strings holding the names of some variables extracted somehow from the -text of a program source you are editing and running this function on. -For the purposes of this illustration, we can just bind it in the -@code{let*} expression. +This is useful when you need a lot of grouping @samp{\( @dots{} \)} +constructs, but only want to remember one or two. Then you can use +not want to remember them for later use with @code{match-string}. -@example -@group -(let* ((varnames '("k" "n" "i" "j" "varname")) - (keys-regexp (regexp-opt - (mapcar #'symbol-name - '(if then else elif - case in of do while - with for next unless - cond begin end)))) - (varname-regexp (regexp-opt varnames)) - (contrived-regexp (concat "\\(" keys-regexp "\\)" - "\\s-(\\s-\\(" - varname-regexp - "\\)\\s-)")) - (keyname "") - (varname "")) - ;; @r{In the body of this particular defun, we:} - (re-search-forward contrived-regexp nil t) - ;; @r{@dots{} and it finds a match. Now we want to extract the} - ;; @r{text that it matched on, and save it into @code{keyname}} - ;; @r{and @code{varname}.} - (setq keyname (match-string 1) - varname (match-string 2)) - ;; @r{@dots{} and then do something with those values.} - (list keyname varname)) - -;; @r{Here's something for it to match, so you can try it with} -;; @kbd{C-x C-e} -;; while ( j ) do ... -@end group -@end example - -Here you should see that if the regular expression returned by -@code{regexp-opt} did not use @samp{\(?: @dots{} \)} for grouping, and -instead used @samp{\( @dots{} \)}, it would be necessary to count the -number of opening parentheses in the @code{keys-regexp} and to use that -figure to calculate which match number is matched by the -@code{varname-regexp}. It is much more convenient to be able to just -ask for the second match string. - -@c This is used to good advantage by the font-locking code.... -@c ... It will be. It's not yet, but will be. +Using @samp{\(?: @dots{} \)} rather than @samp{\( @dots{} \)} when you +don't need the captured substrings ought to speed up your programs some, +since it shortens the code path followed by the regular expression +engine, as well as the amount of memory allocation and string copying it +must do. The actual performance gain to be observed has not been +measured or quantified as of this writing. +@c This is used to good advantage by the font-locking code, and by +@c `regexp-opt.el'. ... It will be. It's not yet, but will be. The shy grouping operator has been borrowed from Perl, and has not been available prior to XEmacs 20.3, nor is it available in FSF Emacs. @@ -699,10 +642,10 @@ In XEmacs, you can search for the next match for a regexp either incrementally or not. Incremental search commands are described in the -@cite{The XEmacs Lisp Reference Manual}. @xref{Regexp Search, , Regular -Expression Search, xemacs, The XEmacs Lisp Reference Manual}. Here we -describe only the search functions useful in programs. The principal -one is @code{re-search-forward}. +@cite{The XEmacs Reference Manual}. @xref{Regexp Search, , Regular Expression +Search, emacs, The XEmacs Reference Manual}. Here we describe only the search +functions useful in programs. The principal one is +@code{re-search-forward}. @deffn Command re-search-forward regexp &optional limit noerror repeat This function searches forward in the current buffer for a string of @@ -1153,7 +1096,7 @@ If @var{count} is zero, then the value is the position of the start of the entire match. Otherwise, @var{count} specifies a subexpression in -the regular expression, and the value of the function is the starting +the regular expresion, and the value of the function is the starting position of the match for that subexpression. The value is @code{nil} for a subexpression inside a @samp{\|}