Mercurial > hg > xemacs-beta
diff man/lispref/searching.texi @ 280:7df0dd720c89 r21-0b38
Import from CVS: tag r21-0b38
author | cvs |
---|---|
date | Mon, 13 Aug 2007 10:32:22 +0200 |
parents | 084402c475ba |
children | c9fe270a4101 |
line wrap: on
line diff
--- a/man/lispref/searching.texi Mon Aug 13 10:31:30 2007 +0200 +++ b/man/lispref/searching.texi Mon Aug 13 10:32:22 2007 +0200 @@ -173,7 +173,7 @@ The XEmacs regular expression syntax most closely resembles that of @cite{ed}, or @cite{grep}, the GNU versions of which all utilize the GNU @cite{regex} library. XEmacs' version of @cite{regex} has recently been -extended with some perl--like capabilities, described in the next +extended with some Perl--like capabilities, described in the next section. @menu @@ -269,26 +269,17 @@ @item *? @cindex @samp{*?} in regexp works just like @samp{*}, except that rather than matching the longest -match, it matches the shortest match. This is known as a "non-greedy" -quantifier. It is a syntax that comes to us from perl. It is very -useful for situations where you want to match the text inside a pair of -delimiters. +match, it matches the shortest match. @samp{*?} is known as a +@dfn{non-greedy} quantifier, a regexp construct borrowed from Perl. @c Did perl get this from somewhere? What's the real history of *? ? -@lisp -@group -(setq s "/ blah / / blah2 /") - @result{} "/ blah / / blah2 /" -(string-match "/.*/" s) - @result{} 0 -(match-string 0 s) - @result{} "/ blah / / blah2 /" -(string-match "/.*?/" s) - @result{} 0 -(match-string 0 s) - @result{} "/ blah /" -@end group -@end lisp +This construct very useful for when you want to match the text inside a +pair of delimiters. For instance, @samp{/\*.*?\*/} will match C +comments in a string. This could not be achieved without the use of +greedy quantifier. + +This construct has not been available prior to XEmacs 20.4. It is not +available in FSF Emacs. @item +? @cindex @samp{+?} in regexp @@ -297,26 +288,10 @@ @item \@{n,m\@} @c Note the spacing after the close brace is deliberate. @cindex @samp{\@{n,m\@} }in regexp -this is an interval quantifier, which is analogous to @samp{*} or -@samp{+}, but specifies that the expression must match at least @samp{n} -times, but no more than @samp{m} times. This syntax comes to us from -@cite{ed}, @cite{grep}, and @cite{perl}. The @cite{etags} utility also -supports it. - -@lisp -@group -(setq s "12 123 1234 12345") - @result{} "12 123 1234 12345" -(string-match "[0-9]\\@{2,4\\@}" s) - @result{} 0 -(match-string 0 s) - @result{} "12" -(string-match "[0-9]\\@{3,4\\@}" s) - @result{} 3 -(match-string 0 s) - @result{} "123" -@end group -@end lisp +serves as an interval quantifier, analogous to @samp{*} or @samp{+}, but +specifies that the expression must match at least @var{n} times, but no +more than @var{m} times. This syntax is supported by most Unix regexp +utilities, and has been introduced to XEmacs for the version 20.3. @item [ @dots{} ] @cindex character set (in regexp) @@ -482,26 +457,13 @@ @item \(?: @dots{} \) @cindex @samp{(?:} in regex @cindex regexp grouping -is called a "shy" grouping operator, and it is used just like @samp{\( -@dots{} \)}, except that it does not cause the matched substring to be -recorded for future reference. This can be useful at times when a -program wants to refer to a specific @samp{\( @dots{} \)} group's number -(eg. in a @code{match-string} or @code{match-beginning} function -application) and you need to use grouping constructs for an alternation -or multi--character repetition inside a regular expression string that -can change each time the code is run, but you don't want those groups -counting because they'd change the reference number of the group you -want to refer to that is inside the static part of your generated -regular expression. +is called a @dfn{shy} grouping operator, and it is used just like +@samp{\( @dots{} \)}, except that it does not cause the matched +substring to be recorded for future reference. -@lisp -;; @r{Here `dynamic-regex' might contain shy groups.} -(re-search-forward - (concat "\\(" dynamic-regex "\\)\\(-?[0-9]\\@{2,4\\@}\\)")) -;; @r{and this `match-string' will still refer to the integer} -;; @r{captured by the second group in the `concat' string.} -(match-string 2) -@end lisp +This is useful when you need a lot of grouping @samp{\( @dots{} \)} +constructs, but only want to remember one or two. Then you can use +not want to remember them for later use with @code{match-string}. Using @samp{\(?: @dots{} \)} rather than @samp{\( @dots{} \)} when you don't need the captured substrings ought to speed up your programs some, @@ -509,8 +471,11 @@ engine, as well as the amount of memory allocation and string copying it must do. The actual performance gain to be observed has not been measured or quantified as of this writing. -@c This is used to good advantage by the font-locking code, and by `regexp-opt.el'. -@c ... It will be. It's not yet, but will be. +@c This is used to good advantage by the font-locking code, and by +@c `regexp-opt.el'. ... It will be. It's not yet, but will be. + +The shy grouping operator has been borrowed from Perl, and has not been +available prior to XEmacs 20.3, nor is it available in FSF Emacs. @item \w @cindex @samp{\w} in regexp @@ -792,6 +757,35 @@ @end example @end defun +@defun split-string string &optional pattern +This function splits @var{string} to substrings delimited by +@var{pattern}, and returns a list of substrings. If @var{pattern} is +omitted, it defaults to @samp{[ \f\t\n\r\v]+}, which means that it +splits @var{string} by white--space. + +@example +@group +(split-string "foo bar") + @result{} ("foo" "bar") +@end group + +@group +(split-string "something") + @result{} ("something") +@end group + +@group +(split-string "a:b:c" ":") + @result{} ("a" "b" "c") +@end group + +@group +(split-string ":a::b:c" ":") + @result{} ("" "a" "" "b" "c") +@end group +@end example +@end defun + @defun looking-at regexp This function determines whether the text in the current buffer directly following point matches the regular expression @var{regexp}. ``Directly