Mercurial > hg > xemacs-beta
annotate man/lispref/searching.texi @ 4885:6772ce4d982b
Fix hash tables, #'member*, #'assoc*, #'eql compiler macros if bignums
lisp/ChangeLog addition:
2010-01-24 Aidan Kehoe <kehoea@parhasard.net>
Correct the semantics of #'member*, #'eql, #'assoc* in the
presence of bignums; change the integerp byte code to fixnump
semantics.
* bytecomp.el (fixnump, integerp, byte-compile-integerp):
Change the integerp byte code to fixnump; add a byte-compile
method to integerp using fixnump and numberp and avoiding a
funcall most of the time, since in the non-core contexts where
integerp is used, it's mostly distinguishing between fixnums and
things that are not numbers at all.
* byte-optimize.el (side-effect-free-fns, byte-after-unbind-ops)
(byte-compile-side-effect-and-error-free-ops):
Replace the integerp bytecode with fixnump; add fixnump to the
side-effect-free-fns. Add the other extended number type
predicates to the list in passing.
* obsolete.el (floatp-safe): Mark this as obsolete.
* cl.el (eql): Go into more detail in the docstring here. Don't
bother checking whether both arguments are numbers; one is enough,
#'equal will fail correctly if they have distinct types.
(subst): Replace a call to #'integerp (deciding whether to use
#'memq or not) with one to #'fixnump.
Delete most-positive-fixnum, most-negative-fixnum from this file;
they're now always in C, so they can't be modified from Lisp.
* cl-seq.el (member*, assoc*, rassoc*):
Correct these functions in the presence of bignums.
* cl-macs.el (cl-make-type-test): The type test for a fixnum is
now fixnump. Ditch floatp-safe, use floatp instead.
(eql): Correct this compiler macro in the presence of bignums.
(assoc*): Correct this compiler macro in the presence of bignums.
* simple.el (undo):
Change #'integerp to #'fixnump here, since we use #'delq with the
same value as ELT a few lines down.
src/ChangeLog addition:
2010-01-24 Aidan Kehoe <kehoea@parhasard.net>
Fix problems with #'eql, extended number types, and the hash table
implementation; change the Bintegerp bytecode to fixnump semantics
even on bignum builds, since #'integerp can have a fast
implementation in terms of #'fixnump for most of its extant uses,
but not vice-versa.
* lisp.h: Always #include number.h; we want the macros provided in
it, even if the various number types are not available.
* number.h (NON_FIXNUM_NUMBER_P): New macro, giving 1 when its
argument is of non-immediate number type. Equivalent to FLOATP if
WITH_NUMBER_TYPES is not defined.
* elhash.c (lisp_object_eql_equal, lisp_object_eql_hash):
Use NON_FIXNUM_NUMBER_P in these functions, instead of FLOATP,
giving more correct behaviour in the presence of the extended
number types.
* bytecode.c (Bfixnump, execute_optimized_program):
Rename Bintegerp to Bfixnump; change its semantics to reflect the
new name on builds with bignum support.
* data.c (Ffixnump, Fintegerp, syms_of_data, vars_of_data):
Always make #'fixnump available, even on non-BIGNUM builds;
always implement #'integerp in this file, even on BIGNUM builds.
Move most-positive-fixnum, most-negative-fixnum here from
number.c, so they are Lisp constants even on builds without number
types, and attempts to change or bind them error.
Use the NUMBERP and INTEGERP macros even on builds without
extended number types.
* data.c (fixnum_char_or_marker_to_int):
Rename this function from integer_char_or_marker_to_int, to better
reflect the arguments it accepts.
* number.c (Fevenp, Foddp, syms_of_number):
Never provide #'integerp in this file. Remove #'oddp,
#'evenp; their implementations are overridden by those in cl.el.
* number.c (vars_of_number):
most-positive-fixnum, most-negative-fixnum are no longer here.
man/ChangeLog addition:
2010-01-23 Aidan Kehoe <kehoea@parhasard.net>
Generally: be careful to say fixnum, not integer, when talking
about fixed-precision integral types. I'm sure I've missed
instances, both here and in the docstrings, but this is a decent
start.
* lispref/text.texi (Columns):
Document where only fixnums, not integers generally, are accepted.
(Registers):
Remove some ancient char-int confoundance here.
* lispref/strings.texi (Creating Strings, Creating Strings):
Be more exact in describing where fixnums but not integers in
general are accepted.
(Creating Strings): Use a more contemporary example to illustrate
how concat deals with lists including integers about #xFF. Delete
some obsolete documentation on same.
(Char Table Types): Document that only fixnums are accepted as
values in syntax tables.
* lispref/searching.texi (String Search, Search and Replace):
Be exact in describing where fixnums but not integers in general
are accepted.
* lispref/range-tables.texi (Range Tables): Be exact in describing
them; only fixnums are accepted to describe ranges.
* lispref/os.texi (Killing XEmacs, User Identification)
(Time of Day, Time Conversion):
Be more exact about using fixnum where only fixed-precision
integers are accepted.
* lispref/objects.texi (Integer Type): Be more exact (and
up-to-date) about the possible values for
integers. Cross-reference to documentation of the bignum extension.
(Equality Predicates):
(Range Table Type):
(Array Type): Use fixnum, not integer, to describe a
fixed-precision integer.
(Syntax Table Type): Correct some English syntax here.
* lispref/numbers.texi (Numbers): Change the phrasing here to use
fixnum to mean the fixed-precision integers normal in emacs.
Document that our terminology deviates from that of Common Lisp,
and that we're working on it.
(Compatibility Issues): Reiterate the Common Lisp versus Emacs
Lisp compatibility issues.
(Comparison of Numbers, Arithmetic Operations):
* lispref/commands.texi (Command Loop Info, Working With Events):
* lispref/buffers.texi (Modification Time):
Be more exact in describing where fixnums but not integers in
general are accepted.
author | Aidan Kehoe <kehoea@parhasard.net> |
---|---|
date | Sun, 24 Jan 2010 15:21:27 +0000 |
parents | 3660d327399f |
children | 755ae5b97edb |
rev | line source |
---|---|
428 | 1 @c -*-texinfo-*- |
2 @c This is part of the XEmacs Lisp Reference Manual. | |
444 | 3 @c Copyright (C) 1990, 1991, 1992, 1993, 1994 Free Software Foundation, Inc. |
428 | 4 @c See the file lispref.texi for copying conditions. |
5 @setfilename ../../info/searching.info | |
6 @node Searching and Matching, Syntax Tables, Text, Top | |
7 @chapter Searching and Matching | |
8 @cindex searching | |
9 | |
10 XEmacs provides two ways to search through a buffer for specified | |
11 text: exact string searches and regular expression searches. After a | |
12 regular expression search, you can examine the @dfn{match data} to | |
13 determine which text matched the whole regular expression or various | |
14 portions of it. | |
15 | |
16 @menu | |
17 * String Search:: Search for an exact match. | |
18 * Regular Expressions:: Describing classes of strings. | |
19 * Regexp Search:: Searching for a match for a regexp. | |
20 * POSIX Regexps:: Searching POSIX-style for the longest match. | |
21 * Search and Replace:: Internals of @code{query-replace}. | |
22 * Match Data:: Finding out which part of the text matched | |
23 various parts of a regexp, after regexp search. | |
24 * Searching and Case:: Case-independent or case-significant searching. | |
25 * Standard Regexps:: Useful regexps for finding sentences, pages,... | |
26 @end menu | |
27 | |
28 The @samp{skip-chars@dots{}} functions also perform a kind of searching. | |
29 @xref{Skipping Characters}. | |
30 | |
31 @node String Search | |
32 @section Searching for Strings | |
33 @cindex string search | |
34 | |
35 These are the primitive functions for searching through the text in a | |
36 buffer. They are meant for use in programs, but you may call them | |
37 interactively. If you do so, they prompt for the search string; | |
444 | 38 @var{limit} and @var{noerror} are set to @code{nil}, and @var{count} |
428 | 39 is set to 1. |
40 | |
444 | 41 @deffn Command search-forward string &optional limit noerror count buffer |
428 | 42 This function searches forward from point for an exact match for |
43 @var{string}. If successful, it sets point to the end of the occurrence | |
44 found, and returns the new value of point. If no match is found, the | |
45 value and side effects depend on @var{noerror} (see below). | |
46 | |
47 In the following example, point is initially at the beginning of the | |
48 line. Then @code{(search-forward "fox")} moves point after the last | |
49 letter of @samp{fox}: | |
50 | |
51 @example | |
52 @group | |
53 ---------- Buffer: foo ---------- | |
54 @point{}The quick brown fox jumped over the lazy dog. | |
55 ---------- Buffer: foo ---------- | |
56 @end group | |
57 | |
58 @group | |
59 (search-forward "fox") | |
60 @result{} 20 | |
61 | |
62 ---------- Buffer: foo ---------- | |
63 The quick brown fox@point{} jumped over the lazy dog. | |
64 ---------- Buffer: foo ---------- | |
65 @end group | |
66 @end example | |
67 | |
68 The argument @var{limit} specifies the upper bound to the search. (It | |
69 must be a position in the current buffer.) No match extending after | |
70 that position is accepted. If @var{limit} is omitted or @code{nil}, it | |
71 defaults to the end of the accessible portion of the buffer. | |
72 | |
73 @kindex search-failed | |
74 What happens when the search fails depends on the value of | |
75 @var{noerror}. If @var{noerror} is @code{nil}, a @code{search-failed} | |
76 error is signaled. If @var{noerror} is @code{t}, @code{search-forward} | |
77 returns @code{nil} and does nothing. If @var{noerror} is neither | |
78 @code{nil} nor @code{t}, then @code{search-forward} moves point to the | |
79 upper bound and returns @code{nil}. (It would be more consistent now | |
80 to return the new position of point in that case, but some programs | |
81 may depend on a value of @code{nil}.) | |
82 | |
4885
6772ce4d982b
Fix hash tables, #'member*, #'assoc*, #'eql compiler macros if bignums
Aidan Kehoe <kehoea@parhasard.net>
parents:
4199
diff
changeset
|
83 If @var{count} is supplied (it must be a fixnum), then the search is |
444 | 84 repeated that many times (each time starting at the end of the previous |
85 time's match). If @var{count} is negative, the search direction is | |
86 backward. If the successive searches succeed, the function succeeds, | |
87 moving point and returning its new value. Otherwise the search fails. | |
88 | |
89 @var{buffer} is the buffer to search in, and defaults to the current buffer. | |
428 | 90 @end deffn |
91 | |
444 | 92 @deffn Command search-backward string &optional limit noerror count buffer |
428 | 93 This function searches backward from point for @var{string}. It is |
94 just like @code{search-forward} except that it searches backwards and | |
95 leaves point at the beginning of the match. | |
96 @end deffn | |
97 | |
444 | 98 @deffn Command word-search-forward string &optional limit noerror count buffer |
428 | 99 @cindex word search |
100 This function searches forward from point for a ``word'' match for | |
101 @var{string}. If it finds a match, it sets point to the end of the | |
102 match found, and returns the new value of point. | |
103 | |
104 Word matching regards @var{string} as a sequence of words, disregarding | |
105 punctuation that separates them. It searches the buffer for the same | |
106 sequence of words. Each word must be distinct in the buffer (searching | |
107 for the word @samp{ball} does not match the word @samp{balls}), but the | |
108 details of punctuation and spacing are ignored (searching for @samp{ball | |
109 boy} does match @samp{ball. Boy!}). | |
110 | |
111 In this example, point is initially at the beginning of the buffer; the | |
112 search leaves it between the @samp{y} and the @samp{!}. | |
113 | |
114 @example | |
115 @group | |
116 ---------- Buffer: foo ---------- | |
117 @point{}He said "Please! Find | |
118 the ball boy!" | |
119 ---------- Buffer: foo ---------- | |
120 @end group | |
121 | |
122 @group | |
123 (word-search-forward "Please find the ball, boy.") | |
124 @result{} 35 | |
125 | |
126 ---------- Buffer: foo ---------- | |
127 He said "Please! Find | |
128 the ball boy@point{}!" | |
129 ---------- Buffer: foo ---------- | |
130 @end group | |
131 @end example | |
132 | |
133 If @var{limit} is non-@code{nil} (it must be a position in the current | |
134 buffer), then it is the upper bound to the search. The match found must | |
135 not extend after that position. | |
136 | |
137 If @var{noerror} is @code{nil}, then @code{word-search-forward} signals | |
138 an error if the search fails. If @var{noerror} is @code{t}, then it | |
139 returns @code{nil} instead of signaling an error. If @var{noerror} is | |
140 neither @code{nil} nor @code{t}, it moves point to @var{limit} (or the | |
141 end of the buffer) and returns @code{nil}. | |
142 | |
444 | 143 If @var{count} is non-@code{nil}, then the search is repeated that many |
428 | 144 times. Point is positioned at the end of the last match. |
444 | 145 |
146 @var{buffer} is the buffer to search in, and defaults to the current buffer. | |
428 | 147 @end deffn |
148 | |
444 | 149 @deffn Command word-search-backward string &optional limit noerror count buffer |
428 | 150 This function searches backward from point for a word match to |
151 @var{string}. This function is just like @code{word-search-forward} | |
152 except that it searches backward and normally leaves point at the | |
153 beginning of the match. | |
154 @end deffn | |
155 | |
156 @node Regular Expressions | |
157 @section Regular Expressions | |
158 @cindex regular expression | |
159 @cindex regexp | |
160 | |
161 A @dfn{regular expression} (@dfn{regexp}, for short) is a pattern that | |
162 denotes a (possibly infinite) set of strings. Searching for matches for | |
163 a regexp is a very powerful operation. This section explains how to write | |
164 regexps; the following section says how to search for them. | |
165 | |
166 To gain a thorough understanding of regular expressions and how to use | |
167 them to best advantage, we recommend that you study @cite{Mastering | |
168 Regular Expressions, by Jeffrey E.F. Friedl, O'Reilly and Associates, | |
169 1997}. (It's known as the "Hip Owls" book, because of the picture on its | |
170 cover.) You might also read the manuals to @ref{(gawk)Top}, | |
171 @ref{(ed)Top}, @cite{sed}, @cite{grep}, @ref{(perl)Top}, | |
172 @ref{(regex)Top}, @ref{(rx)Top}, @cite{pcre}, and @ref{(flex)Top}, which | |
173 also make good use of regular expressions. | |
174 | |
175 The XEmacs regular expression syntax most closely resembles that of | |
176 @cite{ed}, or @cite{grep}, the GNU versions of which all utilize the GNU | |
177 @cite{regex} library. XEmacs' version of @cite{regex} has recently been | |
178 extended with some Perl--like capabilities, described in the next | |
179 section. | |
180 | |
181 @menu | |
182 * Syntax of Regexps:: Rules for writing regular expressions. | |
183 * Regexp Example:: Illustrates regular expression syntax. | |
184 @end menu | |
185 | |
186 @node Syntax of Regexps | |
187 @subsection Syntax of Regular Expressions | |
188 | |
189 Regular expressions have a syntax in which a few characters are | |
190 special constructs and the rest are @dfn{ordinary}. An ordinary | |
191 character is a simple regular expression that matches that character and | |
192 nothing else. The special characters are @samp{.}, @samp{*}, @samp{+}, | |
193 @samp{?}, @samp{[}, @samp{]}, @samp{^}, @samp{$}, and @samp{\}; no new | |
194 special characters will be defined in the future. Any other character | |
195 appearing in a regular expression is ordinary, unless a @samp{\} | |
196 precedes it. | |
197 | |
198 For example, @samp{f} is not a special character, so it is ordinary, and | |
199 therefore @samp{f} is a regular expression that matches the string | |
200 @samp{f} and no other string. (It does @emph{not} match the string | |
201 @samp{ff}.) Likewise, @samp{o} is a regular expression that matches | |
202 only @samp{o}.@refill | |
203 | |
204 Any two regular expressions @var{a} and @var{b} can be concatenated. The | |
205 result is a regular expression that matches a string if @var{a} matches | |
206 some amount of the beginning of that string and @var{b} matches the rest of | |
207 the string.@refill | |
208 | |
209 As a simple example, we can concatenate the regular expressions @samp{f} | |
210 and @samp{o} to get the regular expression @samp{fo}, which matches only | |
211 the string @samp{fo}. Still trivial. To do something more powerful, you | |
212 need to use one of the special characters. Here is a list of them: | |
213 | |
214 @need 1200 | |
215 @table @kbd | |
216 @item .@: @r{(Period)} | |
217 @cindex @samp{.} in regexp | |
218 is a special character that matches any single character except a newline. | |
219 Using concatenation, we can make regular expressions like @samp{a.b}, which | |
220 matches any three-character string that begins with @samp{a} and ends with | |
221 @samp{b}.@refill | |
222 | |
223 @item * | |
224 @cindex @samp{*} in regexp | |
225 is not a construct by itself; it is a quantifying suffix operator that | |
226 means to repeat the preceding regular expression as many times as | |
227 possible. In @samp{fo*}, the @samp{*} applies to the @samp{o}, so | |
228 @samp{fo*} matches one @samp{f} followed by any number of @samp{o}s. | |
229 The case of zero @samp{o}s is allowed: @samp{fo*} does match | |
230 @samp{f}.@refill | |
231 | |
232 @samp{*} always applies to the @emph{smallest} possible preceding | |
233 expression. Thus, @samp{fo*} has a repeating @samp{o}, not a | |
234 repeating @samp{fo}.@refill | |
235 | |
236 The matcher processes a @samp{*} construct by matching, immediately, as | |
237 many repetitions as can be found; it is "greedy". Then it continues | |
238 with the rest of the pattern. If that fails, backtracking occurs, | |
239 discarding some of the matches of the @samp{*}-modified construct in | |
240 case that makes it possible to match the rest of the pattern. For | |
241 example, in matching @samp{ca*ar} against the string @samp{caaar}, the | |
242 @samp{a*} first tries to match all three @samp{a}s; but the rest of the | |
243 pattern is @samp{ar} and there is only @samp{r} left to match, so this | |
244 try fails. The next alternative is for @samp{a*} to match only two | |
245 @samp{a}s. With this choice, the rest of the regexp matches | |
246 successfully.@refill | |
247 | |
248 Nested repetition operators can be extremely slow if they specify | |
249 backtracking loops. For example, it could take hours for the regular | |
250 expression @samp{\(x+y*\)*a} to match the sequence | |
251 @samp{xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxz}. The slowness is because | |
252 Emacs must try each imaginable way of grouping the 35 @samp{x}'s before | |
253 concluding that none of them can work. To make sure your regular | |
254 expressions run fast, check nested repetitions carefully. | |
255 | |
256 @item + | |
257 @cindex @samp{+} in regexp | |
258 is a quantifying suffix operator similar to @samp{*} except that the | |
259 preceding expression must match at least once. It is also "greedy". | |
260 So, for example, @samp{ca+r} matches the strings @samp{car} and | |
261 @samp{caaaar} but not the string @samp{cr}, whereas @samp{ca*r} matches | |
262 all three strings. | |
263 | |
264 @item ? | |
265 @cindex @samp{?} in regexp | |
266 is a quantifying suffix operator similar to @samp{*}, except that the | |
267 preceding expression can match either once or not at all. For example, | |
268 @samp{ca?r} matches @samp{car} or @samp{cr}, but does not match anything | |
269 else. | |
270 | |
271 @item *? | |
272 @cindex @samp{*?} in regexp | |
273 works just like @samp{*}, except that rather than matching the longest | |
274 match, it matches the shortest match. @samp{*?} is known as a | |
275 @dfn{non-greedy} quantifier, a regexp construct borrowed from Perl. | |
276 @c Did perl get this from somewhere? What's the real history of *? ? | |
277 | |
442 | 278 This construct is very useful for when you want to match the text inside |
279 a pair of delimiters. For instance, @samp{/\*.*?\*/} will match C | |
280 comments in a string. This could not easily be achieved without the use | |
281 of a non-greedy quantifier. | |
428 | 282 |
283 This construct has not been available prior to XEmacs 20.4. It is not | |
284 available in FSF Emacs. | |
285 | |
286 @item +? | |
287 @cindex @samp{+?} in regexp | |
442 | 288 is the non-greedy version of @samp{+}. |
289 | |
290 @item ?? | |
291 @cindex @samp{??} in regexp | |
292 is the non-greedy version of @samp{?}. | |
428 | 293 |
294 @item \@{n,m\@} | |
295 @c Note the spacing after the close brace is deliberate. | |
296 @cindex @samp{\@{n,m\@} }in regexp | |
297 serves as an interval quantifier, analogous to @samp{*} or @samp{+}, but | |
298 specifies that the expression must match at least @var{n} times, but no | |
299 more than @var{m} times. This syntax is supported by most Unix regexp | |
300 utilities, and has been introduced to XEmacs for the version 20.3. | |
301 | |
442 | 302 Unfortunately, the non-greedy version of this quantifier does not exist |
303 currently, although it does in Perl. | |
304 | |
428 | 305 @item [ @dots{} ] |
306 @cindex character set (in regexp) | |
307 @cindex @samp{[} in regexp | |
308 @cindex @samp{]} in regexp | |
309 @samp{[} begins a @dfn{character set}, which is terminated by a | |
310 @samp{]}. In the simplest case, the characters between the two brackets | |
311 form the set. Thus, @samp{[ad]} matches either one @samp{a} or one | |
312 @samp{d}, and @samp{[ad]*} matches any string composed of just @samp{a}s | |
313 and @samp{d}s (including the empty string), from which it follows that | |
314 @samp{c[ad]*r} matches @samp{cr}, @samp{car}, @samp{cdr}, | |
315 @samp{caddaar}, etc.@refill | |
316 | |
317 The usual regular expression special characters are not special inside a | |
318 character set. A completely different set of special characters exists | |
319 inside character sets: @samp{]}, @samp{-} and @samp{^}.@refill | |
320 | |
321 @samp{-} is used for ranges of characters. To write a range, write two | |
322 characters with a @samp{-} between them. Thus, @samp{[a-z]} matches any | |
323 lower case letter. Ranges may be intermixed freely with individual | |
324 characters, as in @samp{[a-z$%.]}, which matches any lower case letter | |
325 or @samp{$}, @samp{%}, or a period.@refill | |
326 | |
327 To include a @samp{]} in a character set, make it the first character. | |
328 For example, @samp{[]a]} matches @samp{]} or @samp{a}. To include a | |
329 @samp{-}, write @samp{-} as the first character in the set, or put it | |
330 immediately after a range. (You can replace one individual character | |
331 @var{c} with the range @samp{@var{c}-@var{c}} to make a place to put the | |
332 @samp{-}.) There is no way to write a set containing just @samp{-} and | |
333 @samp{]}. | |
334 | |
335 To include @samp{^} in a set, put it anywhere but at the beginning of | |
336 the set. | |
337 | |
338 @item [^ @dots{} ] | |
339 @cindex @samp{^} in regexp | |
340 @samp{[^} begins a @dfn{complement character set}, which matches any | |
341 character except the ones specified. Thus, @samp{[^a-z0-9A-Z]} | |
342 matches all characters @emph{except} letters and digits.@refill | |
343 | |
344 @samp{^} is not special in a character set unless it is the first | |
345 character. The character following the @samp{^} is treated as if it | |
346 were first (thus, @samp{-} and @samp{]} are not special there). | |
347 | |
348 Note that a complement character set can match a newline, unless | |
349 newline is mentioned as one of the characters not to match. | |
350 | |
351 @item ^ | |
352 @cindex @samp{^} in regexp | |
353 @cindex beginning of line in regexp | |
354 is a special character that matches the empty string, but only at the | |
355 beginning of a line in the text being matched. Otherwise it fails to | |
356 match anything. Thus, @samp{^foo} matches a @samp{foo} that occurs at | |
357 the beginning of a line. | |
358 | |
359 When matching a string instead of a buffer, @samp{^} matches at the | |
360 beginning of the string or after a newline character @samp{\n}. | |
361 | |
362 @item $ | |
363 @cindex @samp{$} in regexp | |
364 is similar to @samp{^} but matches only at the end of a line. Thus, | |
365 @samp{x+$} matches a string of one @samp{x} or more at the end of a line. | |
366 | |
367 When matching a string instead of a buffer, @samp{$} matches at the end | |
368 of the string or before a newline character @samp{\n}. | |
369 | |
370 @item \ | |
371 @cindex @samp{\} in regexp | |
372 has two functions: it quotes the special characters (including | |
373 @samp{\}), and it introduces additional special constructs. | |
374 | |
375 Because @samp{\} quotes special characters, @samp{\$} is a regular | |
376 expression that matches only @samp{$}, and @samp{\[} is a regular | |
377 expression that matches only @samp{[}, and so on. | |
378 | |
379 Note that @samp{\} also has special meaning in the read syntax of Lisp | |
380 strings (@pxref{String Type}), and must be quoted with @samp{\}. For | |
381 example, the regular expression that matches the @samp{\} character is | |
382 @samp{\\}. To write a Lisp string that contains the characters | |
383 @samp{\\}, Lisp syntax requires you to quote each @samp{\} with another | |
384 @samp{\}. Therefore, the read syntax for a regular expression matching | |
385 @samp{\} is @code{"\\\\"}.@refill | |
386 @end table | |
387 | |
388 @strong{Please note:} For historical compatibility, special characters | |
389 are treated as ordinary ones if they are in contexts where their special | |
390 meanings make no sense. For example, @samp{*foo} treats @samp{*} as | |
391 ordinary since there is no preceding expression on which the @samp{*} | |
392 can act. It is poor practice to depend on this behavior; quote the | |
393 special character anyway, regardless of where it appears.@refill | |
394 | |
395 For the most part, @samp{\} followed by any character matches only | |
396 that character. However, there are several exceptions: characters | |
397 that, when preceded by @samp{\}, are special constructs. Such | |
398 characters are always ordinary when encountered on their own. Here | |
399 is a table of @samp{\} constructs: | |
400 | |
401 @table @kbd | |
402 @item \| | |
403 @cindex @samp{|} in regexp | |
404 @cindex regexp alternative | |
405 specifies an alternative. | |
406 Two regular expressions @var{a} and @var{b} with @samp{\|} in | |
407 between form an expression that matches anything that either @var{a} or | |
408 @var{b} matches.@refill | |
409 | |
410 Thus, @samp{foo\|bar} matches either @samp{foo} or @samp{bar} | |
411 but no other string.@refill | |
412 | |
413 @samp{\|} applies to the largest possible surrounding expressions. Only a | |
414 surrounding @samp{\( @dots{} \)} grouping can limit the grouping power of | |
415 @samp{\|}.@refill | |
416 | |
417 Full backtracking capability exists to handle multiple uses of @samp{\|}. | |
418 | |
419 @item \( @dots{} \) | |
420 @cindex @samp{(} in regexp | |
421 @cindex @samp{)} in regexp | |
422 @cindex regexp grouping | |
423 is a grouping construct that serves three purposes: | |
424 | |
425 @enumerate | |
426 @item | |
427 To enclose a set of @samp{\|} alternatives for other operations. | |
428 Thus, @samp{\(foo\|bar\)x} matches either @samp{foox} or @samp{barx}. | |
429 | |
430 @item | |
431 To enclose an expression for a suffix operator such as @samp{*} to act | |
432 on. Thus, @samp{ba\(na\)*} matches @samp{bananana}, etc., with any | |
433 (zero or more) number of @samp{na} strings.@refill | |
434 | |
435 @item | |
436 To record a matched substring for future reference. | |
437 @end enumerate | |
438 | |
439 This last application is not a consequence of the idea of a | |
440 parenthetical grouping; it is a separate feature that happens to be | |
441 assigned as a second meaning to the same @samp{\( @dots{} \)} construct | |
442 because there is no conflict in practice between the two meanings. | |
443 Here is an explanation of this feature: | |
444 | |
445 @item \@var{digit} | |
446 matches the same text that matched the @var{digit}th occurrence of a | |
447 @samp{\( @dots{} \)} construct. | |
448 | |
2255 | 449 In other words, after the end of a @samp{\( @dots{} \)} construct, the |
428 | 450 matcher remembers the beginning and end of the text matched by that |
451 construct. Then, later on in the regular expression, you can use | |
452 @samp{\} followed by @var{digit} to match that same text, whatever it | |
453 may have been. | |
454 | |
455 The strings matching the first nine @samp{\( @dots{} \)} constructs | |
456 appearing in a regular expression are assigned numbers 1 through 9 in | |
457 the order that the open parentheses appear in the regular expression. | |
458 So you can use @samp{\1} through @samp{\9} to refer to the text matched | |
459 by the corresponding @samp{\( @dots{} \)} constructs. | |
460 | |
461 For example, @samp{\(.*\)\1} matches any newline-free string that is | |
462 composed of two identical halves. The @samp{\(.*\)} matches the first | |
463 half, which may be anything, but the @samp{\1} that follows must match | |
464 the same exact text. | |
465 | |
466 @item \(?: @dots{} \) | |
467 @cindex @samp{\(?:} in regexp | |
468 @cindex regexp grouping | |
469 is called a @dfn{shy} grouping operator, and it is used just like | |
470 @samp{\( @dots{} \)}, except that it does not cause the matched | |
471 substring to be recorded for future reference. | |
472 | |
473 This is useful when you need a lot of grouping @samp{\( @dots{} \)} | |
442 | 474 constructs, but only want to remember one or two -- or if you have |
475 more than nine groupings and need to use backreferences to refer to | |
2255 | 476 the groupings at the end. It also allows construction of regular |
477 expressions from variable subexpressions that contain varying numbers of | |
478 non-capturing subexpressions, without disturbing the group counts for | |
479 the main expression. For example | |
480 | |
481 @example | |
482 (let ((sre (if foo "\\(?:bar\\|baz\\)" "quux"))) | |
483 (re-search-forward (format "a\\(b+ %s c+\\) d" sre) nil t) | |
484 (match-string 1)) | |
485 @end example | |
428 | 486 |
2255 | 487 It is very tedious to write this kind of code without shy groups, even |
488 if you know what all the alternative subexpressions will look like. | |
428 | 489 |
2255 | 490 Using @samp{\(?: @dots{} \)} rather than @samp{\( @dots{} \)} should |
491 give little performance gain, as the start of each group must be | |
492 recorded for the purpose of back-tracking in any case, and no string | |
493 copying is done until @code{match-string} is called. | |
494 | |
495 The shy grouping operator has been borrowed from Perl, and was not | |
496 available prior to XEmacs 20.3, and has only been available in GNU Emacs | |
497 since version 21. | |
428 | 498 |
499 @item \w | |
500 @cindex @samp{\w} in regexp | |
501 matches any word-constituent character. The editor syntax table | |
502 determines which characters these are. @xref{Syntax Tables}. | |
503 | |
504 @item \W | |
505 @cindex @samp{\W} in regexp | |
506 matches any character that is not a word constituent. | |
507 | |
508 @item \s@var{code} | |
509 @cindex @samp{\s} in regexp | |
510 matches any character whose syntax is @var{code}. Here @var{code} is a | |
511 character that represents a syntax code: thus, @samp{w} for word | |
512 constituent, @samp{-} for whitespace, @samp{(} for open parenthesis, | |
513 etc. @xref{Syntax Tables}, for a list of syntax codes and the | |
514 characters that stand for them. | |
515 | |
516 @item \S@var{code} | |
517 @cindex @samp{\S} in regexp | |
518 matches any character whose syntax is not @var{code}. | |
2608 | 519 |
520 @item \c@var{category} | |
521 @cindex @samp{\c} in regexp | |
522 matches any character in @var{category}. Only available under Mule, | |
523 categories, and category tables, are further described in @ref{Category | |
524 Tables}. They are a mechanism for constructing classes of characters | |
525 that can be local to a buffer, and that do not require complicated [] | |
526 expressions every time they are referenced. | |
527 | |
528 @item \C@var{category} | |
529 @cindex @samp{\C} in regexp | |
530 matches any character outside @var{category}. @xref{Category Tables}, | |
531 again, and note that this is only available under Mule. | |
428 | 532 @end table |
533 | |
534 The following regular expression constructs match the empty string---that is, | |
535 they don't use up any characters---but whether they match depends on the | |
536 context. | |
537 | |
538 @table @kbd | |
539 @item \` | |
540 @cindex @samp{\`} in regexp | |
541 matches the empty string, but only at the beginning | |
542 of the buffer or string being matched against. | |
543 | |
544 @item \' | |
545 @cindex @samp{\'} in regexp | |
546 matches the empty string, but only at the end of | |
547 the buffer or string being matched against. | |
548 | |
549 @item \= | |
550 @cindex @samp{\=} in regexp | |
551 matches the empty string, but only at point. | |
552 (This construct is not defined when matching against a string.) | |
553 | |
554 @item \b | |
555 @cindex @samp{\b} in regexp | |
556 matches the empty string, but only at the beginning or | |
557 end of a word. Thus, @samp{\bfoo\b} matches any occurrence of | |
558 @samp{foo} as a separate word. @samp{\bballs?\b} matches | |
559 @samp{ball} or @samp{balls} as a separate word.@refill | |
560 | |
561 @item \B | |
562 @cindex @samp{\B} in regexp | |
563 matches the empty string, but @emph{not} at the beginning or | |
564 end of a word. | |
565 | |
566 @item \< | |
567 @cindex @samp{\<} in regexp | |
568 matches the empty string, but only at the beginning of a word. | |
569 | |
570 @item \> | |
571 @cindex @samp{\>} in regexp | |
572 matches the empty string, but only at the end of a word. | |
573 @end table | |
574 | |
575 @kindex invalid-regexp | |
576 Not every string is a valid regular expression. For example, a string | |
577 with unbalanced square brackets is invalid (with a few exceptions, such | |
578 as @samp{[]]}), and so is a string that ends with a single @samp{\}. If | |
579 an invalid regular expression is passed to any of the search functions, | |
580 an @code{invalid-regexp} error is signaled. | |
581 | |
582 @defun regexp-quote string | |
583 This function returns a regular expression string that matches exactly | |
584 @var{string} and nothing else. This allows you to request an exact | |
585 string match when calling a function that wants a regular expression. | |
586 | |
587 @example | |
588 @group | |
589 (regexp-quote "^The cat$") | |
590 @result{} "\\^The cat\\$" | |
591 @end group | |
592 @end example | |
593 | |
594 One use of @code{regexp-quote} is to combine an exact string match with | |
595 context described as a regular expression. For example, this searches | |
596 for the string that is the value of @code{string}, surrounded by | |
597 whitespace: | |
598 | |
599 @example | |
600 @group | |
601 (re-search-forward | |
602 (concat "\\s-" (regexp-quote string) "\\s-")) | |
603 @end group | |
604 @end example | |
605 @end defun | |
606 | |
607 @node Regexp Example | |
608 @subsection Complex Regexp Example | |
609 | |
610 Here is a complicated regexp, used by XEmacs to recognize the end of a | |
611 sentence together with any whitespace that follows. It is the value of | |
444 | 612 the variable @code{sentence-end}. |
428 | 613 |
614 First, we show the regexp as a string in Lisp syntax to distinguish | |
615 spaces from tab characters. The string constant begins and ends with a | |
616 double-quote. @samp{\"} stands for a double-quote as part of the | |
617 string, @samp{\\} for a backslash as part of the string, @samp{\t} for a | |
618 tab and @samp{\n} for a newline. | |
619 | |
620 @example | |
621 "[.?!][]\"')@}]*\\($\\| $\\|\t\\| \\)[ \t\n]*" | |
622 @end example | |
623 | |
624 In contrast, if you evaluate the variable @code{sentence-end}, you | |
625 will see the following: | |
626 | |
627 @example | |
628 @group | |
629 sentence-end | |
630 @result{} | |
444 | 631 "[.?!][]\"')@}]*\\($\\| $\\| \\| \\)[ |
428 | 632 ]*" |
633 @end group | |
634 @end example | |
635 | |
636 @noindent | |
637 In this output, tab and newline appear as themselves. | |
638 | |
639 This regular expression contains four parts in succession and can be | |
640 deciphered as follows: | |
641 | |
642 @table @code | |
643 @item [.?!] | |
644 The first part of the pattern is a character set that matches any one of | |
645 three characters: period, question mark, and exclamation mark. The | |
646 match must begin with one of these three characters. | |
647 | |
648 @item []\"')@}]* | |
649 The second part of the pattern matches any closing braces and quotation | |
650 marks, zero or more of them, that may follow the period, question mark | |
651 or exclamation mark. The @code{\"} is Lisp syntax for a double-quote in | |
652 a string. The @samp{*} at the end indicates that the immediately | |
653 preceding regular expression (a character set, in this case) may be | |
654 repeated zero or more times. | |
655 | |
656 @item \\($\\|@ $\\|\t\\|@ @ \\) | |
657 The third part of the pattern matches the whitespace that follows the | |
658 end of a sentence: the end of a line, or a tab, or two spaces. The | |
659 double backslashes mark the parentheses and vertical bars as regular | |
660 expression syntax; the parentheses delimit a group and the vertical bars | |
661 separate alternatives. The dollar sign is used to match the end of a | |
662 line. | |
663 | |
664 @item [ \t\n]* | |
665 Finally, the last part of the pattern matches any additional whitespace | |
666 beyond the minimum needed to end a sentence. | |
667 @end table | |
668 | |
669 @node Regexp Search | |
670 @section Regular Expression Searching | |
671 @cindex regular expression searching | |
672 @cindex regexp searching | |
673 @cindex searching for regexp | |
674 | |
675 In XEmacs, you can search for the next match for a regexp either | |
676 incrementally or not. Incremental search commands are described in the | |
446 | 677 @cite{The XEmacs Lisp Reference Manual}. @xref{Regexp Search, , Regular Expression |
678 Search, xemacs, The XEmacs Lisp Reference Manual}. Here we describe only the search | |
428 | 679 functions useful in programs. The principal one is |
680 @code{re-search-forward}. | |
681 | |
444 | 682 @deffn Command re-search-forward regexp &optional limit noerror count buffer |
428 | 683 This function searches forward in the current buffer for a string of |
684 text that is matched by the regular expression @var{regexp}. The | |
685 function skips over any amount of text that is not matched by | |
686 @var{regexp}, and leaves point at the end of the first match found. | |
687 It returns the new value of point. | |
688 | |
689 If @var{limit} is non-@code{nil} (it must be a position in the current | |
690 buffer), then it is the upper bound to the search. No match extending | |
691 after that position is accepted. | |
692 | |
693 What happens when the search fails depends on the value of | |
694 @var{noerror}. If @var{noerror} is @code{nil}, a @code{search-failed} | |
695 error is signaled. If @var{noerror} is @code{t}, | |
696 @code{re-search-forward} does nothing and returns @code{nil}. If | |
697 @var{noerror} is neither @code{nil} nor @code{t}, then | |
698 @code{re-search-forward} moves point to @var{limit} (or the end of the | |
699 buffer) and returns @code{nil}. | |
700 | |
444 | 701 If @var{count} is supplied (it must be a positive number), then the |
428 | 702 search is repeated that many times (each time starting at the end of the |
703 previous time's match). If these successive searches succeed, the | |
704 function succeeds, moving point and returning its new value. Otherwise | |
705 the search fails. | |
706 | |
707 In the following example, point is initially before the @samp{T}. | |
708 Evaluating the search call moves point to the end of that line (between | |
709 the @samp{t} of @samp{hat} and the newline). | |
710 | |
711 @example | |
712 @group | |
713 ---------- Buffer: foo ---------- | |
714 I read "@point{}The cat in the hat | |
715 comes back" twice. | |
716 ---------- Buffer: foo ---------- | |
717 @end group | |
718 | |
719 @group | |
720 (re-search-forward "[a-z]+" nil t 5) | |
721 @result{} 27 | |
722 | |
723 ---------- Buffer: foo ---------- | |
724 I read "The cat in the hat@point{} | |
725 comes back" twice. | |
726 ---------- Buffer: foo ---------- | |
727 @end group | |
728 @end example | |
729 @end deffn | |
730 | |
444 | 731 @deffn Command re-search-backward regexp &optional limit noerror count buffer |
428 | 732 This function searches backward in the current buffer for a string of |
733 text that is matched by the regular expression @var{regexp}, leaving | |
734 point at the beginning of the first text found. | |
735 | |
736 This function is analogous to @code{re-search-forward}, but they are not | |
737 simple mirror images. @code{re-search-forward} finds the match whose | |
738 beginning is as close as possible to the starting point. If | |
739 @code{re-search-backward} were a perfect mirror image, it would find the | |
740 match whose end is as close as possible. However, in fact it finds the | |
741 match whose beginning is as close as possible. The reason is that | |
742 matching a regular expression at a given spot always works from | |
743 beginning to end, and starts at a specified beginning position. | |
744 | |
745 A true mirror-image of @code{re-search-forward} would require a special | |
746 feature for matching regexps from end to beginning. It's not worth the | |
747 trouble of implementing that. | |
748 @end deffn | |
749 | |
444 | 750 @defun string-match regexp string &optional start buffer |
428 | 751 This function returns the index of the start of the first match for |
752 the regular expression @var{regexp} in @var{string}, or @code{nil} if | |
753 there is no match. If @var{start} is non-@code{nil}, the search starts | |
754 at that index in @var{string}. | |
755 | |
444 | 756 |
757 Optional arg @var{buffer} controls how case folding is done (according | |
758 to the value of @code{case-fold-search} in @var{buffer} and | |
759 @var{buffer}'s case tables) and defaults to the current buffer. | |
760 | |
428 | 761 For example, |
762 | |
763 @example | |
764 @group | |
765 (string-match | |
766 "quick" "The quick brown fox jumped quickly.") | |
767 @result{} 4 | |
768 @end group | |
769 @group | |
770 (string-match | |
771 "quick" "The quick brown fox jumped quickly." 8) | |
772 @result{} 27 | |
773 @end group | |
774 @end example | |
775 | |
776 @noindent | |
777 The index of the first character of the | |
778 string is 0, the index of the second character is 1, and so on. | |
779 | |
780 After this function returns, the index of the first character beyond | |
781 the match is available as @code{(match-end 0)}. @xref{Match Data}. | |
782 | |
783 @example | |
784 @group | |
785 (string-match | |
786 "quick" "The quick brown fox jumped quickly." 8) | |
787 @result{} 27 | |
788 @end group | |
789 | |
790 @group | |
791 (match-end 0) | |
792 @result{} 32 | |
793 @end group | |
794 @end example | |
795 @end defun | |
796 | |
1495 | 797 The function @code{split-string} can be used to parse a string into |
798 components delimited by text matching a regular expression. | |
799 | |
800 @defvar split-string-default-separators | |
801 The default value of @var{separators} for @code{split-string}, initially | |
802 @samp{"[ \f\t\n\r\v]+"}. | |
803 @end defvar | |
804 | |
805 @defun split-string string &optional separators omit-nulls | |
806 This function splits @var{string} into substrings delimited by matches | |
807 for the regular expression @var{separators}. Each match for | |
808 @var{separators} defines a splitting point; the substrings between the | |
809 splitting points are made into a list, which is the value returned by | |
810 @code{split-string}. If @var{omit-nulls} is @code{t}, null strings will | |
811 be removed from the result list. Otherwise, null strings are left in | |
812 the result. If @var{separators} is @code{nil} (or omitted), the default | |
813 is the value of @code{split-string-default-separators}. | |
814 | |
815 As a special case, when @var{separators} is @code{nil} (or omitted), | |
816 null strings are always omitted from the result. Thus: | |
817 | |
818 @example | |
819 (split-string " two words ") | |
820 @result{} ("two" "words") | |
821 @end example | |
822 | |
823 The result is not @samp{("" "two" "words" "")}, which would rarely be | |
824 useful. If you need such a result, use an explict value for | |
825 @var{separators}: | |
826 | |
827 @example | |
828 (split-string " two words " split-string-default-separators) | |
829 @result{} ("" "two" "words" "") | |
830 @end example | |
831 | |
832 A few examples (there are more in the regression tests): | |
428 | 833 |
834 @example | |
835 @group | |
1495 | 836 (split-string "foo" "") |
837 @result{} ("" "f" "o" "o" "") | |
838 @end group | |
839 @group | |
840 (split-string "foo" "^") | |
841 @result{} ("" "foo") | |
842 @end group | |
843 @group | |
844 (split-string "foo" "$") | |
845 @result{} ("foo" "")) | |
846 @end group | |
847 @group | |
848 (split-string "foo,bar" ",") | |
428 | 849 @result{} ("foo" "bar") |
850 @end group | |
851 @group | |
1495 | 852 (split-string ",foo,bar," ",") |
853 @result{} ("" "foo" "bar" "") | |
428 | 854 @end group |
855 @group | |
1495 | 856 (split-string ",foo,bar," "^,") |
857 @result{} ("" "foo,bar,") | |
428 | 858 @end group |
859 @group | |
1495 | 860 (split-string "foo,bar" "," t) |
861 @result{} ("foo" "bar") | |
862 @end group | |
863 @group | |
864 (split-string ",foo,bar," "," t) | |
865 @result{} ("foo" "bar") | |
428 | 866 @end group |
867 @end example | |
868 @end defun | |
869 | |
870 @defun split-path path | |
871 This function splits a search path into a list of strings. The path | |
872 components are separated with the characters specified with | |
873 @code{path-separator}. Under Unix, @code{path-separator} will normally | |
874 be @samp{:}, while under Windows, it will be @samp{;}. | |
875 @end defun | |
876 | |
444 | 877 @defun looking-at regexp &optional buffer |
428 | 878 This function determines whether the text in the current buffer directly |
879 following point matches the regular expression @var{regexp}. ``Directly | |
880 following'' means precisely that: the search is ``anchored'' and it can | |
881 succeed only starting with the first character following point. The | |
882 result is @code{t} if so, @code{nil} otherwise. | |
883 | |
884 This function does not move point, but it updates the match data, which | |
885 you can access using @code{match-beginning} and @code{match-end}. | |
886 @xref{Match Data}. | |
887 | |
888 In this example, point is located directly before the @samp{T}. If it | |
889 were anywhere else, the result would be @code{nil}. | |
890 | |
891 @example | |
892 @group | |
893 ---------- Buffer: foo ---------- | |
894 I read "@point{}The cat in the hat | |
895 comes back" twice. | |
896 ---------- Buffer: foo ---------- | |
897 | |
898 (looking-at "The cat in the hat$") | |
899 @result{} t | |
900 @end group | |
901 @end example | |
902 @end defun | |
903 | |
904 @node POSIX Regexps | |
905 @section POSIX Regular Expression Searching | |
906 | |
907 The usual regular expression functions do backtracking when necessary | |
908 to handle the @samp{\|} and repetition constructs, but they continue | |
909 this only until they find @emph{some} match. Then they succeed and | |
910 report the first match found. | |
911 | |
912 This section describes alternative search functions which perform the | |
913 full backtracking specified by the POSIX standard for regular expression | |
914 matching. They continue backtracking until they have tried all | |
915 possibilities and found all matches, so they can report the longest | |
916 match, as required by POSIX. This is much slower, so use these | |
917 functions only when you really need the longest match. | |
918 | |
919 In Emacs versions prior to 19.29, these functions did not exist, and | |
920 the functions described above implemented full POSIX backtracking. | |
921 | |
444 | 922 @deffn Command posix-search-forward regexp &optional limit noerror count buffer |
428 | 923 This is like @code{re-search-forward} except that it performs the full |
924 backtracking specified by the POSIX standard for regular expression | |
925 matching. | |
444 | 926 @end deffn |
428 | 927 |
444 | 928 @deffn Command posix-search-backward regexp &optional limit noerror count buffer |
428 | 929 This is like @code{re-search-backward} except that it performs the full |
930 backtracking specified by the POSIX standard for regular expression | |
931 matching. | |
444 | 932 @end deffn |
428 | 933 |
444 | 934 @defun posix-looking-at regexp &optional buffer |
428 | 935 This is like @code{looking-at} except that it performs the full |
936 backtracking specified by the POSIX standard for regular expression | |
937 matching. | |
938 @end defun | |
939 | |
444 | 940 @defun posix-string-match regexp string &optional start buffer |
428 | 941 This is like @code{string-match} except that it performs the full |
942 backtracking specified by the POSIX standard for regular expression | |
943 matching. | |
444 | 944 |
945 Optional arg @var{buffer} controls how case folding is done (according | |
946 to the value of @code{case-fold-search} in @var{buffer} and | |
947 @var{buffer}'s case tables) and defaults to the current buffer. | |
428 | 948 @end defun |
949 | |
950 @ignore | |
951 @deffn Command delete-matching-lines regexp | |
952 This function is identical to @code{delete-non-matching-lines}, save | |
953 that it deletes what @code{delete-non-matching-lines} keeps. | |
954 | |
955 In the example below, point is located on the first line of text. | |
956 | |
957 @example | |
958 @group | |
959 ---------- Buffer: foo ---------- | |
960 We hold these truths | |
961 to be self-evident, | |
962 that all men are created | |
963 equal, and that they are | |
964 ---------- Buffer: foo ---------- | |
965 @end group | |
966 | |
967 @group | |
968 (delete-matching-lines "the") | |
969 @result{} nil | |
970 | |
971 ---------- Buffer: foo ---------- | |
972 to be self-evident, | |
973 that all men are created | |
974 ---------- Buffer: foo ---------- | |
975 @end group | |
976 @end example | |
977 @end deffn | |
978 | |
979 @deffn Command flush-lines regexp | |
444 | 980 This function is an alias of @code{delete-matching-lines}. |
428 | 981 @end deffn |
982 | |
444 | 983 @deffn Command delete-non-matching-lines regexp |
428 | 984 This function deletes all lines following point which don't |
985 contain a match for the regular expression @var{regexp}. | |
444 | 986 @end deffn |
428 | 987 |
988 @deffn Command keep-lines regexp | |
989 This function is the same as @code{delete-non-matching-lines}. | |
990 @end deffn | |
991 | |
444 | 992 @deffn Command count-matches regexp |
428 | 993 This function counts the number of matches for @var{regexp} there are in |
994 the current buffer following point. It prints this number in | |
995 the echo area, returning the string printed. | |
996 @end deffn | |
997 | |
444 | 998 @deffn Command how-many regexp |
999 This function is an alias of @code{count-matches}. | |
428 | 1000 @end deffn |
1001 | |
444 | 1002 @deffn Command list-matching-lines regexp &optional nlines |
428 | 1003 This function is a synonym of @code{occur}. |
1004 Show all lines following point containing a match for @var{regexp}. | |
1005 Display each line with @var{nlines} lines before and after, | |
1006 or @code{-}@var{nlines} before if @var{nlines} is negative. | |
1007 @var{nlines} defaults to @code{list-matching-lines-default-context-lines}. | |
1008 Interactively it is the prefix arg. | |
1009 | |
1010 The lines are shown in a buffer named @samp{*Occur*}. | |
1011 It serves as a menu to find any of the occurrences in this buffer. | |
1012 @kbd{C-h m} (@code{describe-mode} in that buffer gives help. | |
1013 @end deffn | |
1014 | |
1015 @defopt list-matching-lines-default-context-lines | |
1016 Default value is 0. | |
1017 Default number of context lines to include around a @code{list-matching-lines} | |
1018 match. A negative number means to include that many lines before the match. | |
1019 A positive number means to include that many lines both before and after. | |
1020 @end defopt | |
1021 @end ignore | |
1022 | |
1023 @node Search and Replace | |
1024 @section Search and Replace | |
1025 @cindex replacement | |
1026 | |
1027 @defun perform-replace from-string replacements query-flag regexp-flag delimited-flag &optional repeat-count map | |
1028 This function is the guts of @code{query-replace} and related commands. | |
1029 It searches for occurrences of @var{from-string} and replaces some or | |
1030 all of them. If @var{query-flag} is @code{nil}, it replaces all | |
1031 occurrences; otherwise, it asks the user what to do about each one. | |
1032 | |
1033 If @var{regexp-flag} is non-@code{nil}, then @var{from-string} is | |
1034 considered a regular expression; otherwise, it must match literally. If | |
1035 @var{delimited-flag} is non-@code{nil}, then only replacements | |
1036 surrounded by word boundaries are considered. | |
1037 | |
1038 The argument @var{replacements} specifies what to replace occurrences | |
1039 with. If it is a string, that string is used. It can also be a list of | |
1040 strings, to be used in cyclic order. | |
1041 | |
4885
6772ce4d982b
Fix hash tables, #'member*, #'assoc*, #'eql compiler macros if bignums
Aidan Kehoe <kehoea@parhasard.net>
parents:
4199
diff
changeset
|
1042 If @var{repeat-count} is non-@code{nil}, it should be a fixnum. Then |
428 | 1043 it specifies how many times to use each of the strings in the |
1044 @var{replacements} list before advancing cyclicly to the next one. | |
1045 | |
1046 Normally, the keymap @code{query-replace-map} defines the possible user | |
1047 responses for queries. The argument @var{map}, if non-@code{nil}, is a | |
1048 keymap to use instead of @code{query-replace-map}. | |
1049 @end defun | |
1050 | |
1051 @defvar query-replace-map | |
1052 This variable holds a special keymap that defines the valid user | |
1053 responses for @code{query-replace} and related functions, as well as | |
1054 @code{y-or-n-p} and @code{map-y-or-n-p}. It is unusual in two ways: | |
1055 | |
1056 @itemize @bullet | |
1057 @item | |
1058 The ``key bindings'' are not commands, just symbols that are meaningful | |
1059 to the functions that use this map. | |
1060 | |
1061 @item | |
1062 Prefix keys are not supported; each key binding must be for a single event | |
1063 key sequence. This is because the functions don't use read key sequence to | |
1064 get the input; instead, they read a single event and look it up ``by hand.'' | |
1065 @end itemize | |
1066 @end defvar | |
1067 | |
1068 Here are the meaningful ``bindings'' for @code{query-replace-map}. | |
1069 Several of them are meaningful only for @code{query-replace} and | |
1070 friends. | |
1071 | |
1072 @table @code | |
1073 @item act | |
1074 Do take the action being considered---in other words, ``yes.'' | |
1075 | |
1076 @item skip | |
1077 Do not take action for this question---in other words, ``no.'' | |
1078 | |
1079 @item exit | |
1080 Answer this question ``no,'' and give up on the entire series of | |
1081 questions, assuming that the answers will be ``no.'' | |
1082 | |
1083 @item act-and-exit | |
1084 Answer this question ``yes,'' and give up on the entire series of | |
1085 questions, assuming that subsequent answers will be ``no.'' | |
1086 | |
1087 @item act-and-show | |
1088 Answer this question ``yes,'' but show the results---don't advance yet | |
1089 to the next question. | |
1090 | |
1091 @item automatic | |
1092 Answer this question and all subsequent questions in the series with | |
1093 ``yes,'' without further user interaction. | |
1094 | |
1095 @item backup | |
1096 Move back to the previous place that a question was asked about. | |
1097 | |
1098 @item edit | |
1099 Enter a recursive edit to deal with this question---instead of any | |
1100 other action that would normally be taken. | |
1101 | |
1102 @item delete-and-edit | |
1103 Delete the text being considered, then enter a recursive edit to replace | |
1104 it. | |
1105 | |
1106 @item recenter | |
1107 Redisplay and center the window, then ask the same question again. | |
1108 | |
1109 @item quit | |
1110 Perform a quit right away. Only @code{y-or-n-p} and related functions | |
1111 use this answer. | |
1112 | |
1113 @item help | |
1114 Display some help, then ask again. | |
1115 @end table | |
1116 | |
1117 @node Match Data | |
1118 @section The Match Data | |
1119 @cindex match data | |
1120 | |
1121 XEmacs keeps track of the positions of the start and end of segments of | |
1122 text found during a regular expression search. This means, for example, | |
1123 that you can search for a complex pattern, such as a date in an Rmail | |
1124 message, and then extract parts of the match under control of the | |
1125 pattern. | |
1126 | |
1468 | 1127 Because the match data normally describe the most recent successful |
1128 search only, you must be careful not to do another search inadvertently | |
1129 between the search you wish to refer back to and the use of the match | |
1130 data. If you can't avoid another intervening search, you must save and | |
1131 restore the match data around it, to prevent it from being overwritten. | |
1132 | |
1133 To make it possible to write iterative or recursive code that repeatedly | |
1134 searches, and uses the data from the last successful search when no more | |
1135 matches can be found, a search or match which fails will preserve the | |
1136 match data from the last successful search. (You must not depend on | |
1137 match data being preserved in case the search or match signals an | |
1138 error.) If for some reason you need to clear the match data, you may | |
1139 use @code{(store-match-data nil)}. | |
428 | 1140 |
1141 @menu | |
1142 * Simple Match Data:: Accessing single items of match data, | |
1143 such as where a particular subexpression started. | |
1144 * Replacing Match:: Replacing a substring that was matched. | |
1145 * Entire Match Data:: Accessing the entire match data at once, as a list. | |
1146 * Saving Match Data:: Saving and restoring the match data. | |
1147 @end menu | |
1148 | |
1149 @node Simple Match Data | |
1150 @subsection Simple Match Data Access | |
1151 | |
1152 This section explains how to use the match data to find out what was | |
1153 matched by the last search or match operation. | |
1154 | |
1155 You can ask about the entire matching text, or about a particular | |
1156 parenthetical subexpression of a regular expression. The @var{count} | |
1157 argument in the functions below specifies which. If @var{count} is | |
1158 zero, you are asking about the entire match. If @var{count} is | |
1159 positive, it specifies which subexpression you want. | |
1160 | |
1161 Recall that the subexpressions of a regular expression are those | |
1162 expressions grouped with escaped parentheses, @samp{\(@dots{}\)}. The | |
1163 @var{count}th subexpression is found by counting occurrences of | |
1164 @samp{\(} from the beginning of the whole regular expression. The first | |
1165 subexpression is numbered 1, the second 2, and so on. Only regular | |
1166 expressions can have subexpressions---after a simple string search, the | |
1167 only information available is about the entire match. | |
1168 | |
1169 @defun match-string count &optional in-string | |
1170 This function returns, as a string, the text matched in the last search | |
1171 or match operation. It returns the entire text if @var{count} is zero, | |
1172 or just the portion corresponding to the @var{count}th parenthetical | |
1173 subexpression, if @var{count} is positive. If @var{count} is out of | |
1174 range, or if that subexpression didn't match anything, the value is | |
1175 @code{nil}. | |
1176 | |
1177 If the last such operation was done against a string with | |
1178 @code{string-match}, then you should pass the same string as the | |
1179 argument @var{in-string}. Otherwise, after a buffer search or match, | |
1180 you should omit @var{in-string} or pass @code{nil} for it; but you | |
1181 should make sure that the current buffer when you call | |
1182 @code{match-string} is the one in which you did the searching or | |
1183 matching. | |
1184 @end defun | |
1185 | |
1186 @defun match-beginning count | |
1187 This function returns the position of the start of text matched by the | |
1188 last regular expression searched for, or a subexpression of it. | |
1189 | |
1190 If @var{count} is zero, then the value is the position of the start of | |
1191 the entire match. Otherwise, @var{count} specifies a subexpression in | |
1192 the regular expression, and the value of the function is the starting | |
1193 position of the match for that subexpression. | |
1194 | |
1195 The value is @code{nil} for a subexpression inside a @samp{\|} | |
1196 alternative that wasn't used in the match. | |
1197 @end defun | |
1198 | |
1199 @defun match-end count | |
1200 This function is like @code{match-beginning} except that it returns the | |
1201 position of the end of the match, rather than the position of the | |
1202 beginning. | |
1203 @end defun | |
1204 | |
1205 Here is an example of using the match data, with a comment showing the | |
1206 positions within the text: | |
1207 | |
1208 @example | |
1209 @group | |
1210 (string-match "\\(qu\\)\\(ick\\)" | |
1211 "The quick fox jumped quickly.") | |
444 | 1212 ;0123456789 |
428 | 1213 @result{} 4 |
1214 @end group | |
1215 | |
1216 @group | |
1217 (match-string 0 "The quick fox jumped quickly.") | |
1218 @result{} "quick" | |
1219 (match-string 1 "The quick fox jumped quickly.") | |
1220 @result{} "qu" | |
1221 (match-string 2 "The quick fox jumped quickly.") | |
1222 @result{} "ick" | |
1223 @end group | |
1224 | |
1225 @group | |
1226 (match-beginning 1) ; @r{The beginning of the match} | |
1227 @result{} 4 ; @r{with @samp{qu} is at index 4.} | |
1228 @end group | |
1229 | |
1230 @group | |
1231 (match-beginning 2) ; @r{The beginning of the match} | |
1232 @result{} 6 ; @r{with @samp{ick} is at index 6.} | |
1233 @end group | |
1234 | |
1235 @group | |
1236 (match-end 1) ; @r{The end of the match} | |
1237 @result{} 6 ; @r{with @samp{qu} is at index 6.} | |
1238 | |
1239 (match-end 2) ; @r{The end of the match} | |
1240 @result{} 9 ; @r{with @samp{ick} is at index 9.} | |
1241 @end group | |
1242 @end example | |
1243 | |
1244 Here is another example. Point is initially located at the beginning | |
1245 of the line. Searching moves point to between the space and the word | |
1246 @samp{in}. The beginning of the entire match is at the 9th character of | |
1247 the buffer (@samp{T}), and the beginning of the match for the first | |
1248 subexpression is at the 13th character (@samp{c}). | |
1249 | |
1250 @example | |
1251 @group | |
1252 (list | |
1253 (re-search-forward "The \\(cat \\)") | |
1254 (match-beginning 0) | |
1255 (match-beginning 1)) | |
1256 @result{} (9 9 13) | |
1257 @end group | |
1258 | |
1259 @group | |
1260 ---------- Buffer: foo ---------- | |
1261 I read "The cat @point{}in the hat comes back" twice. | |
1262 ^ ^ | |
1263 9 13 | |
1264 ---------- Buffer: foo ---------- | |
1265 @end group | |
1266 @end example | |
1267 | |
1268 @noindent | |
1269 (In this case, the index returned is a buffer position; the first | |
1270 character of the buffer counts as 1.) | |
1271 | |
1272 @node Replacing Match | |
1273 @subsection Replacing the Text That Matched | |
1274 | |
1275 This function replaces the text matched by the last search with | |
1276 @var{replacement}. | |
1277 | |
1278 @cindex case in replacements | |
444 | 1279 @defun replace-match replacement &optional fixedcase literal string strbuffer |
428 | 1280 This function replaces the text in the buffer (or in @var{string}) that |
1281 was matched by the last search. It replaces that text with | |
1282 @var{replacement}. | |
1283 | |
1284 If you did the last search in a buffer, you should specify @code{nil} | |
4199 | 1285 for @var{string}. (An error will be signaled if you don't.) Then |
1286 @code{replace-match} does the replacement by editing the buffer; it | |
1287 leaves point at the end of the replacement text, and returns @code{t}. | |
428 | 1288 |
1289 If you did the search in a string, pass the same string as @var{string}. | |
4199 | 1290 (An error will be signaled if you specify nil.) Then |
1291 @code{replace-match} does the replacement by constructing and returning | |
1292 a new string. | |
444 | 1293 |
428 | 1294 If @var{fixedcase} is non-@code{nil}, then the case of the replacement |
1295 text is not changed; otherwise, the replacement text is converted to a | |
1296 different case depending upon the capitalization of the text to be | |
1297 replaced. If the original text is all upper case, the replacement text | |
1298 is converted to upper case. If the first word of the original text is | |
1299 capitalized, then the first word of the replacement text is capitalized. | |
1300 If the original text contains just one word, and that word is a capital | |
1301 letter, @code{replace-match} considers this a capitalized first word | |
1302 rather than all upper case. | |
1303 | |
1304 If @code{case-replace} is @code{nil}, then case conversion is not done, | |
444 | 1305 regardless of the value of @var{fixedcase}. @xref{Searching and Case}. |
428 | 1306 |
1307 If @var{literal} is non-@code{nil}, then @var{replacement} is inserted | |
1308 exactly as it is, the only alterations being case changes as needed. | |
1309 If it is @code{nil} (the default), then the character @samp{\} is treated | |
1310 specially. If a @samp{\} appears in @var{replacement}, then it must be | |
1311 part of one of the following sequences: | |
1312 | |
1313 @table @asis | |
1314 @item @samp{\&} | |
4199 | 1315 @cindex @samp{\&} in replacement |
428 | 1316 @samp{\&} stands for the entire text being replaced. |
1317 | |
1318 @item @samp{\@var{n}} | |
1319 @cindex @samp{\@var{n}} in replacement | |
4199 | 1320 @cindex @samp{\@var{digit}} in replacement |
428 | 1321 @samp{\@var{n}}, where @var{n} is a digit, stands for the text that |
1322 matched the @var{n}th subexpression in the original regexp. | |
1323 Subexpressions are those expressions grouped inside @samp{\(@dots{}\)}. | |
1324 | |
1325 @item @samp{\\} | |
4199 | 1326 @cindex @samp{\\} in replacement |
428 | 1327 @samp{\\} stands for a single @samp{\} in the replacement text. |
4199 | 1328 |
1329 @item @samp{\u} | |
1330 @cindex @samp{\u} in replacement | |
1331 @samp{\u} means upcase the next character. | |
1332 | |
1333 @item @samp{\l} | |
1334 @cindex @samp{\l} in replacement | |
1335 @samp{\l} means downcase the next character. | |
1336 | |
1337 @item @samp{\U} | |
1338 @cindex @samp{\U} in replacement | |
1339 @samp{\U} means begin upcasing all following characters. | |
1340 | |
1341 @item @samp{\L} | |
1342 @cindex @samp{\L} in replacement | |
1343 @samp{\L} means begin downcasing all following characters. | |
1344 | |
1345 @item @samp{\E} | |
1346 @cindex @samp{\E} in replacement | |
1347 @samp{\E} means terminate the effect of any @samp{\U} or @samp{\L}. | |
428 | 1348 @end table |
4199 | 1349 |
1350 Case changes made with @samp{\u}, @samp{\l}, @samp{\U}, and @samp{\L} | |
1351 override all other case changes that may be made in the replaced text. | |
1352 | |
1353 The fifth argument @var{strbuffer} may be a buffer to be used for | |
1354 syntax-table and case-table lookup. If @var{strbuffer} is not a buffer, | |
1355 the current buffer is used. When @var{string} is not a string, the | |
1356 buffer that the match occurred in has automatically been remembered and | |
1357 you do not need to specify it. @var{string} may also be an integer, | |
1358 specifying the index of the subexpression to match. When @var{string} | |
1359 is not an integer, the ``subexpression'' is 0, @emph{i.e.}, the whole | |
1360 match. An @code{invalid-argument} error will be signaled if you specify | |
1361 a buffer when @var{string} is nil, or specify a subexpression which was | |
1362 not matched. | |
1363 | |
1364 It is not possible to specify both a buffer and a subexpression, but the | |
1365 idiom | |
1366 @example | |
1367 (with-current-buffer @var{buffer} (replace-match ... @var{integer})) | |
1368 @end example | |
1369 may be used. | |
1370 | |
428 | 1371 @end defun |
1372 | |
4199 | 1373 |
428 | 1374 @node Entire Match Data |
1375 @subsection Accessing the Entire Match Data | |
1376 | |
1377 The functions @code{match-data} and @code{set-match-data} read or | |
1378 write the entire match data, all at once. | |
1379 | |
444 | 1380 @defun match-data &optional integers reuse |
428 | 1381 This function returns a newly constructed list containing all the |
1382 information on what text the last search matched. Element zero is the | |
1383 position of the beginning of the match for the whole expression; element | |
1384 one is the position of the end of the match for the expression. The | |
1385 next two elements are the positions of the beginning and end of the | |
1386 match for the first subexpression, and so on. In general, element | |
1387 @ifinfo | |
1388 number 2@var{n} | |
1389 @end ifinfo | |
1390 @tex | |
1391 number {\mathsurround=0pt $2n$} | |
1392 @end tex | |
1393 corresponds to @code{(match-beginning @var{n})}; and | |
1394 element | |
1395 @ifinfo | |
1396 number 2@var{n} + 1 | |
1397 @end ifinfo | |
1398 @tex | |
1399 number {\mathsurround=0pt $2n+1$} | |
1400 @end tex | |
1401 corresponds to @code{(match-end @var{n})}. | |
1402 | |
1403 All the elements are markers or @code{nil} if matching was done on a | |
1404 buffer, and all are integers or @code{nil} if matching was done on a | |
444 | 1405 string with @code{string-match}. However, if the optional first |
1406 argument @var{integers} is non-@code{nil}, always use integers (rather | |
1407 than markers) to represent buffer positions. | |
1408 | |
1409 If the optional second argument @var{reuse} is a list, reuse it as part | |
1410 of the value. If @var{reuse} is long enough to hold all the values, and if | |
1411 @var{integers} is non-@code{nil}, no new lisp objects are created. | |
428 | 1412 |
1413 As always, there must be no possibility of intervening searches between | |
1414 the call to a search function and the call to @code{match-data} that is | |
1415 intended to access the match data for that search. | |
1416 | |
1417 @example | |
1418 @group | |
1419 (match-data) | |
1420 @result{} (#<marker at 9 in foo> | |
1421 #<marker at 17 in foo> | |
1422 #<marker at 13 in foo> | |
1423 #<marker at 17 in foo>) | |
1424 @end group | |
1425 @end example | |
1426 @end defun | |
1427 | |
1428 @defun set-match-data match-list | |
1429 This function sets the match data from the elements of @var{match-list}, | |
1430 which should be a list that was the value of a previous call to | |
1431 @code{match-data}. | |
1432 | |
1433 If @var{match-list} refers to a buffer that doesn't exist, you don't get | |
1434 an error; that sets the match data in a meaningless but harmless way. | |
1435 | |
1436 @findex store-match-data | |
1437 @code{store-match-data} is an alias for @code{set-match-data}. | |
1438 @end defun | |
1439 | |
1440 @node Saving Match Data | |
1441 @subsection Saving and Restoring the Match Data | |
1442 | |
1443 When you call a function that may do a search, you may need to save | |
1444 and restore the match data around that call, if you want to preserve the | |
1445 match data from an earlier search for later use. Here is an example | |
1446 that shows the problem that arises if you fail to save the match data: | |
1447 | |
1448 @example | |
1449 @group | |
1450 (re-search-forward "The \\(cat \\)") | |
1451 @result{} 48 | |
1452 (foo) ; @r{Perhaps @code{foo} does} | |
1453 ; @r{more searching.} | |
1454 (match-end 0) | |
1455 @result{} 61 ; @r{Unexpected result---not 48!} | |
1456 @end group | |
1457 @end example | |
1458 | |
1459 You can save and restore the match data with @code{save-match-data}: | |
1460 | |
444 | 1461 @defspec save-match-data body@dots{} |
428 | 1462 This special form executes @var{body}, saving and restoring the match |
1463 data around it. | |
444 | 1464 @end defspec |
428 | 1465 |
1466 You can use @code{set-match-data} together with @code{match-data} to | |
1467 imitate the effect of the special form @code{save-match-data}. This is | |
1468 useful for writing code that can run in Emacs 18. Here is how: | |
1469 | |
1470 @example | |
1471 @group | |
1472 (let ((data (match-data))) | |
1473 (unwind-protect | |
1474 @dots{} ; @r{May change the original match data.} | |
1475 (set-match-data data))) | |
1476 @end group | |
1477 @end example | |
1478 | |
1479 Emacs automatically saves and restores the match data when it runs | |
1480 process filter functions (@pxref{Filter Functions}) and process | |
1481 sentinels (@pxref{Sentinels}). | |
1482 | |
1483 @ignore | |
1484 Here is a function which restores the match data provided the buffer | |
1485 associated with it still exists. | |
1486 | |
1487 @smallexample | |
1488 @group | |
1489 (defun restore-match-data (data) | |
1490 @c It is incorrect to split the first line of a doc string. | |
1491 @c If there's a problem here, it should be solved in some other way. | |
1492 "Restore the match data DATA unless the buffer is missing." | |
1493 (catch 'foo | |
1494 (let ((d data)) | |
1495 @end group | |
1496 (while d | |
1497 (and (car d) | |
1498 (null (marker-buffer (car d))) | |
1499 @group | |
1500 ;; @file{match-data} @r{buffer is deleted.} | |
1501 (throw 'foo nil)) | |
1502 (setq d (cdr d))) | |
1503 (set-match-data data)))) | |
1504 @end group | |
1505 @end smallexample | |
1506 @end ignore | |
1507 | |
1508 @node Searching and Case | |
1509 @section Searching and Case | |
1510 @cindex searching and case | |
1511 | |
1512 By default, searches in Emacs ignore the case of the text they are | |
1513 searching through; if you specify searching for @samp{FOO}, then | |
1514 @samp{Foo} or @samp{foo} is also considered a match. Regexps, and in | |
1515 particular character sets, are included: thus, @samp{[aB]} would match | |
1516 @samp{a} or @samp{A} or @samp{b} or @samp{B}. | |
1517 | |
1518 If you do not want this feature, set the variable | |
1519 @code{case-fold-search} to @code{nil}. Then all letters must match | |
1520 exactly, including case. This is a buffer-local variable; altering the | |
1521 variable affects only the current buffer. (@xref{Intro to | |
1522 Buffer-Local}.) Alternatively, you may change the value of | |
1523 @code{default-case-fold-search}, which is the default value of | |
1524 @code{case-fold-search} for buffers that do not override it. | |
1525 | |
1526 Note that the user-level incremental search feature handles case | |
1527 distinctions differently. When given a lower case letter, it looks for | |
1528 a match of either case, but when given an upper case letter, it looks | |
1529 for an upper case letter only. But this has nothing to do with the | |
1530 searching functions Lisp functions use. | |
1531 | |
1532 @defopt case-replace | |
1533 This variable determines whether the replacement functions should | |
1534 preserve case. If the variable is @code{nil}, that means to use the | |
1535 replacement text verbatim. A non-@code{nil} value means to convert the | |
1536 case of the replacement text according to the text being replaced. | |
1537 | |
1538 The function @code{replace-match} is where this variable actually has | |
1539 its effect. @xref{Replacing Match}. | |
1540 @end defopt | |
1541 | |
1542 @defopt case-fold-search | |
1543 This buffer-local variable determines whether searches should ignore | |
1544 case. If the variable is @code{nil} they do not ignore case; otherwise | |
1545 they do ignore case. | |
1546 @end defopt | |
1547 | |
1548 @defvar default-case-fold-search | |
1549 The value of this variable is the default value for | |
1550 @code{case-fold-search} in buffers that do not override it. This is the | |
1551 same as @code{(default-value 'case-fold-search)}. | |
1552 @end defvar | |
1553 | |
1554 @node Standard Regexps | |
1555 @section Standard Regular Expressions Used in Editing | |
1556 @cindex regexps used standardly in editing | |
1557 @cindex standard regexps used in editing | |
1558 | |
1559 This section describes some variables that hold regular expressions | |
1560 used for certain purposes in editing: | |
1561 | |
1562 @defvar page-delimiter | |
1563 This is the regexp describing line-beginnings that separate pages. The | |
1564 default value is @code{"^\014"} (i.e., @code{"^^L"} or @code{"^\C-l"}); | |
1565 this matches a line that starts with a formfeed character. | |
1566 @end defvar | |
1567 | |
1568 The following two regular expressions should @emph{not} assume the | |
1569 match always starts at the beginning of a line; they should not use | |
1570 @samp{^} to anchor the match. Most often, the paragraph commands do | |
1571 check for a match only at the beginning of a line, which means that | |
1572 @samp{^} would be superfluous. When there is a nonzero left margin, | |
1573 they accept matches that start after the left margin. In that case, a | |
1574 @samp{^} would be incorrect. However, a @samp{^} is harmless in modes | |
1575 where a left margin is never used. | |
1576 | |
1577 @defvar paragraph-separate | |
1578 This is the regular expression for recognizing the beginning of a line | |
1579 that separates paragraphs. (If you change this, you may have to | |
1580 change @code{paragraph-start} also.) The default value is | |
1581 @w{@code{"[@ \t\f]*$"}}, which matches a line that consists entirely of | |
1582 spaces, tabs, and form feeds (after its left margin). | |
1583 @end defvar | |
1584 | |
1585 @defvar paragraph-start | |
1586 This is the regular expression for recognizing the beginning of a line | |
1587 that starts @emph{or} separates paragraphs. The default value is | |
1588 @w{@code{"[@ \t\n\f]"}}, which matches a line starting with a space, tab, | |
1589 newline, or form feed (after its left margin). | |
1590 @end defvar | |
1591 | |
1592 @defvar sentence-end | |
1593 This is the regular expression describing the end of a sentence. (All | |
1594 paragraph boundaries also end sentences, regardless.) The default value | |
1595 is: | |
1596 | |
1597 @example | |
1598 "[.?!][]\"')@}]*\\($\\| $\\|\t\\| \\)[ \t\n]*" | |
1599 @end example | |
1600 | |
1601 This means a period, question mark or exclamation mark, followed | |
1602 optionally by a closing parenthetical character, followed by tabs, | |
1603 spaces or new lines. | |
1604 | |
1605 For a detailed explanation of this regular expression, see @ref{Regexp | |
1606 Example}. | |
1607 @end defvar |