Mercurial > hg > xemacs-beta
comparison man/lispref/searching.texi @ 5653:3df910176b6a
Support predefined character classes in #'skip-chars-{forward,backward}, too
src/ChangeLog addition:
2012-05-04 Aidan Kehoe <kehoea@parhasard.net>
* regex.c:
Move various #defines and enums to regex.h, since we need them
when implementing #'skip-chars-{backward,forward}.
* regex.c (re_wctype):
* regex.c (re_iswctype):
Be more robust about case insensitivity here.
* regex.c (regex_compile):
* regex.h:
* regex.h (RE_ISWCTYPE_ARG_DECL):
* regex.h (CHAR_CLASS_MAX_LENGTH):
* search.c (skip_chars):
Implement support for the predefined character classes in this
function.
tests/ChangeLog addition:
2012-05-04 Aidan Kehoe <kehoea@parhasard.net>
* automated/regexp-tests.el (equal):
* automated/regexp-tests.el (Assert-char-class):
Correct a stray parenthesis; add tests for the predefined
character classes with #'skip-chars-{forward,backward}; update the
tests to reflect some changed design decisions on my part.
man/ChangeLog addition:
2012-05-04 Aidan Kehoe <kehoea@parhasard.net>
* lispref/searching.texi (Regular Expressions):
* lispref/searching.texi (Syntax of Regexps):
* lispref/searching.texi (Char Classes):
* lispref/searching.texi (Regexp Example):
Document the predefined character classes in this file.
author | Aidan Kehoe <kehoea@parhasard.net> |
---|---|
date | Fri, 04 May 2012 21:12:02 +0100 |
parents | a46c5c8d6564 |
children | 9fae6227ede5 |
comparison
equal
deleted
inserted
replaced
5649:d026b665014f | 5653:3df910176b6a |
---|---|
178 extended with some Perl--like capabilities, described in the next | 178 extended with some Perl--like capabilities, described in the next |
179 section. | 179 section. |
180 | 180 |
181 @menu | 181 @menu |
182 * Syntax of Regexps:: Rules for writing regular expressions. | 182 * Syntax of Regexps:: Rules for writing regular expressions. |
183 * Char Classes:: Predefined character classes for searching. | |
183 * Regexp Example:: Illustrates regular expression syntax. | 184 * Regexp Example:: Illustrates regular expression syntax. |
184 @end menu | 185 @end menu |
185 | 186 |
186 @node Syntax of Regexps | 187 @node Syntax of Regexps |
187 @subsection Syntax of Regular Expressions | 188 @subsection Syntax of Regular Expressions |
333 @samp{]}. | 334 @samp{]}. |
334 | 335 |
335 To include @samp{^} in a set, put it anywhere but at the beginning of | 336 To include @samp{^} in a set, put it anywhere but at the beginning of |
336 the set. | 337 the set. |
337 | 338 |
339 It is also possible to specify named character classes as part of your | |
340 character set; for example, @samp{[:xdigit:]} will match hexadecimal | |
341 digits, @samp{[:nonascii:]} will match characters outside the basic | |
342 ASCII set. These are documented elsewhere, @pxref{Char Classes}. | |
343 | |
338 @item [^ @dots{} ] | 344 @item [^ @dots{} ] |
339 @cindex @samp{^} in regexp | 345 @cindex @samp{^} in regexp |
340 @samp{[^} begins a @dfn{complement character set}, which matches any | 346 @samp{[^} begins a @dfn{complement character set}, which matches any |
341 character except the ones specified. Thus, @samp{[^a-z0-9A-Z]} | 347 character except the ones specified. Thus, @samp{[^a-z0-9A-Z]} |
342 matches all characters @emph{except} letters and digits.@refill | 348 matches all characters @emph{except} letters and digits.@refill |
601 (re-search-forward | 607 (re-search-forward |
602 (concat "\\s-" (regexp-quote string) "\\s-")) | 608 (concat "\\s-" (regexp-quote string) "\\s-")) |
603 @end group | 609 @end group |
604 @end example | 610 @end example |
605 @end defun | 611 @end defun |
612 | |
613 @node Char Classes | |
614 @subsection Char Classes | |
615 | |
616 These are the predefined character classes available within regular | |
617 expression character sets, and within @samp{skip-chars-forward} and | |
618 @samp{skip-chars-backward}, @xref{Skipping Characters}. | |
619 | |
620 @table @samp | |
621 @item [:alnum:] | |
622 This matches any ASCII letter or digit, or any non-ASCII character | |
623 with word syntax. | |
624 @item [:alpha:] | |
625 This matches any ASCII letter, or any non-ASCII character with word syntax. | |
626 @item [:ascii:] | |
627 This matches any character with a numeric value below @samp{?\x80}. | |
628 @item [:blank:] | |
629 This matches space or tab. | |
630 @item [:cntrl:] | |
631 This matches any character with a numeric value below @samp{?\x20}, | |
632 the code for space; these are the ASCII control characters. | |
633 @item [:digit:] | |
634 This matches the characters @samp{?0} to @samp{?9}, inclusive. | |
635 @item [:graph:] | |
636 This matches ``graphic'' characters, with numeric values greater than | |
637 @samp{?\x20}, exclusive of @samp{?\x7f}, the delete character. | |
638 @item [:lower:] | |
639 This matches minuscule characters, or any character with case | |
640 information if @samp{case-fold-search} is non-nil. | |
641 @item [:multibyte:] | |
642 This matches non-ASCII characters, that is, any character with a | |
643 numeric value above @samp{?\x7f}. | |
644 @item [:nonascii:] | |
645 This is equivalent to @samp{[:multibyte:]}. | |
646 @item [:print:] | |
647 This is equivalent to [:graph:], but also matches the space character, | |
648 @samp{?\x20}. | |
649 @item [:punct:] | |
650 This matches non-control, non-alphanumeric ASCII characters, or any | |
651 non-ASCII character without word syntax. | |
652 @item [:space:] | |
653 This matches any character with whitespace syntax. | |
654 @item [:unibyte:] | |
655 This is a GNU Emacs extension; in XEmacs it is equivalent to | |
656 @samp{[:ascii:]}. Note that this means it is not equivalent to | |
657 @samp{"\x00-\xff"}, which one might have assumed to be the case. | |
658 @item [:upper:] | |
659 This matches majuscule characters, or any character with case | |
660 information if @samp{case-fold-search} is non-nil. | |
661 @item [:word:] | |
662 This matches any character with word syntax. | |
663 @item [:xdigit:] | |
664 This matches hexadecimal digits, so the decimal digits @samp{0-9} and the | |
665 letters @samp{a-F} and @samp{A-F}. | |
666 @end table | |
606 | 667 |
607 @node Regexp Example | 668 @node Regexp Example |
608 @subsection Complex Regexp Example | 669 @subsection Complex Regexp Example |
609 | 670 |
610 Here is a complicated regexp, used by XEmacs to recognize the end of a | 671 Here is a complicated regexp, used by XEmacs to recognize the end of a |