Mercurial > hg > xemacs-beta
comparison src/ChangeLog @ 5648:3f4a234f4672
Support non-ASCII correctly in character classes, test this.
src/ChangeLog addition:
2012-04-21 Aidan Kehoe <kehoea@parhasard.net>
Support non-ASCII correctly in character classes ([:alnum:] and
friends).
* regex.c:
* regex.c (ISBLANK, ISUNIBYTE): New. Make these and friends
independent of the locale, since we want them to be consistent in
XEmacs.
* regex.c (print_partial_compiled_pattern): Print the flags for
charset_mule; don't print non-ASCII as the character values in
ranges, this breaks with locales.
* regex.c (enum):
Define various flags the charset_mule and charset_mule_not opcodes
can now take.
* regex.c (CHAR_CLASS_MAX_LENGTH): Update this.
* regex.c (re_iswctype, re_wctype): New, from GNU.
* regex.c (re_wctype_can_match_non_ascii): New; used when deciding
on whether to use charset_mule or the ASCII-only regex character
set opcode.
* regex.c (regex_compile):
Error correctly on long, non-existent character class names.
Break out the handling of charsets that can match non-ASCII into a
separate clause. Use compile_char_class when compiling character
classes.
* regex.c (compile_char_class): New. Used in regex_compile when
compiling character sets that may match non-ASCII.
* regex.c (re_compile_fastmap):
If there are flags set for charset_mule or charset_mule_not, we
can't use the fastmap (since we need to check syntax table values
that aren't available there).
* regex.c (re_match_2_internal):
Check the new flags passed to the charset_mule{,_not} opcode,
observe them if appropriate.
* regex.h:
* regex.h (enum):
Expose re_wctype_t here, imported from GNU.
tests/ChangeLog addition:
2012-04-21 Aidan Kehoe <kehoea@parhasard.net>
* automated/regexp-tests.el:
* automated/regexp-tests.el (Assert-char-class):
Check that #'string-match errors correctly with an over-long
character class name.
Add tests for character class functionality that supports
non-ASCII characters. These tests expose bugs in GNU Emacs
24.0.94.2, but pass under current XEmacs.
author | Aidan Kehoe <kehoea@parhasard.net> |
---|---|
date | Sat, 21 Apr 2012 18:58:28 +0100 |
parents | 1d9f603e9125 |
children | d026b665014f |
comparison
equal
deleted
inserted
replaced
5647:1d9f603e9125 | 5648:3f4a234f4672 |
---|---|
1 2012-04-21 Aidan Kehoe <kehoea@parhasard.net> | |
2 | |
3 Support non-ASCII correctly in character classes ([:alnum:] and | |
4 friends). | |
5 | |
6 * regex.c: | |
7 * regex.c (ISBLANK, ISUNIBYTE): New. Make these and friends | |
8 independent of the locale, since we want them to be consistent in | |
9 XEmacs. | |
10 * regex.c (print_partial_compiled_pattern): Print the flags for | |
11 charset_mule; don't print non-ASCII as the character values in | |
12 ranges, this breaks with locales. | |
13 * regex.c (enum): | |
14 Define various flags the charset_mule and charset_mule_not opcodes | |
15 can now take. | |
16 * regex.c (CHAR_CLASS_MAX_LENGTH): Update this. | |
17 * regex.c (re_iswctype, re_wctype): New, from GNU. | |
18 * regex.c (re_wctype_can_match_non_ascii): New; used when deciding | |
19 on whether to use charset_mule or the ASCII-only regex character | |
20 set opcode. | |
21 * regex.c (regex_compile): | |
22 Error correctly on long, non-existent character class names. | |
23 Break out the handling of charsets that can match non-ASCII into a | |
24 separate clause. Use compile_char_class when compiling character | |
25 classes. | |
26 * regex.c (compile_char_class): New. Used in regex_compile when | |
27 compiling character sets that may match non-ASCII. | |
28 * regex.c (re_compile_fastmap): | |
29 If there are flags set for charset_mule or charset_mule_not, we | |
30 can't use the fastmap (since we need to check syntax table values | |
31 that aren't available there). | |
32 * regex.c (re_match_2_internal): | |
33 Check the new flags passed to the charset_mule{,_not} opcode, | |
34 observe them if appropriate. | |
35 * regex.h: | |
36 * regex.h (enum): | |
37 Expose re_wctype_t here, imported from GNU. | |
38 | |
1 2012-04-21 Aidan Kehoe <kehoea@parhasard.net> | 39 2012-04-21 Aidan Kehoe <kehoea@parhasard.net> |
2 | 40 |
3 * regex.h (RE_SYNTAX_EMACS): | 41 * regex.h (RE_SYNTAX_EMACS): |
4 Turn on character classes ([:alnum:] and friends) by default. This | 42 Turn on character classes ([:alnum:] and friends) by default. This |
5 implementation is incomplete, am working on a version that handles | 43 implementation is incomplete, am working on a version that handles |