annotate src/rangetab.h @ 5648:3f4a234f4672

Support non-ASCII correctly in character classes, test this. src/ChangeLog addition: 2012-04-21 Aidan Kehoe <kehoea@parhasard.net> Support non-ASCII correctly in character classes ([:alnum:] and friends). * regex.c: * regex.c (ISBLANK, ISUNIBYTE): New. Make these and friends independent of the locale, since we want them to be consistent in XEmacs. * regex.c (print_partial_compiled_pattern): Print the flags for charset_mule; don't print non-ASCII as the character values in ranges, this breaks with locales. * regex.c (enum): Define various flags the charset_mule and charset_mule_not opcodes can now take. * regex.c (CHAR_CLASS_MAX_LENGTH): Update this. * regex.c (re_iswctype, re_wctype): New, from GNU. * regex.c (re_wctype_can_match_non_ascii): New; used when deciding on whether to use charset_mule or the ASCII-only regex character set opcode. * regex.c (regex_compile): Error correctly on long, non-existent character class names. Break out the handling of charsets that can match non-ASCII into a separate clause. Use compile_char_class when compiling character classes. * regex.c (compile_char_class): New. Used in regex_compile when compiling character sets that may match non-ASCII. * regex.c (re_compile_fastmap): If there are flags set for charset_mule or charset_mule_not, we can't use the fastmap (since we need to check syntax table values that aren't available there). * regex.c (re_match_2_internal): Check the new flags passed to the charset_mule{,_not} opcode, observe them if appropriate. * regex.h: * regex.h (enum): Expose re_wctype_t here, imported from GNU. tests/ChangeLog addition: 2012-04-21 Aidan Kehoe <kehoea@parhasard.net> * automated/regexp-tests.el: * automated/regexp-tests.el (Assert-char-class): Check that #'string-match errors correctly with an over-long character class name. Add tests for character class functionality that supports non-ASCII characters. These tests expose bugs in GNU Emacs 24.0.94.2, but pass under current XEmacs.
author Aidan Kehoe <kehoea@parhasard.net>
date Sat, 21 Apr 2012 18:58:28 +0100
parents 308d34e9f07d
children
Ignore whitespace changes - Everywhere: Within whitespace: At end of lines:
rev   line source
428
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1 /* XEmacs routines to deal with range tables.
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
2 Copyright (C) 1995 Sun Microsystems, Inc.
5168
cf900a2f1fa3 extract gap array from extents.c, use in range tables
Ben Wing <ben@xemacs.org>
parents: 5127
diff changeset
3 Copyright (C) 1995, 2004, 2010 Ben Wing.
428
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
4
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
5 This file is part of XEmacs.
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
6
5402
308d34e9f07d Changed bulk of GPLv2 or later files identified by script
Mats Lidell <matsl@xemacs.org>
parents: 5168
diff changeset
7 XEmacs is free software: you can redistribute it and/or modify it
428
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
8 under the terms of the GNU General Public License as published by the
5402
308d34e9f07d Changed bulk of GPLv2 or later files identified by script
Mats Lidell <matsl@xemacs.org>
parents: 5168
diff changeset
9 Free Software Foundation, either version 3 of the License, or (at your
308d34e9f07d Changed bulk of GPLv2 or later files identified by script
Mats Lidell <matsl@xemacs.org>
parents: 5168
diff changeset
10 option) any later version.
428
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
11
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
12 XEmacs is distributed in the hope that it will be useful, but WITHOUT
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
13 ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
14 FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
15 for more details.
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
16
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
17 You should have received a copy of the GNU General Public License
5402
308d34e9f07d Changed bulk of GPLv2 or later files identified by script
Mats Lidell <matsl@xemacs.org>
parents: 5168
diff changeset
18 along with XEmacs. If not, see <http://www.gnu.org/licenses/>. */
428
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
19
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
20 /* Synched up with: Not in FSF. */
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
21
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
22 /* Extracted from rangetab.c by O. Galibert, 1998. */
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
23
440
8de8e3f6228a Import from CVS: tag r21-2-28
cvs
parents: 428
diff changeset
24 #ifndef INCLUDED_rangetab_h_
8de8e3f6228a Import from CVS: tag r21-2-28
cvs
parents: 428
diff changeset
25 #define INCLUDED_rangetab_h_
428
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
26
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
27 typedef struct range_table_entry range_table_entry;
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
28 struct range_table_entry
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
29 {
5168
cf900a2f1fa3 extract gap array from extents.c, use in range tables
Ben Wing <ben@xemacs.org>
parents: 5127
diff changeset
30 #ifdef NEW_GC
cf900a2f1fa3 extract gap array from extents.c, use in range tables
Ben Wing <ben@xemacs.org>
parents: 5127
diff changeset
31 NORMAL_LISP_OBJECT_HEADER header;
cf900a2f1fa3 extract gap array from extents.c, use in range tables
Ben Wing <ben@xemacs.org>
parents: 5127
diff changeset
32 #endif /* NEW_GC */
428
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
33 EMACS_INT first;
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
34 EMACS_INT last;
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
35 Lisp_Object val;
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
36 };
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
37
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
38 typedef struct
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
39 {
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
40 Dynarr_declare (range_table_entry);
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
41 } range_table_entry_dynarr;
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
42
2421
ab71ad6ff3dd [xemacs-hg @ 2004-12-06 03:50:53 by ben]
ben
parents: 793
diff changeset
43 enum range_table_type
ab71ad6ff3dd [xemacs-hg @ 2004-12-06 03:50:53 by ben]
ben
parents: 793
diff changeset
44 {
ab71ad6ff3dd [xemacs-hg @ 2004-12-06 03:50:53 by ben]
ben
parents: 793
diff changeset
45 RANGE_START_CLOSED_END_OPEN,
ab71ad6ff3dd [xemacs-hg @ 2004-12-06 03:50:53 by ben]
ben
parents: 793
diff changeset
46 RANGE_START_CLOSED_END_CLOSED,
ab71ad6ff3dd [xemacs-hg @ 2004-12-06 03:50:53 by ben]
ben
parents: 793
diff changeset
47 RANGE_START_OPEN_END_CLOSED,
ab71ad6ff3dd [xemacs-hg @ 2004-12-06 03:50:53 by ben]
ben
parents: 793
diff changeset
48 RANGE_START_OPEN_END_OPEN
ab71ad6ff3dd [xemacs-hg @ 2004-12-06 03:50:53 by ben]
ben
parents: 793
diff changeset
49 };
ab71ad6ff3dd [xemacs-hg @ 2004-12-06 03:50:53 by ben]
ben
parents: 793
diff changeset
50
428
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
51 struct Lisp_Range_Table
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
52 {
5127
a9c41067dd88 more cleanups, terminology clarification, lots of doc work
Ben Wing <ben@xemacs.org>
parents: 5120
diff changeset
53 NORMAL_LISP_OBJECT_HEADER header;
5168
cf900a2f1fa3 extract gap array from extents.c, use in range tables
Ben Wing <ben@xemacs.org>
parents: 5127
diff changeset
54 Gap_Array *entries;
2421
ab71ad6ff3dd [xemacs-hg @ 2004-12-06 03:50:53 by ben]
ben
parents: 793
diff changeset
55 enum range_table_type type;
428
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
56 };
440
8de8e3f6228a Import from CVS: tag r21-2-28
cvs
parents: 428
diff changeset
57 typedef struct Lisp_Range_Table Lisp_Range_Table;
428
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
58
5118
e0db3c197671 merge up to latest default branch, doesn't compile yet
Ben Wing <ben@xemacs.org>
parents: 3017
diff changeset
59 DECLARE_LISP_OBJECT (range_table, Lisp_Range_Table);
440
8de8e3f6228a Import from CVS: tag r21-2-28
cvs
parents: 428
diff changeset
60 #define XRANGE_TABLE(x) XRECORD (x, range_table, Lisp_Range_Table)
617
af57a77cbc92 [xemacs-hg @ 2001-06-18 07:09:50 by ben]
ben
parents: 440
diff changeset
61 #define wrap_range_table(p) wrap_record (p, range_table)
428
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
62 #define RANGE_TABLEP(x) RECORDP (x, range_table)
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
63 #define CHECK_RANGE_TABLE(x) CHECK_RECORD (x, range_table)
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
64
5168
cf900a2f1fa3 extract gap array from extents.c, use in range tables
Ben Wing <ben@xemacs.org>
parents: 5127
diff changeset
65 #define rangetab_gap_array_at(ga, pos) \
cf900a2f1fa3 extract gap array from extents.c, use in range tables
Ben Wing <ben@xemacs.org>
parents: 5127
diff changeset
66 gap_array_at (ga, pos, struct range_table_entry)
cf900a2f1fa3 extract gap array from extents.c, use in range tables
Ben Wing <ben@xemacs.org>
parents: 5127
diff changeset
67 #define rangetab_gap_array_atp(ga, pos) \
cf900a2f1fa3 extract gap array from extents.c, use in range tables
Ben Wing <ben@xemacs.org>
parents: 5127
diff changeset
68 gap_array_atp (ga, pos, struct range_table_entry)
440
8de8e3f6228a Import from CVS: tag r21-2-28
cvs
parents: 428
diff changeset
69 #endif /* INCLUDED_rangetab_h_ */