Mercurial > hg > xemacs-beta
comparison src/search.c @ 826:6728e641994e
[xemacs-hg @ 2002-05-05 11:30:15 by ben]
syntax cache, 8-bit-format, lots of code cleanup
README.packages: Update info about --package-path.
i.c: Create an inheritable event and pass it on to XEmacs, so that ^C
can be handled properly. Intercept ^C and signal the event.
"Stop Build" in VC++ now works.
bytecomp-runtime.el: Doc string changes.
compat.el: Some attempts to redo this to
make it truly useful and fix the "multiple versions interacting
with each other" problem. Not yet done. Currently doesn't work.
files.el: Use with-obsolete-variable to avoid warnings in new revert-buffer code.
xemacs.mak: Split up CFLAGS into a version without flags specifying the C
library. The problem seems to be that minitar depends on zlib,
which depends specifically on libc.lib, not on any of the other C
libraries. Unless you compile with libc.lib, you get errors --
specifically, no _errno in the other libraries, which must make it
something other than an int. (#### But this doesn't seem to obtain
in XEmacs, which also uses zlib, and can be linked with any of the
C libraries. Maybe zlib is used differently and doesn't need
errno, or maybe XEmacs provides an int errno; ... I don't
understand.
Makefile.in.in: Fix so that packages are around when testing.
abbrev.c, alloc.c, buffer.c, buffer.h, bytecode.c, callint.c, casefiddle.c, casetab.c, casetab.h, charset.h, chartab.c, chartab.h, cmds.c, console-msw.h, console-stream.c, console-x.c, console.c, console.h, data.c, device-msw.c, device.c, device.h, dialog-msw.c, dialog-x.c, dired-msw.c, dired.c, doc.c, doprnt.c, dumper.c, editfns.c, elhash.c, emacs.c, eval.c, event-Xt.c, event-gtk.c, event-msw.c, event-stream.c, events.c, events.h, extents.c, extents.h, faces.c, file-coding.c, file-coding.h, fileio.c, fns.c, font-lock.c, frame-gtk.c, frame-msw.c, frame-x.c, frame.c, frame.h, glade.c, glyphs-gtk.c, glyphs-msw.c, glyphs-msw.h, glyphs-x.c, glyphs.c, glyphs.h, gui-msw.c, gui-x.c, gui.h, gutter.h, hash.h, indent.c, insdel.c, intl-win32.c, intl.c, keymap.c, lisp-disunion.h, lisp-union.h, lisp.h, lread.c, lrecord.h, lstream.c, lstream.h, marker.c, menubar-gtk.c, menubar-msw.c, menubar-x.c, menubar.c, minibuf.c, mule-ccl.c, mule-charset.c, mule-coding.c, mule-wnnfns.c, nas.c, objects-msw.c, objects-x.c, opaque.c, postgresql.c, print.c, process-nt.c, process-unix.c, process.c, process.h, profile.c, rangetab.c, redisplay-gtk.c, redisplay-msw.c, redisplay-output.c, redisplay-x.c, redisplay.c, redisplay.h, regex.c, regex.h, scrollbar-msw.c, search.c, select-x.c, specifier.c, specifier.h, symbols.c, symsinit.h, syntax.c, syntax.h, syswindows.h, tests.c, text.c, text.h, tooltalk.c, ui-byhand.c, ui-gtk.c, unicode.c, win32.c, window.c: Another big Ben patch.
-- FUNCTIONALITY CHANGES:
add partial support for 8-bit-fixed, 16-bit-fixed, and
32-bit-fixed formats. not quite done yet. (in particular, needs
functions to actually convert the buffer.) NOTE: lots of changes
to regex.c here. also, many new *_fmt() inline funs that take an
Internal_Format argument.
redo syntax cache code. make the cache per-buffer; keep the cache
valid across calls to functions that use it. also keep it valid
across insertions/deletions and extent changes, as much as is
possible. eliminate the junky regex-reentrancy code by passing in
the relevant lisp info to the regex routines as local vars.
add general mechanism in extents code for signalling extent changes.
fix numerous problems with the case-table implementation; yoshiki
never properly transferred many algorithms from old-style to
new-style case tables.
redo char tables to support a default argument, so that mapping
only occurs over changed args. change many chartab functions to
accept Lisp_Object instead of Lisp_Char_Table *.
comment out the code in font-lock.c by default, because
font-lock.el no longer uses it. we should consider eliminating it
entirely.
Don't output bell as ^G in console-stream when not a TTY.
add -mswindows-termination-handle to interface with i.c, so we can
properly kill a build.
add more error-checking to buffer/string macros.
add some additional buffer_or_string_() funs.
-- INTERFACE CHANGES AFFECTING MORE CODE:
switch the arguments of write_c_string and friends to be
consistent with write_fmt_string, which must have printcharfun
first.
change BI_* macros to BYTE_* for increased clarity; similarly for
bi_* local vars.
change VOID_TO_LISP to be a one-argument function. eliminate
no-longer-needed CVOID_TO_LISP.
-- char/string macro changes:
rename MAKE_CHAR() to make_emchar() for slightly less confusion
with make_char(). (The former generates an Emchar, the latter a
Lisp object. Conceivably we should rename make_char() -> wrap_char()
and similarly for make_int(), make_float().)
Similar changes for other *CHAR* macros -- we now consistently use
names with `emchar' whenever we are working with Emchars. Any
remaining name with just `char' always refers to a Lisp object.
rename macros with XSTRING_* to string_* except for those that
reference actual fields in the Lisp_String object, following
conventions used elsewhere.
rename set_string_{data,length} macros (the only ones to work with
a Lisp_String_* instead of a Lisp_Object) to set_lispstringp_*
to make the difference clear.
try to be consistent about caps vs. lowercase in macro/inline-fun
names for chars and such, which wasn't the case before. we now
reserve caps either for XFOO_ macros that reference object fields
(e.g. XSTRING_DATA) or for things that have non-function semantics,
e.g. directly modifying an arg (BREAKUP_EMCHAR) or evaluating an
arg (any arg) more than once. otherwise, use lowercase.
here is a summary of most of the macros/inline funs changed by all
of the above changes:
BYTE_*_P -> byte_*_p
XSTRING_BYTE -> string_byte
set_string_data/length -> set_lispstringp_data/length
XSTRING_CHAR_LENGTH -> string_char_length
XSTRING_CHAR -> string_emchar
INTBYTE_FIRST_BYTE_P -> intbyte_first_byte_p
INTBYTE_LEADING_BYTE_P -> intbyte_leading_byte_p
charptr_copy_char -> charptr_copy_emchar
LEADING_BYTE_* -> leading_byte_*
CHAR_* -> EMCHAR_*
*_CHAR_* -> *_EMCHAR_*
*_CHAR -> *_EMCHAR
CHARSET_BY_ -> charset_by_*
BYTE_SHIFT_JIS* -> byte_shift_jis*
BYTE_BIG5* -> byte_big5*
REP_BYTES_BY_FIRST_BYTE -> rep_bytes_by_first_byte
char_to_unicode -> emchar_to_unicode
valid_char_p -> valid_emchar_p
Change intbyte_strcmp -> qxestrcmp_c (duplicated functionality).
-- INTERFACE CHANGES AFFECTING LESS CODE:
use DECLARE_INLINE_HEADER in various places.
remove '#ifdef emacs' from XEmacs-only files.
eliminate CHAR_TABLE_VALUE(), which duplicated the functionality
of get_char_table().
add BUFFER_TEXT_LOOP to simplify iterations over buffer text.
define typedefs for signed and unsigned types of fixed sizes
(INT_32_BIT, UINT_32_BIT, etc.).
create ALIGN_FOR_TYPE as a higher-level interface onto ALIGN_SIZE;
fix code to use it.
add charptr_emchar_len to return the text length of the character
pointed to by a ptr; use it in place of
charcount_to_bytecount(..., 1). add emchar_len to return the text
length of a given character.
add types Bytexpos and Charxpos to generalize Bytebpos/Bytecount
and Charbpos/Charcount, in code (particularly, the extents code
and redisplay code) that works with either kind of index. rename
redisplay struct params with names such as `charbpos' to
e.g. `charpos' when they are e.g. a Charxpos, not a Charbpos.
eliminate xxDEFUN in place of DEFUN; no longer necessary with
changes awhile back to doc.c.
split up big ugly combined list of EXFUNs in lisp.h on a
file-by-file basis, since other prototypes are similarly split.
rewrite some "*_UNSAFE" macros as inline funs and eliminate the
_UNSAFE suffix.
move most string code from lisp.h to text.h; the string code and
text.h code is now intertwined in such a fashion that they need
to be in the same place and partially interleaved. (you can't
create forward references for inline funs)
automated/lisp-tests.el, automated/symbol-tests.el, automated/test-harness.el: Fix test harness to output FAIL messages to stderr when in
batch mode.
Fix up some problems in lisp-tests/symbol-tests that were
causing spurious failures.
author | ben |
---|---|
date | Sun, 05 May 2002 11:33:57 +0000 |
parents | a634e3b7acc8 |
children | 5d09ddada9ae |
comparison
equal
deleted
inserted
replaced
825:eb3bc15a6e0f | 826:6728e641994e |
---|---|
22 | 22 |
23 /* Synched up with: FSF 19.29, except for region-cache stuff. */ | 23 /* Synched up with: FSF 19.29, except for region-cache stuff. */ |
24 | 24 |
25 /* Hacked on for Mule by Ben Wing, December 1994 and August 1995. */ | 25 /* Hacked on for Mule by Ben Wing, December 1994 and August 1995. */ |
26 | 26 |
27 /* This file has been Mule-ized except for the TRT stuff. */ | 27 /* This file has been Mule-ized. */ |
28 | 28 |
29 #include <config.h> | 29 #include <config.h> |
30 #include "lisp.h" | 30 #include "lisp.h" |
31 | 31 |
32 #include "buffer.h" | 32 #include "buffer.h" |
83 to call re_set_registers after compiling a new pattern or after | 83 to call re_set_registers after compiling a new pattern or after |
84 setting the match registers, so that the regex functions will be | 84 setting the match registers, so that the regex functions will be |
85 able to free or re-allocate it properly. */ | 85 able to free or re-allocate it properly. */ |
86 | 86 |
87 /* Note: things get trickier under Mule because the values returned from | 87 /* Note: things get trickier under Mule because the values returned from |
88 the regexp routines are in Bytebposs but we need them to be in Charbpos's. | 88 the regexp routines are in Bytebpos's but we need them to be in Charbpos's. |
89 We take the easy way out for the moment and just convert them immediately. | 89 We take the easy way out for the moment and just convert them immediately. |
90 We could be more clever by not converting them until necessary, but | 90 We could be more clever by not converting them until necessary, but |
91 that gets real ugly real fast since the buffer might have changed and | 91 that gets real ugly real fast since the buffer might have changed and |
92 the positions might be out of sync or out of range. | 92 the positions might be out of sync or out of range. |
93 */ | 93 */ |
111 Lisp_Object Vskip_chars_range_table; | 111 Lisp_Object Vskip_chars_range_table; |
112 | 112 |
113 static void set_search_regs (struct buffer *buf, Charbpos beg, Charcount len); | 113 static void set_search_regs (struct buffer *buf, Charbpos beg, Charcount len); |
114 static void save_search_regs (void); | 114 static void save_search_regs (void); |
115 static Charbpos simple_search (struct buffer *buf, Intbyte *base_pat, | 115 static Charbpos simple_search (struct buffer *buf, Intbyte *base_pat, |
116 Bytecount len, Bytebpos pos, Bytebpos lim, | |
117 EMACS_INT n, Lisp_Object trt); | |
118 static Charbpos boyer_moore (struct buffer *buf, Intbyte *base_pat, | |
116 Bytecount len, Bytebpos pos, Bytebpos lim, | 119 Bytecount len, Bytebpos pos, Bytebpos lim, |
117 EMACS_INT n, Lisp_Object trt); | 120 EMACS_INT n, Lisp_Object trt, |
118 static Charbpos boyer_moore (struct buffer *buf, Intbyte *base_pat, | 121 Lisp_Object inverse_trt, int charset_base); |
119 Bytecount len, Bytebpos pos, Bytebpos lim, | |
120 EMACS_INT n, Lisp_Object trt, | |
121 Lisp_Object inverse_trt, int charset_base); | |
122 static Charbpos search_buffer (struct buffer *buf, Lisp_Object str, | 122 static Charbpos search_buffer (struct buffer *buf, Lisp_Object str, |
123 Charbpos charbpos, Charbpos buflim, EMACS_INT n, int RE, | 123 Charbpos charbpos, Charbpos buflim, EMACS_INT n, |
124 Lisp_Object trt, Lisp_Object inverse_trt, | 124 int RE, Lisp_Object trt, |
125 int posix); | 125 Lisp_Object inverse_trt, int posix); |
126 | |
127 struct regex_reentrancy | |
128 { | |
129 struct syntax_cache cache; | |
130 struct buffer *regex_emacs_buffer; | |
131 Lisp_Object regex_match_object; | |
132 }; | |
133 | |
134 typedef struct | |
135 { | |
136 Dynarr_declare (struct regex_reentrancy); | |
137 } regex_reentrancy_dynarr; | |
138 | |
139 static regex_reentrancy_dynarr *the_regex_reentrancy_dynarr; | |
140 | |
141 static Lisp_Object | |
142 restore_regex_reentrancy (Lisp_Object dummy) | |
143 { | |
144 struct regex_reentrancy rr = Dynarr_pop (the_regex_reentrancy_dynarr); | |
145 syntax_cache = rr.cache; | |
146 regex_emacs_buffer = rr.regex_emacs_buffer; | |
147 regex_match_object = rr.regex_match_object; | |
148 return Qnil; | |
149 } | |
150 | |
151 static int | |
152 begin_regex_reentrancy (void) | |
153 { | |
154 /* #### there is still a potential problem with the regex cache -- | |
155 the compiled regex could be overwritten. we'd need 20-fold | |
156 reentrancy, though. */ | |
157 struct regex_reentrancy rr; | |
158 rr.cache = syntax_cache; | |
159 rr.regex_emacs_buffer = regex_emacs_buffer; | |
160 rr.regex_match_object = regex_match_object; | |
161 if (!the_regex_reentrancy_dynarr) | |
162 the_regex_reentrancy_dynarr = Dynarr_new2 (regex_reentrancy_dynarr, | |
163 struct regex_reentrancy); | |
164 Dynarr_add (the_regex_reentrancy_dynarr, rr); | |
165 return record_unwind_protect (restore_regex_reentrancy, Qnil); | |
166 } | |
167 | 126 |
168 static void | 127 static void |
169 matcher_overflow (void) | 128 matcher_overflow (void) |
170 { | 129 { |
171 stack_overflow ("Stack overflow in regexp matcher", Qunbound); | 130 stack_overflow ("Stack overflow in regexp matcher", Qunbound); |
172 } | 131 } |
173 | 132 |
174 /* Compile a regexp and signal a Lisp error if anything goes wrong. | 133 /* Compile a regexp and signal a Lisp error if anything goes wrong. |
175 PATTERN is the pattern to compile. | 134 PATTERN is the pattern to compile. |
176 CP is the place to put the result. | 135 CP is the place to put the result. |
177 TRANSLATE is a translation table for ignoring case, or NULL for none. | 136 TRANSLATE is a translation table for ignoring case, or Qnil for none. |
178 REGP is the structure that says where to store the "register" | 137 REGP is the structure that says where to store the "register" |
179 values that will result from matching this pattern. | 138 values that will result from matching this pattern. |
180 If it is 0, we should compile the pattern not to record any | 139 If it is 0, we should compile the pattern not to record any |
181 subexpression bounds. | 140 subexpression bounds. |
182 POSIX is nonzero if we want full backtracking (POSIX style) | 141 POSIX is nonzero if we want full backtracking (POSIX style) |
183 for this pattern. 0 means backtrack only enough to get a valid match. */ | 142 for this pattern. 0 means backtrack only enough to get a valid match. */ |
184 | 143 |
185 static int | 144 static int |
186 compile_pattern_1 (struct regexp_cache *cp, Lisp_Object pattern, | 145 compile_pattern_1 (struct regexp_cache *cp, Lisp_Object pattern, |
187 Lisp_Object translate, struct re_registers *regp, int posix, | 146 struct re_registers *regp, Lisp_Object translate, |
188 Error_Behavior errb) | 147 int posix, Error_Behavior errb) |
189 { | 148 { |
190 const char *val; | 149 const char *val; |
191 reg_syntax_t old; | 150 reg_syntax_t old; |
192 | 151 |
193 cp->regexp = Qnil; | 152 cp->regexp = Qnil; |
211 } | 170 } |
212 | 171 |
213 /* Compile a regexp if necessary, but first check to see if there's one in | 172 /* Compile a regexp if necessary, but first check to see if there's one in |
214 the cache. | 173 the cache. |
215 PATTERN is the pattern to compile. | 174 PATTERN is the pattern to compile. |
216 TRANSLATE is a translation table for ignoring case, or NULL for none. | 175 TRANSLATE is a translation table for ignoring case, or Qnil for none. |
217 REGP is the structure that says where to store the "register" | 176 REGP is the structure that says where to store the "register" |
218 values that will result from matching this pattern. | 177 values that will result from matching this pattern. |
219 If it is 0, we should compile the pattern not to record any | 178 If it is 0, we should compile the pattern not to record any |
220 subexpression bounds. | 179 subexpression bounds. |
221 POSIX is nonzero if we want full backtracking (POSIX style) | 180 POSIX is nonzero if we want full backtracking (POSIX style) |
222 for this pattern. 0 means backtrack only enough to get a valid match. */ | 181 for this pattern. 0 means backtrack only enough to get a valid match. */ |
223 | 182 |
224 struct re_pattern_buffer * | 183 struct re_pattern_buffer * |
225 compile_pattern (Lisp_Object pattern, struct re_registers *regp, | 184 compile_pattern (Lisp_Object pattern, struct re_registers *regp, |
226 Lisp_Object translate, int posix, Error_Behavior errb) | 185 Lisp_Object translate, Lisp_Object searchobj, |
186 struct buffer *searchbuf, int posix, Error_Behavior errb) | |
227 { | 187 { |
228 struct regexp_cache *cp, **cpp; | 188 struct regexp_cache *cp, **cpp; |
229 | 189 |
230 for (cpp = &searchbuf_head; ; cpp = &cp->next) | 190 for (cpp = &searchbuf_head; ; cpp = &cp->next) |
231 { | 191 { |
232 cp = *cpp; | 192 cp = *cpp; |
193 /* &&#### once we fix up the fastmap code in regex.c for 8-bit-fixed, | |
194 we need to record and compare the buffer and format, since the | |
195 fastmap will reflect the state of the buffer -- and things get | |
196 more complicated if the buffer has changed formats or (esp.) has | |
197 kept the format but changed its interpretation! may need to have | |
198 the code that changes the interpretation go through and invalidate | |
199 cache entries for that buffer. */ | |
233 if (!NILP (Fstring_equal (cp->regexp, pattern)) | 200 if (!NILP (Fstring_equal (cp->regexp, pattern)) |
234 && EQ (cp->buf.translate, translate) | 201 && EQ (cp->buf.translate, translate) |
235 && cp->posix == posix) | 202 && cp->posix == posix) |
236 break; | 203 break; |
237 | 204 |
238 /* If we're at the end of the cache, compile into the last cell. */ | 205 /* If we're at the end of the cache, compile into the last cell. */ |
239 if (cp->next == 0) | 206 if (cp->next == 0) |
240 { | 207 { |
241 if (!compile_pattern_1 (cp, pattern, translate, regp, posix, | 208 if (!compile_pattern_1 (cp, pattern, regp, translate, |
242 errb)) | 209 posix, errb)) |
243 return 0; | 210 return 0; |
244 break; | 211 break; |
245 } | 212 } |
246 } | 213 } |
247 | 214 |
269 for (;;) | 236 for (;;) |
270 Fsignal (Qsearch_failed, list1 (arg)); | 237 Fsignal (Qsearch_failed, list1 (arg)); |
271 return Qnil; /* Not reached. */ | 238 return Qnil; /* Not reached. */ |
272 } | 239 } |
273 | 240 |
274 /* Convert the search registers from Bytebposs to Charbpos's. Needs to be | 241 /* Convert the search registers from Bytebpos's to Charbpos's. Needs to be |
275 done after each regexp match that uses the search regs. | 242 done after each regexp match that uses the search regs. |
276 | 243 |
277 We could get a potential speedup by not converting the search registers | 244 We could get a potential speedup by not converting the search registers |
278 until it's really necessary, e.g. when match-data or replace-match is | 245 until it's really necessary, e.g. when match-data or replace-match is |
279 called. However, this complexifies the code a lot (e.g. the buffer | 246 called. However, this complexifies the code a lot (e.g. the buffer |
280 could have changed and the Bytebposs stored might be invalid) and is | 247 could have changed and the Bytebpos's stored might be invalid) and is |
281 probably not a great time-saver. */ | 248 probably not a great time-saver. */ |
282 | 249 |
283 static void | 250 static void |
284 fixup_search_regs_for_buffer (struct buffer *buf) | 251 fixup_search_regs_for_buffer (struct buffer *buf) |
285 { | 252 { |
287 int num_regs = search_regs.num_regs; | 254 int num_regs = search_regs.num_regs; |
288 | 255 |
289 for (i = 0; i < num_regs; i++) | 256 for (i = 0; i < num_regs; i++) |
290 { | 257 { |
291 if (search_regs.start[i] >= 0) | 258 if (search_regs.start[i] >= 0) |
292 search_regs.start[i] = bytebpos_to_charbpos (buf, search_regs.start[i]); | 259 search_regs.start[i] = bytebpos_to_charbpos (buf, |
260 search_regs.start[i]); | |
293 if (search_regs.end[i] >= 0) | 261 if (search_regs.end[i] >= 0) |
294 search_regs.end[i] = bytebpos_to_charbpos (buf, search_regs.end[i]); | 262 search_regs.end[i] = bytebpos_to_charbpos (buf, search_regs.end[i]); |
295 } | 263 } |
296 } | 264 } |
297 | 265 |
326 | 294 |
327 | 295 |
328 static Lisp_Object | 296 static Lisp_Object |
329 looking_at_1 (Lisp_Object string, struct buffer *buf, int posix) | 297 looking_at_1 (Lisp_Object string, struct buffer *buf, int posix) |
330 { | 298 { |
331 /* This function has been Mule-ized, except for the trt table handling. */ | |
332 Lisp_Object val; | 299 Lisp_Object val; |
333 Bytebpos p1, p2; | 300 Bytebpos p1, p2; |
334 Bytecount s1, s2; | 301 Bytecount s1, s2; |
335 REGISTER int i; | 302 REGISTER int i; |
336 struct re_pattern_buffer *bufp; | 303 struct re_pattern_buffer *bufp; |
337 int count = begin_regex_reentrancy (); | 304 struct syntax_cache scache_struct; |
338 | 305 struct syntax_cache *scache = &scache_struct; |
306 | |
339 if (running_asynch_code) | 307 if (running_asynch_code) |
340 save_search_regs (); | 308 save_search_regs (); |
341 | 309 |
342 CHECK_STRING (string); | 310 CHECK_STRING (string); |
343 bufp = compile_pattern (string, &search_regs, | 311 bufp = compile_pattern (string, &search_regs, |
344 (!NILP (buf->case_fold_search) | 312 (!NILP (buf->case_fold_search) |
345 ? XCASE_TABLE_DOWNCASE (buf->case_table) : Qnil), | 313 ? XCASE_TABLE_DOWNCASE (buf->case_table) : Qnil), |
346 posix, ERROR_ME); | 314 wrap_buffer (buf), buf, posix, ERROR_ME); |
347 | 315 |
348 QUIT; | 316 QUIT; |
349 | 317 |
350 /* Get pointers and sizes of the two strings | 318 /* Get pointers and sizes of the two strings |
351 that make up the visible portion of the buffer. */ | 319 that make up the visible portion of the buffer. */ |
352 | 320 |
353 p1 = BI_BUF_BEGV (buf); | 321 p1 = BYTE_BUF_BEGV (buf); |
354 p2 = BI_BUF_CEILING_OF (buf, p1); | 322 p2 = BYTE_BUF_CEILING_OF (buf, p1); |
355 s1 = p2 - p1; | 323 s1 = p2 - p1; |
356 s2 = BI_BUF_ZV (buf) - p2; | 324 s2 = BYTE_BUF_ZV (buf) - p2; |
357 | 325 |
358 regex_match_object = Qnil; | 326 /* By making the regex object, regex buffer, and syntax cache arguments |
359 regex_emacs_buffer = buf; | 327 to re_{search,match}{,_2}, we've removed the need to do nasty things |
360 i = re_match_2 (bufp, (char *) BI_BUF_BYTE_ADDRESS (buf, p1), | 328 to deal with regex reentrancy. (See stack trace in signal.c for proof |
361 s1, (char *) BI_BUF_BYTE_ADDRESS (buf, p2), s2, | 329 that this can happen.) |
362 BI_BUF_PT (buf) - BI_BUF_BEGV (buf), &search_regs, | 330 |
363 BI_BUF_ZV (buf) - BI_BUF_BEGV (buf)); | 331 #### there is still a potential problem with the regex cache -- |
332 the compiled regex could be overwritten. we'd need 20-fold | |
333 reentrancy, though. Fix this. */ | |
334 | |
335 i = re_match_2 (bufp, (char *) BYTE_BUF_BYTE_ADDRESS (buf, p1), | |
336 s1, (char *) BYTE_BUF_BYTE_ADDRESS (buf, p2), s2, | |
337 BYTE_BUF_PT (buf) - BYTE_BUF_BEGV (buf), &search_regs, | |
338 BYTE_BUF_ZV (buf) - BYTE_BUF_BEGV (buf), wrap_buffer (buf), | |
339 buf, scache); | |
364 | 340 |
365 if (i == -2) | 341 if (i == -2) |
366 matcher_overflow (); | 342 matcher_overflow (); |
367 | 343 |
368 val = (0 <= i ? Qt : Qnil); | 344 val = (0 <= i ? Qt : Qnil); |
369 if (NILP (val)) | 345 if (NILP (val)) |
370 return unbind_to (count); | 346 return Qnil; |
371 { | 347 { |
372 int num_regs = search_regs.num_regs; | 348 int num_regs = search_regs.num_regs; |
373 for (i = 0; i < num_regs; i++) | 349 for (i = 0; i < num_regs; i++) |
374 if (search_regs.start[i] >= 0) | 350 if (search_regs.start[i] >= 0) |
375 { | 351 { |
376 search_regs.start[i] += BI_BUF_BEGV (buf); | 352 search_regs.start[i] += BYTE_BUF_BEGV (buf); |
377 search_regs.end[i] += BI_BUF_BEGV (buf); | 353 search_regs.end[i] += BYTE_BUF_BEGV (buf); |
378 } | 354 } |
379 } | 355 } |
380 last_thing_searched = wrap_buffer (buf); | 356 last_thing_searched = wrap_buffer (buf); |
381 fixup_search_regs_for_buffer (buf); | 357 fixup_search_regs_for_buffer (buf); |
382 return unbind_to_1 (count, val); | 358 return val; |
383 } | 359 } |
384 | 360 |
385 DEFUN ("looking-at", Flooking_at, 1, 2, 0, /* | 361 DEFUN ("looking-at", Flooking_at, 1, 2, 0, /* |
386 Return t if text after point matches regular expression REGEXP. | 362 Return t if text after point matches regular expression REGEXP. |
387 This function modifies the match data that `match-beginning', | 363 This function modifies the match data that `match-beginning', |
404 | 380 |
405 Optional argument BUFFER defaults to the current buffer. | 381 Optional argument BUFFER defaults to the current buffer. |
406 */ | 382 */ |
407 (regexp, buffer)) | 383 (regexp, buffer)) |
408 { | 384 { |
409 return looking_at_1 (regexp, decode_buffer (buffer, 0), 1); | 385 return looking_at_1 (regexp, decode_buffer (buffer, 0), 1); |
410 } | 386 } |
411 | 387 |
412 static Lisp_Object | 388 static Lisp_Object |
413 string_match_1 (Lisp_Object regexp, Lisp_Object string, Lisp_Object start, | 389 string_match_1 (Lisp_Object regexp, Lisp_Object string, Lisp_Object start, |
414 struct buffer *buf, int posix) | 390 struct buffer *buf, int posix) |
415 { | 391 { |
416 /* This function has been Mule-ized, except for the trt table handling. */ | |
417 Bytecount val; | 392 Bytecount val; |
418 Charcount s; | 393 Charcount s; |
419 struct re_pattern_buffer *bufp; | 394 struct re_pattern_buffer *bufp; |
420 int count = begin_regex_reentrancy (); | |
421 | 395 |
422 if (running_asynch_code) | 396 if (running_asynch_code) |
423 save_search_regs (); | 397 save_search_regs (); |
424 | 398 |
425 CHECK_STRING (regexp); | 399 CHECK_STRING (regexp); |
427 | 401 |
428 if (NILP (start)) | 402 if (NILP (start)) |
429 s = 0; | 403 s = 0; |
430 else | 404 else |
431 { | 405 { |
432 Charcount len = XSTRING_CHAR_LENGTH (string); | 406 Charcount len = string_char_length (string); |
433 | 407 |
434 CHECK_INT (start); | 408 CHECK_INT (start); |
435 s = XINT (start); | 409 s = XINT (start); |
436 if (s < 0 && -s <= len) | 410 if (s < 0 && -s <= len) |
437 s = len + s; | 411 s = len + s; |
441 | 415 |
442 | 416 |
443 bufp = compile_pattern (regexp, &search_regs, | 417 bufp = compile_pattern (regexp, &search_regs, |
444 (!NILP (buf->case_fold_search) | 418 (!NILP (buf->case_fold_search) |
445 ? XCASE_TABLE_DOWNCASE (buf->case_table) : Qnil), | 419 ? XCASE_TABLE_DOWNCASE (buf->case_table) : Qnil), |
446 0, ERROR_ME); | 420 string, buf, 0, ERROR_ME); |
447 QUIT; | 421 QUIT; |
448 { | 422 { |
449 Bytecount bis = string_index_char_to_byte (string, s); | 423 Bytecount bis = string_index_char_to_byte (string, s); |
450 regex_match_object = string; | 424 struct syntax_cache scache_struct; |
451 regex_emacs_buffer = buf; | 425 struct syntax_cache *scache = &scache_struct; |
426 | |
427 /* By making the regex object, regex buffer, and syntax cache arguments | |
428 to re_{search,match}{,_2}, we've removed the need to do nasty things | |
429 to deal with regex reentrancy. (See stack trace in signal.c for proof | |
430 that this can happen.) | |
431 | |
432 #### there is still a potential problem with the regex cache -- | |
433 the compiled regex could be overwritten. we'd need 20-fold | |
434 reentrancy, though. Fix this. */ | |
435 | |
452 val = re_search (bufp, (char *) XSTRING_DATA (string), | 436 val = re_search (bufp, (char *) XSTRING_DATA (string), |
453 XSTRING_LENGTH (string), bis, | 437 XSTRING_LENGTH (string), bis, |
454 XSTRING_LENGTH (string) - bis, | 438 XSTRING_LENGTH (string) - bis, |
455 &search_regs); | 439 &search_regs, string, buf, scache); |
456 } | 440 } |
457 if (val == -2) | 441 if (val == -2) |
458 matcher_overflow (); | 442 matcher_overflow (); |
459 if (val < 0) return unbind_to (count); | 443 if (val < 0) return Qnil; |
460 last_thing_searched = Qt; | 444 last_thing_searched = Qt; |
461 fixup_search_regs_for_string (string); | 445 fixup_search_regs_for_string (string); |
462 return | 446 return make_int (string_index_byte_to_char (string, val)); |
463 unbind_to_1 (count, | |
464 make_int (string_index_byte_to_char (string, val))); | |
465 } | 447 } |
466 | 448 |
467 DEFUN ("string-match", Fstring_match, 2, 4, 0, /* | 449 DEFUN ("string-match", Fstring_match, 2, 4, 0, /* |
468 Return index of start of first match for REGEXP in STRING, or nil. | 450 Return index of start of first match for REGEXP in STRING, or nil. |
469 If third arg START is non-nil, start search at that index in STRING. | 451 If third arg START is non-nil, start search at that index in STRING. |
470 For index of first char beyond the match, do (match-end 0). | 452 For index of first char beyond the match, do (match-end 0). |
471 `match-end' and `match-beginning' also give indices of substrings | 453 `match-end' and `match-beginning' also give indices of substrings |
472 matched by parenthesis constructs in the pattern. | 454 matched by parenthesis constructs in the pattern. |
473 | 455 |
474 Optional arg BUFFER controls how case folding is done (according to | 456 Optional arg BUFFER controls how case folding and syntax and category |
475 the value of `case-fold-search' in that buffer and that buffer's case | 457 lookup is done (according to the value of `case-fold-search' in that buffer |
476 tables) and defaults to the current buffer. | 458 and that buffer's case tables, syntax tables, and category table). If nil |
459 or unspecified, it defaults *NOT* to the current buffer but instead: | |
460 | |
461 -- the value of `case-fold-search' in the current buffer is still respected | |
462 because of idioms like | |
463 | |
464 (let ((case-fold-search nil)) | |
465 (string-match "^foo.*bar" string)) | |
466 | |
467 but the case, syntax, and category tables come from the standard tables, | |
468 which are accessed through functions `default-{case,syntax,category}-table' and serve as the parents of the | |
469 tables in particular buffer | |
470 | |
477 */ | 471 */ |
478 (regexp, string, start, buffer)) | 472 (regexp, string, start, buffer)) |
479 { | 473 { |
474 /* &&#### implement new interp for buffer arg; check code to see if it | |
475 makes more sense than prev */ | |
480 return string_match_1 (regexp, string, start, decode_buffer (buffer, 0), 0); | 476 return string_match_1 (regexp, string, start, decode_buffer (buffer, 0), 0); |
481 } | 477 } |
482 | 478 |
483 DEFUN ("posix-string-match", Fposix_string_match, 2, 4, 0, /* | 479 DEFUN ("posix-string-match", Fposix_string_match, 2, 4, 0, /* |
484 Return index of start of first match for REGEXP in STRING, or nil. | 480 Return index of start of first match for REGEXP in STRING, or nil. |
505 fast_string_match (Lisp_Object regexp, const Intbyte *nonreloc, | 501 fast_string_match (Lisp_Object regexp, const Intbyte *nonreloc, |
506 Lisp_Object reloc, Bytecount offset, | 502 Lisp_Object reloc, Bytecount offset, |
507 Bytecount length, int case_fold_search, | 503 Bytecount length, int case_fold_search, |
508 Error_Behavior errb, int no_quit) | 504 Error_Behavior errb, int no_quit) |
509 { | 505 { |
510 /* This function has been Mule-ized, except for the trt table handling. */ | |
511 Bytecount val; | 506 Bytecount val; |
512 Intbyte *newnonreloc = (Intbyte *) nonreloc; | 507 Intbyte *newnonreloc = (Intbyte *) nonreloc; |
513 struct re_pattern_buffer *bufp; | 508 struct re_pattern_buffer *bufp; |
514 int count; | 509 struct syntax_cache scache_struct; |
510 struct syntax_cache *scache = &scache_struct; | |
515 | 511 |
516 bufp = compile_pattern (regexp, 0, | 512 bufp = compile_pattern (regexp, 0, |
517 (case_fold_search | 513 (case_fold_search |
518 ? XCASE_TABLE_DOWNCASE (Vstandard_case_table) | 514 ? XCASE_TABLE_DOWNCASE (Vstandard_case_table) |
519 : Qnil), | 515 : Qnil), |
520 0, errb); | 516 reloc, 0, 0, errb); |
521 if (!bufp) | 517 if (!bufp) |
522 return -1; /* will only do this when errb != ERROR_ME */ | 518 return -1; /* will only do this when errb != ERROR_ME */ |
523 if (!no_quit) | 519 if (!no_quit) |
524 QUIT; | 520 QUIT; |
525 else | 521 else |
530 /* Don't need to protect against GC inside of re_search() due to QUIT; | 526 /* Don't need to protect against GC inside of re_search() due to QUIT; |
531 QUIT is GC-inhibited. */ | 527 QUIT is GC-inhibited. */ |
532 if (!NILP (reloc)) | 528 if (!NILP (reloc)) |
533 newnonreloc = XSTRING_DATA (reloc); | 529 newnonreloc = XSTRING_DATA (reloc); |
534 | 530 |
535 count = begin_regex_reentrancy (); | 531 /* By making the regex object, regex buffer, and syntax cache arguments |
536 /* #### evil current-buffer dependency */ | 532 to re_{search,match}{,_2}, we've removed the need to do nasty things |
537 regex_match_object = reloc; | 533 to deal with regex reentrancy. (See stack trace in signal.c for proof |
538 regex_emacs_buffer = current_buffer; | 534 that this can happen.) |
535 | |
536 #### there is still a potential problem with the regex cache -- | |
537 the compiled regex could be overwritten. we'd need 20-fold | |
538 reentrancy, though. Fix this. */ | |
539 | |
539 val = re_search (bufp, (char *) newnonreloc + offset, length, 0, | 540 val = re_search (bufp, (char *) newnonreloc + offset, length, 0, |
540 length, 0); | 541 length, 0, reloc, 0, scache); |
541 | 542 |
542 no_quit_in_re_search = 0; | 543 no_quit_in_re_search = 0; |
543 unbind_to (count); | |
544 return val; | 544 return val; |
545 } | 545 } |
546 | 546 |
547 Bytecount | 547 Bytecount |
548 fast_lisp_string_match (Lisp_Object regex, Lisp_Object string) | 548 fast_lisp_string_match (Lisp_Object regex, Lisp_Object string) |
598 to the number of TARGETs left unfound, and return END. | 598 to the number of TARGETs left unfound, and return END. |
599 | 599 |
600 If ALLOW_QUIT is non-zero, call QUIT periodically. */ | 600 If ALLOW_QUIT is non-zero, call QUIT periodically. */ |
601 | 601 |
602 static Bytebpos | 602 static Bytebpos |
603 bi_scan_buffer (struct buffer *buf, Emchar target, Bytebpos st, Bytebpos en, | 603 byte_scan_buffer (struct buffer *buf, Emchar target, Bytebpos st, Bytebpos en, |
604 EMACS_INT count, EMACS_INT *shortage, int allow_quit) | 604 EMACS_INT count, EMACS_INT *shortage, int allow_quit) |
605 { | 605 { |
606 /* This function has been Mule-ized. */ | |
607 Bytebpos lim = en > 0 ? en : | 606 Bytebpos lim = en > 0 ? en : |
608 ((count > 0) ? BI_BUF_ZV (buf) : BI_BUF_BEGV (buf)); | 607 ((count > 0) ? BYTE_BUF_ZV (buf) : BYTE_BUF_BEGV (buf)); |
609 | 608 |
610 /* #### newline cache stuff in this function not yet ported */ | 609 /* #### newline cache stuff in this function not yet ported */ |
611 | |
612 assert (count != 0); | 610 assert (count != 0); |
613 | 611 |
614 if (shortage) | 612 if (shortage) |
615 *shortage = 0; | 613 *shortage = 0; |
616 | 614 |
617 if (count > 0) | 615 if (count > 0) |
618 { | 616 { |
619 #ifdef MULE | 617 #ifdef MULE |
620 /* Due to the Mule representation of characters in a buffer, | 618 Internal_Format fmt = buf->text->format; |
621 we can simply search for characters in the range 0 - 127 | 619 /* Check for char that's unrepresentable in the buffer -- it |
622 directly. For other characters, we do it the "hard" way. | 620 certainly can't be there. */ |
623 Note that this way works for all characters but the other | 621 if (!emchar_fits_in_format (target, fmt, wrap_buffer (buf))) |
624 way is faster. */ | 622 { |
625 if (target >= 0200) | 623 *shortage = count; |
626 { | 624 return lim; |
625 } | |
626 /* Due to the Mule representation of characters in a buffer, we can | |
627 simply search for characters in the range 0 - 127 directly; for | |
628 8-bit-fixed, we can do this for all characters. In other cases, | |
629 we do it the "hard" way. Note that this way works for all | |
630 characters and all formats, but the other way is faster. */ | |
631 else if (! (fmt == FORMAT_8_BIT_FIXED || | |
632 (fmt == FORMAT_DEFAULT && emchar_ascii_p (target)))) | |
633 { | |
634 Raw_Emchar raw = emchar_to_raw (target, fmt, wrap_buffer (buf)); | |
627 while (st < lim && count > 0) | 635 while (st < lim && count > 0) |
628 { | 636 { |
629 if (BI_BUF_FETCH_CHAR (buf, st) == target) | 637 if (BYTE_BUF_FETCH_CHAR_RAW (buf, st) == raw) |
630 count--; | 638 count--; |
631 INC_BYTEBPOS (buf, st); | 639 INC_BYTEBPOS (buf, st); |
632 } | 640 } |
633 } | 641 } |
634 else | 642 else |
635 #endif | 643 #endif |
636 { | 644 { |
645 Raw_Emchar raw = emchar_to_raw (target, fmt, wrap_buffer (buf)); | |
637 while (st < lim && count > 0) | 646 while (st < lim && count > 0) |
638 { | 647 { |
639 Bytebpos ceil; | 648 Bytebpos ceil; |
640 Intbyte *bufptr; | 649 Intbyte *bufptr; |
641 | 650 |
642 ceil = BI_BUF_CEILING_OF (buf, st); | 651 ceil = BYTE_BUF_CEILING_OF (buf, st); |
643 ceil = min (lim, ceil); | 652 ceil = min (lim, ceil); |
644 bufptr = (Intbyte *) memchr (BI_BUF_BYTE_ADDRESS (buf, st), | 653 bufptr = (Intbyte *) memchr (BYTE_BUF_BYTE_ADDRESS (buf, st), |
645 (int) target, ceil - st); | 654 raw, ceil - st); |
646 if (bufptr) | 655 if (bufptr) |
647 { | 656 { |
648 count--; | 657 count--; |
649 st = BI_BUF_PTR_BYTE_POS (buf, bufptr) + 1; | 658 st = BYTE_BUF_PTR_BYTE_POS (buf, bufptr) + 1; |
650 } | 659 } |
651 else | 660 else |
652 st = ceil; | 661 st = ceil; |
653 } | 662 } |
654 } | 663 } |
660 return st; | 669 return st; |
661 } | 670 } |
662 else | 671 else |
663 { | 672 { |
664 #ifdef MULE | 673 #ifdef MULE |
665 if (target >= 0200) | 674 Internal_Format fmt = buf->text->format; |
666 { | 675 /* Check for char that's unrepresentable in the buffer -- it |
676 certainly can't be there. */ | |
677 if (!emchar_fits_in_format (target, fmt, wrap_buffer (buf))) | |
678 { | |
679 *shortage = -count; | |
680 return lim; | |
681 } | |
682 else if (! (fmt == FORMAT_8_BIT_FIXED || | |
683 (fmt == FORMAT_DEFAULT && emchar_ascii_p (target)))) | |
684 { | |
685 Raw_Emchar raw = emchar_to_raw (target, fmt, wrap_buffer (buf)); | |
667 while (st > lim && count < 0) | 686 while (st > lim && count < 0) |
668 { | 687 { |
669 DEC_BYTEBPOS (buf, st); | 688 DEC_BYTEBPOS (buf, st); |
670 if (BI_BUF_FETCH_CHAR (buf, st) == target) | 689 if (BYTE_BUF_FETCH_CHAR_RAW (buf, st) == raw) |
671 count++; | 690 count++; |
672 } | 691 } |
673 } | 692 } |
674 else | 693 else |
675 #endif | 694 #endif |
676 { | 695 { |
696 Raw_Emchar raw = emchar_to_raw (target, fmt, wrap_buffer (buf)); | |
677 while (st > lim && count < 0) | 697 while (st > lim && count < 0) |
678 { | 698 { |
679 Bytebpos floor; | 699 Bytebpos floor; |
680 Intbyte *bufptr; | 700 Intbyte *bufptr; |
681 Intbyte *floorptr; | 701 Intbyte *floorptr; |
682 | 702 |
683 floor = BI_BUF_FLOOR_OF (buf, st); | 703 floor = BYTE_BUF_FLOOR_OF (buf, st); |
684 floor = max (lim, floor); | 704 floor = max (lim, floor); |
685 /* No memrchr() ... */ | 705 /* No memrchr() ... */ |
686 bufptr = BI_BUF_BYTE_ADDRESS_BEFORE (buf, st); | 706 bufptr = BYTE_BUF_BYTE_ADDRESS_BEFORE (buf, st); |
687 floorptr = BI_BUF_BYTE_ADDRESS (buf, floor); | 707 floorptr = BYTE_BUF_BYTE_ADDRESS (buf, floor); |
688 while (bufptr >= floorptr) | 708 while (bufptr >= floorptr) |
689 { | 709 { |
690 st--; | 710 st--; |
691 /* At this point, both ST and BUFPTR refer to the same | 711 /* At this point, both ST and BUFPTR refer to the same |
692 character. When the loop terminates, ST will | 712 character. When the loop terminates, ST will |
693 always point to the last character we tried. */ | 713 always point to the last character we tried. */ |
694 if (* (unsigned char *) bufptr == (unsigned char) target) | 714 if (*bufptr == (Intbyte) raw) |
695 { | 715 { |
696 count++; | 716 count++; |
697 break; | 717 break; |
698 } | 718 } |
699 bufptr--; | 719 bufptr--; |
720 | 740 |
721 Charbpos | 741 Charbpos |
722 scan_buffer (struct buffer *buf, Emchar target, Charbpos start, Charbpos end, | 742 scan_buffer (struct buffer *buf, Emchar target, Charbpos start, Charbpos end, |
723 EMACS_INT count, EMACS_INT *shortage, int allow_quit) | 743 EMACS_INT count, EMACS_INT *shortage, int allow_quit) |
724 { | 744 { |
725 Bytebpos bi_retval; | 745 Bytebpos byte_retval; |
726 Bytebpos bi_start, bi_end; | 746 Bytebpos byte_start, byte_end; |
727 | 747 |
728 bi_start = charbpos_to_bytebpos (buf, start); | 748 byte_start = charbpos_to_bytebpos (buf, start); |
729 if (end) | 749 if (end) |
730 bi_end = charbpos_to_bytebpos (buf, end); | 750 byte_end = charbpos_to_bytebpos (buf, end); |
731 else | 751 else |
732 bi_end = 0; | 752 byte_end = 0; |
733 bi_retval = bi_scan_buffer (buf, target, bi_start, bi_end, count, | 753 byte_retval = byte_scan_buffer (buf, target, byte_start, byte_end, count, |
734 shortage, allow_quit); | 754 shortage, allow_quit); |
735 return bytebpos_to_charbpos (buf, bi_retval); | 755 return bytebpos_to_charbpos (buf, byte_retval); |
736 } | 756 } |
737 | 757 |
738 Bytebpos | 758 Bytebpos |
739 bi_find_next_newline_no_quit (struct buffer *buf, Bytebpos from, int count) | 759 byte_find_next_newline_no_quit (struct buffer *buf, Bytebpos from, int count) |
740 { | 760 { |
741 return bi_scan_buffer (buf, '\n', from, 0, count, 0, 0); | 761 return byte_scan_buffer (buf, '\n', from, 0, count, 0, 0); |
742 } | 762 } |
743 | 763 |
744 Charbpos | 764 Charbpos |
745 find_next_newline_no_quit (struct buffer *buf, Charbpos from, int count) | 765 find_next_newline_no_quit (struct buffer *buf, Charbpos from, int count) |
746 { | 766 { |
751 find_next_newline (struct buffer *buf, Charbpos from, int count) | 771 find_next_newline (struct buffer *buf, Charbpos from, int count) |
752 { | 772 { |
753 return scan_buffer (buf, '\n', from, 0, count, 0, 1); | 773 return scan_buffer (buf, '\n', from, 0, count, 0, 1); |
754 } | 774 } |
755 | 775 |
756 Bytebpos | 776 Bytecount |
757 bi_find_next_emchar_in_string (Lisp_Object str, Emchar target, Bytebpos st, | 777 byte_find_next_emchar_in_string (Lisp_Object str, Emchar target, Bytecount st, |
758 EMACS_INT count) | 778 EMACS_INT count) |
759 { | 779 { |
760 /* This function has been Mule-ized. */ | |
761 Bytebpos lim = XSTRING_LENGTH (str) -1; | 780 Bytebpos lim = XSTRING_LENGTH (str) -1; |
762 Intbyte *s = XSTRING_DATA (str); | 781 Intbyte *s = XSTRING_DATA (str); |
763 | 782 |
764 assert (count >= 0); | 783 assert (count >= 0); |
765 | 784 |
771 way is faster. */ | 790 way is faster. */ |
772 if (target >= 0200) | 791 if (target >= 0200) |
773 { | 792 { |
774 while (st < lim && count > 0) | 793 while (st < lim && count > 0) |
775 { | 794 { |
776 if (XSTRING_CHAR (str, st) == target) | 795 if (string_emchar (str, st) == target) |
777 count--; | 796 count--; |
778 INC_CHARBYTEBPOS (s, st); | 797 INC_BYTECOUNT (s, st); |
779 } | 798 } |
780 } | 799 } |
781 else | 800 else |
782 #endif | 801 #endif |
783 { | 802 { |
786 Intbyte *bufptr = (Intbyte *) memchr (charptr_n_addr (s, st), | 805 Intbyte *bufptr = (Intbyte *) memchr (charptr_n_addr (s, st), |
787 (int) target, lim - st); | 806 (int) target, lim - st); |
788 if (bufptr) | 807 if (bufptr) |
789 { | 808 { |
790 count--; | 809 count--; |
791 st = (Bytebpos)(bufptr - s) + 1; | 810 st = (Bytebpos) (bufptr - s) + 1; |
792 } | 811 } |
793 else | 812 else |
794 st = lim; | 813 st = lim; |
795 } | 814 } |
796 } | 815 } |
799 | 818 |
800 /* Like find_next_newline, but returns position before the newline, | 819 /* Like find_next_newline, but returns position before the newline, |
801 not after, and only search up to TO. This isn't just | 820 not after, and only search up to TO. This isn't just |
802 find_next_newline (...)-1, because you might hit TO. */ | 821 find_next_newline (...)-1, because you might hit TO. */ |
803 Charbpos | 822 Charbpos |
804 find_before_next_newline (struct buffer *buf, Charbpos from, Charbpos to, int count) | 823 find_before_next_newline (struct buffer *buf, Charbpos from, Charbpos to, |
824 int count) | |
805 { | 825 { |
806 EMACS_INT shortage; | 826 EMACS_INT shortage; |
807 Charbpos pos = scan_buffer (buf, '\n', from, to, count, &shortage, 1); | 827 Charbpos pos = scan_buffer (buf, '\n', from, to, count, &shortage, 1); |
808 | 828 |
809 if (shortage == 0) | 829 if (shortage == 0) |
814 | 834 |
815 static Lisp_Object | 835 static Lisp_Object |
816 skip_chars (struct buffer *buf, int forwardp, int syntaxp, | 836 skip_chars (struct buffer *buf, int forwardp, int syntaxp, |
817 Lisp_Object string, Lisp_Object lim) | 837 Lisp_Object string, Lisp_Object lim) |
818 { | 838 { |
819 /* This function has been Mule-ized. */ | |
820 REGISTER Intbyte *p, *pend; | 839 REGISTER Intbyte *p, *pend; |
821 REGISTER Emchar c; | 840 REGISTER Emchar c; |
822 /* We store the first 256 chars in an array here and the rest in | 841 /* We store the first 256 chars in an array here and the rest in |
823 a range table. */ | 842 a range table. */ |
824 unsigned char fastmap[0400]; | 843 unsigned char fastmap[0400]; |
825 int negate = 0; | 844 int negate = 0; |
826 REGISTER int i; | 845 REGISTER int i; |
827 #ifndef emacs | |
828 Lisp_Char_Table *syntax_table = XCHAR_TABLE (buf->mirror_syntax_table); | |
829 #endif | |
830 Charbpos limit; | 846 Charbpos limit; |
831 | 847 struct syntax_cache *scache; |
848 | |
832 if (NILP (lim)) | 849 if (NILP (lim)) |
833 limit = forwardp ? BUF_ZV (buf) : BUF_BEGV (buf); | 850 limit = forwardp ? BUF_ZV (buf) : BUF_BEGV (buf); |
834 else | 851 else |
835 { | 852 { |
836 CHECK_INT_COERCE_MARKER (lim); | 853 CHECK_INT_COERCE_MARKER (lim); |
920 { | 937 { |
921 Charbpos start_point = BUF_PT (buf); | 938 Charbpos start_point = BUF_PT (buf); |
922 | 939 |
923 if (syntaxp) | 940 if (syntaxp) |
924 { | 941 { |
925 SETUP_SYNTAX_CACHE_FOR_BUFFER (buf, BUF_PT (buf), forwardp ? 1 : -1); | 942 scache = setup_buffer_syntax_cache (buf, BUF_PT (buf), |
943 forwardp ? 1 : -1); | |
926 /* All syntax designators are normal chars so nothing strange | 944 /* All syntax designators are normal chars so nothing strange |
927 to worry about */ | 945 to worry about */ |
928 if (forwardp) | 946 if (forwardp) |
929 { | 947 { |
930 while (BUF_PT (buf) < limit | 948 while (BUF_PT (buf) < limit |
931 && fastmap[(unsigned char) | 949 && fastmap[(unsigned char) |
932 syntax_code_spec | 950 syntax_code_spec |
933 [(int) SYNTAX_FROM_CACHE (syntax_table, | 951 [(int) SYNTAX_FROM_CACHE |
934 BUF_FETCH_CHAR | 952 (scache, BUF_FETCH_CHAR (buf, BUF_PT (buf)))]]) |
935 (buf, BUF_PT (buf)))]]) | |
936 { | 953 { |
937 BUF_SET_PT (buf, BUF_PT (buf) + 1); | 954 BUF_SET_PT (buf, BUF_PT (buf) + 1); |
938 UPDATE_SYNTAX_CACHE_FORWARD (BUF_PT (buf)); | 955 UPDATE_SYNTAX_CACHE_FORWARD (scache, BUF_PT (buf)); |
939 } | 956 } |
940 } | 957 } |
941 else | 958 else |
942 { | 959 { |
943 while (BUF_PT (buf) > limit | 960 while (BUF_PT (buf) > limit |
944 && fastmap[(unsigned char) | 961 && fastmap[(unsigned char) |
945 syntax_code_spec | 962 syntax_code_spec |
946 [(int) SYNTAX_FROM_CACHE (syntax_table, | 963 [(int) SYNTAX_FROM_CACHE |
947 BUF_FETCH_CHAR | 964 (scache, |
948 (buf, BUF_PT (buf) - 1))]]) | 965 BUF_FETCH_CHAR (buf, BUF_PT (buf) - 1))]]) |
949 { | 966 { |
950 BUF_SET_PT (buf, BUF_PT (buf) - 1); | 967 BUF_SET_PT (buf, BUF_PT (buf) - 1); |
951 UPDATE_SYNTAX_CACHE_BACKWARD (BUF_PT (buf) - 1); | 968 UPDATE_SYNTAX_CACHE_BACKWARD (scache, BUF_PT (buf) - 1); |
952 } | 969 } |
953 } | 970 } |
954 } | 971 } |
955 else | 972 else |
956 { | 973 { |
1052 static Lisp_Object | 1069 static Lisp_Object |
1053 search_command (Lisp_Object string, Lisp_Object limit, Lisp_Object noerror, | 1070 search_command (Lisp_Object string, Lisp_Object limit, Lisp_Object noerror, |
1054 Lisp_Object count, Lisp_Object buffer, int direction, | 1071 Lisp_Object count, Lisp_Object buffer, int direction, |
1055 int RE, int posix) | 1072 int RE, int posix) |
1056 { | 1073 { |
1057 /* This function has been Mule-ized, except for the trt table handling. */ | |
1058 REGISTER Charbpos np; | 1074 REGISTER Charbpos np; |
1059 Charbpos lim; | 1075 Charbpos lim; |
1060 EMACS_INT n = direction; | 1076 EMACS_INT n = direction; |
1061 struct buffer *buf; | 1077 struct buffer *buf; |
1062 | 1078 |
1119 } | 1135 } |
1120 | 1136 |
1121 static int | 1137 static int |
1122 trivial_regexp_p (Lisp_Object regexp) | 1138 trivial_regexp_p (Lisp_Object regexp) |
1123 { | 1139 { |
1124 /* This function has been Mule-ized. */ | |
1125 Bytecount len = XSTRING_LENGTH (regexp); | 1140 Bytecount len = XSTRING_LENGTH (regexp); |
1126 Intbyte *s = XSTRING_DATA (regexp); | 1141 Intbyte *s = XSTRING_DATA (regexp); |
1127 while (--len >= 0) | 1142 while (--len >= 0) |
1128 { | 1143 { |
1129 switch (*s++) | 1144 switch (*s++) |
1170 static Charbpos | 1185 static Charbpos |
1171 search_buffer (struct buffer *buf, Lisp_Object string, Charbpos charbpos, | 1186 search_buffer (struct buffer *buf, Lisp_Object string, Charbpos charbpos, |
1172 Charbpos buflim, EMACS_INT n, int RE, Lisp_Object trt, | 1187 Charbpos buflim, EMACS_INT n, int RE, Lisp_Object trt, |
1173 Lisp_Object inverse_trt, int posix) | 1188 Lisp_Object inverse_trt, int posix) |
1174 { | 1189 { |
1175 /* This function has been Mule-ized, except for the trt table handling. */ | |
1176 Bytecount len = XSTRING_LENGTH (string); | 1190 Bytecount len = XSTRING_LENGTH (string); |
1177 Intbyte *base_pat = XSTRING_DATA (string); | 1191 Intbyte *base_pat = XSTRING_DATA (string); |
1178 REGISTER EMACS_INT i, j; | 1192 REGISTER EMACS_INT i, j; |
1179 Bytebpos p1, p2; | 1193 Bytebpos p1, p2; |
1180 Bytecount s1, s2; | 1194 Bytecount s1, s2; |
1181 Bytebpos pos, lim; | 1195 Bytebpos pos, lim; |
1182 int count; | |
1183 | 1196 |
1184 if (running_asynch_code) | 1197 if (running_asynch_code) |
1185 save_search_regs (); | 1198 save_search_regs (); |
1186 | 1199 |
1187 /* Null string is found at starting position. */ | 1200 /* Null string is found at starting position. */ |
1198 pos = charbpos_to_bytebpos (buf, charbpos); | 1211 pos = charbpos_to_bytebpos (buf, charbpos); |
1199 lim = charbpos_to_bytebpos (buf, buflim); | 1212 lim = charbpos_to_bytebpos (buf, buflim); |
1200 if (RE && !trivial_regexp_p (string)) | 1213 if (RE && !trivial_regexp_p (string)) |
1201 { | 1214 { |
1202 struct re_pattern_buffer *bufp; | 1215 struct re_pattern_buffer *bufp; |
1203 count = begin_regex_reentrancy (); | 1216 |
1204 | 1217 bufp = compile_pattern (string, &search_regs, trt, |
1205 bufp = compile_pattern (string, &search_regs, trt, posix, | 1218 wrap_buffer (buf), buf, posix, ERROR_ME); |
1206 ERROR_ME); | |
1207 | 1219 |
1208 /* Get pointers and sizes of the two strings | 1220 /* Get pointers and sizes of the two strings |
1209 that make up the visible portion of the buffer. */ | 1221 that make up the visible portion of the buffer. */ |
1210 | 1222 |
1211 p1 = BI_BUF_BEGV (buf); | 1223 p1 = BYTE_BUF_BEGV (buf); |
1212 p2 = BI_BUF_CEILING_OF (buf, p1); | 1224 p2 = BYTE_BUF_CEILING_OF (buf, p1); |
1213 s1 = p2 - p1; | 1225 s1 = p2 - p1; |
1214 s2 = BI_BUF_ZV (buf) - p2; | 1226 s2 = BYTE_BUF_ZV (buf) - p2; |
1215 regex_match_object = Qnil; | 1227 |
1216 | 1228 while (n != 0) |
1217 while (n < 0) | |
1218 { | 1229 { |
1219 Bytecount val; | 1230 Bytecount val; |
1231 struct syntax_cache scache_struct; | |
1232 struct syntax_cache *scache = &scache_struct; | |
1233 | |
1220 QUIT; | 1234 QUIT; |
1221 regex_emacs_buffer = buf; | 1235 /* By making the regex object, regex buffer, and syntax cache |
1236 arguments to re_{search,match}{,_2}, we've removed the need to | |
1237 do nasty things to deal with regex reentrancy. (See stack | |
1238 trace in signal.c for proof that this can happen.) | |
1239 | |
1240 #### there is still a potential problem with the regex cache -- | |
1241 the compiled regex could be overwritten. we'd need 20-fold | |
1242 reentrancy, though. Fix this. */ | |
1243 | |
1222 val = re_search_2 (bufp, | 1244 val = re_search_2 (bufp, |
1223 (char *) BI_BUF_BYTE_ADDRESS (buf, p1), s1, | 1245 (char *) BYTE_BUF_BYTE_ADDRESS (buf, p1), s1, |
1224 (char *) BI_BUF_BYTE_ADDRESS (buf, p2), s2, | 1246 (char *) BYTE_BUF_BYTE_ADDRESS (buf, p2), s2, |
1225 pos - BI_BUF_BEGV (buf), lim - pos, &search_regs, | 1247 pos - BYTE_BUF_BEGV (buf), lim - pos, &search_regs, |
1226 pos - BI_BUF_BEGV (buf)); | 1248 n > 0 ? lim - BYTE_BUF_BEGV (buf) : |
1249 pos - BYTE_BUF_BEGV (buf), wrap_buffer (buf), | |
1250 buf, scache); | |
1227 | 1251 |
1228 if (val == -2) | 1252 if (val == -2) |
1229 { | 1253 { |
1230 matcher_overflow (); | 1254 matcher_overflow (); |
1231 } | 1255 } |
1232 if (val >= 0) | 1256 if (val >= 0) |
1233 { | 1257 { |
1234 int num_regs = search_regs.num_regs; | 1258 int num_regs = search_regs.num_regs; |
1235 j = BI_BUF_BEGV (buf); | 1259 j = BYTE_BUF_BEGV (buf); |
1236 for (i = 0; i < num_regs; i++) | 1260 for (i = 0; i < num_regs; i++) |
1237 if (search_regs.start[i] >= 0) | 1261 if (search_regs.start[i] >= 0) |
1238 { | 1262 { |
1239 search_regs.start[i] += j; | 1263 search_regs.start[i] += j; |
1240 search_regs.end[i] += j; | 1264 search_regs.end[i] += j; |
1241 } | 1265 } |
1242 last_thing_searched = wrap_buffer (buf); | 1266 last_thing_searched = wrap_buffer (buf); |
1243 /* Set pos to the new position. */ | 1267 /* Set pos to the new position. */ |
1244 pos = search_regs.start[0]; | 1268 pos = n > 0 ? search_regs.end[0] : search_regs.start[0]; |
1245 fixup_search_regs_for_buffer (buf); | 1269 fixup_search_regs_for_buffer (buf); |
1246 /* And charbpos too. */ | 1270 /* And charbpos too. */ |
1247 charbpos = search_regs.start[0]; | 1271 charbpos = n > 0 ? search_regs.end[0] : search_regs.start[0]; |
1248 } | 1272 } |
1249 else | 1273 else |
1250 { | 1274 return (n > 0 ? 0 - n : n); |
1251 unbind_to (count); | 1275 if (n > 0) n--; else n++; |
1252 return n; | 1276 } |
1253 } | |
1254 n++; | |
1255 } | |
1256 while (n > 0) | |
1257 { | |
1258 Bytecount val; | |
1259 QUIT; | |
1260 regex_emacs_buffer = buf; | |
1261 val = re_search_2 (bufp, | |
1262 (char *) BI_BUF_BYTE_ADDRESS (buf, p1), s1, | |
1263 (char *) BI_BUF_BYTE_ADDRESS (buf, p2), s2, | |
1264 pos - BI_BUF_BEGV (buf), lim - pos, &search_regs, | |
1265 lim - BI_BUF_BEGV (buf)); | |
1266 if (val == -2) | |
1267 { | |
1268 matcher_overflow (); | |
1269 } | |
1270 if (val >= 0) | |
1271 { | |
1272 int num_regs = search_regs.num_regs; | |
1273 j = BI_BUF_BEGV (buf); | |
1274 for (i = 0; i < num_regs; i++) | |
1275 if (search_regs.start[i] >= 0) | |
1276 { | |
1277 search_regs.start[i] += j; | |
1278 search_regs.end[i] += j; | |
1279 } | |
1280 last_thing_searched = wrap_buffer (buf); | |
1281 /* Set pos to the new position. */ | |
1282 pos = search_regs.end[0]; | |
1283 fixup_search_regs_for_buffer (buf); | |
1284 /* And charbpos too. */ | |
1285 charbpos = search_regs.end[0]; | |
1286 } | |
1287 else | |
1288 { | |
1289 unbind_to (count); | |
1290 return 0 - n; | |
1291 } | |
1292 n--; | |
1293 } | |
1294 unbind_to (count); | |
1295 return charbpos; | 1277 return charbpos; |
1296 } | 1278 } |
1297 else /* non-RE case */ | 1279 else /* non-RE case */ |
1298 { | 1280 { |
1299 int charset_base = -1; | 1281 int charset_base = -1; |
1300 int boyer_moore_ok = 1; | 1282 int boyer_moore_ok = 1; |
1301 Intbyte *pat = 0; | 1283 Intbyte *pat = 0; |
1302 Intbyte *patbuf = alloca_array (Intbyte, len * MAX_EMCHAR_LEN); | 1284 Intbyte *patbuf = alloca_array (Intbyte, len * MAX_EMCHAR_LEN); |
1303 pat = patbuf; | 1285 pat = patbuf; |
1304 #ifdef MULE | 1286 #ifdef MULE |
1287 /* &&#### needs some 8-bit work here */ | |
1305 while (len > 0) | 1288 while (len > 0) |
1306 { | 1289 { |
1307 Intbyte tmp_str[MAX_EMCHAR_LEN]; | 1290 Intbyte tmp_str[MAX_EMCHAR_LEN]; |
1308 Emchar c, translated, inverse; | 1291 Emchar c, translated, inverse; |
1309 Bytecount orig_bytelen, new_bytelen, inv_bytelen; | 1292 Bytecount orig_bytelen, new_bytelen, inv_bytelen; |
1318 } | 1301 } |
1319 c = charptr_emchar (base_pat); | 1302 c = charptr_emchar (base_pat); |
1320 translated = TRANSLATE (trt, c); | 1303 translated = TRANSLATE (trt, c); |
1321 inverse = TRANSLATE (inverse_trt, c); | 1304 inverse = TRANSLATE (inverse_trt, c); |
1322 | 1305 |
1323 orig_bytelen = charcount_to_bytecount (base_pat, 1); | 1306 orig_bytelen = charptr_emchar_len (base_pat); |
1324 inv_bytelen = set_charptr_emchar (tmp_str, inverse); | 1307 inv_bytelen = set_charptr_emchar (tmp_str, inverse); |
1325 new_bytelen = set_charptr_emchar (tmp_str, translated); | 1308 new_bytelen = set_charptr_emchar (tmp_str, translated); |
1326 | |
1327 | 1309 |
1328 if (new_bytelen != orig_bytelen || inv_bytelen != orig_bytelen) | 1310 if (new_bytelen != orig_bytelen || inv_bytelen != orig_bytelen) |
1329 boyer_moore_ok = 0; | 1311 boyer_moore_ok = 0; |
1330 if (translated != c || inverse != c) | 1312 if (translated != c || inverse != c) |
1331 { | 1313 { |
1332 /* Keep track of which character set row | 1314 /* Keep track of which character set row |
1333 contains the characters that need translation. */ | 1315 contains the characters that need translation. */ |
1334 int charset_base_code = c & ~CHAR_FIELD3_MASK; | 1316 int charset_base_code = c & ~EMCHAR_FIELD3_MASK; |
1335 if (charset_base == -1) | 1317 if (charset_base == -1) |
1336 charset_base = charset_base_code; | 1318 charset_base = charset_base_code; |
1337 else if (charset_base != charset_base_code) | 1319 else if (charset_base != charset_base_code) |
1338 /* If two different rows appear, needing translation, | 1320 /* If two different rows appear, needing translation, |
1339 then we cannot use boyer_moore search. */ | 1321 then we cannot use boyer_moore search. */ |
1366 else | 1348 else |
1367 return simple_search (buf, base_pat, len, pos, lim, n, trt); | 1349 return simple_search (buf, base_pat, len, pos, lim, n, trt); |
1368 } | 1350 } |
1369 } | 1351 } |
1370 | 1352 |
1371 /* Do a simple string search N times for the string PAT, | 1353 /* Do a simple string search N times for the string PAT, whose length is |
1372 whose length is LEN/LEN_BYTE, | 1354 LEN/LEN_BYTE, from buffer position POS until LIM. TRT is the |
1373 from buffer position POS/POS_BYTE until LIM/LIM_BYTE. | 1355 translation table. |
1374 TRT is the translation table. | |
1375 | 1356 |
1376 Return the character position where the match is found. | 1357 Return the character position where the match is found. |
1377 Otherwise, if M matches remained to be found, return -M. | 1358 Otherwise, if M matches remained to be found, return -M. |
1378 | 1359 |
1379 This kind of search works regardless of what is in PAT and | 1360 This kind of search works regardless of what is in PAT and |
1380 regardless of what is in TRT. It is used in cases where | 1361 regardless of what is in TRT. It is used in cases where |
1381 boyer_moore cannot work. */ | 1362 boyer_moore cannot work. */ |
1382 | 1363 |
1383 static Charbpos | 1364 static Charbpos |
1384 simple_search (struct buffer *buf, Intbyte *base_pat, Bytecount len_byte, | 1365 simple_search (struct buffer *buf, Intbyte *base_pat, Bytecount len, |
1385 Bytebpos idx, Bytebpos lim, EMACS_INT n, Lisp_Object trt) | 1366 Bytebpos pos, Bytebpos lim, EMACS_INT n, Lisp_Object trt) |
1386 { | 1367 { |
1387 int forward = n > 0; | 1368 int forward = n > 0; |
1388 Bytecount buf_len = 0; /* Shut up compiler. */ | 1369 Bytecount buf_len = 0; /* Shut up compiler. */ |
1389 | 1370 |
1390 if (lim > idx) | 1371 if (lim > pos) |
1391 while (n > 0) | 1372 while (n > 0) |
1392 { | 1373 { |
1393 while (1) | 1374 while (1) |
1394 { | 1375 { |
1395 Bytecount this_len = len_byte; | 1376 Bytecount this_len = len; |
1396 Bytebpos this_idx = idx; | 1377 Bytebpos this_pos = pos; |
1397 Intbyte *p = base_pat; | 1378 Intbyte *p = base_pat; |
1398 if (idx >= lim) | 1379 if (pos >= lim) |
1399 goto stop; | 1380 goto stop; |
1400 | 1381 |
1401 while (this_len > 0) | 1382 while (this_len > 0) |
1402 { | 1383 { |
1403 Emchar pat_ch, buf_ch; | 1384 Emchar pat_ch, buf_ch; |
1404 Bytecount pat_len; | 1385 Bytecount pat_len; |
1405 | 1386 |
1406 pat_ch = charptr_emchar (p); | 1387 pat_ch = charptr_emchar (p); |
1407 buf_ch = BI_BUF_FETCH_CHAR (buf, this_idx); | 1388 buf_ch = BYTE_BUF_FETCH_CHAR (buf, this_pos); |
1408 | 1389 |
1409 buf_ch = TRANSLATE (trt, buf_ch); | 1390 buf_ch = TRANSLATE (trt, buf_ch); |
1410 | 1391 |
1411 if (buf_ch != pat_ch) | 1392 if (buf_ch != pat_ch) |
1412 break; | 1393 break; |
1413 | 1394 |
1414 pat_len = charcount_to_bytecount (p, 1); | 1395 pat_len = charptr_emchar_len (p); |
1415 p += pat_len; | 1396 p += pat_len; |
1416 this_len -= pat_len; | 1397 this_len -= pat_len; |
1417 INC_BYTEBPOS (buf, this_idx); | 1398 INC_BYTEBPOS (buf, this_pos); |
1418 } | 1399 } |
1419 if (this_len == 0) | 1400 if (this_len == 0) |
1420 { | 1401 { |
1421 buf_len = this_idx - idx; | 1402 buf_len = this_pos - pos; |
1422 idx = this_idx; | 1403 pos = this_pos; |
1423 break; | 1404 break; |
1424 } | 1405 } |
1425 INC_BYTEBPOS (buf, idx); | 1406 INC_BYTEBPOS (buf, pos); |
1426 } | 1407 } |
1427 n--; | 1408 n--; |
1428 } | 1409 } |
1429 else | 1410 else |
1430 while (n < 0) | 1411 while (n < 0) |
1431 { | 1412 { |
1432 while (1) | 1413 while (1) |
1433 { | 1414 { |
1434 Bytecount this_len = len_byte; | 1415 Bytecount this_len = len; |
1435 Bytebpos this_idx = idx; | 1416 Bytebpos this_pos = pos; |
1436 Intbyte *p; | 1417 Intbyte *p; |
1437 if (idx <= lim) | 1418 if (pos <= lim) |
1438 goto stop; | 1419 goto stop; |
1439 p = base_pat + len_byte; | 1420 p = base_pat + len; |
1440 | 1421 |
1441 while (this_len > 0) | 1422 while (this_len > 0) |
1442 { | 1423 { |
1443 Emchar pat_ch, buf_ch; | 1424 Emchar pat_ch, buf_ch; |
1444 | 1425 |
1445 DEC_CHARPTR (p); | 1426 DEC_CHARPTR (p); |
1446 DEC_BYTEBPOS (buf, this_idx); | 1427 DEC_BYTEBPOS (buf, this_pos); |
1447 pat_ch = charptr_emchar (p); | 1428 pat_ch = charptr_emchar (p); |
1448 buf_ch = BI_BUF_FETCH_CHAR (buf, this_idx); | 1429 buf_ch = BYTE_BUF_FETCH_CHAR (buf, this_pos); |
1449 | 1430 |
1450 buf_ch = TRANSLATE (trt, buf_ch); | 1431 buf_ch = TRANSLATE (trt, buf_ch); |
1451 | 1432 |
1452 if (buf_ch != pat_ch) | 1433 if (buf_ch != pat_ch) |
1453 break; | 1434 break; |
1454 | 1435 |
1455 this_len -= charcount_to_bytecount (p, 1); | 1436 this_len -= charptr_emchar_len (p); |
1456 } | 1437 } |
1457 if (this_len == 0) | 1438 if (this_len == 0) |
1458 { | 1439 { |
1459 buf_len = idx - this_idx; | 1440 buf_len = pos - this_pos; |
1460 idx = this_idx; | 1441 pos = this_pos; |
1461 break; | 1442 break; |
1462 } | 1443 } |
1463 DEC_BYTEBPOS (buf, idx); | 1444 DEC_BYTEBPOS (buf, pos); |
1464 } | 1445 } |
1465 n++; | 1446 n++; |
1466 } | 1447 } |
1467 stop: | 1448 stop: |
1468 if (n == 0) | 1449 if (n == 0) |
1469 { | 1450 { |
1470 Charbpos beg, end, retval; | 1451 Charbpos beg, end, retval; |
1471 if (forward) | 1452 if (forward) |
1472 { | 1453 { |
1473 beg = bytebpos_to_charbpos (buf, idx - buf_len); | 1454 beg = bytebpos_to_charbpos (buf, pos - buf_len); |
1474 retval = end = bytebpos_to_charbpos (buf, idx); | 1455 retval = end = bytebpos_to_charbpos (buf, pos); |
1475 } | 1456 } |
1476 else | 1457 else |
1477 { | 1458 { |
1478 retval = beg = bytebpos_to_charbpos (buf, idx); | 1459 retval = beg = bytebpos_to_charbpos (buf, pos); |
1479 end = bytebpos_to_charbpos (buf, idx + buf_len); | 1460 end = bytebpos_to_charbpos (buf, pos + buf_len); |
1480 } | 1461 } |
1481 set_search_regs (buf, beg, end - beg); | 1462 set_search_regs (buf, beg, end - beg); |
1482 | 1463 |
1483 return retval; | 1464 return retval; |
1484 } | 1465 } |
1504 static Charbpos | 1485 static Charbpos |
1505 boyer_moore (struct buffer *buf, Intbyte *base_pat, Bytecount len, | 1486 boyer_moore (struct buffer *buf, Intbyte *base_pat, Bytecount len, |
1506 Bytebpos pos, Bytebpos lim, EMACS_INT n, Lisp_Object trt, | 1487 Bytebpos pos, Bytebpos lim, EMACS_INT n, Lisp_Object trt, |
1507 Lisp_Object inverse_trt, int charset_base) | 1488 Lisp_Object inverse_trt, int charset_base) |
1508 { | 1489 { |
1490 /* &&#### needs some 8-bit work here */ | |
1509 /* #### Someone really really really needs to comment the workings | 1491 /* #### Someone really really really needs to comment the workings |
1510 of this junk somewhat better. | 1492 of this junk somewhat better. |
1511 | 1493 |
1512 BTW "BM" stands for Boyer-Moore, which is one of the standard | 1494 BTW "BM" stands for Boyer-Moore, which is one of the standard |
1513 string-searching algorithms. It's the best string-searching | 1495 string-searching algorithms. It's the best string-searching |
1622 #ifdef MULE | 1604 #ifdef MULE |
1623 Emchar ch, untranslated; | 1605 Emchar ch, untranslated; |
1624 int this_translated = 1; | 1606 int this_translated = 1; |
1625 | 1607 |
1626 /* Is *PTR the last byte of a character? */ | 1608 /* Is *PTR the last byte of a character? */ |
1627 if (pat_end - ptr == 1 || INTBYTE_FIRST_BYTE_P (ptr[1])) | 1609 if (pat_end - ptr == 1 || intbyte_first_byte_p (ptr[1])) |
1628 { | 1610 { |
1629 Intbyte *charstart = ptr; | 1611 Intbyte *charstart = ptr; |
1630 while (!INTBYTE_FIRST_BYTE_P (*charstart)) | 1612 while (!intbyte_first_byte_p (*charstart)) |
1631 charstart--; | 1613 charstart--; |
1632 untranslated = charptr_emchar (charstart); | 1614 untranslated = charptr_emchar (charstart); |
1633 if (charset_base == (untranslated & ~CHAR_FIELD3_MASK)) | 1615 if (charset_base == (untranslated & ~EMCHAR_FIELD3_MASK)) |
1634 { | 1616 { |
1635 ch = TRANSLATE (trt, untranslated); | 1617 ch = TRANSLATE (trt, untranslated); |
1636 if (!INTBYTE_FIRST_BYTE_P (*ptr)) | 1618 if (!intbyte_first_byte_p (*ptr)) |
1637 { | 1619 { |
1638 translate_prev_byte = ptr[-1]; | 1620 translate_prev_byte = ptr[-1]; |
1639 if (!INTBYTE_FIRST_BYTE_P (translate_prev_byte)) | 1621 if (!intbyte_first_byte_p (translate_prev_byte)) |
1640 translate_anteprev_byte = ptr[-2]; | 1622 translate_anteprev_byte = ptr[-2]; |
1641 } | 1623 } |
1642 } | 1624 } |
1643 else | 1625 else |
1644 { | 1626 { |
1658 | 1640 |
1659 if (i == infinity) | 1641 if (i == infinity) |
1660 stride_for_teases = BM_tab[j]; | 1642 stride_for_teases = BM_tab[j]; |
1661 BM_tab[j] = dirlen - i; | 1643 BM_tab[j] = dirlen - i; |
1662 /* A translation table is accompanied by its inverse -- | 1644 /* A translation table is accompanied by its inverse -- |
1663 see comment following downcase_table for details */ | 1645 see comment in casetab.c. */ |
1664 if (this_translated) | 1646 if (this_translated) |
1665 { | 1647 { |
1666 Emchar starting_ch = ch; | 1648 Emchar starting_ch = ch; |
1667 EMACS_INT starting_j = j; | 1649 EMACS_INT starting_j = j; |
1668 while (1) | 1650 while (1) |
1688 k = (j = TRANSLATE (trt, j)); | 1670 k = (j = TRANSLATE (trt, j)); |
1689 if (i == infinity) | 1671 if (i == infinity) |
1690 stride_for_teases = BM_tab[j]; | 1672 stride_for_teases = BM_tab[j]; |
1691 BM_tab[j] = dirlen - i; | 1673 BM_tab[j] = dirlen - i; |
1692 /* A translation table is accompanied by its inverse -- | 1674 /* A translation table is accompanied by its inverse -- |
1693 see comment following downcase_table for details */ | 1675 see comment in casetab.c. */ |
1694 | |
1695 while ((j = TRANSLATE (inverse_trt, j)) != k) | 1676 while ((j = TRANSLATE (inverse_trt, j)) != k) |
1696 { | 1677 { |
1697 simple_translate[j] = (Intbyte) k; | 1678 simple_translate[j] = (Intbyte) k; |
1698 BM_tab[j] = dirlen - i; | 1679 BM_tab[j] = dirlen - i; |
1699 } | 1680 } |
1732 pat = base_pat; | 1713 pat = base_pat; |
1733 limit = pos - dirlen + direction; | 1714 limit = pos - dirlen + direction; |
1734 /* XEmacs change: definitions of CEILING_OF and FLOOR_OF | 1715 /* XEmacs change: definitions of CEILING_OF and FLOOR_OF |
1735 have changed. See buffer.h. */ | 1716 have changed. See buffer.h. */ |
1736 limit = ((direction > 0) | 1717 limit = ((direction > 0) |
1737 ? BI_BUF_CEILING_OF (buf, limit) - 1 | 1718 ? BYTE_BUF_CEILING_OF (buf, limit) - 1 |
1738 : BI_BUF_FLOOR_OF (buf, limit + 1)); | 1719 : BYTE_BUF_FLOOR_OF (buf, limit + 1)); |
1739 /* LIMIT is now the last (not beyond-last!) value POS can | 1720 /* LIMIT is now the last (not beyond-last!) value POS can |
1740 take on without hitting edge of buffer or the gap. */ | 1721 take on without hitting edge of buffer or the gap. */ |
1741 limit = ((direction > 0) | 1722 limit = ((direction > 0) |
1742 ? min (lim - 1, min (limit, pos + 20000)) | 1723 ? min (lim - 1, min (limit, pos + 20000)) |
1743 : max (lim, max (limit, pos - 20000))); | 1724 : max (lim, max (limit, pos - 20000))); |
1744 tail_end = BI_BUF_CEILING_OF (buf, pos); | 1725 tail_end = BYTE_BUF_CEILING_OF (buf, pos); |
1745 tail_end_ptr = BI_BUF_BYTE_ADDRESS (buf, tail_end); | 1726 tail_end_ptr = BYTE_BUF_BYTE_ADDRESS (buf, tail_end); |
1746 | 1727 |
1747 if ((limit - pos) * direction > 20) | 1728 if ((limit - pos) * direction > 20) |
1748 { | 1729 { |
1749 p_limit = BI_BUF_BYTE_ADDRESS (buf, limit); | 1730 /* We have to be careful because the code can generate addresses |
1750 ptr2 = (cursor = BI_BUF_BYTE_ADDRESS (buf, pos)); | 1731 that don't point to the beginning of characters. */ |
1732 p_limit = BYTE_BUF_BYTE_ADDRESS_NO_VERIFY (buf, limit); | |
1733 ptr2 = (cursor = BYTE_BUF_BYTE_ADDRESS_NO_VERIFY (buf, pos)); | |
1751 /* In this loop, pos + cursor - ptr2 is the surrogate | 1734 /* In this loop, pos + cursor - ptr2 is the surrogate |
1752 for pos */ | 1735 for pos */ |
1753 while (1) /* use one cursor setting as long as i can */ | 1736 while (1) /* use one cursor setting as long as i can */ |
1754 { | 1737 { |
1755 if (direction > 0) /* worth duplicating */ | 1738 if (direction > 0) /* worth duplicating */ |
1803 #ifdef MULE | 1786 #ifdef MULE |
1804 Emchar ch; | 1787 Emchar ch; |
1805 cursor -= direction; | 1788 cursor -= direction; |
1806 /* Translate only the last byte of a character. */ | 1789 /* Translate only the last byte of a character. */ |
1807 if ((cursor == tail_end_ptr | 1790 if ((cursor == tail_end_ptr |
1808 || INTBYTE_FIRST_BYTE_P (cursor[1])) | 1791 || intbyte_first_byte_p (cursor[1])) |
1809 && (INTBYTE_FIRST_BYTE_P (cursor[0]) | 1792 && (intbyte_first_byte_p (cursor[0]) |
1810 || (translate_prev_byte == cursor[-1] | 1793 || (translate_prev_byte == cursor[-1] |
1811 && (INTBYTE_FIRST_BYTE_P (translate_prev_byte) | 1794 && (intbyte_first_byte_p (translate_prev_byte) |
1812 || translate_anteprev_byte == cursor[-2])))) | 1795 || translate_anteprev_byte == cursor[-2])))) |
1813 ch = simple_translate[*cursor]; | 1796 ch = simple_translate[*cursor]; |
1814 else | 1797 else |
1815 ch = *cursor; | 1798 ch = *cursor; |
1816 if (pat[i] != ch) | 1799 if (pat[i] != ch) |
1858 way because it covers a discontinuity */ | 1841 way because it covers a discontinuity */ |
1859 { | 1842 { |
1860 /* XEmacs change: definitions of CEILING_OF and FLOOR_OF | 1843 /* XEmacs change: definitions of CEILING_OF and FLOOR_OF |
1861 have changed. See buffer.h. */ | 1844 have changed. See buffer.h. */ |
1862 limit = ((direction > 0) | 1845 limit = ((direction > 0) |
1863 ? BI_BUF_CEILING_OF (buf, pos - dirlen + 1) - 1 | 1846 ? BYTE_BUF_CEILING_OF (buf, pos - dirlen + 1) - 1 |
1864 : BI_BUF_FLOOR_OF (buf, pos - dirlen)); | 1847 : BYTE_BUF_FLOOR_OF (buf, pos - dirlen)); |
1865 limit = ((direction > 0) | 1848 limit = ((direction > 0) |
1866 ? min (limit + len, lim - 1) | 1849 ? min (limit + len, lim - 1) |
1867 : max (limit - len, lim)); | 1850 : max (limit - len, lim)); |
1868 /* LIMIT is now the last value POS can have | 1851 /* LIMIT is now the last value POS can have |
1869 and still be valid for a possible match. */ | 1852 and still be valid for a possible match. */ |
1872 /* This loop can be coded for space rather than | 1855 /* This loop can be coded for space rather than |
1873 speed because it will usually run only once. | 1856 speed because it will usually run only once. |
1874 (the reach is at most len + 21, and typically | 1857 (the reach is at most len + 21, and typically |
1875 does not exceed len) */ | 1858 does not exceed len) */ |
1876 while ((limit - pos) * direction >= 0) | 1859 while ((limit - pos) * direction >= 0) |
1877 /* *not* BI_BUF_FETCH_CHAR. We are working here | 1860 /* *not* BYTE_BUF_FETCH_CHAR. We are working here |
1878 with bytes, not characters. */ | 1861 with bytes, not characters. */ |
1879 pos += BM_tab[*BI_BUF_BYTE_ADDRESS (buf, pos)]; | 1862 pos += BM_tab[*BYTE_BUF_BYTE_ADDRESS_NO_VERIFY (buf, pos)]; |
1880 /* now run the same tests to distinguish going off | 1863 /* now run the same tests to distinguish going off |
1881 the end, a match or a phony match. */ | 1864 the end, a match or a phony match. */ |
1882 if ((pos - limit) * direction <= len) | 1865 if ((pos - limit) * direction <= len) |
1883 break; /* ran off the end */ | 1866 break; /* ran off the end */ |
1884 /* Found what might be a match. | 1867 /* Found what might be a match. |
1891 Emchar ch; | 1874 Emchar ch; |
1892 Intbyte *ptr; | 1875 Intbyte *ptr; |
1893 #endif | 1876 #endif |
1894 pos -= direction; | 1877 pos -= direction; |
1895 #ifdef MULE | 1878 #ifdef MULE |
1896 ptr = BI_BUF_BYTE_ADDRESS (buf, pos); | 1879 ptr = BYTE_BUF_BYTE_ADDRESS_NO_VERIFY (buf, pos); |
1897 if ((ptr == tail_end_ptr | 1880 if ((ptr == tail_end_ptr |
1898 || INTBYTE_FIRST_BYTE_P (ptr[1])) | 1881 || intbyte_first_byte_p (ptr[1])) |
1899 && (INTBYTE_FIRST_BYTE_P (ptr[0]) | 1882 && (intbyte_first_byte_p (ptr[0]) |
1900 || (translate_prev_byte == ptr[-1] | 1883 || (translate_prev_byte == ptr[-1] |
1901 && (INTBYTE_FIRST_BYTE_P (translate_prev_byte) | 1884 && (intbyte_first_byte_p (translate_prev_byte) |
1902 || translate_anteprev_byte == ptr[-2])))) | 1885 || translate_anteprev_byte == ptr[-2])))) |
1903 ch = simple_translate[*ptr]; | 1886 ch = simple_translate[*ptr]; |
1904 else | 1887 else |
1905 ch = *ptr; | 1888 ch = *ptr; |
1906 if (pat[i] != ch) | 1889 if (pat[i] != ch) |
1907 break; | 1890 break; |
1908 | 1891 |
1909 #else | 1892 #else |
1910 if (pat[i] != TRANSLATE (trt, | 1893 if (pat[i] != |
1911 *BI_BUF_BYTE_ADDRESS (buf, pos))) | 1894 TRANSLATE (trt, |
1895 *BYTE_BUF_BYTE_ADDRESS_NO_VERIFY (buf, pos))) | |
1912 break; | 1896 break; |
1913 #endif | 1897 #endif |
1914 } | 1898 } |
1915 /* Above loop has moved POS part or all the way back | 1899 /* Above loop has moved POS part or all the way back |
1916 to the first char pos (last char pos if reverse). | 1900 to the first char pos (last char pos if reverse). |
1952 for a match just found in the current buffer. */ | 1936 for a match just found in the current buffer. */ |
1953 | 1937 |
1954 static void | 1938 static void |
1955 set_search_regs (struct buffer *buf, Charbpos beg, Charcount len) | 1939 set_search_regs (struct buffer *buf, Charbpos beg, Charcount len) |
1956 { | 1940 { |
1957 /* This function has been Mule-ized. */ | |
1958 /* Make sure we have registers in which to store | 1941 /* Make sure we have registers in which to store |
1959 the match position. */ | 1942 the match position. */ |
1960 if (search_regs.num_regs == 0) | 1943 if (search_regs.num_regs == 0) |
1961 { | 1944 { |
1962 search_regs.start = xnew (regoff_t); | 1945 search_regs.start = xnew (regoff_t); |
1978 wordify (Lisp_Object buffer, Lisp_Object string) | 1961 wordify (Lisp_Object buffer, Lisp_Object string) |
1979 { | 1962 { |
1980 Charcount i, len; | 1963 Charcount i, len; |
1981 EMACS_INT punct_count = 0, word_count = 0; | 1964 EMACS_INT punct_count = 0, word_count = 0; |
1982 struct buffer *buf = decode_buffer (buffer, 0); | 1965 struct buffer *buf = decode_buffer (buffer, 0); |
1983 Lisp_Char_Table *syntax_table = XCHAR_TABLE (buf->mirror_syntax_table); | 1966 Lisp_Object syntax_table = buf->mirror_syntax_table; |
1984 | 1967 |
1985 CHECK_STRING (string); | 1968 CHECK_STRING (string); |
1986 len = XSTRING_CHAR_LENGTH (string); | 1969 len = string_char_length (string); |
1987 | 1970 |
1988 for (i = 0; i < len; i++) | 1971 for (i = 0; i < len; i++) |
1989 if (!WORD_SYNTAX_P (syntax_table, XSTRING_CHAR (string, i))) | 1972 if (!WORD_SYNTAX_P (syntax_table, string_emchar (string, i))) |
1990 { | 1973 { |
1991 punct_count++; | 1974 punct_count++; |
1992 if (i > 0 && WORD_SYNTAX_P (syntax_table, | 1975 if (i > 0 && WORD_SYNTAX_P (syntax_table, |
1993 XSTRING_CHAR (string, i - 1))) | 1976 string_emchar (string, i - 1))) |
1994 word_count++; | 1977 word_count++; |
1995 } | 1978 } |
1996 if (WORD_SYNTAX_P (syntax_table, XSTRING_CHAR (string, len - 1))) | 1979 if (WORD_SYNTAX_P (syntax_table, string_emchar (string, len - 1))) |
1997 word_count++; | 1980 word_count++; |
1998 if (!word_count) return build_string (""); | 1981 if (!word_count) return build_string (""); |
1999 | 1982 |
2000 { | 1983 { |
2001 /* The following value is an upper bound on the amount of storage we | 1984 /* The following value is an upper bound on the amount of storage we |
2008 *o++ = '\\'; | 1991 *o++ = '\\'; |
2009 *o++ = 'b'; | 1992 *o++ = 'b'; |
2010 | 1993 |
2011 for (i = 0; i < len; i++) | 1994 for (i = 0; i < len; i++) |
2012 { | 1995 { |
2013 Emchar ch = XSTRING_CHAR (string, i); | 1996 Emchar ch = string_emchar (string, i); |
2014 | 1997 |
2015 if (WORD_SYNTAX_P (syntax_table, ch)) | 1998 if (WORD_SYNTAX_P (syntax_table, ch)) |
2016 o += set_charptr_emchar (o, ch); | 1999 o += set_charptr_emchar (o, ch); |
2017 else if (i > 0 | 2000 else if (i > 0 |
2018 && WORD_SYNTAX_P (syntax_table, | 2001 && WORD_SYNTAX_P (syntax_table, |
2019 XSTRING_CHAR (string, i - 1)) | 2002 string_emchar (string, i - 1)) |
2020 && --word_count) | 2003 && --word_count) |
2021 { | 2004 { |
2022 *o++ = '\\'; | 2005 *o++ = '\\'; |
2023 *o++ = 'W'; | 2006 *o++ = 'W'; |
2024 *o++ = '\\'; | 2007 *o++ = '\\'; |
2297 whole match. This is useful only after a regular expression search or | 2280 whole match. This is useful only after a regular expression search or |
2298 match since only regular expressions have distinguished subexpressions. | 2281 match since only regular expressions have distinguished subexpressions. |
2299 */ | 2282 */ |
2300 (replacement, fixedcase, literal, string, strbuffer)) | 2283 (replacement, fixedcase, literal, string, strbuffer)) |
2301 { | 2284 { |
2302 /* This function has been Mule-ized. */ | |
2303 /* This function can GC */ | 2285 /* This function can GC */ |
2304 enum { nochange, all_caps, cap_initial } case_action; | 2286 enum { nochange, all_caps, cap_initial } case_action; |
2305 Charbpos pos, last; | 2287 Charbpos pos, last; |
2306 int some_multiletter_word; | 2288 int some_multiletter_word; |
2307 int some_lowercase; | 2289 int some_lowercase; |
2308 int some_uppercase; | 2290 int some_uppercase; |
2309 int some_nonuppercase_initial; | 2291 int some_nonuppercase_initial; |
2310 Emchar c, prevc; | 2292 Emchar c, prevc; |
2311 Charcount inslen; | 2293 Charcount inslen; |
2312 struct buffer *buf; | 2294 struct buffer *buf; |
2313 Lisp_Char_Table *syntax_table; | 2295 Lisp_Object syntax_table; |
2314 int mc_count; | 2296 int mc_count; |
2315 Lisp_Object buffer; | 2297 Lisp_Object buffer; |
2316 int_dynarr *ul_action_dynarr = 0; | 2298 int_dynarr *ul_action_dynarr = 0; |
2317 int_dynarr *ul_pos_dynarr = 0; | 2299 int_dynarr *ul_pos_dynarr = 0; |
2318 int sub = 0; | 2300 int sub = 0; |
2347 invalid_argument ("last thing matched was not a buffer", Qunbound); | 2329 invalid_argument ("last thing matched was not a buffer", Qunbound); |
2348 buffer = last_thing_searched; | 2330 buffer = last_thing_searched; |
2349 buf = XBUFFER (buffer); | 2331 buf = XBUFFER (buffer); |
2350 } | 2332 } |
2351 | 2333 |
2352 syntax_table = XCHAR_TABLE (buf->mirror_syntax_table); | 2334 syntax_table = buf->mirror_syntax_table; |
2353 | 2335 |
2354 case_action = nochange; /* We tried an initialization */ | 2336 case_action = nochange; /* We tried an initialization */ |
2355 /* but some C compilers blew it */ | 2337 /* but some C compilers blew it */ |
2356 | 2338 |
2357 if (search_regs.num_regs == 0) | 2339 if (search_regs.num_regs == 0) |
2358 signal_error (Qinvalid_operation, "replace-match called before any match found", Qunbound); | 2340 signal_error (Qinvalid_operation, |
2341 "replace-match called before any match found", Qunbound); | |
2359 | 2342 |
2360 if (NILP (string)) | 2343 if (NILP (string)) |
2361 { | 2344 { |
2362 if (search_regs.start[sub] < BUF_BEGV (buf) | 2345 if (search_regs.start[sub] < BUF_BEGV (buf) |
2363 || search_regs.start[sub] > search_regs.end[sub] | 2346 || search_regs.start[sub] > search_regs.end[sub] |
2367 } | 2350 } |
2368 else | 2351 else |
2369 { | 2352 { |
2370 if (search_regs.start[0] < 0 | 2353 if (search_regs.start[0] < 0 |
2371 || search_regs.start[0] > search_regs.end[0] | 2354 || search_regs.start[0] > search_regs.end[0] |
2372 || search_regs.end[0] > XSTRING_CHAR_LENGTH (string)) | 2355 || search_regs.end[0] > string_char_length (string)) |
2373 args_out_of_range (make_int (search_regs.start[0]), | 2356 args_out_of_range (make_int (search_regs.start[0]), |
2374 make_int (search_regs.end[0])); | 2357 make_int (search_regs.end[0])); |
2375 } | 2358 } |
2376 | 2359 |
2377 if (NILP (fixedcase)) | 2360 if (NILP (fixedcase)) |
2392 for (pos = search_regs.start[sub]; pos < last; pos++) | 2375 for (pos = search_regs.start[sub]; pos < last; pos++) |
2393 { | 2376 { |
2394 if (NILP (string)) | 2377 if (NILP (string)) |
2395 c = BUF_FETCH_CHAR (buf, pos); | 2378 c = BUF_FETCH_CHAR (buf, pos); |
2396 else | 2379 else |
2397 c = XSTRING_CHAR (string, pos); | 2380 c = string_emchar (string, pos); |
2398 | 2381 |
2399 if (LOWERCASEP (buf, c)) | 2382 if (LOWERCASEP (buf, c)) |
2400 { | 2383 { |
2401 /* Cannot be all caps if any original char is lower case */ | 2384 /* Cannot be all caps if any original char is lower case */ |
2402 | 2385 |
2450 after = Fsubstring (string, make_int (search_regs.end[0]), Qnil); | 2433 after = Fsubstring (string, make_int (search_regs.end[0]), Qnil); |
2451 | 2434 |
2452 /* Do case substitution into REPLACEMENT if desired. */ | 2435 /* Do case substitution into REPLACEMENT if desired. */ |
2453 if (NILP (literal)) | 2436 if (NILP (literal)) |
2454 { | 2437 { |
2455 Charcount stlen = XSTRING_CHAR_LENGTH (replacement); | 2438 Charcount stlen = string_char_length (replacement); |
2456 Charcount strpos; | 2439 Charcount strpos; |
2457 /* XEmacs change: rewrote this loop somewhat to make it | 2440 /* XEmacs change: rewrote this loop somewhat to make it |
2458 cleaner. Also added \U, \E, etc. */ | 2441 cleaner. Also added \U, \E, etc. */ |
2459 Charcount literal_start = 0; | 2442 Charcount literal_start = 0; |
2460 /* We build up the substituted string in ACCUM. */ | 2443 /* We build up the substituted string in ACCUM. */ |
2477 /* If SUBSTART is set, we need to also insert the | 2460 /* If SUBSTART is set, we need to also insert the |
2478 text from SUBSTART to SUBEND in the original string. */ | 2461 text from SUBSTART to SUBEND in the original string. */ |
2479 Charcount substart = -1; | 2462 Charcount substart = -1; |
2480 Charcount subend = -1; | 2463 Charcount subend = -1; |
2481 | 2464 |
2482 c = XSTRING_CHAR (replacement, strpos); | 2465 c = string_emchar (replacement, strpos); |
2483 if (c == '\\' && strpos < stlen - 1) | 2466 if (c == '\\' && strpos < stlen - 1) |
2484 { | 2467 { |
2485 c = XSTRING_CHAR (replacement, ++strpos); | 2468 c = string_emchar (replacement, ++strpos); |
2486 if (c == '&') | 2469 if (c == '&') |
2487 { | 2470 { |
2488 literal_end = strpos - 1; | 2471 literal_end = strpos - 1; |
2489 substart = search_regs.start[0]; | 2472 substart = search_regs.start[0]; |
2490 subend = search_regs.end[0]; | 2473 subend = search_regs.end[0]; |
2516 make_opaque_ptr (ul_action_dynarr))); | 2499 make_opaque_ptr (ul_action_dynarr))); |
2517 } | 2500 } |
2518 literal_end = strpos - 1; | 2501 literal_end = strpos - 1; |
2519 Dynarr_add (ul_pos_dynarr, | 2502 Dynarr_add (ul_pos_dynarr, |
2520 (!NILP (accum) | 2503 (!NILP (accum) |
2521 ? XSTRING_CHAR_LENGTH (accum) | 2504 ? string_char_length (accum) |
2522 : 0) + (literal_end - literal_start)); | 2505 : 0) + (literal_end - literal_start)); |
2523 Dynarr_add (ul_action_dynarr, c); | 2506 Dynarr_add (ul_action_dynarr, c); |
2524 } | 2507 } |
2525 else if (c == '\\') | 2508 else if (c == '\\') |
2526 /* So we get just one backslash. */ | 2509 /* So we get just one backslash. */ |
2565 /* Now finally, we need to process the \U's, \E's, etc. */ | 2548 /* Now finally, we need to process the \U's, \E's, etc. */ |
2566 if (ul_pos_dynarr) | 2549 if (ul_pos_dynarr) |
2567 { | 2550 { |
2568 int i = 0; | 2551 int i = 0; |
2569 int cur_action = 'E'; | 2552 int cur_action = 'E'; |
2570 Charcount stlen = XSTRING_CHAR_LENGTH (replacement); | 2553 Charcount stlen = string_char_length (replacement); |
2571 Charcount strpos; | 2554 Charcount strpos; |
2572 | 2555 |
2573 for (strpos = 0; strpos < stlen; strpos++) | 2556 for (strpos = 0; strpos < stlen; strpos++) |
2574 { | 2557 { |
2575 Emchar curchar = XSTRING_CHAR (replacement, strpos); | 2558 Emchar curchar = string_emchar (replacement, strpos); |
2576 Emchar newchar = -1; | 2559 Emchar newchar = -1; |
2577 if (i < Dynarr_length (ul_pos_dynarr) && | 2560 if (i < Dynarr_length (ul_pos_dynarr) && |
2578 strpos == Dynarr_at (ul_pos_dynarr, i)) | 2561 strpos == Dynarr_at (ul_pos_dynarr, i)) |
2579 { | 2562 { |
2580 int new_action = Dynarr_at (ul_action_dynarr, i); | 2563 int new_action = Dynarr_at (ul_action_dynarr, i); |
2619 BUF_SET_PT (buf, search_regs.start[sub]); | 2602 BUF_SET_PT (buf, search_regs.start[sub]); |
2620 if (!NILP (literal)) | 2603 if (!NILP (literal)) |
2621 Finsert (1, &replacement); | 2604 Finsert (1, &replacement); |
2622 else | 2605 else |
2623 { | 2606 { |
2624 Charcount stlen = XSTRING_CHAR_LENGTH (replacement); | 2607 Charcount stlen = string_char_length (replacement); |
2625 Charcount strpos; | 2608 Charcount strpos; |
2626 struct gcpro gcpro1; | 2609 struct gcpro gcpro1; |
2627 GCPRO1 (replacement); | 2610 GCPRO1 (replacement); |
2628 for (strpos = 0; strpos < stlen; strpos++) | 2611 for (strpos = 0; strpos < stlen; strpos++) |
2629 { | 2612 { |
2631 exactly complementing BUF_SET_PT() above. | 2614 exactly complementing BUF_SET_PT() above. |
2632 During the loop, it keeps track of the amount inserted. | 2615 During the loop, it keeps track of the amount inserted. |
2633 */ | 2616 */ |
2634 Charcount offset = BUF_PT (buf) - search_regs.start[sub]; | 2617 Charcount offset = BUF_PT (buf) - search_regs.start[sub]; |
2635 | 2618 |
2636 c = XSTRING_CHAR (replacement, strpos); | 2619 c = string_emchar (replacement, strpos); |
2637 if (c == '\\' && strpos < stlen - 1) | 2620 if (c == '\\' && strpos < stlen - 1) |
2638 { | 2621 { |
2639 /* XXX FIXME: replacing just a substring non-literally | 2622 /* XXX FIXME: replacing just a substring non-literally |
2640 using backslash refs to the match looks dangerous. But | 2623 using backslash refs to the match looks dangerous. But |
2641 <15366.18513.698042.156573@ns.caldera.de> from Torsten Duwe | 2624 <15366.18513.698042.156573@ns.caldera.de> from Torsten Duwe |
2642 <duwe@caldera.de> claims Finsert_buffer_substring already | 2625 <duwe@caldera.de> claims Finsert_buffer_substring already |
2643 handles this correctly. | 2626 handles this correctly. |
2644 */ | 2627 */ |
2645 c = XSTRING_CHAR (replacement, ++strpos); | 2628 c = string_emchar (replacement, ++strpos); |
2646 if (c == '&') | 2629 if (c == '&') |
2647 Finsert_buffer_substring | 2630 Finsert_buffer_substring |
2648 (buffer, | 2631 (buffer, |
2649 make_int (search_regs.start[0] + offset), | 2632 make_int (search_regs.start[0] + offset), |
2650 make_int (search_regs.end[0] + offset)); | 2633 make_int (search_regs.end[0] + offset)); |
2741 } | 2724 } |
2742 | 2725 |
2743 static Lisp_Object | 2726 static Lisp_Object |
2744 match_limit (Lisp_Object num, int beginningp) | 2727 match_limit (Lisp_Object num, int beginningp) |
2745 { | 2728 { |
2746 /* This function has been Mule-ized. */ | |
2747 int n; | 2729 int n; |
2748 | 2730 |
2749 CHECK_INT (num); | 2731 CHECK_INT (num); |
2750 n = XINT (num); | 2732 n = XINT (num); |
2751 if (n < 0 || n >= search_regs.num_regs) | 2733 if (n < 0 || n >= search_regs.num_regs) |
2790 If REUSE is a list, reuse it as part of the value. If REUSE is long enough | 2772 If REUSE is a list, reuse it as part of the value. If REUSE is long enough |
2791 to hold all the values, and if INTEGERS is non-nil, no consing is done. | 2773 to hold all the values, and if INTEGERS is non-nil, no consing is done. |
2792 */ | 2774 */ |
2793 (integers, reuse)) | 2775 (integers, reuse)) |
2794 { | 2776 { |
2795 /* This function has been Mule-ized. */ | |
2796 Lisp_Object tail, prev; | 2777 Lisp_Object tail, prev; |
2797 Lisp_Object *data; | 2778 Lisp_Object *data; |
2798 int i; | 2779 int i; |
2799 Charcount len; | 2780 Charcount len; |
2800 | 2781 |
2863 Set internal data on last search match from elements of LIST. | 2844 Set internal data on last search match from elements of LIST. |
2864 LIST should have been created by calling `match-data' previously. | 2845 LIST should have been created by calling `match-data' previously. |
2865 */ | 2846 */ |
2866 (list)) | 2847 (list)) |
2867 { | 2848 { |
2868 /* This function has been Mule-ized. */ | |
2869 REGISTER int i; | 2849 REGISTER int i; |
2870 REGISTER Lisp_Object marker; | 2850 REGISTER Lisp_Object marker; |
2871 int num_regs; | 2851 int num_regs; |
2872 int length; | 2852 int length; |
2873 | 2853 |