comparison src/search.c @ 665:fdefd0186b75

[xemacs-hg @ 2001-09-20 06:28:42 by ben] The great integral types renaming. The purpose of this is to rationalize the names used for various integral types, so that they match their intended uses and follow consist conventions, and eliminate types that were not semantically different from each other. The conventions are: -- All integral types that measure quantities of anything are signed. Some people disagree vociferously with this, but their arguments are mostly theoretical, and are vastly outweighed by the practical headaches of mixing signed and unsigned values, and more importantly by the far increased likelihood of inadvertent bugs: Because of the broken "viral" nature of unsigned quantities in C (operations involving mixed signed/unsigned are done unsigned, when exactly the opposite is nearly always wanted), even a single error in declaring a quantity unsigned that should be signed, or even the even more subtle error of comparing signed and unsigned values and forgetting the necessary cast, can be catastrophic, as comparisons will yield wrong results. -Wsign-compare is turned on specifically to catch this, but this tends to result in a great number of warnings when mixing signed and unsigned, and the casts are annoying. More has been written on this elsewhere. -- All such quantity types just mentioned boil down to EMACS_INT, which is 32 bits on 32-bit machines and 64 bits on 64-bit machines. This is guaranteed to be the same size as Lisp objects of type `int', and (as far as I can tell) of size_t (unsigned!) and ssize_t. The only type below that is not an EMACS_INT is Hashcode, which is an unsigned value of the same size as EMACS_INT. -- Type names should be relatively short (no more than 10 characters or so), with the first letter capitalized and no underscores if they can at all be avoided. -- "count" == a zero-based measurement of some quantity. Includes sizes, offsets, and indexes. -- "bpos" == a one-based measurement of a position in a buffer. "Charbpos" and "Bytebpos" count text in the buffer, rather than bytes in memory; thus Bytebpos does not directly correspond to the memory representation. Use "Membpos" for this. -- "Char" refers to internal-format characters, not to the C type "char", which is really a byte. -- For the actual name changes, see the script below. I ran the following script to do the conversion. (NOTE: This script is idempotent. You can safely run it multiple times and it will not screw up previous results -- in fact, it will do nothing if nothing has changed. Thus, it can be run repeatedly as necessary to handle patches coming in from old workspaces, or old branches.) There are two tags, just before and just after the change: `pre-integral-type-rename' and `post-integral-type-rename'. When merging code from the main trunk into a branch, the best thing to do is first merge up to `pre-integral-type-rename', then apply the script and associated changes, then merge from `post-integral-type-change' to the present. (Alternatively, just do the merging in one operation; but you may then have a lot of conflicts needing to be resolved by hand.) Script `fixtypes.sh' follows: ----------------------------------- cut ------------------------------------ files="*.[ch] s/*.h m/*.h config.h.in ../configure.in Makefile.in.in ../lib-src/*.[ch] ../lwlib/*.[ch]" gr Memory_Count Bytecount $files gr Lstream_Data_Count Bytecount $files gr Element_Count Elemcount $files gr Hash_Code Hashcode $files gr extcount bytecount $files gr bufpos charbpos $files gr bytind bytebpos $files gr memind membpos $files gr bufbyte intbyte $files gr Extcount Bytecount $files gr Bufpos Charbpos $files gr Bytind Bytebpos $files gr Memind Membpos $files gr Bufbyte Intbyte $files gr EXTCOUNT BYTECOUNT $files gr BUFPOS CHARBPOS $files gr BYTIND BYTEBPOS $files gr MEMIND MEMBPOS $files gr BUFBYTE INTBYTE $files gr MEMORY_COUNT BYTECOUNT $files gr LSTREAM_DATA_COUNT BYTECOUNT $files gr ELEMENT_COUNT ELEMCOUNT $files gr HASH_CODE HASHCODE $files ----------------------------------- cut ------------------------------------ `fixtypes.sh' is a Bourne-shell script; it uses 'gr': ----------------------------------- cut ------------------------------------ #!/bin/sh # Usage is like this: # gr FROM TO FILES ... # globally replace FROM with TO in FILES. FROM and TO are regular expressions. # backup files are stored in the `backup' directory. from="$1" to="$2" shift 2 echo ${1+"$@"} | xargs global-replace "s/$from/$to/g" ----------------------------------- cut ------------------------------------ `gr' in turn uses a Perl script to do its real work, `global-replace', which follows: ----------------------------------- cut ------------------------------------ : #-*- Perl -*- ### global-modify --- modify the contents of a file by a Perl expression ## Copyright (C) 1999 Martin Buchholz. ## Copyright (C) 2001 Ben Wing. ## Authors: Martin Buchholz <martin@xemacs.org>, Ben Wing <ben@xemacs.org> ## Maintainer: Ben Wing <ben@xemacs.org> ## Current Version: 1.0, May 5, 2001 # This program is free software; you can redistribute it and/or modify # it under the terms of the GNU General Public License as published by # the Free Software Foundation; either version 2, or (at your option) # any later version. # # This program is distributed in the hope that it will be useful, but # WITHOUT ANY WARRANTY; without even the implied warranty of # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU # General Public License for more details. # # You should have received a copy of the GNU General Public License # along with XEmacs; see the file COPYING. If not, write to the Free # Software Foundation, Inc., 59 Temple Place - Suite 330, Boston, MA # 02111-1307, USA. eval 'exec perl -w -S $0 ${1+"$@"}' if 0; use strict; use FileHandle; use Carp; use Getopt::Long; use File::Basename; (my $myName = $0) =~ s@.*/@@; my $usage=" Usage: $myName [--help] [--backup-dir=DIR] [--line-mode] [--hunk-mode] PERLEXPR FILE ... Globally modify a file, either line by line or in one big hunk. Typical usage is like this: [with GNU print, GNU xargs: guaranteed to handle spaces, quotes, etc. in file names] find . -name '*.[ch]' -print0 | xargs -0 $0 's/\bCONST\b/const/g'\n [with non-GNU print, xargs] find . -name '*.[ch]' -print | xargs $0 's/\bCONST\b/const/g'\n The file is read in, either line by line (with --line-mode specified) or in one big hunk (with --hunk-mode specified; it's the default), and the Perl expression is then evalled with \$_ set to the line or hunk of text, including the terminating newline if there is one. It should destructively modify the value there, storing the changed result in \$_. Files in which any modifications are made are backed up to the directory specified using --backup-dir, or to `backup' by default. To disable this, use --backup-dir= with no argument. Hunk mode is the default because it is MUCH MUCH faster than line-by-line. Use line-by-line only when it matters, e.g. you want to do a replacement only once per line (the default without the `g' argument). Conversely, when using hunk mode, *ALWAYS* use `g'; otherwise, you will only make one replacement in the entire file! "; my %options = (); $Getopt::Long::ignorecase = 0; &GetOptions ( \%options, 'help', 'backup-dir=s', 'line-mode', 'hunk-mode', ); die $usage if $options{"help"} or @ARGV <= 1; my $code = shift; die $usage if grep (-d || ! -w, @ARGV); sub SafeOpen { open ((my $fh = new FileHandle), $_[0]); confess "Can't open $_[0]: $!" if ! defined $fh; return $fh; } sub SafeClose { close $_[0] or confess "Can't close $_[0]: $!"; } sub FileContents { my $fh = SafeOpen ("< $_[0]"); my $olddollarslash = $/; local $/ = undef; my $contents = <$fh>; $/ = $olddollarslash; return $contents; } sub WriteStringToFile { my $fh = SafeOpen ("> $_[0]"); binmode $fh; print $fh $_[1] or confess "$_[0]: $!\n"; SafeClose $fh; } foreach my $file (@ARGV) { my $changed_p = 0; my $new_contents = ""; if ($options{"line-mode"}) { my $fh = SafeOpen $file; while (<$fh>) { my $save_line = $_; eval $code; $changed_p = 1 if $save_line ne $_; $new_contents .= $_; } } else { my $orig_contents = $_ = FileContents $file; eval $code; if ($_ ne $orig_contents) { $changed_p = 1; $new_contents = $_; } } if ($changed_p) { my $backdir = $options{"backup-dir"}; $backdir = "backup" if !defined ($backdir); if ($backdir) { my ($name, $path, $suffix) = fileparse ($file, ""); my $backfulldir = $path . $backdir; my $backfile = "$backfulldir/$name"; mkdir $backfulldir, 0755 unless -d $backfulldir; print "modifying $file (original saved in $backfile)\n"; rename $file, $backfile; } WriteStringToFile ($file, $new_contents); } } ----------------------------------- cut ------------------------------------ In addition to those programs, I needed to fix up a few other things, particularly relating to the duplicate definitions of types, now that some types merged with others. Specifically: 1. in lisp.h, removed duplicate declarations of Bytecount. The changed code should now look like this: (In each code snippet below, the first and last lines are the same as the original, as are all lines outside of those lines. That allows you to locate the section to be replaced, and replace the stuff in that section, verifying that there isn't anything new added that would need to be kept.) --------------------------------- snip ------------------------------------- /* Counts of bytes or chars */ typedef EMACS_INT Bytecount; typedef EMACS_INT Charcount; /* Counts of elements */ typedef EMACS_INT Elemcount; /* Hash codes */ typedef unsigned long Hashcode; /* ------------------------ dynamic arrays ------------------- */ --------------------------------- snip ------------------------------------- 2. in lstream.h, removed duplicate declaration of Bytecount. Rewrote the comment about this type. The changed code should now look like this: --------------------------------- snip ------------------------------------- #endif /* The have been some arguments over the what the type should be that specifies a count of bytes in a data block to be written out or read in, using Lstream_read(), Lstream_write(), and related functions. Originally it was long, which worked fine; Martin "corrected" these to size_t and ssize_t on the grounds that this is theoretically cleaner and is in keeping with the C standards. Unfortunately, this practice is horribly error-prone due to design flaws in the way that mixed signed/unsigned arithmetic happens. In fact, by doing this change, Martin introduced a subtle but fatal error that caused the operation of sending large mail messages to the SMTP server under Windows to fail. By putting all values back to be signed, avoiding any signed/unsigned mixing, the bug immediately went away. The type then in use was Lstream_Data_Count, so that it be reverted cleanly if a vote came to that. Now it is Bytecount. Some earlier comments about why the type must be signed: This MUST BE SIGNED, since it also is used in functions that return the number of bytes actually read to or written from in an operation, and these functions can return -1 to signal error. Note that the standard Unix read() and write() functions define the count going in as a size_t, which is UNSIGNED, and the count going out as an ssize_t, which is SIGNED. This is a horrible design flaw. Not only is it highly likely to lead to logic errors when a -1 gets interpreted as a large positive number, but operations are bound to fail in all sorts of horrible ways when a number in the upper-half of the size_t range is passed in -- this number is unrepresentable as an ssize_t, so code that checks to see how many bytes are actually written (which is mandatory if you are dealing with certain types of devices) will get completely screwed up. --ben */ typedef enum lstream_buffering --------------------------------- snip ------------------------------------- 3. in dumper.c, there are four places, all inside of switch() statements, where XD_BYTECOUNT appears twice as a case tag. In each case, the two case blocks contain identical code, and you should *REMOVE THE SECOND* and leave the first.
author ben
date Thu, 20 Sep 2001 06:31:11 +0000
parents b39c14581166
children a307f9a2021d
comparison
equal deleted inserted replaced
664:6e99cc8c6ca5 665:fdefd0186b75
82 to call re_set_registers after compiling a new pattern or after 82 to call re_set_registers after compiling a new pattern or after
83 setting the match registers, so that the regex functions will be 83 setting the match registers, so that the regex functions will be
84 able to free or re-allocate it properly. */ 84 able to free or re-allocate it properly. */
85 85
86 /* Note: things get trickier under Mule because the values returned from 86 /* Note: things get trickier under Mule because the values returned from
87 the regexp routines are in Bytinds but we need them to be in Bufpos's. 87 the regexp routines are in Bytebposs but we need them to be in Charbpos's.
88 We take the easy way out for the moment and just convert them immediately. 88 We take the easy way out for the moment and just convert them immediately.
89 We could be more clever by not converting them until necessary, but 89 We could be more clever by not converting them until necessary, but
90 that gets real ugly real fast since the buffer might have changed and 90 that gets real ugly real fast since the buffer might have changed and
91 the positions might be out of sync or out of range. 91 the positions might be out of sync or out of range.
92 */ 92 */
107 Fixnum warn_about_possibly_incompatible_back_references; 107 Fixnum warn_about_possibly_incompatible_back_references;
108 108
109 /* range table for use with skip_chars. Only needed for Mule. */ 109 /* range table for use with skip_chars. Only needed for Mule. */
110 Lisp_Object Vskip_chars_range_table; 110 Lisp_Object Vskip_chars_range_table;
111 111
112 static void set_search_regs (struct buffer *buf, Bufpos beg, Charcount len); 112 static void set_search_regs (struct buffer *buf, Charbpos beg, Charcount len);
113 static void save_search_regs (void); 113 static void save_search_regs (void);
114 static Bufpos simple_search (struct buffer *buf, Bufbyte *base_pat, 114 static Charbpos simple_search (struct buffer *buf, Intbyte *base_pat,
115 Bytecount len, Bytind pos, Bytind lim, 115 Bytecount len, Bytebpos pos, Bytebpos lim,
116 EMACS_INT n, Lisp_Object trt); 116 EMACS_INT n, Lisp_Object trt);
117 static Bufpos boyer_moore (struct buffer *buf, Bufbyte *base_pat, 117 static Charbpos boyer_moore (struct buffer *buf, Intbyte *base_pat,
118 Bytecount len, Bytind pos, Bytind lim, 118 Bytecount len, Bytebpos pos, Bytebpos lim,
119 EMACS_INT n, Lisp_Object trt, 119 EMACS_INT n, Lisp_Object trt,
120 Lisp_Object inverse_trt, int charset_base); 120 Lisp_Object inverse_trt, int charset_base);
121 static Bufpos search_buffer (struct buffer *buf, Lisp_Object str, 121 static Charbpos search_buffer (struct buffer *buf, Lisp_Object str,
122 Bufpos bufpos, Bufpos buflim, EMACS_INT n, int RE, 122 Charbpos charbpos, Charbpos buflim, EMACS_INT n, int RE,
123 Lisp_Object trt, Lisp_Object inverse_trt, 123 Lisp_Object trt, Lisp_Object inverse_trt,
124 int posix); 124 int posix);
125 125
126 static void 126 static void
127 matcher_overflow (void) 127 matcher_overflow (void)
227 for (;;) 227 for (;;)
228 Fsignal (Qsearch_failed, list1 (arg)); 228 Fsignal (Qsearch_failed, list1 (arg));
229 return Qnil; /* Not reached. */ 229 return Qnil; /* Not reached. */
230 } 230 }
231 231
232 /* Convert the search registers from Bytinds to Bufpos's. Needs to be 232 /* Convert the search registers from Bytebposs to Charbpos's. Needs to be
233 done after each regexp match that uses the search regs. 233 done after each regexp match that uses the search regs.
234 234
235 We could get a potential speedup by not converting the search registers 235 We could get a potential speedup by not converting the search registers
236 until it's really necessary, e.g. when match-data or replace-match is 236 until it's really necessary, e.g. when match-data or replace-match is
237 called. However, this complexifies the code a lot (e.g. the buffer 237 called. However, this complexifies the code a lot (e.g. the buffer
238 could have changed and the Bytinds stored might be invalid) and is 238 could have changed and the Bytebposs stored might be invalid) and is
239 probably not a great time-saver. */ 239 probably not a great time-saver. */
240 240
241 static void 241 static void
242 fixup_search_regs_for_buffer (struct buffer *buf) 242 fixup_search_regs_for_buffer (struct buffer *buf)
243 { 243 {
245 int num_regs = search_regs.num_regs; 245 int num_regs = search_regs.num_regs;
246 246
247 for (i = 0; i < num_regs; i++) 247 for (i = 0; i < num_regs; i++)
248 { 248 {
249 if (search_regs.start[i] >= 0) 249 if (search_regs.start[i] >= 0)
250 search_regs.start[i] = bytind_to_bufpos (buf, search_regs.start[i]); 250 search_regs.start[i] = bytebpos_to_charbpos (buf, search_regs.start[i]);
251 if (search_regs.end[i] >= 0) 251 if (search_regs.end[i] >= 0)
252 search_regs.end[i] = bytind_to_bufpos (buf, search_regs.end[i]); 252 search_regs.end[i] = bytebpos_to_charbpos (buf, search_regs.end[i]);
253 } 253 }
254 } 254 }
255 255
256 /* Similar but for strings. */ 256 /* Similar but for strings. */
257 static void 257 static void
288 static Lisp_Object 288 static Lisp_Object
289 looking_at_1 (Lisp_Object string, struct buffer *buf, int posix) 289 looking_at_1 (Lisp_Object string, struct buffer *buf, int posix)
290 { 290 {
291 /* This function has been Mule-ized, except for the trt table handling. */ 291 /* This function has been Mule-ized, except for the trt table handling. */
292 Lisp_Object val; 292 Lisp_Object val;
293 Bytind p1, p2; 293 Bytebpos p1, p2;
294 Bytecount s1, s2; 294 Bytecount s1, s2;
295 REGISTER int i; 295 REGISTER int i;
296 struct re_pattern_buffer *bufp; 296 struct re_pattern_buffer *bufp;
297 297
298 if (running_asynch_code) 298 if (running_asynch_code)
456 /* Match REGEXP against STRING, searching all of STRING, 456 /* Match REGEXP against STRING, searching all of STRING,
457 and return the index of the match, or negative on failure. 457 and return the index of the match, or negative on failure.
458 This does not clobber the match data. */ 458 This does not clobber the match data. */
459 459
460 Bytecount 460 Bytecount
461 fast_string_match (Lisp_Object regexp, const Bufbyte *nonreloc, 461 fast_string_match (Lisp_Object regexp, const Intbyte *nonreloc,
462 Lisp_Object reloc, Bytecount offset, 462 Lisp_Object reloc, Bytecount offset,
463 Bytecount length, int case_fold_search, 463 Bytecount length, int case_fold_search,
464 Error_Behavior errb, int no_quit) 464 Error_Behavior errb, int no_quit)
465 { 465 {
466 /* This function has been Mule-ized, except for the trt table handling. */ 466 /* This function has been Mule-ized, except for the trt table handling. */
467 Bytecount val; 467 Bytecount val;
468 Bufbyte *newnonreloc = (Bufbyte *) nonreloc; 468 Intbyte *newnonreloc = (Intbyte *) nonreloc;
469 struct re_pattern_buffer *bufp; 469 struct re_pattern_buffer *bufp;
470 470
471 bufp = compile_pattern (regexp, 0, 471 bufp = compile_pattern (regexp, 0,
472 (case_fold_search 472 (case_fold_search
473 ? XCASE_TABLE_DOWNCASE (current_buffer->case_table) 473 ? XCASE_TABLE_DOWNCASE (current_buffer->case_table)
489 else 489 else
490 { 490 {
491 /* QUIT could relocate RELOC. Therefore we must alloca() 491 /* QUIT could relocate RELOC. Therefore we must alloca()
492 and copy. No way around this except some serious 492 and copy. No way around this except some serious
493 rewriting of re_search(). */ 493 rewriting of re_search(). */
494 newnonreloc = (Bufbyte *) alloca (length); 494 newnonreloc = (Intbyte *) alloca (length);
495 memcpy (newnonreloc, XSTRING_DATA (reloc), length); 495 memcpy (newnonreloc, XSTRING_DATA (reloc), length);
496 } 496 }
497 } 497 }
498 498
499 /* #### evil current-buffer dependency */ 499 /* #### evil current-buffer dependency */
559 If we don't find COUNT instances before reaching END, set *SHORTAGE 559 If we don't find COUNT instances before reaching END, set *SHORTAGE
560 to the number of TARGETs left unfound, and return END. 560 to the number of TARGETs left unfound, and return END.
561 561
562 If ALLOW_QUIT is non-zero, call QUIT periodically. */ 562 If ALLOW_QUIT is non-zero, call QUIT periodically. */
563 563
564 static Bytind 564 static Bytebpos
565 bi_scan_buffer (struct buffer *buf, Emchar target, Bytind st, Bytind en, 565 bi_scan_buffer (struct buffer *buf, Emchar target, Bytebpos st, Bytebpos en,
566 EMACS_INT count, EMACS_INT *shortage, int allow_quit) 566 EMACS_INT count, EMACS_INT *shortage, int allow_quit)
567 { 567 {
568 /* This function has been Mule-ized. */ 568 /* This function has been Mule-ized. */
569 Bytind lim = en > 0 ? en : 569 Bytebpos lim = en > 0 ? en :
570 ((count > 0) ? BI_BUF_ZV (buf) : BI_BUF_BEGV (buf)); 570 ((count > 0) ? BI_BUF_ZV (buf) : BI_BUF_BEGV (buf));
571 571
572 /* #### newline cache stuff in this function not yet ported */ 572 /* #### newline cache stuff in this function not yet ported */
573 573
574 assert (count != 0); 574 assert (count != 0);
588 { 588 {
589 while (st < lim && count > 0) 589 while (st < lim && count > 0)
590 { 590 {
591 if (BI_BUF_FETCH_CHAR (buf, st) == target) 591 if (BI_BUF_FETCH_CHAR (buf, st) == target)
592 count--; 592 count--;
593 INC_BYTIND (buf, st); 593 INC_BYTEBPOS (buf, st);
594 } 594 }
595 } 595 }
596 else 596 else
597 #endif 597 #endif
598 { 598 {
599 while (st < lim && count > 0) 599 while (st < lim && count > 0)
600 { 600 {
601 Bytind ceil; 601 Bytebpos ceil;
602 Bufbyte *bufptr; 602 Intbyte *bufptr;
603 603
604 ceil = BI_BUF_CEILING_OF (buf, st); 604 ceil = BI_BUF_CEILING_OF (buf, st);
605 ceil = min (lim, ceil); 605 ceil = min (lim, ceil);
606 bufptr = (Bufbyte *) memchr (BI_BUF_BYTE_ADDRESS (buf, st), 606 bufptr = (Intbyte *) memchr (BI_BUF_BYTE_ADDRESS (buf, st),
607 (int) target, ceil - st); 607 (int) target, ceil - st);
608 if (bufptr) 608 if (bufptr)
609 { 609 {
610 count--; 610 count--;
611 st = BI_BUF_PTR_BYTE_POS (buf, bufptr) + 1; 611 st = BI_BUF_PTR_BYTE_POS (buf, bufptr) + 1;
626 #ifdef MULE 626 #ifdef MULE
627 if (target >= 0200) 627 if (target >= 0200)
628 { 628 {
629 while (st > lim && count < 0) 629 while (st > lim && count < 0)
630 { 630 {
631 DEC_BYTIND (buf, st); 631 DEC_BYTEBPOS (buf, st);
632 if (BI_BUF_FETCH_CHAR (buf, st) == target) 632 if (BI_BUF_FETCH_CHAR (buf, st) == target)
633 count++; 633 count++;
634 } 634 }
635 } 635 }
636 else 636 else
637 #endif 637 #endif
638 { 638 {
639 while (st > lim && count < 0) 639 while (st > lim && count < 0)
640 { 640 {
641 Bytind floor; 641 Bytebpos floor;
642 Bufbyte *bufptr; 642 Intbyte *bufptr;
643 Bufbyte *floorptr; 643 Intbyte *floorptr;
644 644
645 floor = BI_BUF_FLOOR_OF (buf, st); 645 floor = BI_BUF_FLOOR_OF (buf, st);
646 floor = max (lim, floor); 646 floor = max (lim, floor);
647 /* No memrchr() ... */ 647 /* No memrchr() ... */
648 bufptr = BI_BUF_BYTE_ADDRESS_BEFORE (buf, st); 648 bufptr = BI_BUF_BYTE_ADDRESS_BEFORE (buf, st);
672 else 672 else
673 { 673 {
674 /* We found the character we were looking for; we have to return 674 /* We found the character we were looking for; we have to return
675 the position *after* it due to the strange way that the return 675 the position *after* it due to the strange way that the return
676 value is defined. */ 676 value is defined. */
677 INC_BYTIND (buf, st); 677 INC_BYTEBPOS (buf, st);
678 return st; 678 return st;
679 } 679 }
680 } 680 }
681 } 681 }
682 682
683 Bufpos 683 Charbpos
684 scan_buffer (struct buffer *buf, Emchar target, Bufpos start, Bufpos end, 684 scan_buffer (struct buffer *buf, Emchar target, Charbpos start, Charbpos end,
685 EMACS_INT count, EMACS_INT *shortage, int allow_quit) 685 EMACS_INT count, EMACS_INT *shortage, int allow_quit)
686 { 686 {
687 Bytind bi_retval; 687 Bytebpos bi_retval;
688 Bytind bi_start, bi_end; 688 Bytebpos bi_start, bi_end;
689 689
690 bi_start = bufpos_to_bytind (buf, start); 690 bi_start = charbpos_to_bytebpos (buf, start);
691 if (end) 691 if (end)
692 bi_end = bufpos_to_bytind (buf, end); 692 bi_end = charbpos_to_bytebpos (buf, end);
693 else 693 else
694 bi_end = 0; 694 bi_end = 0;
695 bi_retval = bi_scan_buffer (buf, target, bi_start, bi_end, count, 695 bi_retval = bi_scan_buffer (buf, target, bi_start, bi_end, count,
696 shortage, allow_quit); 696 shortage, allow_quit);
697 return bytind_to_bufpos (buf, bi_retval); 697 return bytebpos_to_charbpos (buf, bi_retval);
698 } 698 }
699 699
700 Bytind 700 Bytebpos
701 bi_find_next_newline_no_quit (struct buffer *buf, Bytind from, int count) 701 bi_find_next_newline_no_quit (struct buffer *buf, Bytebpos from, int count)
702 { 702 {
703 return bi_scan_buffer (buf, '\n', from, 0, count, 0, 0); 703 return bi_scan_buffer (buf, '\n', from, 0, count, 0, 0);
704 } 704 }
705 705
706 Bufpos 706 Charbpos
707 find_next_newline_no_quit (struct buffer *buf, Bufpos from, int count) 707 find_next_newline_no_quit (struct buffer *buf, Charbpos from, int count)
708 { 708 {
709 return scan_buffer (buf, '\n', from, 0, count, 0, 0); 709 return scan_buffer (buf, '\n', from, 0, count, 0, 0);
710 } 710 }
711 711
712 Bufpos 712 Charbpos
713 find_next_newline (struct buffer *buf, Bufpos from, int count) 713 find_next_newline (struct buffer *buf, Charbpos from, int count)
714 { 714 {
715 return scan_buffer (buf, '\n', from, 0, count, 0, 1); 715 return scan_buffer (buf, '\n', from, 0, count, 0, 1);
716 } 716 }
717 717
718 Bytind 718 Bytebpos
719 bi_find_next_emchar_in_string (Lisp_String* str, Emchar target, Bytind st, 719 bi_find_next_emchar_in_string (Lisp_String* str, Emchar target, Bytebpos st,
720 EMACS_INT count) 720 EMACS_INT count)
721 { 721 {
722 /* This function has been Mule-ized. */ 722 /* This function has been Mule-ized. */
723 Bytind lim = string_length (str) -1; 723 Bytebpos lim = string_length (str) -1;
724 Bufbyte* s = string_data (str); 724 Intbyte* s = string_data (str);
725 725
726 assert (count >= 0); 726 assert (count >= 0);
727 727
728 #ifdef MULE 728 #ifdef MULE
729 /* Due to the Mule representation of characters in a buffer, 729 /* Due to the Mule representation of characters in a buffer,
735 { 735 {
736 while (st < lim && count > 0) 736 while (st < lim && count > 0)
737 { 737 {
738 if (string_char (str, st) == target) 738 if (string_char (str, st) == target)
739 count--; 739 count--;
740 INC_CHARBYTIND (s, st); 740 INC_CHARBYTEBPOS (s, st);
741 } 741 }
742 } 742 }
743 else 743 else
744 #endif 744 #endif
745 { 745 {
746 while (st < lim && count > 0) 746 while (st < lim && count > 0)
747 { 747 {
748 Bufbyte *bufptr = (Bufbyte *) memchr (charptr_n_addr (s, st), 748 Intbyte *bufptr = (Intbyte *) memchr (charptr_n_addr (s, st),
749 (int) target, lim - st); 749 (int) target, lim - st);
750 if (bufptr) 750 if (bufptr)
751 { 751 {
752 count--; 752 count--;
753 st = (Bytind)(bufptr - s) + 1; 753 st = (Bytebpos)(bufptr - s) + 1;
754 } 754 }
755 else 755 else
756 st = lim; 756 st = lim;
757 } 757 }
758 } 758 }
760 } 760 }
761 761
762 /* Like find_next_newline, but returns position before the newline, 762 /* Like find_next_newline, but returns position before the newline,
763 not after, and only search up to TO. This isn't just 763 not after, and only search up to TO. This isn't just
764 find_next_newline (...)-1, because you might hit TO. */ 764 find_next_newline (...)-1, because you might hit TO. */
765 Bufpos 765 Charbpos
766 find_before_next_newline (struct buffer *buf, Bufpos from, Bufpos to, int count) 766 find_before_next_newline (struct buffer *buf, Charbpos from, Charbpos to, int count)
767 { 767 {
768 EMACS_INT shortage; 768 EMACS_INT shortage;
769 Bufpos pos = scan_buffer (buf, '\n', from, to, count, &shortage, 1); 769 Charbpos pos = scan_buffer (buf, '\n', from, to, count, &shortage, 1);
770 770
771 if (shortage == 0) 771 if (shortage == 0)
772 pos--; 772 pos--;
773 773
774 return pos; 774 return pos;
777 static Lisp_Object 777 static Lisp_Object
778 skip_chars (struct buffer *buf, int forwardp, int syntaxp, 778 skip_chars (struct buffer *buf, int forwardp, int syntaxp,
779 Lisp_Object string, Lisp_Object lim) 779 Lisp_Object string, Lisp_Object lim)
780 { 780 {
781 /* This function has been Mule-ized. */ 781 /* This function has been Mule-ized. */
782 REGISTER Bufbyte *p, *pend; 782 REGISTER Intbyte *p, *pend;
783 REGISTER Emchar c; 783 REGISTER Emchar c;
784 /* We store the first 256 chars in an array here and the rest in 784 /* We store the first 256 chars in an array here and the rest in
785 a range table. */ 785 a range table. */
786 unsigned char fastmap[0400]; 786 unsigned char fastmap[0400];
787 int negate = 0; 787 int negate = 0;
788 REGISTER int i; 788 REGISTER int i;
789 #ifndef emacs 789 #ifndef emacs
790 Lisp_Char_Table *syntax_table = XCHAR_TABLE (buf->mirror_syntax_table); 790 Lisp_Char_Table *syntax_table = XCHAR_TABLE (buf->mirror_syntax_table);
791 #endif 791 #endif
792 Bufpos limit; 792 Charbpos limit;
793 793
794 if (NILP (lim)) 794 if (NILP (lim))
795 limit = forwardp ? BUF_ZV (buf) : BUF_BEGV (buf); 795 limit = forwardp ? BUF_ZV (buf) : BUF_BEGV (buf);
796 else 796 else
797 { 797 {
878 if (negate) 878 if (negate)
879 for (i = 0; i < (int) (sizeof (fastmap)); i++) 879 for (i = 0; i < (int) (sizeof (fastmap)); i++)
880 fastmap[i] ^= 1; 880 fastmap[i] ^= 1;
881 881
882 { 882 {
883 Bufpos start_point = BUF_PT (buf); 883 Charbpos start_point = BUF_PT (buf);
884 884
885 if (syntaxp) 885 if (syntaxp)
886 { 886 {
887 SETUP_SYNTAX_CACHE_FOR_BUFFER (buf, BUF_PT (buf), forwardp ? 1 : -1); 887 SETUP_SYNTAX_CACHE_FOR_BUFFER (buf, BUF_PT (buf), forwardp ? 1 : -1);
888 /* All syntax designators are normal chars so nothing strange 888 /* All syntax designators are normal chars so nothing strange
1015 search_command (Lisp_Object string, Lisp_Object limit, Lisp_Object noerror, 1015 search_command (Lisp_Object string, Lisp_Object limit, Lisp_Object noerror,
1016 Lisp_Object count, Lisp_Object buffer, int direction, 1016 Lisp_Object count, Lisp_Object buffer, int direction,
1017 int RE, int posix) 1017 int RE, int posix)
1018 { 1018 {
1019 /* This function has been Mule-ized, except for the trt table handling. */ 1019 /* This function has been Mule-ized, except for the trt table handling. */
1020 REGISTER Bufpos np; 1020 REGISTER Charbpos np;
1021 Bufpos lim; 1021 Charbpos lim;
1022 EMACS_INT n = direction; 1022 EMACS_INT n = direction;
1023 struct buffer *buf; 1023 struct buffer *buf;
1024 1024
1025 if (!NILP (count)) 1025 if (!NILP (count))
1026 { 1026 {
1083 static int 1083 static int
1084 trivial_regexp_p (Lisp_Object regexp) 1084 trivial_regexp_p (Lisp_Object regexp)
1085 { 1085 {
1086 /* This function has been Mule-ized. */ 1086 /* This function has been Mule-ized. */
1087 Bytecount len = XSTRING_LENGTH (regexp); 1087 Bytecount len = XSTRING_LENGTH (regexp);
1088 Bufbyte *s = XSTRING_DATA (regexp); 1088 Intbyte *s = XSTRING_DATA (regexp);
1089 while (--len >= 0) 1089 while (--len >= 0)
1090 { 1090 {
1091 switch (*s++) 1091 switch (*s++)
1092 { 1092 {
1093 case '.': case '*': case '+': case '?': case '[': case '^': case '$': 1093 case '.': case '*': case '+': case '?': case '[': case '^': case '$':
1112 } 1112 }
1113 return 1; 1113 return 1;
1114 } 1114 }
1115 1115
1116 /* Search for the n'th occurrence of STRING in BUF, 1116 /* Search for the n'th occurrence of STRING in BUF,
1117 starting at position BUFPOS and stopping at position BUFLIM, 1117 starting at position CHARBPOS and stopping at position BUFLIM,
1118 treating PAT as a literal string if RE is false or as 1118 treating PAT as a literal string if RE is false or as
1119 a regular expression if RE is true. 1119 a regular expression if RE is true.
1120 1120
1121 If N is positive, searching is forward and BUFLIM must be greater 1121 If N is positive, searching is forward and BUFLIM must be greater
1122 than BUFPOS. 1122 than CHARBPOS.
1123 If N is negative, searching is backward and BUFLIM must be less 1123 If N is negative, searching is backward and BUFLIM must be less
1124 than BUFPOS. 1124 than CHARBPOS.
1125 1125
1126 Returns -x if only N-x occurrences found (x > 0), 1126 Returns -x if only N-x occurrences found (x > 0),
1127 or else the position at the beginning of the Nth occurrence 1127 or else the position at the beginning of the Nth occurrence
1128 (if searching backward) or the end (if searching forward). 1128 (if searching backward) or the end (if searching forward).
1129 1129
1130 POSIX is nonzero if we want full backtracking (POSIX style) 1130 POSIX is nonzero if we want full backtracking (POSIX style)
1131 for this pattern. 0 means backtrack only enough to get a valid match. */ 1131 for this pattern. 0 means backtrack only enough to get a valid match. */
1132 static Bufpos 1132 static Charbpos
1133 search_buffer (struct buffer *buf, Lisp_Object string, Bufpos bufpos, 1133 search_buffer (struct buffer *buf, Lisp_Object string, Charbpos charbpos,
1134 Bufpos buflim, EMACS_INT n, int RE, Lisp_Object trt, 1134 Charbpos buflim, EMACS_INT n, int RE, Lisp_Object trt,
1135 Lisp_Object inverse_trt, int posix) 1135 Lisp_Object inverse_trt, int posix)
1136 { 1136 {
1137 /* This function has been Mule-ized, except for the trt table handling. */ 1137 /* This function has been Mule-ized, except for the trt table handling. */
1138 Bytecount len = XSTRING_LENGTH (string); 1138 Bytecount len = XSTRING_LENGTH (string);
1139 Bufbyte *base_pat = XSTRING_DATA (string); 1139 Intbyte *base_pat = XSTRING_DATA (string);
1140 REGISTER EMACS_INT i, j; 1140 REGISTER EMACS_INT i, j;
1141 Bytind p1, p2; 1141 Bytebpos p1, p2;
1142 Bytecount s1, s2; 1142 Bytecount s1, s2;
1143 Bytind pos, lim; 1143 Bytebpos pos, lim;
1144 1144
1145 if (running_asynch_code) 1145 if (running_asynch_code)
1146 save_search_regs (); 1146 save_search_regs ();
1147 1147
1148 /* Null string is found at starting position. */ 1148 /* Null string is found at starting position. */
1149 if (len == 0) 1149 if (len == 0)
1150 { 1150 {
1151 set_search_regs (buf, bufpos, 0); 1151 set_search_regs (buf, charbpos, 0);
1152 return bufpos; 1152 return charbpos;
1153 } 1153 }
1154 1154
1155 /* Searching 0 times means don't move. */ 1155 /* Searching 0 times means don't move. */
1156 if (n == 0) 1156 if (n == 0)
1157 return bufpos; 1157 return charbpos;
1158 1158
1159 pos = bufpos_to_bytind (buf, bufpos); 1159 pos = charbpos_to_bytebpos (buf, charbpos);
1160 lim = bufpos_to_bytind (buf, buflim); 1160 lim = charbpos_to_bytebpos (buf, buflim);
1161 if (RE && !trivial_regexp_p (string)) 1161 if (RE && !trivial_regexp_p (string))
1162 { 1162 {
1163 struct re_pattern_buffer *bufp; 1163 struct re_pattern_buffer *bufp;
1164 1164
1165 bufp = compile_pattern (string, &search_regs, trt, posix, 1165 bufp = compile_pattern (string, &search_regs, trt, posix,
1201 } 1201 }
1202 XSETBUFFER (last_thing_searched, buf); 1202 XSETBUFFER (last_thing_searched, buf);
1203 /* Set pos to the new position. */ 1203 /* Set pos to the new position. */
1204 pos = search_regs.start[0]; 1204 pos = search_regs.start[0];
1205 fixup_search_regs_for_buffer (buf); 1205 fixup_search_regs_for_buffer (buf);
1206 /* And bufpos too. */ 1206 /* And charbpos too. */
1207 bufpos = search_regs.start[0]; 1207 charbpos = search_regs.start[0];
1208 } 1208 }
1209 else 1209 else
1210 { 1210 {
1211 return n; 1211 return n;
1212 } 1212 }
1238 } 1238 }
1239 XSETBUFFER (last_thing_searched, buf); 1239 XSETBUFFER (last_thing_searched, buf);
1240 /* Set pos to the new position. */ 1240 /* Set pos to the new position. */
1241 pos = search_regs.end[0]; 1241 pos = search_regs.end[0];
1242 fixup_search_regs_for_buffer (buf); 1242 fixup_search_regs_for_buffer (buf);
1243 /* And bufpos too. */ 1243 /* And charbpos too. */
1244 bufpos = search_regs.end[0]; 1244 charbpos = search_regs.end[0];
1245 } 1245 }
1246 else 1246 else
1247 { 1247 {
1248 return 0 - n; 1248 return 0 - n;
1249 } 1249 }
1250 n--; 1250 n--;
1251 } 1251 }
1252 return bufpos; 1252 return charbpos;
1253 } 1253 }
1254 else /* non-RE case */ 1254 else /* non-RE case */
1255 { 1255 {
1256 int charset_base = -1; 1256 int charset_base = -1;
1257 int boyer_moore_ok = 1; 1257 int boyer_moore_ok = 1;
1258 Bufbyte *pat = 0; 1258 Intbyte *pat = 0;
1259 Bufbyte *patbuf = alloca_array (Bufbyte, len * MAX_EMCHAR_LEN); 1259 Intbyte *patbuf = alloca_array (Intbyte, len * MAX_EMCHAR_LEN);
1260 pat = patbuf; 1260 pat = patbuf;
1261 #ifdef MULE 1261 #ifdef MULE
1262 while (len > 0) 1262 while (len > 0)
1263 { 1263 {
1264 Bufbyte tmp_str[MAX_EMCHAR_LEN]; 1264 Intbyte tmp_str[MAX_EMCHAR_LEN];
1265 Emchar c, translated, inverse; 1265 Emchar c, translated, inverse;
1266 Bytecount orig_bytelen, new_bytelen, inv_bytelen; 1266 Bytecount orig_bytelen, new_bytelen, inv_bytelen;
1267 1267
1268 /* If we got here and the RE flag is set, it's because 1268 /* If we got here and the RE flag is set, it's because
1269 we're dealing with a regexp known to be trivial, so the 1269 we're dealing with a regexp known to be trivial, so the
1335 1335
1336 This kind of search works regardless of what is in PAT and 1336 This kind of search works regardless of what is in PAT and
1337 regardless of what is in TRT. It is used in cases where 1337 regardless of what is in TRT. It is used in cases where
1338 boyer_moore cannot work. */ 1338 boyer_moore cannot work. */
1339 1339
1340 static Bufpos 1340 static Charbpos
1341 simple_search (struct buffer *buf, Bufbyte *base_pat, Bytecount len_byte, 1341 simple_search (struct buffer *buf, Intbyte *base_pat, Bytecount len_byte,
1342 Bytind idx, Bytind lim, EMACS_INT n, Lisp_Object trt) 1342 Bytebpos idx, Bytebpos lim, EMACS_INT n, Lisp_Object trt)
1343 { 1343 {
1344 int forward = n > 0; 1344 int forward = n > 0;
1345 Bytecount buf_len = 0; /* Shut up compiler. */ 1345 Bytecount buf_len = 0; /* Shut up compiler. */
1346 1346
1347 if (lim > idx) 1347 if (lim > idx)
1348 while (n > 0) 1348 while (n > 0)
1349 { 1349 {
1350 while (1) 1350 while (1)
1351 { 1351 {
1352 Bytecount this_len = len_byte; 1352 Bytecount this_len = len_byte;
1353 Bytind this_idx = idx; 1353 Bytebpos this_idx = idx;
1354 Bufbyte *p = base_pat; 1354 Intbyte *p = base_pat;
1355 if (idx >= lim) 1355 if (idx >= lim)
1356 goto stop; 1356 goto stop;
1357 1357
1358 while (this_len > 0) 1358 while (this_len > 0)
1359 { 1359 {
1369 break; 1369 break;
1370 1370
1371 pat_len = charcount_to_bytecount (p, 1); 1371 pat_len = charcount_to_bytecount (p, 1);
1372 p += pat_len; 1372 p += pat_len;
1373 this_len -= pat_len; 1373 this_len -= pat_len;
1374 INC_BYTIND (buf, this_idx); 1374 INC_BYTEBPOS (buf, this_idx);
1375 } 1375 }
1376 if (this_len == 0) 1376 if (this_len == 0)
1377 { 1377 {
1378 buf_len = this_idx - idx; 1378 buf_len = this_idx - idx;
1379 idx = this_idx; 1379 idx = this_idx;
1380 break; 1380 break;
1381 } 1381 }
1382 INC_BYTIND (buf, idx); 1382 INC_BYTEBPOS (buf, idx);
1383 } 1383 }
1384 n--; 1384 n--;
1385 } 1385 }
1386 else 1386 else
1387 while (n < 0) 1387 while (n < 0)
1388 { 1388 {
1389 while (1) 1389 while (1)
1390 { 1390 {
1391 Bytecount this_len = len_byte; 1391 Bytecount this_len = len_byte;
1392 Bytind this_idx = idx; 1392 Bytebpos this_idx = idx;
1393 Bufbyte *p; 1393 Intbyte *p;
1394 if (idx <= lim) 1394 if (idx <= lim)
1395 goto stop; 1395 goto stop;
1396 p = base_pat + len_byte; 1396 p = base_pat + len_byte;
1397 1397
1398 while (this_len > 0) 1398 while (this_len > 0)
1399 { 1399 {
1400 Emchar pat_ch, buf_ch; 1400 Emchar pat_ch, buf_ch;
1401 1401
1402 DEC_CHARPTR (p); 1402 DEC_CHARPTR (p);
1403 DEC_BYTIND (buf, this_idx); 1403 DEC_BYTEBPOS (buf, this_idx);
1404 pat_ch = charptr_emchar (p); 1404 pat_ch = charptr_emchar (p);
1405 buf_ch = BI_BUF_FETCH_CHAR (buf, this_idx); 1405 buf_ch = BI_BUF_FETCH_CHAR (buf, this_idx);
1406 1406
1407 buf_ch = TRANSLATE (trt, buf_ch); 1407 buf_ch = TRANSLATE (trt, buf_ch);
1408 1408
1415 { 1415 {
1416 buf_len = idx - this_idx; 1416 buf_len = idx - this_idx;
1417 idx = this_idx; 1417 idx = this_idx;
1418 break; 1418 break;
1419 } 1419 }
1420 DEC_BYTIND (buf, idx); 1420 DEC_BYTEBPOS (buf, idx);
1421 } 1421 }
1422 n++; 1422 n++;
1423 } 1423 }
1424 stop: 1424 stop:
1425 if (n == 0) 1425 if (n == 0)
1426 { 1426 {
1427 Bufpos beg, end, retval; 1427 Charbpos beg, end, retval;
1428 if (forward) 1428 if (forward)
1429 { 1429 {
1430 beg = bytind_to_bufpos (buf, idx - buf_len); 1430 beg = bytebpos_to_charbpos (buf, idx - buf_len);
1431 retval = end = bytind_to_bufpos (buf, idx); 1431 retval = end = bytebpos_to_charbpos (buf, idx);
1432 } 1432 }
1433 else 1433 else
1434 { 1434 {
1435 retval = beg = bytind_to_bufpos (buf, idx); 1435 retval = beg = bytebpos_to_charbpos (buf, idx);
1436 end = bytind_to_bufpos (buf, idx + buf_len); 1436 end = bytebpos_to_charbpos (buf, idx + buf_len);
1437 } 1437 }
1438 set_search_regs (buf, beg, end - beg); 1438 set_search_regs (buf, beg, end - beg);
1439 1439
1440 return retval; 1440 return retval;
1441 } 1441 }
1456 makes it possible to translate just the last byte of a character, 1456 makes it possible to translate just the last byte of a character,
1457 and do so after just a simple test of the context. 1457 and do so after just a simple test of the context.
1458 1458
1459 If that criterion is not satisfied, do not call this function. */ 1459 If that criterion is not satisfied, do not call this function. */
1460 1460
1461 static Bufpos 1461 static Charbpos
1462 boyer_moore (struct buffer *buf, Bufbyte *base_pat, Bytecount len, 1462 boyer_moore (struct buffer *buf, Intbyte *base_pat, Bytecount len,
1463 Bytind pos, Bytind lim, EMACS_INT n, Lisp_Object trt, 1463 Bytebpos pos, Bytebpos lim, EMACS_INT n, Lisp_Object trt,
1464 Lisp_Object inverse_trt, int charset_base) 1464 Lisp_Object inverse_trt, int charset_base)
1465 { 1465 {
1466 /* #### Someone really really really needs to comment the workings 1466 /* #### Someone really really really needs to comment the workings
1467 of this junk somewhat better. 1467 of this junk somewhat better.
1468 1468
1494 is what the BM_tab holds. */ 1494 is what the BM_tab holds. */
1495 REGISTER EMACS_INT *BM_tab; 1495 REGISTER EMACS_INT *BM_tab;
1496 EMACS_INT *BM_tab_base; 1496 EMACS_INT *BM_tab_base;
1497 REGISTER Bytecount dirlen; 1497 REGISTER Bytecount dirlen;
1498 EMACS_INT infinity; 1498 EMACS_INT infinity;
1499 Bytind limit; 1499 Bytebpos limit;
1500 Bytecount stride_for_teases = 0; 1500 Bytecount stride_for_teases = 0;
1501 REGISTER EMACS_INT i, j; 1501 REGISTER EMACS_INT i, j;
1502 Bufbyte *pat, *pat_end; 1502 Intbyte *pat, *pat_end;
1503 REGISTER Bufbyte *cursor, *p_limit, *ptr2; 1503 REGISTER Intbyte *cursor, *p_limit, *ptr2;
1504 Bufbyte simple_translate[0400]; 1504 Intbyte simple_translate[0400];
1505 REGISTER int direction = ((n > 0) ? 1 : -1); 1505 REGISTER int direction = ((n > 0) ? 1 : -1);
1506 #ifdef MULE 1506 #ifdef MULE
1507 Bufbyte translate_prev_byte = 0; 1507 Intbyte translate_prev_byte = 0;
1508 Bufbyte translate_anteprev_byte = 0; 1508 Intbyte translate_anteprev_byte = 0;
1509 #endif 1509 #endif
1510 #ifdef C_ALLOCA 1510 #ifdef C_ALLOCA
1511 EMACS_INT BM_tab_space[0400]; 1511 EMACS_INT BM_tab_space[0400];
1512 BM_tab = &BM_tab_space[0]; 1512 BM_tab = &BM_tab_space[0];
1513 #else 1513 #else
1564 /* We use this for translation, instead of TRT itself. We 1564 /* We use this for translation, instead of TRT itself. We
1565 fill this in to handle the characters that actually occur 1565 fill this in to handle the characters that actually occur
1566 in the pattern. Others don't matter anyway! */ 1566 in the pattern. Others don't matter anyway! */
1567 xzero (simple_translate); 1567 xzero (simple_translate);
1568 for (i = 0; i < 0400; i++) 1568 for (i = 0; i < 0400; i++)
1569 simple_translate[i] = (Bufbyte) i; 1569 simple_translate[i] = (Intbyte) i;
1570 i = 0; 1570 i = 0;
1571 while (i != infinity) 1571 while (i != infinity)
1572 { 1572 {
1573 Bufbyte *ptr = base_pat + i; 1573 Intbyte *ptr = base_pat + i;
1574 i += direction; 1574 i += direction;
1575 if (i == dirlen) 1575 if (i == dirlen)
1576 i = infinity; 1576 i = infinity;
1577 if (!NILP (trt)) 1577 if (!NILP (trt))
1578 { 1578 {
1579 #ifdef MULE 1579 #ifdef MULE
1580 Emchar ch, untranslated; 1580 Emchar ch, untranslated;
1581 int this_translated = 1; 1581 int this_translated = 1;
1582 1582
1583 /* Is *PTR the last byte of a character? */ 1583 /* Is *PTR the last byte of a character? */
1584 if (pat_end - ptr == 1 || BUFBYTE_FIRST_BYTE_P (ptr[1])) 1584 if (pat_end - ptr == 1 || INTBYTE_FIRST_BYTE_P (ptr[1]))
1585 { 1585 {
1586 Bufbyte *charstart = ptr; 1586 Intbyte *charstart = ptr;
1587 while (!BUFBYTE_FIRST_BYTE_P (*charstart)) 1587 while (!INTBYTE_FIRST_BYTE_P (*charstart))
1588 charstart--; 1588 charstart--;
1589 untranslated = charptr_emchar (charstart); 1589 untranslated = charptr_emchar (charstart);
1590 if (charset_base == (untranslated & ~CHAR_FIELD3_MASK)) 1590 if (charset_base == (untranslated & ~CHAR_FIELD3_MASK))
1591 { 1591 {
1592 ch = TRANSLATE (trt, untranslated); 1592 ch = TRANSLATE (trt, untranslated);
1593 if (!BUFBYTE_FIRST_BYTE_P (*ptr)) 1593 if (!INTBYTE_FIRST_BYTE_P (*ptr))
1594 { 1594 {
1595 translate_prev_byte = ptr[-1]; 1595 translate_prev_byte = ptr[-1];
1596 if (!BUFBYTE_FIRST_BYTE_P (translate_prev_byte)) 1596 if (!INTBYTE_FIRST_BYTE_P (translate_prev_byte))
1597 translate_anteprev_byte = ptr[-2]; 1597 translate_anteprev_byte = ptr[-2];
1598 } 1598 }
1599 } 1599 }
1600 else 1600 else
1601 { 1601 {
1649 /* A translation table is accompanied by its inverse -- 1649 /* A translation table is accompanied by its inverse --
1650 see comment following downcase_table for details */ 1650 see comment following downcase_table for details */
1651 1651
1652 while ((j = TRANSLATE (inverse_trt, j)) != k) 1652 while ((j = TRANSLATE (inverse_trt, j)) != k)
1653 { 1653 {
1654 simple_translate[j] = (Bufbyte) k; 1654 simple_translate[j] = (Intbyte) k;
1655 BM_tab[j] = dirlen - i; 1655 BM_tab[j] = dirlen - i;
1656 } 1656 }
1657 #endif 1657 #endif
1658 } 1658 }
1659 else 1659 else
1674 pos += dirlen - ((direction > 0) ? direction : 0); 1674 pos += dirlen - ((direction > 0) ? direction : 0);
1675 /* loop invariant - pos points at where last char (first char if 1675 /* loop invariant - pos points at where last char (first char if
1676 reverse) of pattern would align in a possible match. */ 1676 reverse) of pattern would align in a possible match. */
1677 while (n != 0) 1677 while (n != 0)
1678 { 1678 {
1679 Bytind tail_end; 1679 Bytebpos tail_end;
1680 Bufbyte *tail_end_ptr; 1680 Intbyte *tail_end_ptr;
1681 /* It's been reported that some (broken) compiler thinks 1681 /* It's been reported that some (broken) compiler thinks
1682 that Boolean expressions in an arithmetic context are 1682 that Boolean expressions in an arithmetic context are
1683 unsigned. Using an explicit ?1:0 prevents this. */ 1683 unsigned. Using an explicit ?1:0 prevents this. */
1684 if ((lim - pos - ((direction > 0) ? 1 : 0)) * direction < 0) 1684 if ((lim - pos - ((direction > 0) ? 1 : 0)) * direction < 0)
1685 return n * (0 - direction); 1685 return n * (0 - direction);
1760 #ifdef MULE 1760 #ifdef MULE
1761 Emchar ch; 1761 Emchar ch;
1762 cursor -= direction; 1762 cursor -= direction;
1763 /* Translate only the last byte of a character. */ 1763 /* Translate only the last byte of a character. */
1764 if ((cursor == tail_end_ptr 1764 if ((cursor == tail_end_ptr
1765 || BUFBYTE_FIRST_BYTE_P (cursor[1])) 1765 || INTBYTE_FIRST_BYTE_P (cursor[1]))
1766 && (BUFBYTE_FIRST_BYTE_P (cursor[0]) 1766 && (INTBYTE_FIRST_BYTE_P (cursor[0])
1767 || (translate_prev_byte == cursor[-1] 1767 || (translate_prev_byte == cursor[-1]
1768 && (BUFBYTE_FIRST_BYTE_P (translate_prev_byte) 1768 && (INTBYTE_FIRST_BYTE_P (translate_prev_byte)
1769 || translate_anteprev_byte == cursor[-2])))) 1769 || translate_anteprev_byte == cursor[-2]))))
1770 ch = simple_translate[*cursor]; 1770 ch = simple_translate[*cursor];
1771 else 1771 else
1772 ch = *cursor; 1772 ch = *cursor;
1773 if (pat[i] != ch) 1773 if (pat[i] != ch)
1788 if (i + direction == 0) 1788 if (i + direction == 0)
1789 { 1789 {
1790 cursor -= direction; 1790 cursor -= direction;
1791 1791
1792 { 1792 {
1793 Bytind bytstart = (pos + cursor - ptr2 + 1793 Bytebpos bytstart = (pos + cursor - ptr2 +
1794 ((direction > 0) 1794 ((direction > 0)
1795 ? 1 - len : 0)); 1795 ? 1 - len : 0));
1796 Bufpos bufstart = bytind_to_bufpos (buf, bytstart); 1796 Charbpos bufstart = bytebpos_to_charbpos (buf, bytstart);
1797 Bufpos bufend = bytind_to_bufpos (buf, bytstart + len); 1797 Charbpos bufend = bytebpos_to_charbpos (buf, bytstart + len);
1798 1798
1799 set_search_regs (buf, bufstart, bufend - bufstart); 1799 set_search_regs (buf, bufstart, bufend - bufstart);
1800 } 1800 }
1801 1801
1802 if ((n -= direction) != 0) 1802 if ((n -= direction) != 0)
1844 i = dirlen - direction; 1844 i = dirlen - direction;
1845 while ((i -= direction) + direction != 0) 1845 while ((i -= direction) + direction != 0)
1846 { 1846 {
1847 #ifdef MULE 1847 #ifdef MULE
1848 Emchar ch; 1848 Emchar ch;
1849 Bufbyte *ptr; 1849 Intbyte *ptr;
1850 #endif 1850 #endif
1851 pos -= direction; 1851 pos -= direction;
1852 #ifdef MULE 1852 #ifdef MULE
1853 ptr = BI_BUF_BYTE_ADDRESS (buf, pos); 1853 ptr = BI_BUF_BYTE_ADDRESS (buf, pos);
1854 if ((ptr == tail_end_ptr 1854 if ((ptr == tail_end_ptr
1855 || BUFBYTE_FIRST_BYTE_P (ptr[1])) 1855 || INTBYTE_FIRST_BYTE_P (ptr[1]))
1856 && (BUFBYTE_FIRST_BYTE_P (ptr[0]) 1856 && (INTBYTE_FIRST_BYTE_P (ptr[0])
1857 || (translate_prev_byte == ptr[-1] 1857 || (translate_prev_byte == ptr[-1]
1858 && (BUFBYTE_FIRST_BYTE_P (translate_prev_byte) 1858 && (INTBYTE_FIRST_BYTE_P (translate_prev_byte)
1859 || translate_anteprev_byte == ptr[-2])))) 1859 || translate_anteprev_byte == ptr[-2]))))
1860 ch = simple_translate[*ptr]; 1860 ch = simple_translate[*ptr];
1861 else 1861 else
1862 ch = *ptr; 1862 ch = *ptr;
1863 if (pat[i] != ch) 1863 if (pat[i] != ch)
1877 if (i + direction == 0) 1877 if (i + direction == 0)
1878 { 1878 {
1879 pos -= direction; 1879 pos -= direction;
1880 1880
1881 { 1881 {
1882 Bytind bytstart = (pos + 1882 Bytebpos bytstart = (pos +
1883 ((direction > 0) 1883 ((direction > 0)
1884 ? 1 - len : 0)); 1884 ? 1 - len : 0));
1885 Bufpos bufstart = bytind_to_bufpos (buf, bytstart); 1885 Charbpos bufstart = bytebpos_to_charbpos (buf, bytstart);
1886 Bufpos bufend = bytind_to_bufpos (buf, bytstart + len); 1886 Charbpos bufend = bytebpos_to_charbpos (buf, bytstart + len);
1887 1887
1888 set_search_regs (buf, bufstart, bufend - bufstart); 1888 set_search_regs (buf, bufstart, bufend - bufstart);
1889 } 1889 }
1890 1890
1891 if ((n -= direction) != 0) 1891 if ((n -= direction) != 0)
1900 } 1900 }
1901 /* We have done one clump. Can we continue? */ 1901 /* We have done one clump. Can we continue? */
1902 if ((lim - pos) * direction < 0) 1902 if ((lim - pos) * direction < 0)
1903 return (0 - n) * direction; 1903 return (0 - n) * direction;
1904 } 1904 }
1905 return bytind_to_bufpos (buf, pos); 1905 return bytebpos_to_charbpos (buf, pos);
1906 } 1906 }
1907 1907
1908 /* Record beginning BEG and end BEG + LEN 1908 /* Record beginning BEG and end BEG + LEN
1909 for a match just found in the current buffer. */ 1909 for a match just found in the current buffer. */
1910 1910
1911 static void 1911 static void
1912 set_search_regs (struct buffer *buf, Bufpos beg, Charcount len) 1912 set_search_regs (struct buffer *buf, Charbpos beg, Charcount len)
1913 { 1913 {
1914 /* This function has been Mule-ized. */ 1914 /* This function has been Mule-ized. */
1915 /* Make sure we have registers in which to store 1915 /* Make sure we have registers in which to store
1916 the match position. */ 1916 the match position. */
1917 if (search_regs.num_regs == 0) 1917 if (search_regs.num_regs == 0)
1955 if (!word_count) return build_string (""); 1955 if (!word_count) return build_string ("");
1956 1956
1957 { 1957 {
1958 /* The following value is an upper bound on the amount of storage we 1958 /* The following value is an upper bound on the amount of storage we
1959 need. In non-Mule, it is exact. */ 1959 need. In non-Mule, it is exact. */
1960 Bufbyte *storage = 1960 Intbyte *storage =
1961 (Bufbyte *) alloca (XSTRING_LENGTH (string) - punct_count + 1961 (Intbyte *) alloca (XSTRING_LENGTH (string) - punct_count +
1962 5 * (word_count - 1) + 4); 1962 5 * (word_count - 1) + 4);
1963 Bufbyte *o = storage; 1963 Intbyte *o = storage;
1964 1964
1965 *o++ = '\\'; 1965 *o++ = '\\';
1966 *o++ = 'b'; 1966 *o++ = 'b';
1967 1967
1968 for (i = 0; i < len; i++) 1968 for (i = 0; i < len; i++)
2257 (replacement, fixedcase, literal, string, strbuffer)) 2257 (replacement, fixedcase, literal, string, strbuffer))
2258 { 2258 {
2259 /* This function has been Mule-ized. */ 2259 /* This function has been Mule-ized. */
2260 /* This function can GC */ 2260 /* This function can GC */
2261 enum { nochange, all_caps, cap_initial } case_action; 2261 enum { nochange, all_caps, cap_initial } case_action;
2262 Bufpos pos, last; 2262 Charbpos pos, last;
2263 int some_multiletter_word; 2263 int some_multiletter_word;
2264 int some_lowercase; 2264 int some_lowercase;
2265 int some_uppercase; 2265 int some_uppercase;
2266 int some_nonuppercase_initial; 2266 int some_nonuppercase_initial;
2267 Emchar c, prevc; 2267 Emchar c, prevc;
2646 2646
2647 /* Now go through and make all the case changes that were requested 2647 /* Now go through and make all the case changes that were requested
2648 in the replacement string. */ 2648 in the replacement string. */
2649 if (ul_pos_dynarr) 2649 if (ul_pos_dynarr)
2650 { 2650 {
2651 Bufpos eend = BUF_PT (buf); 2651 Charbpos eend = BUF_PT (buf);
2652 int i = 0; 2652 int i = 0;
2653 int cur_action = 'E'; 2653 int cur_action = 'E';
2654 2654
2655 for (pos = BUF_PT (buf) - inslen; pos < eend; pos++) 2655 for (pos = BUF_PT (buf) - inslen; pos < eend; pos++)
2656 { 2656 {
2754 data = alloca_array (Lisp_Object, 2 * search_regs.num_regs); 2754 data = alloca_array (Lisp_Object, 2 * search_regs.num_regs);
2755 2755
2756 len = -1; 2756 len = -1;
2757 for (i = 0; i < search_regs.num_regs; i++) 2757 for (i = 0; i < search_regs.num_regs; i++)
2758 { 2758 {
2759 Bufpos start = search_regs.start[i]; 2759 Charbpos start = search_regs.start[i];
2760 if (start >= 0) 2760 if (start >= 0)
2761 { 2761 {
2762 if (EQ (last_thing_searched, Qt) 2762 if (EQ (last_thing_searched, Qt)
2763 || !NILP (integers)) 2763 || !NILP (integers))
2764 { 2764 {
2931 DEFUN ("regexp-quote", Fregexp_quote, 1, 1, 0, /* 2931 DEFUN ("regexp-quote", Fregexp_quote, 1, 1, 0, /*
2932 Return a regexp string which matches exactly STRING and nothing else. 2932 Return a regexp string which matches exactly STRING and nothing else.
2933 */ 2933 */
2934 (string)) 2934 (string))
2935 { 2935 {
2936 REGISTER Bufbyte *in, *out, *end; 2936 REGISTER Intbyte *in, *out, *end;
2937 REGISTER Bufbyte *temp; 2937 REGISTER Intbyte *temp;
2938 2938
2939 CHECK_STRING (string); 2939 CHECK_STRING (string);
2940 2940
2941 temp = (Bufbyte *) alloca (XSTRING_LENGTH (string) * 2); 2941 temp = (Intbyte *) alloca (XSTRING_LENGTH (string) * 2);
2942 2942
2943 /* Now copy the data into the new string, inserting escapes. */ 2943 /* Now copy the data into the new string, inserting escapes. */
2944 2944
2945 in = XSTRING_DATA (string); 2945 in = XSTRING_DATA (string);
2946 end = in + XSTRING_LENGTH (string); 2946 end = in + XSTRING_LENGTH (string);