Mercurial > hg > xemacs-beta
comparison src/insdel.c @ 771:943eaba38521
[xemacs-hg @ 2002-03-13 08:51:24 by ben]
The big ben-mule-21-5 check-in!
Various files were added and deleted. See CHANGES-ben-mule.
There are still some test suite failures. No crashes, though.
Many of the failures have to do with problems in the test suite itself
rather than in the actual code. I'll be addressing these in the next
day or so -- none of the test suite failures are at all critical.
Meanwhile I'll be trying to address the biggest issues -- i.e. build
or run failures, which will almost certainly happen on various platforms.
All comments should be sent to ben@xemacs.org -- use a Cc: if necessary
when sending to mailing lists. There will be pre- and post- tags,
something like
pre-ben-mule-21-5-merge-in, and
post-ben-mule-21-5-merge-in.
author | ben |
---|---|
date | Wed, 13 Mar 2002 08:54:06 +0000 |
parents | 760db937b9ee |
children | e38acbeb1cae |
comparison
equal
deleted
inserted
replaced
770:336a418893b5 | 771:943eaba38521 |
---|---|
1 /* Buffer insertion/deletion and gap motion for XEmacs. | 1 /* Buffer insertion/deletion and gap motion for XEmacs. |
2 Copyright (C) 1985, 1986, 1991, 1992, 1993, 1994, 1995 | 2 Copyright (C) 1985, 1986, 1991, 1992, 1993, 1994, 1995 |
3 Free Software Foundation, Inc. | 3 Free Software Foundation, Inc. |
4 Copyright (C) 1995 Sun Microsystems, Inc. | 4 Copyright (C) 1995 Sun Microsystems, Inc. |
5 Copyright (C) 2001 Ben Wing. | |
5 | 6 |
6 This file is part of XEmacs. | 7 This file is part of XEmacs. |
7 | 8 |
8 XEmacs is free software; you can redistribute it and/or modify it | 9 XEmacs is free software; you can redistribute it and/or modify it |
9 under the terms of the GNU General Public License as published by the | 10 under the terms of the GNU General Public License as published by the |
23 /* Synched up with: Mule 2.0, FSF 19.30. Diverges significantly. */ | 24 /* Synched up with: Mule 2.0, FSF 19.30. Diverges significantly. */ |
24 | 25 |
25 /* This file has been Mule-ized. */ | 26 /* This file has been Mule-ized. */ |
26 | 27 |
27 /* Overhauled by Ben Wing, December 1994, for Mule implementation. */ | 28 /* Overhauled by Ben Wing, December 1994, for Mule implementation. */ |
28 | |
29 /* | |
30 There are three possible ways to specify positions in a buffer. All | |
31 of these are one-based: the beginning of the buffer is position or | |
32 index 1, and 0 is not a valid position. | |
33 | |
34 As a "buffer position" (typedef Charbpos): | |
35 | |
36 This is an index specifying an offset in characters from the | |
37 beginning of the buffer. Note that buffer positions are | |
38 logically *between* characters, not on a character. The | |
39 difference between two buffer positions specifies the number of | |
40 characters between those positions. Buffer positions are the | |
41 only kind of position externally visible to the user. | |
42 | |
43 As a "byte index" (typedef Bytebpos): | |
44 | |
45 This is an index over the bytes used to represent the characters | |
46 in the buffer. If there is no Mule support, this is identical | |
47 to a buffer position, because each character is represented | |
48 using one byte. However, with Mule support, many characters | |
49 require two or more bytes for their representation, and so a | |
50 byte index may be greater than the corresponding buffer | |
51 position. | |
52 | |
53 As a "memory index" (typedef Membpos): | |
54 | |
55 This is the byte index adjusted for the gap. For positions | |
56 before the gap, this is identical to the byte index. For | |
57 positions after the gap, this is the byte index plus the gap | |
58 size. There are two possible memory indices for the gap | |
59 position; the memory index at the beginning of the gap should | |
60 always be used, except in code that deals with manipulating the | |
61 gap, where both indices may be seen. The address of the | |
62 character "at" (i.e. following) a particular position can be | |
63 obtained from the formula | |
64 | |
65 buffer_start_address + memory_index(position) - 1 | |
66 | |
67 except in the case of characters at the gap position. | |
68 | |
69 Other typedefs: | |
70 =============== | |
71 | |
72 Emchar: | |
73 ------- | |
74 This typedef represents a single Emacs character, which can be | |
75 ASCII, ISO-8859, or some extended character, as would typically | |
76 be used for Kanji. Note that the representation of a character | |
77 as an Emchar is *not* the same as the representation of that | |
78 same character in a string; thus, you cannot do the standard | |
79 C trick of passing a pointer to a character to a function that | |
80 expects a string. | |
81 | |
82 An Emchar takes up 19 bits of representation and (for code | |
83 compatibility and such) is compatible with an int. This | |
84 representation is visible on the Lisp level. The important | |
85 characteristics of the Emchar representation are | |
86 | |
87 -- values 0x00 - 0x7f represent ASCII. | |
88 -- values 0x80 - 0xff represent the right half of ISO-8859-1. | |
89 -- values 0x100 and up represent all other characters. | |
90 | |
91 This means that Emchar values are upwardly compatible with | |
92 the standard 8-bit representation of ASCII/ISO-8859-1. | |
93 | |
94 Intbyte: | |
95 -------- | |
96 The data in a buffer or string is logically made up of Intbyte | |
97 objects, where a Intbyte takes up the same amount of space as a | |
98 char. (It is declared differently, though, to catch invalid | |
99 usages.) Strings stored using Intbytes are said to be in | |
100 "internal format". The important characteristics of internal | |
101 format are | |
102 | |
103 -- ASCII characters are represented as a single Intbyte, | |
104 in the range 0 - 0x7f. | |
105 -- All other characters are represented as a Intbyte in | |
106 the range 0x80 - 0x9f followed by one or more Intbytes | |
107 in the range 0xa0 to 0xff. | |
108 | |
109 This leads to a number of desirable properties: | |
110 | |
111 -- Given the position of the beginning of a character, | |
112 you can find the beginning of the next or previous | |
113 character in constant time. | |
114 -- When searching for a substring or an ASCII character | |
115 within the string, you need merely use standard | |
116 searching routines. | |
117 | |
118 array of char: | |
119 -------------- | |
120 Strings that go in or out of Emacs are in "external format", | |
121 typedef'ed as an array of char or a char *. There is more | |
122 than one external format (JIS, EUC, etc.) but they all | |
123 have similar properties. They are modal encodings, | |
124 which is to say that the meaning of particular bytes is | |
125 not fixed but depends on what "mode" the string is currently | |
126 in (e.g. bytes in the range 0 - 0x7f might be | |
127 interpreted as ASCII, or as Hiragana, or as 2-byte Kanji, | |
128 depending on the current mode). The mode starts out in | |
129 ASCII/ISO-8859-1 and is switched using escape sequences -- | |
130 for example, in the JIS encoding, 'ESC $ B' switches to a | |
131 mode where pairs of bytes in the range 0 - 0x7f | |
132 are interpreted as Kanji characters. | |
133 | |
134 External-formatted data is generally desirable for passing | |
135 data between programs because it is upwardly compatible | |
136 with standard ASCII/ISO-8859-1 strings and may require | |
137 less space than internal encodings such as the one | |
138 described above. In addition, some encodings (e.g. JIS) | |
139 keep all characters (except the ESC used to switch modes) | |
140 in the printing ASCII range 0x20 - 0x7e, which results in | |
141 a much higher probability that the data will avoid being | |
142 garbled in transmission. Externally-formatted data is | |
143 generally not very convenient to work with, however, and | |
144 for this reason is usually converted to internal format | |
145 before any work is done on the string. | |
146 | |
147 NOTE: filenames need to be in external format so that | |
148 ISO-8859-1 characters come out correctly. | |
149 | |
150 Charcount: | |
151 ---------- | |
152 This typedef represents a count of characters, such as | |
153 a character offset into a string or the number of | |
154 characters between two positions in a buffer. The | |
155 difference between two Charbpos's is a Charcount, and | |
156 character positions in a string are represented using | |
157 a Charcount. | |
158 | |
159 Bytecount: | |
160 ---------- | |
161 Similar to a Charcount but represents a count of bytes. | |
162 The difference between two Bytebpos's is a Bytecount. | |
163 | |
164 | |
165 Usage of the various representations: | |
166 ===================================== | |
167 | |
168 Memory indices are used in low-level functions in insdel.c and for | |
169 extent endpoints and marker positions. The reason for this is that | |
170 this way, the extents and markers don't need to be updated for most | |
171 insertions, which merely shrink the gap and don't move any | |
172 characters around in memory. | |
173 | |
174 (The beginning-of-gap memory index simplifies insertions w.r.t. | |
175 markers, because text usually gets inserted after markers. For | |
176 extents, it is merely for consistency, because text can get | |
177 inserted either before or after an extent's endpoint depending on | |
178 the open/closedness of the endpoint.) | |
179 | |
180 Byte indices are used in other code that needs to be fast, | |
181 such as the searching, redisplay, and extent-manipulation code. | |
182 | |
183 Buffer positions are used in all other code. This is because this | |
184 representation is easiest to work with (especially since Lisp | |
185 code always uses buffer positions), necessitates the fewest | |
186 changes to existing code, and is the safest (e.g. if the text gets | |
187 shifted underneath a buffer position, it will still point to a | |
188 character; if text is shifted under a byte index, it might point | |
189 to the middle of a character, which would be bad). | |
190 | |
191 Similarly, Charcounts are used in all code that deals with strings | |
192 except for code that needs to be fast, which used Bytecounts. | |
193 | |
194 Strings are always passed around internally using internal format. | |
195 Conversions between external format are performed at the time | |
196 that the data goes in or out of Emacs. | |
197 | |
198 Working with the various representations: | |
199 ========================================= */ | |
200 | 29 |
201 #include <config.h> | 30 #include <config.h> |
202 #include "lisp.h" | 31 #include "lisp.h" |
203 | 32 |
204 #include "buffer.h" | 33 #include "buffer.h" |
207 #include "extents.h" | 36 #include "extents.h" |
208 #include "insdel.h" | 37 #include "insdel.h" |
209 #include "lstream.h" | 38 #include "lstream.h" |
210 #include "redisplay.h" | 39 #include "redisplay.h" |
211 #include "line-number.h" | 40 #include "line-number.h" |
212 | |
213 /* We write things this way because it's very important the | |
214 MAX_BYTEBPOS_GAP_SIZE_3 is a multiple of 3. (As it happens, | |
215 65535 is a multiple of 3, but this may not always be the | |
216 case.) */ | |
217 | |
218 #define MAX_CHARBPOS_GAP_SIZE_3 (65535/3) | |
219 #define MAX_BYTEBPOS_GAP_SIZE_3 (3 * MAX_CHARBPOS_GAP_SIZE_3) | |
220 | |
221 short three_to_one_table[1 + MAX_BYTEBPOS_GAP_SIZE_3]; | |
222 | 41 |
223 /* Various macros modelled along the lines of those in buffer.h. | 42 /* Various macros modelled along the lines of those in buffer.h. |
224 Purposefully omitted from buffer.h because files other than this | 43 Purposefully omitted from buffer.h because files other than this |
225 one should not be using them. */ | 44 one should not be using them. */ |
226 | 45 |
284 # define SET_END_SENTINEL(buf) | 103 # define SET_END_SENTINEL(buf) |
285 #endif | 104 #endif |
286 | 105 |
287 | 106 |
288 /************************************************************************/ | 107 /************************************************************************/ |
289 /* Charcount/Bytecount conversion */ | |
290 /************************************************************************/ | |
291 | |
292 /* Optimization. Do it. Live it. Love it. */ | |
293 | |
294 #ifdef MULE | |
295 | |
296 /* We include the basic functions here that require no specific | |
297 knowledge of how data is Mule-encoded into a buffer other | |
298 than the basic (00 - 7F), (80 - 9F), (A0 - FF) scheme. | |
299 Anything that requires more specific knowledge goes into | |
300 mule-charset.c. */ | |
301 | |
302 /* Given a pointer to a text string and a length in bytes, return | |
303 the equivalent length in characters. */ | |
304 | |
305 Charcount | |
306 bytecount_to_charcount (const Intbyte *ptr, Bytecount len) | |
307 { | |
308 Charcount count = 0; | |
309 const Intbyte *end = ptr + len; | |
310 | |
311 #if SIZEOF_LONG == 8 | |
312 # define STRIDE_TYPE long | |
313 # define HIGH_BIT_MASK 0x8080808080808080UL | |
314 #elif SIZEOF_LONG_LONG == 8 && !(defined (i386) || defined (__i386__)) | |
315 # define STRIDE_TYPE long long | |
316 # define HIGH_BIT_MASK 0x8080808080808080ULL | |
317 #elif SIZEOF_LONG == 4 | |
318 # define STRIDE_TYPE long | |
319 # define HIGH_BIT_MASK 0x80808080UL | |
320 #else | |
321 # error Add support for 128-bit systems here | |
322 #endif | |
323 | |
324 #define ALIGN_BITS ((EMACS_UINT) (ALIGNOF (STRIDE_TYPE) - 1)) | |
325 #define ALIGN_MASK (~ ALIGN_BITS) | |
326 #define ALIGNED(ptr) ((((EMACS_UINT) ptr) & ALIGN_BITS) == 0) | |
327 #define STRIDE sizeof (STRIDE_TYPE) | |
328 | |
329 while (ptr < end) | |
330 { | |
331 if (BYTE_ASCII_P (*ptr)) | |
332 { | |
333 /* optimize for long stretches of ASCII */ | |
334 if (! ALIGNED (ptr)) | |
335 ptr++, count++; | |
336 else | |
337 { | |
338 const unsigned STRIDE_TYPE *ascii_end = | |
339 (const unsigned STRIDE_TYPE *) ptr; | |
340 /* This loop screams, because we can typically | |
341 detect ASCII characters 8 at a time. */ | |
342 while ((const Intbyte *) ascii_end + STRIDE <= end | |
343 && !(*ascii_end & HIGH_BIT_MASK)) | |
344 ascii_end++; | |
345 if ((Intbyte *) ascii_end == ptr) | |
346 ptr++, count++; | |
347 else | |
348 { | |
349 count += (Intbyte *) ascii_end - ptr; | |
350 ptr = (Intbyte *) ascii_end; | |
351 } | |
352 } | |
353 } | |
354 else | |
355 { | |
356 /* optimize for successive characters from the same charset */ | |
357 Intbyte leading_byte = *ptr; | |
358 Bytecount bytes = REP_BYTES_BY_FIRST_BYTE (leading_byte); | |
359 while ((ptr < end) && (*ptr == leading_byte)) | |
360 ptr += bytes, count++; | |
361 } | |
362 } | |
363 | |
364 #ifdef ERROR_CHECK_CHARBPOS | |
365 /* Bomb out if the specified substring ends in the middle | |
366 of a character. Note that we might have already gotten | |
367 a core dump above from an invalid reference, but at least | |
368 we will get no farther than here. */ | |
369 assert (ptr == end); | |
370 #endif | |
371 | |
372 return count; | |
373 } | |
374 | |
375 /* Given a pointer to a text string and a length in characters, return | |
376 the equivalent length in bytes. */ | |
377 | |
378 Bytecount | |
379 charcount_to_bytecount (const Intbyte *ptr, Charcount len) | |
380 { | |
381 const Intbyte *newptr = ptr; | |
382 | |
383 while (len > 0) | |
384 { | |
385 INC_CHARPTR (newptr); | |
386 len--; | |
387 } | |
388 return newptr - ptr; | |
389 } | |
390 | |
391 /* The next two functions are the actual meat behind the | |
392 charbpos-to-bytebpos and bytebpos-to-charbpos conversions. Currently | |
393 the method they use is fairly unsophisticated; see buffer.h. | |
394 | |
395 Note that charbpos_to_bytebpos_func() is probably the most-called | |
396 function in all of XEmacs. Therefore, it must be FAST FAST FAST. | |
397 This is the reason why so much of the code is duplicated. | |
398 | |
399 Similar considerations apply to bytebpos_to_charbpos_func(), although | |
400 less so because the function is not called so often. | |
401 | |
402 #### At some point this should use a more sophisticated method; | |
403 see buffer.h. */ | |
404 | |
405 static int not_very_random_number; | |
406 | |
407 Bytebpos | |
408 charbpos_to_bytebpos_func (struct buffer *buf, Charbpos x) | |
409 { | |
410 Charbpos bufmin; | |
411 Charbpos bufmax; | |
412 Bytebpos bytmin; | |
413 Bytebpos bytmax; | |
414 int size; | |
415 int forward_p; | |
416 Bytebpos retval; | |
417 int diff_so_far; | |
418 int add_to_cache = 0; | |
419 | |
420 /* Check for some cached positions, for speed. */ | |
421 if (x == BUF_PT (buf)) | |
422 return BI_BUF_PT (buf); | |
423 if (x == BUF_ZV (buf)) | |
424 return BI_BUF_ZV (buf); | |
425 if (x == BUF_BEGV (buf)) | |
426 return BI_BUF_BEGV (buf); | |
427 | |
428 bufmin = buf->text->mule_bufmin; | |
429 bufmax = buf->text->mule_bufmax; | |
430 bytmin = buf->text->mule_bytmin; | |
431 bytmax = buf->text->mule_bytmax; | |
432 size = (1 << buf->text->mule_shifter) + !!buf->text->mule_three_p; | |
433 | |
434 /* The basic idea here is that we shift the "known region" up or down | |
435 until it overlaps the specified position. We do this by moving | |
436 the upper bound of the known region up one character at a time, | |
437 and moving the lower bound of the known region up as necessary | |
438 when the size of the character just seen changes. | |
439 | |
440 We optimize this, however, by first shifting the known region to | |
441 one of the cached points if it's close by. (We don't check BEG or | |
442 Z, even though they're cached; most of the time these will be the | |
443 same as BEGV and ZV, and when they're not, they're not likely | |
444 to be used.) */ | |
445 | |
446 if (x > bufmax) | |
447 { | |
448 Charbpos diffmax = x - bufmax; | |
449 Charbpos diffpt = x - BUF_PT (buf); | |
450 Charbpos diffzv = BUF_ZV (buf) - x; | |
451 /* #### This value could stand some more exploration. */ | |
452 Charcount heuristic_hack = (bufmax - bufmin) >> 2; | |
453 | |
454 /* Check if the position is closer to PT or ZV than to the | |
455 end of the known region. */ | |
456 | |
457 if (diffpt < 0) | |
458 diffpt = -diffpt; | |
459 if (diffzv < 0) | |
460 diffzv = -diffzv; | |
461 | |
462 /* But also implement a heuristic that favors the known region | |
463 over PT or ZV. The reason for this is that switching to | |
464 PT or ZV will wipe out the knowledge in the known region, | |
465 which might be annoying if the known region is large and | |
466 PT or ZV is not that much closer than the end of the known | |
467 region. */ | |
468 | |
469 diffzv += heuristic_hack; | |
470 diffpt += heuristic_hack; | |
471 if (diffpt < diffmax && diffpt <= diffzv) | |
472 { | |
473 bufmax = bufmin = BUF_PT (buf); | |
474 bytmax = bytmin = BI_BUF_PT (buf); | |
475 /* We set the size to 1 even though it doesn't really | |
476 matter because the new known region contains no | |
477 characters. We do this because this is the most | |
478 likely size of the characters around the new known | |
479 region, and we avoid potential yuckiness that is | |
480 done when size == 3. */ | |
481 size = 1; | |
482 } | |
483 if (diffzv < diffmax) | |
484 { | |
485 bufmax = bufmin = BUF_ZV (buf); | |
486 bytmax = bytmin = BI_BUF_ZV (buf); | |
487 size = 1; | |
488 } | |
489 } | |
490 #ifdef ERROR_CHECK_CHARBPOS | |
491 else if (x >= bufmin) | |
492 abort (); | |
493 #endif | |
494 else | |
495 { | |
496 Charbpos diffmin = bufmin - x; | |
497 Charbpos diffpt = BUF_PT (buf) - x; | |
498 Charbpos diffbegv = x - BUF_BEGV (buf); | |
499 /* #### This value could stand some more exploration. */ | |
500 Charcount heuristic_hack = (bufmax - bufmin) >> 2; | |
501 | |
502 if (diffpt < 0) | |
503 diffpt = -diffpt; | |
504 if (diffbegv < 0) | |
505 diffbegv = -diffbegv; | |
506 | |
507 /* But also implement a heuristic that favors the known region -- | |
508 see above. */ | |
509 | |
510 diffbegv += heuristic_hack; | |
511 diffpt += heuristic_hack; | |
512 | |
513 if (diffpt < diffmin && diffpt <= diffbegv) | |
514 { | |
515 bufmax = bufmin = BUF_PT (buf); | |
516 bytmax = bytmin = BI_BUF_PT (buf); | |
517 /* We set the size to 1 even though it doesn't really | |
518 matter because the new known region contains no | |
519 characters. We do this because this is the most | |
520 likely size of the characters around the new known | |
521 region, and we avoid potential yuckiness that is | |
522 done when size == 3. */ | |
523 size = 1; | |
524 } | |
525 if (diffbegv < diffmin) | |
526 { | |
527 bufmax = bufmin = BUF_BEGV (buf); | |
528 bytmax = bytmin = BI_BUF_BEGV (buf); | |
529 size = 1; | |
530 } | |
531 } | |
532 | |
533 diff_so_far = x > bufmax ? x - bufmax : bufmin - x; | |
534 if (diff_so_far > 50) | |
535 { | |
536 /* If we have to move more than a certain amount, then look | |
537 into our cache. */ | |
538 int minval = INT_MAX; | |
539 int found = 0; | |
540 int i; | |
541 | |
542 add_to_cache = 1; | |
543 /* I considered keeping the positions ordered. This would speed | |
544 up this loop, but updating the cache would take longer, so | |
545 it doesn't seem like it would really matter. */ | |
546 for (i = 0; i < 16; i++) | |
547 { | |
548 int diff = buf->text->mule_charbpos_cache[i] - x; | |
549 | |
550 if (diff < 0) | |
551 diff = -diff; | |
552 if (diff < minval) | |
553 { | |
554 minval = diff; | |
555 found = i; | |
556 } | |
557 } | |
558 | |
559 if (minval < diff_so_far) | |
560 { | |
561 bufmax = bufmin = buf->text->mule_charbpos_cache[found]; | |
562 bytmax = bytmin = buf->text->mule_bytebpos_cache[found]; | |
563 size = 1; | |
564 } | |
565 } | |
566 | |
567 /* It's conceivable that the caching above could lead to X being | |
568 the same as one of the range edges. */ | |
569 if (x >= bufmax) | |
570 { | |
571 Bytebpos newmax; | |
572 Bytecount newsize; | |
573 | |
574 forward_p = 1; | |
575 while (x > bufmax) | |
576 { | |
577 newmax = bytmax; | |
578 | |
579 INC_BYTEBPOS (buf, newmax); | |
580 newsize = newmax - bytmax; | |
581 if (newsize != size) | |
582 { | |
583 bufmin = bufmax; | |
584 bytmin = bytmax; | |
585 size = newsize; | |
586 } | |
587 bytmax = newmax; | |
588 bufmax++; | |
589 } | |
590 retval = bytmax; | |
591 | |
592 /* #### Should go past the found location to reduce the number | |
593 of times that this function is called */ | |
594 } | |
595 else /* x < bufmin */ | |
596 { | |
597 Bytebpos newmin; | |
598 Bytecount newsize; | |
599 | |
600 forward_p = 0; | |
601 while (x < bufmin) | |
602 { | |
603 newmin = bytmin; | |
604 | |
605 DEC_BYTEBPOS (buf, newmin); | |
606 newsize = bytmin - newmin; | |
607 if (newsize != size) | |
608 { | |
609 bufmax = bufmin; | |
610 bytmax = bytmin; | |
611 size = newsize; | |
612 } | |
613 bytmin = newmin; | |
614 bufmin--; | |
615 } | |
616 retval = bytmin; | |
617 | |
618 /* #### Should go past the found location to reduce the number | |
619 of times that this function is called | |
620 */ | |
621 } | |
622 | |
623 /* If size is three, than we have to max sure that the range we | |
624 discovered isn't too large, because we use a fixed-length | |
625 table to divide by 3. */ | |
626 | |
627 if (size == 3) | |
628 { | |
629 int gap = bytmax - bytmin; | |
630 buf->text->mule_three_p = 1; | |
631 buf->text->mule_shifter = 1; | |
632 | |
633 if (gap > MAX_BYTEBPOS_GAP_SIZE_3) | |
634 { | |
635 if (forward_p) | |
636 { | |
637 bytmin = bytmax - MAX_BYTEBPOS_GAP_SIZE_3; | |
638 bufmin = bufmax - MAX_CHARBPOS_GAP_SIZE_3; | |
639 } | |
640 else | |
641 { | |
642 bytmax = bytmin + MAX_BYTEBPOS_GAP_SIZE_3; | |
643 bufmax = bufmin + MAX_CHARBPOS_GAP_SIZE_3; | |
644 } | |
645 } | |
646 } | |
647 else | |
648 { | |
649 buf->text->mule_three_p = 0; | |
650 if (size == 4) | |
651 buf->text->mule_shifter = 2; | |
652 else | |
653 buf->text->mule_shifter = size - 1; | |
654 } | |
655 | |
656 buf->text->mule_bufmin = bufmin; | |
657 buf->text->mule_bufmax = bufmax; | |
658 buf->text->mule_bytmin = bytmin; | |
659 buf->text->mule_bytmax = bytmax; | |
660 | |
661 if (add_to_cache) | |
662 { | |
663 int replace_loc; | |
664 | |
665 /* We throw away a "random" cached value and replace it with | |
666 the new value. It doesn't actually have to be very random | |
667 at all, just evenly distributed. | |
668 | |
669 #### It would be better to use a least-recently-used algorithm | |
670 or something that tries to space things out, but I'm not sure | |
671 it's worth it to go to the trouble of maintaining that. */ | |
672 not_very_random_number += 621; | |
673 replace_loc = not_very_random_number & 15; | |
674 buf->text->mule_charbpos_cache[replace_loc] = x; | |
675 buf->text->mule_bytebpos_cache[replace_loc] = retval; | |
676 } | |
677 | |
678 return retval; | |
679 } | |
680 | |
681 /* The logic in this function is almost identical to the logic in | |
682 the previous function. */ | |
683 | |
684 Charbpos | |
685 bytebpos_to_charbpos_func (struct buffer *buf, Bytebpos x) | |
686 { | |
687 Charbpos bufmin; | |
688 Charbpos bufmax; | |
689 Bytebpos bytmin; | |
690 Bytebpos bytmax; | |
691 int size; | |
692 int forward_p; | |
693 Charbpos retval; | |
694 int diff_so_far; | |
695 int add_to_cache = 0; | |
696 | |
697 /* Check for some cached positions, for speed. */ | |
698 if (x == BI_BUF_PT (buf)) | |
699 return BUF_PT (buf); | |
700 if (x == BI_BUF_ZV (buf)) | |
701 return BUF_ZV (buf); | |
702 if (x == BI_BUF_BEGV (buf)) | |
703 return BUF_BEGV (buf); | |
704 | |
705 bufmin = buf->text->mule_bufmin; | |
706 bufmax = buf->text->mule_bufmax; | |
707 bytmin = buf->text->mule_bytmin; | |
708 bytmax = buf->text->mule_bytmax; | |
709 size = (1 << buf->text->mule_shifter) + !!buf->text->mule_three_p; | |
710 | |
711 /* The basic idea here is that we shift the "known region" up or down | |
712 until it overlaps the specified position. We do this by moving | |
713 the upper bound of the known region up one character at a time, | |
714 and moving the lower bound of the known region up as necessary | |
715 when the size of the character just seen changes. | |
716 | |
717 We optimize this, however, by first shifting the known region to | |
718 one of the cached points if it's close by. (We don't check BI_BEG or | |
719 BI_Z, even though they're cached; most of the time these will be the | |
720 same as BI_BEGV and BI_ZV, and when they're not, they're not likely | |
721 to be used.) */ | |
722 | |
723 if (x > bytmax) | |
724 { | |
725 Bytebpos diffmax = x - bytmax; | |
726 Bytebpos diffpt = x - BI_BUF_PT (buf); | |
727 Bytebpos diffzv = BI_BUF_ZV (buf) - x; | |
728 /* #### This value could stand some more exploration. */ | |
729 Bytecount heuristic_hack = (bytmax - bytmin) >> 2; | |
730 | |
731 /* Check if the position is closer to PT or ZV than to the | |
732 end of the known region. */ | |
733 | |
734 if (diffpt < 0) | |
735 diffpt = -diffpt; | |
736 if (diffzv < 0) | |
737 diffzv = -diffzv; | |
738 | |
739 /* But also implement a heuristic that favors the known region | |
740 over BI_PT or BI_ZV. The reason for this is that switching to | |
741 BI_PT or BI_ZV will wipe out the knowledge in the known region, | |
742 which might be annoying if the known region is large and | |
743 BI_PT or BI_ZV is not that much closer than the end of the known | |
744 region. */ | |
745 | |
746 diffzv += heuristic_hack; | |
747 diffpt += heuristic_hack; | |
748 if (diffpt < diffmax && diffpt <= diffzv) | |
749 { | |
750 bufmax = bufmin = BUF_PT (buf); | |
751 bytmax = bytmin = BI_BUF_PT (buf); | |
752 /* We set the size to 1 even though it doesn't really | |
753 matter because the new known region contains no | |
754 characters. We do this because this is the most | |
755 likely size of the characters around the new known | |
756 region, and we avoid potential yuckiness that is | |
757 done when size == 3. */ | |
758 size = 1; | |
759 } | |
760 if (diffzv < diffmax) | |
761 { | |
762 bufmax = bufmin = BUF_ZV (buf); | |
763 bytmax = bytmin = BI_BUF_ZV (buf); | |
764 size = 1; | |
765 } | |
766 } | |
767 #ifdef ERROR_CHECK_CHARBPOS | |
768 else if (x >= bytmin) | |
769 abort (); | |
770 #endif | |
771 else | |
772 { | |
773 Bytebpos diffmin = bytmin - x; | |
774 Bytebpos diffpt = BI_BUF_PT (buf) - x; | |
775 Bytebpos diffbegv = x - BI_BUF_BEGV (buf); | |
776 /* #### This value could stand some more exploration. */ | |
777 Bytecount heuristic_hack = (bytmax - bytmin) >> 2; | |
778 | |
779 if (diffpt < 0) | |
780 diffpt = -diffpt; | |
781 if (diffbegv < 0) | |
782 diffbegv = -diffbegv; | |
783 | |
784 /* But also implement a heuristic that favors the known region -- | |
785 see above. */ | |
786 | |
787 diffbegv += heuristic_hack; | |
788 diffpt += heuristic_hack; | |
789 | |
790 if (diffpt < diffmin && diffpt <= diffbegv) | |
791 { | |
792 bufmax = bufmin = BUF_PT (buf); | |
793 bytmax = bytmin = BI_BUF_PT (buf); | |
794 /* We set the size to 1 even though it doesn't really | |
795 matter because the new known region contains no | |
796 characters. We do this because this is the most | |
797 likely size of the characters around the new known | |
798 region, and we avoid potential yuckiness that is | |
799 done when size == 3. */ | |
800 size = 1; | |
801 } | |
802 if (diffbegv < diffmin) | |
803 { | |
804 bufmax = bufmin = BUF_BEGV (buf); | |
805 bytmax = bytmin = BI_BUF_BEGV (buf); | |
806 size = 1; | |
807 } | |
808 } | |
809 | |
810 diff_so_far = x > bytmax ? x - bytmax : bytmin - x; | |
811 if (diff_so_far > 50) | |
812 { | |
813 /* If we have to move more than a certain amount, then look | |
814 into our cache. */ | |
815 int minval = INT_MAX; | |
816 int found = 0; | |
817 int i; | |
818 | |
819 add_to_cache = 1; | |
820 /* I considered keeping the positions ordered. This would speed | |
821 up this loop, but updating the cache would take longer, so | |
822 it doesn't seem like it would really matter. */ | |
823 for (i = 0; i < 16; i++) | |
824 { | |
825 int diff = buf->text->mule_bytebpos_cache[i] - x; | |
826 | |
827 if (diff < 0) | |
828 diff = -diff; | |
829 if (diff < minval) | |
830 { | |
831 minval = diff; | |
832 found = i; | |
833 } | |
834 } | |
835 | |
836 if (minval < diff_so_far) | |
837 { | |
838 bufmax = bufmin = buf->text->mule_charbpos_cache[found]; | |
839 bytmax = bytmin = buf->text->mule_bytebpos_cache[found]; | |
840 size = 1; | |
841 } | |
842 } | |
843 | |
844 /* It's conceivable that the caching above could lead to X being | |
845 the same as one of the range edges. */ | |
846 if (x >= bytmax) | |
847 { | |
848 Bytebpos newmax; | |
849 Bytecount newsize; | |
850 | |
851 forward_p = 1; | |
852 while (x > bytmax) | |
853 { | |
854 newmax = bytmax; | |
855 | |
856 INC_BYTEBPOS (buf, newmax); | |
857 newsize = newmax - bytmax; | |
858 if (newsize != size) | |
859 { | |
860 bufmin = bufmax; | |
861 bytmin = bytmax; | |
862 size = newsize; | |
863 } | |
864 bytmax = newmax; | |
865 bufmax++; | |
866 } | |
867 retval = bufmax; | |
868 | |
869 /* #### Should go past the found location to reduce the number | |
870 of times that this function is called */ | |
871 } | |
872 else /* x <= bytmin */ | |
873 { | |
874 Bytebpos newmin; | |
875 Bytecount newsize; | |
876 | |
877 forward_p = 0; | |
878 while (x < bytmin) | |
879 { | |
880 newmin = bytmin; | |
881 | |
882 DEC_BYTEBPOS (buf, newmin); | |
883 newsize = bytmin - newmin; | |
884 if (newsize != size) | |
885 { | |
886 bufmax = bufmin; | |
887 bytmax = bytmin; | |
888 size = newsize; | |
889 } | |
890 bytmin = newmin; | |
891 bufmin--; | |
892 } | |
893 retval = bufmin; | |
894 | |
895 /* #### Should go past the found location to reduce the number | |
896 of times that this function is called | |
897 */ | |
898 } | |
899 | |
900 /* If size is three, than we have to max sure that the range we | |
901 discovered isn't too large, because we use a fixed-length | |
902 table to divide by 3. */ | |
903 | |
904 if (size == 3) | |
905 { | |
906 int gap = bytmax - bytmin; | |
907 buf->text->mule_three_p = 1; | |
908 buf->text->mule_shifter = 1; | |
909 | |
910 if (gap > MAX_BYTEBPOS_GAP_SIZE_3) | |
911 { | |
912 if (forward_p) | |
913 { | |
914 bytmin = bytmax - MAX_BYTEBPOS_GAP_SIZE_3; | |
915 bufmin = bufmax - MAX_CHARBPOS_GAP_SIZE_3; | |
916 } | |
917 else | |
918 { | |
919 bytmax = bytmin + MAX_BYTEBPOS_GAP_SIZE_3; | |
920 bufmax = bufmin + MAX_CHARBPOS_GAP_SIZE_3; | |
921 } | |
922 } | |
923 } | |
924 else | |
925 { | |
926 buf->text->mule_three_p = 0; | |
927 if (size == 4) | |
928 buf->text->mule_shifter = 2; | |
929 else | |
930 buf->text->mule_shifter = size - 1; | |
931 } | |
932 | |
933 buf->text->mule_bufmin = bufmin; | |
934 buf->text->mule_bufmax = bufmax; | |
935 buf->text->mule_bytmin = bytmin; | |
936 buf->text->mule_bytmax = bytmax; | |
937 | |
938 if (add_to_cache) | |
939 { | |
940 int replace_loc; | |
941 | |
942 /* We throw away a "random" cached value and replace it with | |
943 the new value. It doesn't actually have to be very random | |
944 at all, just evenly distributed. | |
945 | |
946 #### It would be better to use a least-recently-used algorithm | |
947 or something that tries to space things out, but I'm not sure | |
948 it's worth it to go to the trouble of maintaining that. */ | |
949 not_very_random_number += 621; | |
950 replace_loc = not_very_random_number & 15; | |
951 buf->text->mule_charbpos_cache[replace_loc] = retval; | |
952 buf->text->mule_bytebpos_cache[replace_loc] = x; | |
953 } | |
954 | |
955 return retval; | |
956 } | |
957 | |
958 /* Text of length BYTELENGTH and CHARLENGTH (in different units) | |
959 was inserted at charbpos START. */ | |
960 | |
961 static void | |
962 buffer_mule_signal_inserted_region (struct buffer *buf, Charbpos start, | |
963 Bytecount bytelength, | |
964 Charcount charlength) | |
965 { | |
966 int size = (1 << buf->text->mule_shifter) + !!buf->text->mule_three_p; | |
967 int i; | |
968 | |
969 /* Adjust the cache of known positions. */ | |
970 for (i = 0; i < 16; i++) | |
971 { | |
972 | |
973 if (buf->text->mule_charbpos_cache[i] > start) | |
974 { | |
975 buf->text->mule_charbpos_cache[i] += charlength; | |
976 buf->text->mule_bytebpos_cache[i] += bytelength; | |
977 } | |
978 } | |
979 | |
980 if (start >= buf->text->mule_bufmax) | |
981 return; | |
982 | |
983 /* The insertion is either before the known region, in which case | |
984 it shoves it forward; or within the known region, in which case | |
985 it shoves the end forward. (But it may make the known region | |
986 inconsistent, so we may have to shorten it.) */ | |
987 | |
988 if (start <= buf->text->mule_bufmin) | |
989 { | |
990 buf->text->mule_bufmin += charlength; | |
991 buf->text->mule_bufmax += charlength; | |
992 buf->text->mule_bytmin += bytelength; | |
993 buf->text->mule_bytmax += bytelength; | |
994 } | |
995 else | |
996 { | |
997 Charbpos end = start + charlength; | |
998 /* the insertion point divides the known region in two. | |
999 Keep the longer half, at least, and expand into the | |
1000 inserted chunk as much as possible. */ | |
1001 | |
1002 if (start - buf->text->mule_bufmin > buf->text->mule_bufmax - start) | |
1003 { | |
1004 Bytebpos bytestart = (buf->text->mule_bytmin | |
1005 + size * (start - buf->text->mule_bufmin)); | |
1006 Bytebpos bytenew; | |
1007 | |
1008 while (start < end) | |
1009 { | |
1010 bytenew = bytestart; | |
1011 INC_BYTEBPOS (buf, bytenew); | |
1012 if (bytenew - bytestart != size) | |
1013 break; | |
1014 start++; | |
1015 bytestart = bytenew; | |
1016 } | |
1017 if (start != end) | |
1018 { | |
1019 buf->text->mule_bufmax = start; | |
1020 buf->text->mule_bytmax = bytestart; | |
1021 } | |
1022 else | |
1023 { | |
1024 buf->text->mule_bufmax += charlength; | |
1025 buf->text->mule_bytmax += bytelength; | |
1026 } | |
1027 } | |
1028 else | |
1029 { | |
1030 Bytebpos byteend = (buf->text->mule_bytmin | |
1031 + size * (start - buf->text->mule_bufmin) | |
1032 + bytelength); | |
1033 Bytebpos bytenew; | |
1034 | |
1035 buf->text->mule_bufmax += charlength; | |
1036 buf->text->mule_bytmax += bytelength; | |
1037 | |
1038 while (end > start) | |
1039 { | |
1040 bytenew = byteend; | |
1041 DEC_BYTEBPOS (buf, bytenew); | |
1042 if (byteend - bytenew != size) | |
1043 break; | |
1044 end--; | |
1045 byteend = bytenew; | |
1046 } | |
1047 if (start != end) | |
1048 { | |
1049 buf->text->mule_bufmin = end; | |
1050 buf->text->mule_bytmin = byteend; | |
1051 } | |
1052 } | |
1053 } | |
1054 } | |
1055 | |
1056 /* Text from START to END (equivalent in Bytebposs: from BI_START to | |
1057 BI_END) was deleted. */ | |
1058 | |
1059 static void | |
1060 buffer_mule_signal_deleted_region (struct buffer *buf, Charbpos start, | |
1061 Charbpos end, Bytebpos bi_start, | |
1062 Bytebpos bi_end) | |
1063 { | |
1064 int i; | |
1065 | |
1066 /* Adjust the cache of known positions. */ | |
1067 for (i = 0; i < 16; i++) | |
1068 { | |
1069 /* After the end; gets shoved backward */ | |
1070 if (buf->text->mule_charbpos_cache[i] > end) | |
1071 { | |
1072 buf->text->mule_charbpos_cache[i] -= end - start; | |
1073 buf->text->mule_bytebpos_cache[i] -= bi_end - bi_start; | |
1074 } | |
1075 /* In the range; moves to start of range */ | |
1076 else if (buf->text->mule_charbpos_cache[i] > start) | |
1077 { | |
1078 buf->text->mule_charbpos_cache[i] = start; | |
1079 buf->text->mule_bytebpos_cache[i] = bi_start; | |
1080 } | |
1081 } | |
1082 | |
1083 /* We don't care about any text after the end of the known region. */ | |
1084 | |
1085 end = min (end, buf->text->mule_bufmax); | |
1086 bi_end = min (bi_end, buf->text->mule_bytmax); | |
1087 if (start >= end) | |
1088 return; | |
1089 | |
1090 /* The end of the known region offsets by the total amount of deletion, | |
1091 since it's all before it. */ | |
1092 | |
1093 buf->text->mule_bufmax -= end - start; | |
1094 buf->text->mule_bytmax -= bi_end - bi_start; | |
1095 | |
1096 /* Now we don't care about any text after the start of the known region. */ | |
1097 | |
1098 end = min (end, buf->text->mule_bufmin); | |
1099 bi_end = min (bi_end, buf->text->mule_bytmin); | |
1100 if (start >= end) | |
1101 return; | |
1102 | |
1103 buf->text->mule_bufmin -= end - start; | |
1104 buf->text->mule_bytmin -= bi_end - bi_start; | |
1105 } | |
1106 | |
1107 #endif /* MULE */ | |
1108 | |
1109 #ifdef ERROR_CHECK_CHARBPOS | |
1110 | |
1111 Bytebpos | |
1112 charbpos_to_bytebpos (struct buffer *buf, Charbpos x) | |
1113 { | |
1114 Bytebpos retval = real_charbpos_to_bytebpos (buf, x); | |
1115 ASSERT_VALID_BYTEBPOS_UNSAFE (buf, retval); | |
1116 return retval; | |
1117 } | |
1118 | |
1119 Charbpos | |
1120 bytebpos_to_charbpos (struct buffer *buf, Bytebpos x) | |
1121 { | |
1122 ASSERT_VALID_BYTEBPOS_UNSAFE (buf, x); | |
1123 return real_bytebpos_to_charbpos (buf, x); | |
1124 } | |
1125 | |
1126 #endif /* ERROR_CHECK_CHARBPOS */ | |
1127 | |
1128 | |
1129 /************************************************************************/ | |
1130 /* verifying buffer and string positions */ | |
1131 /************************************************************************/ | |
1132 | |
1133 /* Functions below are tagged with either _byte or _char indicating | |
1134 whether they return byte or character positions. For a buffer, | |
1135 a character position is a "Charbpos" and a byte position is a "Bytebpos". | |
1136 For strings, these are sometimes typed using "Charcount" and | |
1137 "Bytecount". */ | |
1138 | |
1139 /* Flags for the functions below are: | |
1140 | |
1141 GB_ALLOW_PAST_ACCESSIBLE | |
1142 | |
1143 Allow positions to range over the entire buffer (BUF_BEG to BUF_Z), | |
1144 rather than just the accessible portion (BUF_BEGV to BUF_ZV). | |
1145 For strings, this flag has no effect. | |
1146 | |
1147 GB_COERCE_RANGE | |
1148 | |
1149 If the position is outside the allowable range, return the lower | |
1150 or upper bound of the range, whichever is closer to the specified | |
1151 position. | |
1152 | |
1153 GB_NO_ERROR_IF_BAD | |
1154 | |
1155 If the position is outside the allowable range, return -1. | |
1156 | |
1157 GB_NEGATIVE_FROM_END | |
1158 | |
1159 If a value is negative, treat it as an offset from the end. | |
1160 Only applies to strings. | |
1161 | |
1162 The following additional flags apply only to the functions | |
1163 that return ranges: | |
1164 | |
1165 GB_ALLOW_NIL | |
1166 | |
1167 Either or both positions can be nil. If FROM is nil, | |
1168 FROM_OUT will contain the lower bound of the allowed range. | |
1169 If TO is nil, TO_OUT will contain the upper bound of the | |
1170 allowed range. | |
1171 | |
1172 GB_CHECK_ORDER | |
1173 | |
1174 FROM must contain the lower bound and TO the upper bound | |
1175 of the range. If the positions are reversed, an error is | |
1176 signalled. | |
1177 | |
1178 The following is a combination flag: | |
1179 | |
1180 GB_HISTORICAL_STRING_BEHAVIOR | |
1181 | |
1182 Equivalent to (GB_NEGATIVE_FROM_END | GB_ALLOW_NIL). | |
1183 */ | |
1184 | |
1185 /* Return a buffer position stored in a Lisp_Object. Full | |
1186 error-checking is done on the position. Flags can be specified to | |
1187 control the behavior of out-of-range values. The default behavior | |
1188 is to require that the position is within the accessible part of | |
1189 the buffer (BEGV and ZV), and to signal an error if the position is | |
1190 out of range. | |
1191 | |
1192 */ | |
1193 | |
1194 Charbpos | |
1195 get_buffer_pos_char (struct buffer *b, Lisp_Object pos, unsigned int flags) | |
1196 { | |
1197 /* Does not GC */ | |
1198 Charbpos ind; | |
1199 Charbpos min_allowed, max_allowed; | |
1200 | |
1201 CHECK_INT_COERCE_MARKER (pos); | |
1202 ind = XINT (pos); | |
1203 min_allowed = flags & GB_ALLOW_PAST_ACCESSIBLE ? BUF_BEG (b) : BUF_BEGV (b); | |
1204 max_allowed = flags & GB_ALLOW_PAST_ACCESSIBLE ? BUF_Z (b) : BUF_ZV (b); | |
1205 | |
1206 if (ind < min_allowed || ind > max_allowed) | |
1207 { | |
1208 if (flags & GB_COERCE_RANGE) | |
1209 ind = ind < min_allowed ? min_allowed : max_allowed; | |
1210 else if (flags & GB_NO_ERROR_IF_BAD) | |
1211 ind = -1; | |
1212 else | |
1213 { | |
1214 Lisp_Object buffer; | |
1215 XSETBUFFER (buffer, b); | |
1216 args_out_of_range (buffer, pos); | |
1217 } | |
1218 } | |
1219 | |
1220 return ind; | |
1221 } | |
1222 | |
1223 Bytebpos | |
1224 get_buffer_pos_byte (struct buffer *b, Lisp_Object pos, unsigned int flags) | |
1225 { | |
1226 Charbpos bpos = get_buffer_pos_char (b, pos, flags); | |
1227 if (bpos < 0) /* could happen with GB_NO_ERROR_IF_BAD */ | |
1228 return -1; | |
1229 return charbpos_to_bytebpos (b, bpos); | |
1230 } | |
1231 | |
1232 /* Return a pair of buffer positions representing a range of text, | |
1233 taken from a pair of Lisp_Objects. Full error-checking is | |
1234 done on the positions. Flags can be specified to control the | |
1235 behavior of out-of-range values. The default behavior is to | |
1236 allow the range bounds to be specified in either order | |
1237 (however, FROM_OUT will always be the lower bound of the range | |
1238 and TO_OUT the upper bound),to require that the positions | |
1239 are within the accessible part of the buffer (BEGV and ZV), | |
1240 and to signal an error if the positions are out of range. | |
1241 */ | |
1242 | |
1243 void | |
1244 get_buffer_range_char (struct buffer *b, Lisp_Object from, Lisp_Object to, | |
1245 Charbpos *from_out, Charbpos *to_out, unsigned int flags) | |
1246 { | |
1247 /* Does not GC */ | |
1248 Charbpos min_allowed, max_allowed; | |
1249 | |
1250 min_allowed = (flags & GB_ALLOW_PAST_ACCESSIBLE) ? | |
1251 BUF_BEG (b) : BUF_BEGV (b); | |
1252 max_allowed = (flags & GB_ALLOW_PAST_ACCESSIBLE) ? | |
1253 BUF_Z (b) : BUF_ZV (b); | |
1254 | |
1255 if (NILP (from) && (flags & GB_ALLOW_NIL)) | |
1256 *from_out = min_allowed; | |
1257 else | |
1258 *from_out = get_buffer_pos_char (b, from, flags | GB_NO_ERROR_IF_BAD); | |
1259 | |
1260 if (NILP (to) && (flags & GB_ALLOW_NIL)) | |
1261 *to_out = max_allowed; | |
1262 else | |
1263 *to_out = get_buffer_pos_char (b, to, flags | GB_NO_ERROR_IF_BAD); | |
1264 | |
1265 if ((*from_out < 0 || *to_out < 0) && !(flags & GB_NO_ERROR_IF_BAD)) | |
1266 { | |
1267 Lisp_Object buffer; | |
1268 XSETBUFFER (buffer, b); | |
1269 args_out_of_range_3 (buffer, from, to); | |
1270 } | |
1271 | |
1272 if (*from_out >= 0 && *to_out >= 0 && *from_out > *to_out) | |
1273 { | |
1274 if (flags & GB_CHECK_ORDER) | |
1275 invalid_argument_2 ("start greater than end", from, to); | |
1276 else | |
1277 { | |
1278 Charbpos temp = *from_out; | |
1279 *from_out = *to_out; | |
1280 *to_out = temp; | |
1281 } | |
1282 } | |
1283 } | |
1284 | |
1285 void | |
1286 get_buffer_range_byte (struct buffer *b, Lisp_Object from, Lisp_Object to, | |
1287 Bytebpos *from_out, Bytebpos *to_out, unsigned int flags) | |
1288 { | |
1289 Charbpos s, e; | |
1290 | |
1291 get_buffer_range_char (b, from, to, &s, &e, flags); | |
1292 if (s >= 0) | |
1293 *from_out = charbpos_to_bytebpos (b, s); | |
1294 else /* could happen with GB_NO_ERROR_IF_BAD */ | |
1295 *from_out = -1; | |
1296 if (e >= 0) | |
1297 *to_out = charbpos_to_bytebpos (b, e); | |
1298 else | |
1299 *to_out = -1; | |
1300 } | |
1301 | |
1302 static Charcount | |
1303 get_string_pos_char_1 (Lisp_Object string, Lisp_Object pos, unsigned int flags, | |
1304 Charcount known_length) | |
1305 { | |
1306 Charcount ccpos; | |
1307 Charcount min_allowed = 0; | |
1308 Charcount max_allowed = known_length; | |
1309 | |
1310 /* Computation of KNOWN_LENGTH is potentially expensive so we pass | |
1311 it in. */ | |
1312 CHECK_INT (pos); | |
1313 ccpos = XINT (pos); | |
1314 if (ccpos < 0 && flags & GB_NEGATIVE_FROM_END) | |
1315 ccpos += max_allowed; | |
1316 | |
1317 if (ccpos < min_allowed || ccpos > max_allowed) | |
1318 { | |
1319 if (flags & GB_COERCE_RANGE) | |
1320 ccpos = ccpos < min_allowed ? min_allowed : max_allowed; | |
1321 else if (flags & GB_NO_ERROR_IF_BAD) | |
1322 ccpos = -1; | |
1323 else | |
1324 args_out_of_range (string, pos); | |
1325 } | |
1326 | |
1327 return ccpos; | |
1328 } | |
1329 | |
1330 Charcount | |
1331 get_string_pos_char (Lisp_Object string, Lisp_Object pos, unsigned int flags) | |
1332 { | |
1333 return get_string_pos_char_1 (string, pos, flags, | |
1334 XSTRING_CHAR_LENGTH (string)); | |
1335 } | |
1336 | |
1337 Bytecount | |
1338 get_string_pos_byte (Lisp_Object string, Lisp_Object pos, unsigned int flags) | |
1339 { | |
1340 Charcount ccpos = get_string_pos_char (string, pos, flags); | |
1341 if (ccpos < 0) /* could happen with GB_NO_ERROR_IF_BAD */ | |
1342 return -1; | |
1343 return charcount_to_bytecount (XSTRING_DATA (string), ccpos); | |
1344 } | |
1345 | |
1346 void | |
1347 get_string_range_char (Lisp_Object string, Lisp_Object from, Lisp_Object to, | |
1348 Charcount *from_out, Charcount *to_out, | |
1349 unsigned int flags) | |
1350 { | |
1351 Charcount min_allowed = 0; | |
1352 Charcount max_allowed = XSTRING_CHAR_LENGTH (string); | |
1353 | |
1354 if (NILP (from) && (flags & GB_ALLOW_NIL)) | |
1355 *from_out = min_allowed; | |
1356 else | |
1357 *from_out = get_string_pos_char_1 (string, from, | |
1358 flags | GB_NO_ERROR_IF_BAD, | |
1359 max_allowed); | |
1360 | |
1361 if (NILP (to) && (flags & GB_ALLOW_NIL)) | |
1362 *to_out = max_allowed; | |
1363 else | |
1364 *to_out = get_string_pos_char_1 (string, to, | |
1365 flags | GB_NO_ERROR_IF_BAD, | |
1366 max_allowed); | |
1367 | |
1368 if ((*from_out < 0 || *to_out < 0) && !(flags & GB_NO_ERROR_IF_BAD)) | |
1369 args_out_of_range_3 (string, from, to); | |
1370 | |
1371 if (*from_out >= 0 && *to_out >= 0 && *from_out > *to_out) | |
1372 { | |
1373 if (flags & GB_CHECK_ORDER) | |
1374 invalid_argument_2 ("start greater than end", from, to); | |
1375 else | |
1376 { | |
1377 Charbpos temp = *from_out; | |
1378 *from_out = *to_out; | |
1379 *to_out = temp; | |
1380 } | |
1381 } | |
1382 } | |
1383 | |
1384 void | |
1385 get_string_range_byte (Lisp_Object string, Lisp_Object from, Lisp_Object to, | |
1386 Bytecount *from_out, Bytecount *to_out, | |
1387 unsigned int flags) | |
1388 { | |
1389 Charcount s, e; | |
1390 | |
1391 get_string_range_char (string, from, to, &s, &e, flags); | |
1392 if (s >= 0) | |
1393 *from_out = charcount_to_bytecount (XSTRING_DATA (string), s); | |
1394 else /* could happen with GB_NO_ERROR_IF_BAD */ | |
1395 *from_out = -1; | |
1396 if (e >= 0) | |
1397 *to_out = charcount_to_bytecount (XSTRING_DATA (string), e); | |
1398 else | |
1399 *to_out = -1; | |
1400 | |
1401 } | |
1402 | |
1403 Charbpos | |
1404 get_buffer_or_string_pos_char (Lisp_Object object, Lisp_Object pos, | |
1405 unsigned int flags) | |
1406 { | |
1407 return STRINGP (object) ? | |
1408 get_string_pos_char (object, pos, flags) : | |
1409 get_buffer_pos_char (XBUFFER (object), pos, flags); | |
1410 } | |
1411 | |
1412 Bytebpos | |
1413 get_buffer_or_string_pos_byte (Lisp_Object object, Lisp_Object pos, | |
1414 unsigned int flags) | |
1415 { | |
1416 return STRINGP (object) ? | |
1417 get_string_pos_byte (object, pos, flags) : | |
1418 get_buffer_pos_byte (XBUFFER (object), pos, flags); | |
1419 } | |
1420 | |
1421 void | |
1422 get_buffer_or_string_range_char (Lisp_Object object, Lisp_Object from, | |
1423 Lisp_Object to, Charbpos *from_out, | |
1424 Charbpos *to_out, unsigned int flags) | |
1425 { | |
1426 if (STRINGP (object)) | |
1427 get_string_range_char (object, from, to, from_out, to_out, flags); | |
1428 else | |
1429 get_buffer_range_char (XBUFFER (object), from, to, from_out, to_out, flags); | |
1430 } | |
1431 | |
1432 void | |
1433 get_buffer_or_string_range_byte (Lisp_Object object, Lisp_Object from, | |
1434 Lisp_Object to, Bytebpos *from_out, | |
1435 Bytebpos *to_out, unsigned int flags) | |
1436 { | |
1437 if (STRINGP (object)) | |
1438 get_string_range_byte (object, from, to, from_out, to_out, flags); | |
1439 else | |
1440 get_buffer_range_byte (XBUFFER (object), from, to, from_out, to_out, flags); | |
1441 } | |
1442 | |
1443 Charbpos | |
1444 buffer_or_string_accessible_begin_char (Lisp_Object object) | |
1445 { | |
1446 return STRINGP (object) ? 0 : BUF_BEGV (XBUFFER (object)); | |
1447 } | |
1448 | |
1449 Charbpos | |
1450 buffer_or_string_accessible_end_char (Lisp_Object object) | |
1451 { | |
1452 return STRINGP (object) ? | |
1453 XSTRING_CHAR_LENGTH (object) : BUF_ZV (XBUFFER (object)); | |
1454 } | |
1455 | |
1456 Bytebpos | |
1457 buffer_or_string_accessible_begin_byte (Lisp_Object object) | |
1458 { | |
1459 return STRINGP (object) ? 0 : BI_BUF_BEGV (XBUFFER (object)); | |
1460 } | |
1461 | |
1462 Bytebpos | |
1463 buffer_or_string_accessible_end_byte (Lisp_Object object) | |
1464 { | |
1465 return STRINGP (object) ? | |
1466 XSTRING_LENGTH (object) : BI_BUF_ZV (XBUFFER (object)); | |
1467 } | |
1468 | |
1469 Charbpos | |
1470 buffer_or_string_absolute_begin_char (Lisp_Object object) | |
1471 { | |
1472 return STRINGP (object) ? 0 : BUF_BEG (XBUFFER (object)); | |
1473 } | |
1474 | |
1475 Charbpos | |
1476 buffer_or_string_absolute_end_char (Lisp_Object object) | |
1477 { | |
1478 return STRINGP (object) ? | |
1479 XSTRING_CHAR_LENGTH (object) : BUF_Z (XBUFFER (object)); | |
1480 } | |
1481 | |
1482 Bytebpos | |
1483 buffer_or_string_absolute_begin_byte (Lisp_Object object) | |
1484 { | |
1485 return STRINGP (object) ? 0 : BI_BUF_BEG (XBUFFER (object)); | |
1486 } | |
1487 | |
1488 Bytebpos | |
1489 buffer_or_string_absolute_end_byte (Lisp_Object object) | |
1490 { | |
1491 return STRINGP (object) ? | |
1492 XSTRING_LENGTH (object) : BI_BUF_Z (XBUFFER (object)); | |
1493 } | |
1494 | |
1495 | |
1496 /************************************************************************/ | |
1497 /* point and marker adjustment */ | 108 /* point and marker adjustment */ |
1498 /************************************************************************/ | 109 /************************************************************************/ |
1499 | 110 |
1500 /* just_set_point() is the only place `PT' is an lvalue in all of emacs. | 111 /* just_set_point() is the only place `PT' is an lvalue in all of emacs. |
1501 This function is called from set_buffer_point(), which is the function | 112 This function is called from set_buffer_point(), which is the function |
1694 { | 305 { |
1695 adjust_markers (mbuf, pos, BI_BUF_GPT (mbuf), BUF_GAP_SIZE (mbuf)); | 306 adjust_markers (mbuf, pos, BI_BUF_GPT (mbuf), BUF_GAP_SIZE (mbuf)); |
1696 } | 307 } |
1697 MAP_INDIRECT_BUFFERS (buf, mbuf, bufcons) | 308 MAP_INDIRECT_BUFFERS (buf, mbuf, bufcons) |
1698 { | 309 { |
1699 adjust_extents (make_buffer (mbuf), pos, BI_BUF_GPT (mbuf), | 310 adjust_extents (wrap_buffer (mbuf), pos, BI_BUF_GPT (mbuf), |
1700 BUF_GAP_SIZE (mbuf)); | 311 BUF_GAP_SIZE (mbuf)); |
1701 } | 312 } |
1702 SET_BI_BUF_GPT (buf, pos); | 313 SET_BI_BUF_GPT (buf, pos); |
1703 SET_GAP_SENTINEL (buf); | 314 SET_GAP_SENTINEL (buf); |
1704 #ifdef ERROR_CHECK_EXTENTS | 315 #ifdef ERROR_CHECK_EXTENTS |
1705 MAP_INDIRECT_BUFFERS (buf, mbuf, bufcons) | 316 MAP_INDIRECT_BUFFERS (buf, mbuf, bufcons) |
1706 { | 317 { |
1707 sledgehammer_extent_check (make_buffer (mbuf)); | 318 sledgehammer_extent_check (wrap_buffer (mbuf)); |
1708 } | 319 } |
1709 #endif | 320 #endif |
1710 QUIT; | 321 QUIT; |
1711 } | 322 } |
1712 | 323 |
1764 { | 375 { |
1765 adjust_markers (mbuf, BI_BUF_GPT (mbuf) + gsize, pos + gsize, - gsize); | 376 adjust_markers (mbuf, BI_BUF_GPT (mbuf) + gsize, pos + gsize, - gsize); |
1766 } | 377 } |
1767 MAP_INDIRECT_BUFFERS (buf, mbuf, bufcons) | 378 MAP_INDIRECT_BUFFERS (buf, mbuf, bufcons) |
1768 { | 379 { |
1769 adjust_extents (make_buffer (mbuf), BI_BUF_GPT (mbuf) + gsize, | 380 adjust_extents (wrap_buffer (mbuf), BI_BUF_GPT (mbuf) + gsize, |
1770 pos + gsize, - gsize); | 381 pos + gsize, - gsize); |
1771 } | 382 } |
1772 SET_BI_BUF_GPT (buf, pos); | 383 SET_BI_BUF_GPT (buf, pos); |
1773 SET_GAP_SENTINEL (buf); | 384 SET_GAP_SENTINEL (buf); |
1774 #ifdef ERROR_CHECK_EXTENTS | 385 #ifdef ERROR_CHECK_EXTENTS |
1775 MAP_INDIRECT_BUFFERS (buf, mbuf, bufcons) | 386 MAP_INDIRECT_BUFFERS (buf, mbuf, bufcons) |
1776 { | 387 { |
1777 sledgehammer_extent_check (make_buffer (mbuf)); | 388 sledgehammer_extent_check (wrap_buffer (mbuf)); |
1778 } | 389 } |
1779 #endif | 390 #endif |
1780 } | 391 } |
1781 if (pos == BI_BUF_Z (buf)) | 392 if (pos == BI_BUF_Z (buf)) |
1782 { | 393 { |
2069 end_multiple_change (struct buffer *buf, int count) | 680 end_multiple_change (struct buffer *buf, int count) |
2070 { | 681 { |
2071 assert (buf->text->changes->in_multiple_change > 0); | 682 assert (buf->text->changes->in_multiple_change > 0); |
2072 buf->text->changes->in_multiple_change--; | 683 buf->text->changes->in_multiple_change--; |
2073 if (!buf->text->changes->in_multiple_change) | 684 if (!buf->text->changes->in_multiple_change) |
2074 unbind_to (count, Qnil); | 685 unbind_to (count); |
2075 } | 686 } |
2076 | 687 |
2077 static int inside_change_hook; | 688 static int inside_change_hook; |
2078 | 689 |
2079 static Lisp_Object | 690 static Lisp_Object |
2113 int speccount = specpdl_depth (); | 724 int speccount = specpdl_depth (); |
2114 record_unwind_protect (first_change_hook_restore, buffer); | 725 record_unwind_protect (first_change_hook_restore, buffer); |
2115 set_buffer_internal (buf); | 726 set_buffer_internal (buf); |
2116 in_first_change = 1; | 727 in_first_change = 1; |
2117 run_hook (Qfirst_change_hook); | 728 run_hook (Qfirst_change_hook); |
2118 unbind_to (speccount, Qnil); | 729 unbind_to (speccount); |
2119 } | 730 } |
2120 } | 731 } |
2121 } | 732 } |
2122 | 733 |
2123 /* Signal a change to the buffer immediately before it happens. | 734 /* Signal a change to the buffer immediately before it happens. |
2201 MAP_INDIRECT_BUFFERS (buf, mbuf, bufcons) | 812 MAP_INDIRECT_BUFFERS (buf, mbuf, bufcons) |
2202 { | 813 { |
2203 XSETBUFFER (buffer, mbuf); | 814 XSETBUFFER (buffer, mbuf); |
2204 report_extent_modification (buffer, start, end, 0); | 815 report_extent_modification (buffer, start, end, 0); |
2205 } | 816 } |
2206 unbind_to (speccount, Qnil); | 817 unbind_to (speccount); |
2207 | 818 |
2208 /* Only now do we indicate that the before-change-functions have | 819 /* Only now do we indicate that the before-change-functions have |
2209 been called, in case some function throws out. */ | 820 been called, in case some function throws out. */ |
2210 buf->text->changes->mc_begin_signaled = 1; | 821 buf->text->changes->mc_begin_signaled = 1; |
2211 } | 822 } |
2289 MAP_INDIRECT_BUFFERS (buf, mbuf, bufcons) | 900 MAP_INDIRECT_BUFFERS (buf, mbuf, bufcons) |
2290 { | 901 { |
2291 XSETBUFFER (buffer, mbuf); | 902 XSETBUFFER (buffer, mbuf); |
2292 report_extent_modification (buffer, start, new_end, 1); | 903 report_extent_modification (buffer, start, new_end, 1); |
2293 } | 904 } |
2294 unbind_to (speccount, Qnil); /* sets inside_change_hook back to 0 */ | 905 unbind_to (speccount); /* sets inside_change_hook back to 0 */ |
2295 } | 906 } |
2296 } | 907 } |
2297 | 908 |
2298 /* Call this if you're about to change the region of BUFFER from START | 909 /* Call this if you're about to change the region of BUFFER from START |
2299 to END. This checks the read-only properties of the region, calls | 910 to END. This checks the read-only properties of the region, calls |
2467 if (pos < BUF_BEGV (buf)) | 1078 if (pos < BUF_BEGV (buf)) |
2468 pos = BUF_BEGV (buf); | 1079 pos = BUF_BEGV (buf); |
2469 if (pos > BUF_ZV (buf)) | 1080 if (pos > BUF_ZV (buf)) |
2470 pos = BUF_ZV (buf); | 1081 pos = BUF_ZV (buf); |
2471 | 1082 |
1083 ind = charbpos_to_bytebpos (buf, pos); | |
1084 | |
2472 /* string may have been relocated up to this point */ | 1085 /* string may have been relocated up to this point */ |
2473 if (STRINGP (reloc)) | 1086 if (STRINGP (reloc)) |
2474 nonreloc = XSTRING_DATA (reloc); | 1087 { |
2475 | 1088 cclen = XSTRING_OFFSET_BYTE_TO_CHAR_LEN (reloc, offset, length); |
2476 ind = charbpos_to_bytebpos (buf, pos); | 1089 nonreloc = XSTRING_DATA (reloc); |
2477 cclen = bytecount_to_charcount (nonreloc + offset, length); | 1090 } |
1091 else | |
1092 cclen = bytecount_to_charcount (nonreloc + offset, length); | |
2478 | 1093 |
2479 if (ind != BI_BUF_GPT (buf)) | 1094 if (ind != BI_BUF_GPT (buf)) |
2480 /* #### if debug-on-quit is invoked and the user changes the | 1095 /* #### if debug-on-quit is invoked and the user changes the |
2481 buffer, bad things can happen. This is a rampant problem | 1096 buffer, bad things can happen. This is a rampant problem |
2482 in Emacs. */ | 1097 in Emacs. */ |
2511 { | 1126 { |
2512 SET_BOTH_BUF_ZV (mbuf, BUF_ZV (mbuf) + cclen, BI_BUF_ZV (mbuf) + length); | 1127 SET_BOTH_BUF_ZV (mbuf, BUF_ZV (mbuf) + cclen, BI_BUF_ZV (mbuf) + length); |
2513 } | 1128 } |
2514 SET_BOTH_BUF_Z (buf, BUF_Z (buf) + cclen, BI_BUF_Z (buf) + length); | 1129 SET_BOTH_BUF_Z (buf, BUF_Z (buf) + cclen, BI_BUF_Z (buf) + length); |
2515 SET_GAP_SENTINEL (buf); | 1130 SET_GAP_SENTINEL (buf); |
2516 | 1131 |
1132 | |
2517 #ifdef MULE | 1133 #ifdef MULE |
2518 buffer_mule_signal_inserted_region (buf, pos, length, cclen); | 1134 buffer_mule_signal_inserted_region (buf, pos, length, cclen); |
2519 #endif | 1135 #endif |
2520 | 1136 |
2521 MAP_INDIRECT_BUFFERS (buf, mbuf, bufcons) | 1137 MAP_INDIRECT_BUFFERS (buf, mbuf, bufcons) |
2522 { | 1138 { |
2523 process_extents_for_insertion (make_buffer (mbuf), ind, length); | 1139 process_extents_for_insertion (wrap_buffer (mbuf), ind, length); |
2524 } | 1140 } |
2525 | 1141 |
2526 MAP_INDIRECT_BUFFERS (buf, mbuf, bufcons) | 1142 MAP_INDIRECT_BUFFERS (buf, mbuf, bufcons) |
2527 { | 1143 { |
2528 /* We know the gap is at IND so the cast is OK. */ | 1144 /* We know the gap is at IND so the cast is OK. */ |
2736 extents will be present to be recorded, and BEFORE the gap | 1352 extents will be present to be recorded, and BEFORE the gap |
2737 size is increased, as otherwise we will be confused about | 1353 size is increased, as otherwise we will be confused about |
2738 where the extents end. */ | 1354 where the extents end. */ |
2739 MAP_INDIRECT_BUFFERS (buf, mbuf, bufcons) | 1355 MAP_INDIRECT_BUFFERS (buf, mbuf, bufcons) |
2740 { | 1356 { |
2741 process_extents_for_deletion (make_buffer (mbuf), bi_from, bi_to, 0); | 1357 process_extents_for_deletion (wrap_buffer (mbuf), bi_from, bi_to, 0); |
2742 } | 1358 } |
2743 | 1359 |
2744 /* Relocate all markers pointing into the new, larger gap to | 1360 /* Relocate all markers pointing into the new, larger gap to |
2745 point at the end of the text before the gap. */ | 1361 point at the end of the text before the gap. */ |
2746 MAP_INDIRECT_BUFFERS (buf, mbuf, bufcons) | 1362 MAP_INDIRECT_BUFFERS (buf, mbuf, bufcons) |
2752 } | 1368 } |
2753 | 1369 |
2754 MAP_INDIRECT_BUFFERS (buf, mbuf, bufcons) | 1370 MAP_INDIRECT_BUFFERS (buf, mbuf, bufcons) |
2755 { | 1371 { |
2756 /* Relocate any extent endpoints just like markers. */ | 1372 /* Relocate any extent endpoints just like markers. */ |
2757 adjust_extents_for_deletion (make_buffer (mbuf), bi_from, bi_to, | 1373 adjust_extents_for_deletion (wrap_buffer (mbuf), bi_from, bi_to, |
2758 BUF_GAP_SIZE (mbuf), bc_numdel, 0); | 1374 BUF_GAP_SIZE (mbuf), bc_numdel, 0); |
2759 } | 1375 } |
2760 | 1376 |
2761 MAP_INDIRECT_BUFFERS (buf, mbuf, bufcons) | 1377 MAP_INDIRECT_BUFFERS (buf, mbuf, bufcons) |
2762 { | 1378 { |
2810 This must come AFTER record_delete(), so that the appropriate extents | 1426 This must come AFTER record_delete(), so that the appropriate extents |
2811 will be present to be recorded, and BEFORE the gap size is increased, | 1427 will be present to be recorded, and BEFORE the gap size is increased, |
2812 as otherwise we will be confused about where the extents end. */ | 1428 as otherwise we will be confused about where the extents end. */ |
2813 MAP_INDIRECT_BUFFERS (buf, mbuf, bufcons) | 1429 MAP_INDIRECT_BUFFERS (buf, mbuf, bufcons) |
2814 { | 1430 { |
2815 process_extents_for_deletion (make_buffer (mbuf), bi_from, bi_to, 0); | 1431 process_extents_for_deletion (wrap_buffer (mbuf), bi_from, bi_to, 0); |
2816 } | 1432 } |
2817 | 1433 |
2818 /* Relocate all markers pointing into the new, larger gap to | 1434 /* Relocate all markers pointing into the new, larger gap to |
2819 point at the end of the text before the gap. */ | 1435 point at the end of the text before the gap. */ |
2820 MAP_INDIRECT_BUFFERS (buf, mbuf, bufcons) | 1436 MAP_INDIRECT_BUFFERS (buf, mbuf, bufcons) |
2826 } | 1442 } |
2827 | 1443 |
2828 /* Relocate any extent endpoints just like markers. */ | 1444 /* Relocate any extent endpoints just like markers. */ |
2829 MAP_INDIRECT_BUFFERS (buf, mbuf, bufcons) | 1445 MAP_INDIRECT_BUFFERS (buf, mbuf, bufcons) |
2830 { | 1446 { |
2831 adjust_extents_for_deletion (make_buffer (mbuf), bi_from, bi_to, | 1447 adjust_extents_for_deletion (wrap_buffer (mbuf), bi_from, bi_to, |
2832 BUF_GAP_SIZE (mbuf), | 1448 BUF_GAP_SIZE (mbuf), |
2833 bc_numdel, BUF_GAP_SIZE (mbuf)); | 1449 bc_numdel, BUF_GAP_SIZE (mbuf)); |
2834 } | 1450 } |
2835 | 1451 |
2836 MAP_INDIRECT_BUFFERS (buf, mbuf, bufcons) | 1452 MAP_INDIRECT_BUFFERS (buf, mbuf, bufcons) |
2862 #endif | 1478 #endif |
2863 | 1479 |
2864 #ifdef ERROR_CHECK_EXTENTS | 1480 #ifdef ERROR_CHECK_EXTENTS |
2865 MAP_INDIRECT_BUFFERS (buf, mbuf, bufcons) | 1481 MAP_INDIRECT_BUFFERS (buf, mbuf, bufcons) |
2866 { | 1482 { |
2867 sledgehammer_extent_check (make_buffer (mbuf)); | 1483 sledgehammer_extent_check (wrap_buffer (mbuf)); |
2868 } | 1484 } |
2869 #endif | 1485 #endif |
2870 | 1486 |
2871 signal_after_change (buf, from, to, from); | 1487 signal_after_change (buf, from, to, from); |
2872 } | 1488 } |
3025 memcpy (dest, start1, len1); | 1641 memcpy (dest, start1, len1); |
3026 memcpy (dest + len1, start2, bi_len - len1); | 1642 memcpy (dest + len1, start2, bi_len - len1); |
3027 } | 1643 } |
3028 } | 1644 } |
3029 | 1645 |
1646 init_string_ascii_begin (val); | |
1647 sledgehammer_check_ascii_begin (val); | |
1648 | |
3030 UNGCPRO; | 1649 UNGCPRO; |
3031 return val; | 1650 return val; |
3032 } | 1651 } |
3033 | 1652 |
3034 Lisp_Object | 1653 Lisp_Object |
3070 charbpos_to_bytebpos (buf, to), | 1689 charbpos_to_bytebpos (buf, to), |
3071 iro); | 1690 iro); |
3072 } | 1691 } |
3073 } | 1692 } |
3074 | 1693 |
3075 void | |
3076 find_charsets_in_intbyte_string (unsigned char *charsets, const Intbyte *str, | |
3077 Bytecount len) | |
3078 { | |
3079 #ifndef MULE | |
3080 /* Telescope this. */ | |
3081 charsets[0] = 1; | |
3082 #else | |
3083 const Intbyte *strend = str + len; | |
3084 memset (charsets, 0, NUM_LEADING_BYTES); | |
3085 | |
3086 /* #### SJT doesn't like this. */ | |
3087 if (len == 0) | |
3088 { | |
3089 charsets[XCHARSET_LEADING_BYTE (Vcharset_ascii) - 128] = 1; | |
3090 return; | |
3091 } | |
3092 | |
3093 while (str < strend) | |
3094 { | |
3095 charsets[CHAR_LEADING_BYTE (charptr_emchar (str)) - 128] = 1; | |
3096 INC_CHARPTR (str); | |
3097 } | |
3098 #endif | |
3099 } | |
3100 | |
3101 void | |
3102 find_charsets_in_emchar_string (unsigned char *charsets, const Emchar *str, | |
3103 Charcount len) | |
3104 { | |
3105 #ifndef MULE | |
3106 /* Telescope this. */ | |
3107 charsets[0] = 1; | |
3108 #else | |
3109 int i; | |
3110 | |
3111 memset (charsets, 0, NUM_LEADING_BYTES); | |
3112 | |
3113 /* #### SJT doesn't like this. */ | |
3114 if (len == 0) | |
3115 { | |
3116 charsets[XCHARSET_LEADING_BYTE (Vcharset_ascii) - 128] = 1; | |
3117 return; | |
3118 } | |
3119 | |
3120 for (i = 0; i < len; i++) | |
3121 { | |
3122 charsets[CHAR_LEADING_BYTE (str[i]) - 128] = 1; | |
3123 } | |
3124 #endif | |
3125 } | |
3126 | |
3127 int | |
3128 intbyte_string_displayed_columns (const Intbyte *str, Bytecount len) | |
3129 { | |
3130 int cols = 0; | |
3131 const Intbyte *end = str + len; | |
3132 | |
3133 while (str < end) | |
3134 { | |
3135 #ifdef MULE | |
3136 Emchar ch = charptr_emchar (str); | |
3137 cols += XCHARSET_COLUMNS (CHAR_CHARSET (ch)); | |
3138 #else | |
3139 cols++; | |
3140 #endif | |
3141 INC_CHARPTR (str); | |
3142 } | |
3143 | |
3144 return cols; | |
3145 } | |
3146 | |
3147 int | |
3148 emchar_string_displayed_columns (const Emchar *str, Charcount len) | |
3149 { | |
3150 #ifdef MULE | |
3151 int cols = 0; | |
3152 int i; | |
3153 | |
3154 for (i = 0; i < len; i++) | |
3155 cols += XCHARSET_COLUMNS (CHAR_CHARSET (str[i])); | |
3156 | |
3157 return cols; | |
3158 #else /* not MULE */ | |
3159 return len; | |
3160 #endif | |
3161 } | |
3162 | |
3163 /* NOTE: Does not reset the Dynarr. */ | |
3164 | |
3165 void | |
3166 convert_intbyte_string_into_emchar_dynarr (const Intbyte *str, Bytecount len, | |
3167 Emchar_dynarr *dyn) | |
3168 { | |
3169 const Intbyte *strend = str + len; | |
3170 | |
3171 while (str < strend) | |
3172 { | |
3173 Emchar ch = charptr_emchar (str); | |
3174 Dynarr_add (dyn, ch); | |
3175 INC_CHARPTR (str); | |
3176 } | |
3177 } | |
3178 | |
3179 Charcount | |
3180 convert_intbyte_string_into_emchar_string (const Intbyte *str, Bytecount len, | |
3181 Emchar *arr) | |
3182 { | |
3183 const Intbyte *strend = str + len; | |
3184 Charcount newlen = 0; | |
3185 while (str < strend) | |
3186 { | |
3187 Emchar ch = charptr_emchar (str); | |
3188 arr[newlen++] = ch; | |
3189 INC_CHARPTR (str); | |
3190 } | |
3191 return newlen; | |
3192 } | |
3193 | |
3194 /* Convert an array of Emchars into the equivalent string representation. | |
3195 Store into the given Intbyte dynarr. Does not reset the dynarr. | |
3196 Does not add a terminating zero. */ | |
3197 | |
3198 void | |
3199 convert_emchar_string_into_intbyte_dynarr (Emchar *arr, int nels, | |
3200 Intbyte_dynarr *dyn) | |
3201 { | |
3202 Intbyte str[MAX_EMCHAR_LEN]; | |
3203 int i; | |
3204 | |
3205 for (i = 0; i < nels; i++) | |
3206 { | |
3207 Bytecount len = set_charptr_emchar (str, arr[i]); | |
3208 Dynarr_add_many (dyn, str, len); | |
3209 } | |
3210 } | |
3211 | |
3212 /* Convert an array of Emchars into the equivalent string representation. | |
3213 Malloc the space needed for this and return it. If LEN_OUT is not a | |
3214 NULL pointer, store into LEN_OUT the number of Intbytes in the | |
3215 malloc()ed string. Note that the actual number of Intbytes allocated | |
3216 is one more than this: the returned string is zero-terminated. */ | |
3217 | |
3218 Intbyte * | |
3219 convert_emchar_string_into_malloced_string (Emchar *arr, int nels, | |
3220 Bytecount *len_out) | |
3221 { | |
3222 /* Damn zero-termination. */ | |
3223 Intbyte *str = (Intbyte *) alloca (nels * MAX_EMCHAR_LEN + 1); | |
3224 Intbyte *strorig = str; | |
3225 Bytecount len; | |
3226 | |
3227 int i; | |
3228 | |
3229 for (i = 0; i < nels; i++) | |
3230 str += set_charptr_emchar (str, arr[i]); | |
3231 *str = '\0'; | |
3232 len = str - strorig; | |
3233 str = (Intbyte *) xmalloc (1 + len); | |
3234 memcpy (str, strorig, 1 + len); | |
3235 if (len_out) | |
3236 *len_out = len; | |
3237 return str; | |
3238 } | |
3239 | |
3240 | 1694 |
3241 /************************************************************************/ | 1695 /************************************************************************/ |
3242 /* initialization */ | 1696 /* initialization */ |
3243 /************************************************************************/ | 1697 /************************************************************************/ |
3244 | 1698 |
3245 void | 1699 void |
3246 reinit_vars_of_insdel (void) | 1700 reinit_vars_of_insdel (void) |
3247 { | 1701 { |
3248 int i; | |
3249 | |
3250 inside_change_hook = 0; | 1702 inside_change_hook = 0; |
3251 in_first_change = 0; | 1703 in_first_change = 0; |
3252 | |
3253 for (i = 0; i <= MAX_BYTEBPOS_GAP_SIZE_3; i++) | |
3254 three_to_one_table[i] = i / 3; | |
3255 } | 1704 } |
3256 | 1705 |
3257 void | 1706 void |
3258 vars_of_insdel (void) | 1707 vars_of_insdel (void) |
3259 { | 1708 { |
3281 | 1730 |
3282 b->text->mule_bufmin = b->text->mule_bufmax = 1; | 1731 b->text->mule_bufmin = b->text->mule_bufmax = 1; |
3283 b->text->mule_bytmin = b->text->mule_bytmax = 1; | 1732 b->text->mule_bytmin = b->text->mule_bytmax = 1; |
3284 b->text->mule_shifter = 0; | 1733 b->text->mule_shifter = 0; |
3285 b->text->mule_three_p = 0; | 1734 b->text->mule_three_p = 0; |
1735 b->text->entirely_ascii_p = 1; | |
3286 | 1736 |
3287 for (i = 0; i < 16; i++) | 1737 for (i = 0; i < 16; i++) |
3288 { | 1738 { |
3289 b->text->mule_charbpos_cache[i] = 1; | 1739 b->text->mule_charbpos_cache[i] = 1; |
3290 b->text->mule_bytebpos_cache[i] = 1; | 1740 b->text->mule_bytebpos_cache[i] = 1; |