annotate src/text.c @ 826:6728e641994e

[xemacs-hg @ 2002-05-05 11:30:15 by ben] syntax cache, 8-bit-format, lots of code cleanup README.packages: Update info about --package-path. i.c: Create an inheritable event and pass it on to XEmacs, so that ^C can be handled properly. Intercept ^C and signal the event. "Stop Build" in VC++ now works. bytecomp-runtime.el: Doc string changes. compat.el: Some attempts to redo this to make it truly useful and fix the "multiple versions interacting with each other" problem. Not yet done. Currently doesn't work. files.el: Use with-obsolete-variable to avoid warnings in new revert-buffer code. xemacs.mak: Split up CFLAGS into a version without flags specifying the C library. The problem seems to be that minitar depends on zlib, which depends specifically on libc.lib, not on any of the other C libraries. Unless you compile with libc.lib, you get errors -- specifically, no _errno in the other libraries, which must make it something other than an int. (#### But this doesn't seem to obtain in XEmacs, which also uses zlib, and can be linked with any of the C libraries. Maybe zlib is used differently and doesn't need errno, or maybe XEmacs provides an int errno; ... I don't understand. Makefile.in.in: Fix so that packages are around when testing. abbrev.c, alloc.c, buffer.c, buffer.h, bytecode.c, callint.c, casefiddle.c, casetab.c, casetab.h, charset.h, chartab.c, chartab.h, cmds.c, console-msw.h, console-stream.c, console-x.c, console.c, console.h, data.c, device-msw.c, device.c, device.h, dialog-msw.c, dialog-x.c, dired-msw.c, dired.c, doc.c, doprnt.c, dumper.c, editfns.c, elhash.c, emacs.c, eval.c, event-Xt.c, event-gtk.c, event-msw.c, event-stream.c, events.c, events.h, extents.c, extents.h, faces.c, file-coding.c, file-coding.h, fileio.c, fns.c, font-lock.c, frame-gtk.c, frame-msw.c, frame-x.c, frame.c, frame.h, glade.c, glyphs-gtk.c, glyphs-msw.c, glyphs-msw.h, glyphs-x.c, glyphs.c, glyphs.h, gui-msw.c, gui-x.c, gui.h, gutter.h, hash.h, indent.c, insdel.c, intl-win32.c, intl.c, keymap.c, lisp-disunion.h, lisp-union.h, lisp.h, lread.c, lrecord.h, lstream.c, lstream.h, marker.c, menubar-gtk.c, menubar-msw.c, menubar-x.c, menubar.c, minibuf.c, mule-ccl.c, mule-charset.c, mule-coding.c, mule-wnnfns.c, nas.c, objects-msw.c, objects-x.c, opaque.c, postgresql.c, print.c, process-nt.c, process-unix.c, process.c, process.h, profile.c, rangetab.c, redisplay-gtk.c, redisplay-msw.c, redisplay-output.c, redisplay-x.c, redisplay.c, redisplay.h, regex.c, regex.h, scrollbar-msw.c, search.c, select-x.c, specifier.c, specifier.h, symbols.c, symsinit.h, syntax.c, syntax.h, syswindows.h, tests.c, text.c, text.h, tooltalk.c, ui-byhand.c, ui-gtk.c, unicode.c, win32.c, window.c: Another big Ben patch. -- FUNCTIONALITY CHANGES: add partial support for 8-bit-fixed, 16-bit-fixed, and 32-bit-fixed formats. not quite done yet. (in particular, needs functions to actually convert the buffer.) NOTE: lots of changes to regex.c here. also, many new *_fmt() inline funs that take an Internal_Format argument. redo syntax cache code. make the cache per-buffer; keep the cache valid across calls to functions that use it. also keep it valid across insertions/deletions and extent changes, as much as is possible. eliminate the junky regex-reentrancy code by passing in the relevant lisp info to the regex routines as local vars. add general mechanism in extents code for signalling extent changes. fix numerous problems with the case-table implementation; yoshiki never properly transferred many algorithms from old-style to new-style case tables. redo char tables to support a default argument, so that mapping only occurs over changed args. change many chartab functions to accept Lisp_Object instead of Lisp_Char_Table *. comment out the code in font-lock.c by default, because font-lock.el no longer uses it. we should consider eliminating it entirely. Don't output bell as ^G in console-stream when not a TTY. add -mswindows-termination-handle to interface with i.c, so we can properly kill a build. add more error-checking to buffer/string macros. add some additional buffer_or_string_() funs. -- INTERFACE CHANGES AFFECTING MORE CODE: switch the arguments of write_c_string and friends to be consistent with write_fmt_string, which must have printcharfun first. change BI_* macros to BYTE_* for increased clarity; similarly for bi_* local vars. change VOID_TO_LISP to be a one-argument function. eliminate no-longer-needed CVOID_TO_LISP. -- char/string macro changes: rename MAKE_CHAR() to make_emchar() for slightly less confusion with make_char(). (The former generates an Emchar, the latter a Lisp object. Conceivably we should rename make_char() -> wrap_char() and similarly for make_int(), make_float().) Similar changes for other *CHAR* macros -- we now consistently use names with `emchar' whenever we are working with Emchars. Any remaining name with just `char' always refers to a Lisp object. rename macros with XSTRING_* to string_* except for those that reference actual fields in the Lisp_String object, following conventions used elsewhere. rename set_string_{data,length} macros (the only ones to work with a Lisp_String_* instead of a Lisp_Object) to set_lispstringp_* to make the difference clear. try to be consistent about caps vs. lowercase in macro/inline-fun names for chars and such, which wasn't the case before. we now reserve caps either for XFOO_ macros that reference object fields (e.g. XSTRING_DATA) or for things that have non-function semantics, e.g. directly modifying an arg (BREAKUP_EMCHAR) or evaluating an arg (any arg) more than once. otherwise, use lowercase. here is a summary of most of the macros/inline funs changed by all of the above changes: BYTE_*_P -> byte_*_p XSTRING_BYTE -> string_byte set_string_data/length -> set_lispstringp_data/length XSTRING_CHAR_LENGTH -> string_char_length XSTRING_CHAR -> string_emchar INTBYTE_FIRST_BYTE_P -> intbyte_first_byte_p INTBYTE_LEADING_BYTE_P -> intbyte_leading_byte_p charptr_copy_char -> charptr_copy_emchar LEADING_BYTE_* -> leading_byte_* CHAR_* -> EMCHAR_* *_CHAR_* -> *_EMCHAR_* *_CHAR -> *_EMCHAR CHARSET_BY_ -> charset_by_* BYTE_SHIFT_JIS* -> byte_shift_jis* BYTE_BIG5* -> byte_big5* REP_BYTES_BY_FIRST_BYTE -> rep_bytes_by_first_byte char_to_unicode -> emchar_to_unicode valid_char_p -> valid_emchar_p Change intbyte_strcmp -> qxestrcmp_c (duplicated functionality). -- INTERFACE CHANGES AFFECTING LESS CODE: use DECLARE_INLINE_HEADER in various places. remove '#ifdef emacs' from XEmacs-only files. eliminate CHAR_TABLE_VALUE(), which duplicated the functionality of get_char_table(). add BUFFER_TEXT_LOOP to simplify iterations over buffer text. define typedefs for signed and unsigned types of fixed sizes (INT_32_BIT, UINT_32_BIT, etc.). create ALIGN_FOR_TYPE as a higher-level interface onto ALIGN_SIZE; fix code to use it. add charptr_emchar_len to return the text length of the character pointed to by a ptr; use it in place of charcount_to_bytecount(..., 1). add emchar_len to return the text length of a given character. add types Bytexpos and Charxpos to generalize Bytebpos/Bytecount and Charbpos/Charcount, in code (particularly, the extents code and redisplay code) that works with either kind of index. rename redisplay struct params with names such as `charbpos' to e.g. `charpos' when they are e.g. a Charxpos, not a Charbpos. eliminate xxDEFUN in place of DEFUN; no longer necessary with changes awhile back to doc.c. split up big ugly combined list of EXFUNs in lisp.h on a file-by-file basis, since other prototypes are similarly split. rewrite some "*_UNSAFE" macros as inline funs and eliminate the _UNSAFE suffix. move most string code from lisp.h to text.h; the string code and text.h code is now intertwined in such a fashion that they need to be in the same place and partially interleaved. (you can't create forward references for inline funs) automated/lisp-tests.el, automated/symbol-tests.el, automated/test-harness.el: Fix test harness to output FAIL messages to stderr when in batch mode. Fix up some problems in lisp-tests/symbol-tests that were causing spurious failures.
author ben
date Sun, 05 May 2002 11:33:57 +0000
parents a634e3b7acc8
children 44478bd99873
Ignore whitespace changes - Everywhere: Within whitespace: At end of lines:
rev   line source
771
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1 /* Buffer manipulation primitives for XEmacs.
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2 Copyright (C) 1995 Sun Microsystems, Inc.
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3 Copyright (C) 1995, 1996, 2000, 2001, 2002 Ben Wing.
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
4 Copyright (C) 1999 Martin Buchholz.
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
5
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
6 This file is part of XEmacs.
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
7
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
8 XEmacs is free software; you can redistribute it and/or modify it
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
9 under the terms of the GNU General Public License as published by the
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
10 Free Software Foundation; either version 2, or (at your option) any
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
11 later version.
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
12
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
13 XEmacs is distributed in the hope that it will be useful, but WITHOUT
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
14 ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
15 FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
16 for more details.
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
17
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
18 You should have received a copy of the GNU General Public License
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
19 along with XEmacs; see the file COPYING. If not, write to
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
20 the Free Software Foundation, Inc., 59 Temple Place - Suite 330,
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
21 Boston, MA 02111-1307, USA. */
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
22
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
23 /* Synched up with: Not in FSF. */
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
24
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
25 /* Authorship:
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
26 */
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
27
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
28 #include <config.h>
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
29 #include "lisp.h"
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
30
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
31 #include "buffer.h"
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
32 #include "charset.h"
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
33 #include "file-coding.h"
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
34 #include "lstream.h"
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
35
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
36
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
37 /************************************************************************/
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
38 /* long comments */
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
39 /************************************************************************/
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
40
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
41 /*
826
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
42 ==========================================================================
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
43 1. Character Sets
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
44 ==========================================================================
771
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
45
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
46 A character set (or "charset") is an ordered set of characters.
826
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
47
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
48 A character (which is, BTW, a surprisingly complex concept) is, in a
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
49 written representation of text, the most basic written unit that has a
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
50 meaning of its own. It's comparable to a phoneme when analyzing words
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
51 in spoken speech. Just like with a phoneme (which is an abstract
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
52 concept, and is represented in actual spoken speech by one or more
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
53 allophones, ...&&#### finish this., a character is actually an abstract
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
54 concept
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
55
771
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
56 A particular character in a charset is indexed using one or
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
57 more "position codes", which are non-negative integers.
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
58 The number of position codes needed to identify a particular
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
59 character in a charset is called the "dimension" of the
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
60 charset. In XEmacs/Mule, all charsets have 1 or 2 dimensions,
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
61 and the size of all charsets (except for a few special cases)
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
62 is either 94, 96, 94 by 94, or 96 by 96. The range of
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
63 position codes used to index characters from any of these
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
64 types of character sets is as follows:
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
65
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
66 Charset type Position code 1 Position code 2
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
67 ------------------------------------------------------------
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
68 94 33 - 126 N/A
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
69 96 32 - 127 N/A
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
70 94x94 33 - 126 33 - 126
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
71 96x96 32 - 127 32 - 127
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
72
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
73 Note that in the above cases position codes do not start at
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
74 an expected value such as 0 or 1. The reason for this will
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
75 become clear later.
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
76
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
77 For example, Latin-1 is a 96-character charset, and JISX0208
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
78 (the Japanese national character set) is a 94x94-character
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
79 charset.
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
80
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
81 [Note that, although the ranges above define the *valid*
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
82 position codes for a charset, some of the slots in a particular
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
83 charset may in fact be empty. This is the case for JISX0208,
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
84 for example, where (e.g.) all the slots whose first
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
85 position code is in the range 118 - 127 are empty.]
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
86
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
87 There are three charsets that do not follow the above rules.
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
88 All of them have one dimension, and have ranges of position
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
89 codes as follows:
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
90
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
91 Charset name Position code 1
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
92 ------------------------------------
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
93 ASCII 0 - 127
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
94 Control-1 0 - 31
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
95 Composite 0 - some large number
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
96
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
97 (The upper bound of the position code for composite characters
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
98 has not yet been determined, but it will probably be at
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
99 least 16,383).
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
100
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
101 ASCII is the union of two subsidiary character sets:
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
102 Printing-ASCII (the printing ASCII character set,
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
103 consisting of position codes 33 - 126, like for a standard
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
104 94-character charset) and Control-ASCII (the non-printing
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
105 characters that would appear in a binary file with codes 0
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
106 - 32 and 127).
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
107
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
108 Control-1 contains the non-printing characters that would
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
109 appear in a binary file with codes 128 - 159.
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
110
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
111 Composite contains characters that are generated by
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
112 overstriking one or more characters from other charsets.
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
113
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
114 Note that some characters in ASCII, and all characters
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
115 in Control-1, are "control" (non-printing) characters.
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
116 These have no printed representation but instead control
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
117 some other function of the printing (e.g. TAB or 8 moves
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
118 the current character position to the next tab stop).
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
119 All other characters in all charsets are "graphic"
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
120 (printing) characters.
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
121
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
122 When a binary file is read in, the bytes in the file are
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
123 assigned to character sets as follows:
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
124
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
125 Bytes Character set Range
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
126 --------------------------------------------------
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
127 0 - 127 ASCII 0 - 127
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
128 128 - 159 Control-1 0 - 31
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
129 160 - 255 Latin-1 32 - 127
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
130
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
131 This is a bit ad-hoc but gets the job done.
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
132
826
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
133 ==========================================================================
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
134 2. Encodings
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
135 ==========================================================================
771
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
136
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
137 An "encoding" is a way of numerically representing
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
138 characters from one or more character sets. If an encoding
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
139 only encompasses one character set, then the position codes
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
140 for the characters in that character set could be used
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
141 directly. This is not possible, however, if more than one
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
142 character set is to be used in the encoding.
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
143
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
144 For example, the conversion detailed above between bytes in
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
145 a binary file and characters is effectively an encoding
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
146 that encompasses the three character sets ASCII, Control-1,
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
147 and Latin-1 in a stream of 8-bit bytes.
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
148
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
149 Thus, an encoding can be viewed as a way of encoding
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
150 characters from a specified group of character sets using a
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
151 stream of bytes, each of which contains a fixed number of
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
152 bits (but not necessarily 8, as in the common usage of
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
153 "byte").
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
154
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
155 Here are descriptions of a couple of common
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
156 encodings:
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
157
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
158
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
159 A. Japanese EUC (Extended Unix Code)
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
160
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
161 This encompasses the character sets:
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
162 - Printing-ASCII,
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
163 - Katakana-JISX0201 (half-width katakana, the right half of JISX0201).
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
164 - Japanese-JISX0208
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
165 - Japanese-JISX0212
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
166 It uses 8-bit bytes.
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
167
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
168 Note that Printing-ASCII and Katakana-JISX0201 are 94-character
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
169 charsets, while Japanese-JISX0208 is a 94x94-character charset.
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
170
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
171 The encoding is as follows:
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
172
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
173 Character set Representation (PC == position-code)
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
174 ------------- --------------
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
175 Printing-ASCII PC1
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
176 Japanese-JISX0208 PC1 + 0x80 | PC2 + 0x80
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
177 Katakana-JISX0201 0x8E | PC1 + 0x80
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
178
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
179
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
180 B. JIS7
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
181
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
182 This encompasses the character sets:
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
183 - Printing-ASCII
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
184 - Latin-JISX0201 (the left half of JISX0201; this character set is
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
185 very similar to Printing-ASCII and is a 94-character charset)
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
186 - Japanese-JISX0208
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
187 - Katakana-JISX0201
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
188 It uses 7-bit bytes.
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
189
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
190 Unlike Japanese EUC, this is a "modal" encoding, which
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
191 means that there are multiple states that the encoding can
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
192 be in, which affect how the bytes are to be interpreted.
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
193 Special sequences of bytes (called "escape sequences")
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
194 are used to change states.
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
195
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
196 The encoding is as follows:
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
197
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
198 Character set Representation
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
199 ------------- --------------
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
200 Printing-ASCII PC1
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
201 Latin-JISX0201 PC1
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
202 Katakana-JISX0201 PC1
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
203 Japanese-JISX0208 PC1 | PC2
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
204
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
205 Escape sequence ASCII equivalent Meaning
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
206 --------------- ---------------- -------
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
207 0x1B 0x28 0x42 ESC ( B invoke Printing-ASCII
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
208 0x1B 0x28 0x4A ESC ( J invoke Latin-JISX0201
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
209 0x1B 0x28 0x49 ESC ( I invoke Katakana-JISX0201
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
210 0x1B 0x24 0x42 ESC $ B invoke Japanese-JISX0208
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
211
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
212 Initially, Printing-ASCII is invoked.
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
213
826
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
214 ==========================================================================
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
215 3. Internal Mule Encodings
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
216 ==========================================================================
771
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
217
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
218 In XEmacs/Mule, each character set is assigned a unique number,
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
219 called a "leading byte". This is used in the encodings of a
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
220 character. Leading bytes are in the range 0x80 - 0xFF
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
221 (except for ASCII, which has a leading byte of 0), although
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
222 some leading bytes are reserved.
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
223
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
224 Charsets whose leading byte is in the range 0x80 - 0x9F are
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
225 called "official" and are used for built-in charsets.
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
226 Other charsets are called "private" and have leading bytes
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
227 in the range 0xA0 - 0xFF; these are user-defined charsets.
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
228
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
229 More specifically:
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
230
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
231 Character set Leading byte
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
232 ------------- ------------
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
233 ASCII 0 (0x7F in arrays indexed by leading byte)
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
234 Composite 0x8D
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
235 Dimension-1 Official 0x80 - 0x8C/0x8D
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
236 (0x8E is free)
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
237 Control 0x8F
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
238 Dimension-2 Official 0x90 - 0x99
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
239 (0x9A - 0x9D are free)
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
240 Dimension-1 Private Marker 0x9E
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
241 Dimension-2 Private Marker 0x9F
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
242 Dimension-1 Private 0xA0 - 0xEF
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
243 Dimension-2 Private 0xF0 - 0xFF
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
244
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
245 There are two internal encodings for characters in XEmacs/Mule.
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
246 One is called "string encoding" and is an 8-bit encoding that
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
247 is used for representing characters in a buffer or string.
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
248 It uses 1 to 4 bytes per character. The other is called
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
249 "character encoding" and is a 19-bit encoding that is used
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
250 for representing characters individually in a variable.
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
251
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
252 (In the following descriptions, we'll ignore composite
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
253 characters for the moment. We also give a general (structural)
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
254 overview first, followed later by the exact details.)
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
255
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
256 A. Internal String Encoding
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
257
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
258 ASCII characters are encoded using their position code directly.
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
259 Other characters are encoded using their leading byte followed
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
260 by their position code(s) with the high bit set. Characters
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
261 in private character sets have their leading byte prefixed with
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
262 a "leading byte prefix", which is either 0x9E or 0x9F. (No
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
263 character sets are ever assigned these leading bytes.) Specifically:
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
264
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
265 Character set Encoding (PC == position-code)
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
266 ------------- -------- (LB == leading-byte)
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
267 ASCII PC1 |
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
268 Control-1 LB | PC1 + 0xA0
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
269 Dimension-1 official LB | PC1 + 0x80
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
270 Dimension-1 private 0x9E | LB | PC1 + 0x80
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
271 Dimension-2 official LB | PC1 | PC2 + 0x80
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
272 Dimension-2 private 0x9F | LB | PC1 + 0x80 | PC2 + 0x80
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
273
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
274 The basic characteristic of this encoding is that the first byte
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
275 of all characters is in the range 0x00 - 0x9F, and the second and
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
276 following bytes of all characters is in the range 0xA0 - 0xFF.
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
277 This means that it is impossible to get out of sync, or more
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
278 specifically:
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
279
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
280 1. Given any byte position, the beginning of the character it is
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
281 within can be determined in constant time.
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
282 2. Given any byte position at the beginning of a character, the
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
283 beginning of the next character can be determined in constant
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
284 time.
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
285 3. Given any byte position at the beginning of a character, the
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
286 beginning of the previous character can be determined in constant
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
287 time.
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
288 4. Textual searches can simply treat encoded strings as if they
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
289 were encoded in a one-byte-per-character fashion rather than
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
290 the actual multi-byte encoding.
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
291
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
292 None of the standard non-modal encodings meet all of these
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
293 conditions. For example, EUC satisfies only (2) and (3), while
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
294 Shift-JIS and Big5 (not yet described) satisfy only (2). (All
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
295 non-modal encodings must satisfy (2), in order to be unambiguous.)
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
296
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
297 B. Internal Character Encoding
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
298
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
299 One 19-bit word represents a single character. The word is
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
300 separated into three fields:
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
301
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
302 Bit number: 18 17 16 15 14 13 12 11 10 09 08 07 06 05 04 03 02 01 00
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
303 <------------> <------------------> <------------------>
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
304 Field: 1 2 3
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
305
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
306 Note that fields 2 and 3 hold 7 bits each, while field 1 holds 5 bits.
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
307
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
308 Character set Field 1 Field 2 Field 3
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
309 ------------- ------- ------- -------
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
310 ASCII 0 0 PC1
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
311 range: (00 - 7F)
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
312 Control-1 0 1 PC1
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
313 range: (00 - 1F)
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
314 Dimension-1 official 0 LB - 0x7F PC1
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
315 range: (01 - 0D) (20 - 7F)
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
316 Dimension-1 private 0 LB - 0x80 PC1
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
317 range: (20 - 6F) (20 - 7F)
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
318 Dimension-2 official LB - 0x8F PC1 PC2
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
319 range: (01 - 0A) (20 - 7F) (20 - 7F)
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
320 Dimension-2 private LB - 0xE1 PC1 PC2
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
321 range: (0F - 1E) (20 - 7F) (20 - 7F)
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
322 Composite 0x1F ? ?
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
323
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
324 Note that character codes 0 - 255 are the same as the "binary encoding"
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
325 described above.
826
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
326
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
327 Most of the code in XEmacs knows nothing of the representation of a
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
328 character other than that values 0 - 255 represent ASCII, Control 1,
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
329 and Latin 1.
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
330
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
331 WARNING WARNING WARNING: The Boyer-Moore code in search.c, and the
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
332 code in search_buffer() that determines whether that code can be used,
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
333 knows that "field 3" in a character always corresponds to the last
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
334 byte in the textual representation of the character. (This is important
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
335 because the Boyer-Moore algorithm works by looking at the last byte
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
336 of the search string and &&#### finish this.
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
337
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
338 ==========================================================================
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
339 4. Buffer Positions and Other Typedefs
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
340 ==========================================================================
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
341
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
342 A. Buffer Positions
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
343
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
344 There are three possible ways to specify positions in a buffer. All
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
345 of these are one-based: the beginning of the buffer is position or
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
346 index 1, and 0 is not a valid position.
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
347
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
348 As a "buffer position" (typedef Charbpos):
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
349
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
350 This is an index specifying an offset in characters from the
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
351 beginning of the buffer. Note that buffer positions are
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
352 logically *between* characters, not on a character. The
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
353 difference between two buffer positions specifies the number of
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
354 characters between those positions. Buffer positions are the
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
355 only kind of position externally visible to the user.
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
356
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
357 As a "byte index" (typedef Bytebpos):
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
358
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
359 This is an index over the bytes used to represent the characters
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
360 in the buffer. If there is no Mule support, this is identical
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
361 to a buffer position, because each character is represented
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
362 using one byte. However, with Mule support, many characters
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
363 require two or more bytes for their representation, and so a
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
364 byte index may be greater than the corresponding buffer
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
365 position.
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
366
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
367 As a "memory index" (typedef Membpos):
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
368
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
369 This is the byte index adjusted for the gap. For positions
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
370 before the gap, this is identical to the byte index. For
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
371 positions after the gap, this is the byte index plus the gap
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
372 size. There are two possible memory indices for the gap
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
373 position; the memory index at the beginning of the gap should
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
374 always be used, except in code that deals with manipulating the
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
375 gap, where both indices may be seen. The address of the
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
376 character "at" (i.e. following) a particular position can be
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
377 obtained from the formula
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
378
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
379 buffer_start_address + memory_index(position) - 1
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
380
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
381 except in the case of characters at the gap position.
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
382
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
383 B. Other Typedefs
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
384
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
385 Emchar:
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
386 -------
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
387 This typedef represents a single Emacs character, which can be
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
388 ASCII, ISO-8859, or some extended character, as would typically
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
389 be used for Kanji. Note that the representation of a character
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
390 as an Emchar is *not* the same as the representation of that
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
391 same character in a string; thus, you cannot do the standard
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
392 C trick of passing a pointer to a character to a function that
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
393 expects a string.
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
394
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
395 An Emchar takes up 19 bits of representation and (for code
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
396 compatibility and such) is compatible with an int. This
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
397 representation is visible on the Lisp level. The important
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
398 characteristics of the Emchar representation are
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
399
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
400 -- values 0x00 - 0x7f represent ASCII.
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
401 -- values 0x80 - 0xff represent the right half of ISO-8859-1.
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
402 -- values 0x100 and up represent all other characters.
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
403
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
404 This means that Emchar values are upwardly compatible with
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
405 the standard 8-bit representation of ASCII/ISO-8859-1.
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
406
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
407 Intbyte:
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
408 --------
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
409 The data in a buffer or string is logically made up of Intbyte
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
410 objects, where a Intbyte takes up the same amount of space as a
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
411 char. (It is declared differently, though, to catch invalid
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
412 usages.) Strings stored using Intbytes are said to be in
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
413 "internal format". The important characteristics of internal
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
414 format are
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
415
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
416 -- ASCII characters are represented as a single Intbyte,
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
417 in the range 0 - 0x7f.
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
418 -- All other characters are represented as a Intbyte in
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
419 the range 0x80 - 0x9f followed by one or more Intbytes
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
420 in the range 0xa0 to 0xff.
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
421
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
422 This leads to a number of desirable properties:
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
423
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
424 -- Given the position of the beginning of a character,
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
425 you can find the beginning of the next or previous
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
426 character in constant time.
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
427 -- When searching for a substring or an ASCII character
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
428 within the string, you need merely use standard
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
429 searching routines.
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
430
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
431 array of char:
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
432 --------------
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
433 Strings that go in or out of Emacs are in "external format",
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
434 typedef'ed as an array of char or a char *. There is more
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
435 than one external format (JIS, EUC, etc.) but they all
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
436 have similar properties. They are modal encodings,
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
437 which is to say that the meaning of particular bytes is
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
438 not fixed but depends on what "mode" the string is currently
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
439 in (e.g. bytes in the range 0 - 0x7f might be
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
440 interpreted as ASCII, or as Hiragana, or as 2-byte Kanji,
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
441 depending on the current mode). The mode starts out in
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
442 ASCII/ISO-8859-1 and is switched using escape sequences --
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
443 for example, in the JIS encoding, 'ESC $ B' switches to a
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
444 mode where pairs of bytes in the range 0 - 0x7f
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
445 are interpreted as Kanji characters.
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
446
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
447 External-formatted data is generally desirable for passing
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
448 data between programs because it is upwardly compatible
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
449 with standard ASCII/ISO-8859-1 strings and may require
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
450 less space than internal encodings such as the one
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
451 described above. In addition, some encodings (e.g. JIS)
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
452 keep all characters (except the ESC used to switch modes)
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
453 in the printing ASCII range 0x20 - 0x7e, which results in
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
454 a much higher probability that the data will avoid being
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
455 garbled in transmission. Externally-formatted data is
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
456 generally not very convenient to work with, however, and
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
457 for this reason is usually converted to internal format
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
458 before any work is done on the string.
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
459
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
460 NOTE: filenames need to be in external format so that
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
461 ISO-8859-1 characters come out correctly.
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
462
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
463 Charcount:
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
464 ----------
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
465 This typedef represents a count of characters, such as
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
466 a character offset into a string or the number of
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
467 characters between two positions in a buffer. The
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
468 difference between two Charbpos's is a Charcount, and
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
469 character positions in a string are represented using
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
470 a Charcount.
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
471
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
472 Bytecount:
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
473 ----------
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
474 Similar to a Charcount but represents a count of bytes.
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
475 The difference between two Bytebpos's is a Bytecount.
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
476
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
477
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
478 C. Usage of the Various Representations
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
479
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
480 Memory indices are used in low-level functions in insdel.c and for
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
481 extent endpoints and marker positions. The reason for this is that
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
482 this way, the extents and markers don't need to be updated for most
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
483 insertions, which merely shrink the gap and don't move any
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
484 characters around in memory.
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
485
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
486 (The beginning-of-gap memory index simplifies insertions w.r.t.
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
487 markers, because text usually gets inserted after markers. For
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
488 extents, it is merely for consistency, because text can get
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
489 inserted either before or after an extent's endpoint depending on
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
490 the open/closedness of the endpoint.)
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
491
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
492 Byte indices are used in other code that needs to be fast,
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
493 such as the searching, redisplay, and extent-manipulation code.
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
494
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
495 Buffer positions are used in all other code. This is because this
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
496 representation is easiest to work with (especially since Lisp
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
497 code always uses buffer positions), necessitates the fewest
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
498 changes to existing code, and is the safest (e.g. if the text gets
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
499 shifted underneath a buffer position, it will still point to a
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
500 character; if text is shifted under a byte index, it might point
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
501 to the middle of a character, which would be bad).
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
502
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
503 Similarly, Charcounts are used in all code that deals with strings
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
504 except for code that needs to be fast, which used Bytecounts.
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
505
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
506 Strings are always passed around internally using internal format.
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
507 Conversions between external format are performed at the time
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
508 that the data goes in or out of Emacs.
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
509
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
510 D. Working With the Various Representations
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
511
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
512 We write things this way because it's very important the
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
513 MAX_BYTEBPOS_GAP_SIZE_3 is a multiple of 3. (As it happens,
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
514 65535 is a multiple of 3, but this may not always be the
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
515 case. #### unfinished
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
516
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
517 ==========================================================================
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
518 5. Miscellaneous
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
519 ==========================================================================
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
520
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
521 A. Unicode Support
771
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
522
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
523 Adding Unicode support is very desirable. Unicode will likely be a
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
524 very common representation in the future, and thus we should
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
525 represent Unicode characters using three bytes instead of four.
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
526 This means we need to find leading bytes for Unicode. Given that
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
527 there are 65,536 characters in Unicode and we can attach 96x96 =
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
528 9,216 characters per leading byte, we need eight leading bytes for
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
529 Unicode. We currently have four free (0x9A - 0x9D), and with a
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
530 little bit of rearranging we can get five: ASCII doesn't really
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
531 need to take up a leading byte. (We could just as well use 0x7F,
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
532 with a little change to the functions that assume that 0x80 is the
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
533 lowest leading byte.) This means we still need to dump three
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
534 leading bytes and move them into private space. The CNS charsets
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
535 are good candidates since they are rarely used, and
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
536 JAPANESE_JISX0208_1978 is becoming less and less used and could
826
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
537 also be dumped.
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
538
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
539 B. Composite Characters
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
540
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
541 Composite characters are characters constructed by overstriking two
771
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
542 or more regular characters.
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
543
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
544 1) The old Mule implementation involves storing composite characters
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
545 in a buffer as a tag followed by all of the actual characters
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
546 used to make up the composite character. I think this is a bad
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
547 idea; it greatly complicates code that wants to handle strings
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
548 one character at a time because it has to deal with the possibility
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
549 of great big ungainly characters. It's much more reasonable to
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
550 simply store an index into a table of composite characters.
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
551
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
552 2) The current implementation only allows for 16,384 separate
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
553 composite characters over the lifetime of the XEmacs process.
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
554 This could become a potential problem if the user
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
555 edited lots of different files that use composite characters.
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
556 Due to FSF bogosity, increasing the number of allowable
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
557 composite characters under Mule would decrease the number
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
558 of possible faces that can exist. Mule already has shrunk
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
559 this to 2048, and further shrinkage would become uncomfortable.
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
560 No such problems exist in XEmacs.
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
561
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
562 Composite characters could be represented as 0x8D C1 C2 C3,
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
563 where each C[1-3] is in the range 0xA0 - 0xFF. This allows
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
564 for slightly under 2^20 (one million) composite characters
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
565 over the XEmacs process lifetime, and you only need to
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
566 increase the size of a Mule character from 19 to 21 bits.
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
567 Or you could use 0x8D C1 C2 C3 C4, allowing for about
826
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
568 85 million (slightly over 2^26) composite characters.
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
569
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
570 */
771
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
571
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
572
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
573 /************************************************************************/
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
574 /* declarations */
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
575 /************************************************************************/
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
576
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
577 Eistring the_eistring_zero_init, the_eistring_malloc_zero_init;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
578
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
579 #define MAX_CHARBPOS_GAP_SIZE_3 (65535/3)
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
580 #define MAX_BYTEBPOS_GAP_SIZE_3 (3 * MAX_CHARBPOS_GAP_SIZE_3)
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
581
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
582 short three_to_one_table[1 + MAX_BYTEBPOS_GAP_SIZE_3];
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
583
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
584 #ifdef MULE
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
585
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
586 /* Table of number of bytes in the string representation of a character
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
587 indexed by the first byte of that representation.
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
588
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
589 rep_bytes_by_first_byte(c) is more efficient than the equivalent
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
590 canonical computation:
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
591
826
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
592 XCHARSET_REP_BYTES (charset_by_leading_byte (c)) */
771
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
593
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
594 const Bytecount rep_bytes_by_first_byte[0xA0] =
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
595 { /* 0x00 - 0x7f are for straight ASCII */
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
596 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
597 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
598 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
599 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
600 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
601 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
602 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
603 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
604 /* 0x80 - 0x8f are for Dimension-1 official charsets */
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
605 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2,
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
606 /* 0x90 - 0x9d are for Dimension-2 official charsets */
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
607 /* 0x9e is for Dimension-1 private charsets */
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
608 /* 0x9f is for Dimension-2 private charsets */
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
609 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 4
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
610 };
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
611
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
612 #ifdef ENABLE_COMPOSITE_CHARS
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
613
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
614 /* Hash tables for composite chars. One maps string representing
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
615 composed chars to their equivalent chars; one goes the
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
616 other way. */
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
617 Lisp_Object Vcomposite_char_char2string_hash_table;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
618 Lisp_Object Vcomposite_char_string2char_hash_table;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
619
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
620 static int composite_char_row_next;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
621 static int composite_char_col_next;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
622
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
623 #endif /* ENABLE_COMPOSITE_CHARS */
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
624
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
625 #endif /* MULE */
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
626
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
627
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
628 /************************************************************************/
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
629 /* qxestr***() functions */
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
630 /************************************************************************/
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
631
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
632 /* Most are inline functions in lisp.h */
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
633
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
634 int
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
635 qxesprintf (Intbyte *buffer, const CIntbyte *format, ...)
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
636 {
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
637 va_list args;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
638 int retval;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
639
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
640 va_start (args, format);
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
641 retval = vsprintf ((char *) buffer, format, args);
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
642 va_end (args);
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
643
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
644 return retval;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
645 }
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
646
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
647 /* strcasecmp() implementation from BSD */
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
648 static Intbyte strcasecmp_charmap[] = {
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
649 '\000', '\001', '\002', '\003', '\004', '\005', '\006', '\007',
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
650 '\010', '\011', '\012', '\013', '\014', '\015', '\016', '\017',
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
651 '\020', '\021', '\022', '\023', '\024', '\025', '\026', '\027',
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
652 '\030', '\031', '\032', '\033', '\034', '\035', '\036', '\037',
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
653 '\040', '\041', '\042', '\043', '\044', '\045', '\046', '\047',
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
654 '\050', '\051', '\052', '\053', '\054', '\055', '\056', '\057',
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
655 '\060', '\061', '\062', '\063', '\064', '\065', '\066', '\067',
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
656 '\070', '\071', '\072', '\073', '\074', '\075', '\076', '\077',
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
657 '\100', '\141', '\142', '\143', '\144', '\145', '\146', '\147',
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
658 '\150', '\151', '\152', '\153', '\154', '\155', '\156', '\157',
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
659 '\160', '\161', '\162', '\163', '\164', '\165', '\166', '\167',
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
660 '\170', '\171', '\172', '\133', '\134', '\135', '\136', '\137',
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
661 '\140', '\141', '\142', '\143', '\144', '\145', '\146', '\147',
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
662 '\150', '\151', '\152', '\153', '\154', '\155', '\156', '\157',
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
663 '\160', '\161', '\162', '\163', '\164', '\165', '\166', '\167',
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
664 '\170', '\171', '\172', '\173', '\174', '\175', '\176', '\177',
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
665 '\200', '\201', '\202', '\203', '\204', '\205', '\206', '\207',
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
666 '\210', '\211', '\212', '\213', '\214', '\215', '\216', '\217',
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
667 '\220', '\221', '\222', '\223', '\224', '\225', '\226', '\227',
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
668 '\230', '\231', '\232', '\233', '\234', '\235', '\236', '\237',
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
669 '\240', '\241', '\242', '\243', '\244', '\245', '\246', '\247',
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
670 '\250', '\251', '\252', '\253', '\254', '\255', '\256', '\257',
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
671 '\260', '\261', '\262', '\263', '\264', '\265', '\266', '\267',
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
672 '\270', '\271', '\272', '\273', '\274', '\275', '\276', '\277',
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
673 '\300', '\301', '\302', '\303', '\304', '\305', '\306', '\307',
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
674 '\310', '\311', '\312', '\313', '\314', '\315', '\316', '\317',
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
675 '\320', '\321', '\322', '\323', '\324', '\325', '\326', '\327',
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
676 '\330', '\331', '\332', '\333', '\334', '\335', '\336', '\337',
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
677 '\340', '\341', '\342', '\343', '\344', '\345', '\346', '\347',
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
678 '\350', '\351', '\352', '\353', '\354', '\355', '\356', '\357',
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
679 '\360', '\361', '\362', '\363', '\364', '\365', '\366', '\367',
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
680 '\370', '\371', '\372', '\373', '\374', '\375', '\376', '\377',
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
681 };
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
682
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
683 /* A version that works like generic strcasecmp() -- only collapsing
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
684 case in ASCII A-Z/a-z. This is safe on Mule strings due to the
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
685 current representation.
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
686
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
687 This version was written by some Berkeley coder, favoring
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
688 nanosecond improvements over clarity. In all other versions below,
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
689 we use symmetrical algorithms that may sacrifice a few machine
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
690 cycles but are MUCH MUCH clearer, which counts a lot more.
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
691 */
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
692
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
693 int
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
694 qxestrcasecmp (const Intbyte *s1, const Intbyte *s2)
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
695 {
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
696 Intbyte *cm = strcasecmp_charmap;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
697
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
698 while (cm[*s1] == cm[*s2++])
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
699 if (*s1++ == '\0')
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
700 return (0);
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
701
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
702 return (cm[*s1] - cm[*--s2]);
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
703 }
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
704
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
705 int
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
706 ascii_strcasecmp (const Char_ASCII *s1, const Char_ASCII *s2)
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
707 {
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
708 return qxestrcasecmp ((const Intbyte *) s1, (const Intbyte *) s2);
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
709 }
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
710
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
711 int
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
712 qxestrcasecmp_c (const Intbyte *s1, const Char_ASCII *s2)
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
713 {
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
714 return qxestrcasecmp (s1, (const Intbyte *) s2);
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
715 }
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
716
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
717 /* An internationalized version that collapses case in a general fashion.
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
718 */
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
719
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
720 int
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
721 qxestrcasecmp_i18n (const Intbyte *s1, const Intbyte *s2)
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
722 {
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
723 while (*s1 && *s2)
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
724 {
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
725 if (DOWNCASE (0, charptr_emchar (s1)) !=
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
726 DOWNCASE (0, charptr_emchar (s2)))
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
727 break;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
728 INC_CHARPTR (s1);
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
729 INC_CHARPTR (s2);
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
730 }
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
731
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
732 return (DOWNCASE (0, charptr_emchar (s1)) -
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
733 DOWNCASE (0, charptr_emchar (s2)));
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
734 }
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
735
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
736 /* The only difference between these next two and
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
737 qxememcasecmp()/qxememcasecmp_i18n() is that these two will stop if
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
738 both strings are equal and less than LEN in length, while
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
739 the mem...() versions would would run off the end. */
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
740
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
741 int
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
742 qxestrncasecmp (const Intbyte *s1, const Intbyte *s2, Bytecount len)
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
743 {
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
744 Intbyte *cm = strcasecmp_charmap;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
745
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
746 while (len--)
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
747 {
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
748 int diff = cm[*s1] - cm[*s2];
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
749 if (diff != 0)
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
750 return diff;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
751 if (!*s1)
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
752 return 0;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
753 s1++, s2++;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
754 }
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
755
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
756 return 0;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
757 }
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
758
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
759 int
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
760 ascii_strncasecmp (const Char_ASCII *s1, const Char_ASCII *s2, Bytecount len)
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
761 {
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
762 return qxestrncasecmp ((const Intbyte *) s1, (const Intbyte *) s2, len);
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
763 }
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
764
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
765 int
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
766 qxestrncasecmp_c (const Intbyte *s1, const Char_ASCII *s2, Bytecount len)
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
767 {
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
768 return qxestrncasecmp (s1, (const Intbyte *) s2, len);
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
769 }
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
770
801
2b676dc88c66 [xemacs-hg @ 2002-04-01 03:58:02 by ben]
ben
parents: 800
diff changeset
771 /* Compare LEN_FROM_S1 worth of characters from S1 with the same number of
2b676dc88c66 [xemacs-hg @ 2002-04-01 03:58:02 by ben]
ben
parents: 800
diff changeset
772 characters from S2, case insensitive. NOTE: Downcasing can convert
2b676dc88c66 [xemacs-hg @ 2002-04-01 03:58:02 by ben]
ben
parents: 800
diff changeset
773 characters from one length in bytes to another, so reversing S1 and S2
2b676dc88c66 [xemacs-hg @ 2002-04-01 03:58:02 by ben]
ben
parents: 800
diff changeset
774 is *NOT* a symmetric operations! You must choose a length that agrees
2b676dc88c66 [xemacs-hg @ 2002-04-01 03:58:02 by ben]
ben
parents: 800
diff changeset
775 with S1. */
2b676dc88c66 [xemacs-hg @ 2002-04-01 03:58:02 by ben]
ben
parents: 800
diff changeset
776
771
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
777 int
801
2b676dc88c66 [xemacs-hg @ 2002-04-01 03:58:02 by ben]
ben
parents: 800
diff changeset
778 qxestrncasecmp_i18n (const Intbyte *s1, const Intbyte *s2,
2b676dc88c66 [xemacs-hg @ 2002-04-01 03:58:02 by ben]
ben
parents: 800
diff changeset
779 Bytecount len_from_s1)
771
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
780 {
801
2b676dc88c66 [xemacs-hg @ 2002-04-01 03:58:02 by ben]
ben
parents: 800
diff changeset
781 while (len_from_s1 > 0)
771
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
782 {
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
783 const Intbyte *old_s1 = s1;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
784 int diff = (DOWNCASE (0, charptr_emchar (s1)) -
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
785 DOWNCASE (0, charptr_emchar (s2)));
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
786 if (diff != 0)
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
787 return diff;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
788 if (!*s1)
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
789 return 0;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
790 INC_CHARPTR (s1);
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
791 INC_CHARPTR (s2);
801
2b676dc88c66 [xemacs-hg @ 2002-04-01 03:58:02 by ben]
ben
parents: 800
diff changeset
792 len_from_s1 -= s1 - old_s1;
771
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
793 }
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
794
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
795 return 0;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
796 }
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
797
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
798 int
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
799 qxememcmp (const Intbyte *s1, const Intbyte *s2, Bytecount len)
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
800 {
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
801 return memcmp (s1, s2, len);
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
802 }
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
803
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
804 int
801
2b676dc88c66 [xemacs-hg @ 2002-04-01 03:58:02 by ben]
ben
parents: 800
diff changeset
805 qxememcmp4 (const Intbyte *s1, Bytecount len1,
2b676dc88c66 [xemacs-hg @ 2002-04-01 03:58:02 by ben]
ben
parents: 800
diff changeset
806 const Intbyte *s2, Bytecount len2)
2b676dc88c66 [xemacs-hg @ 2002-04-01 03:58:02 by ben]
ben
parents: 800
diff changeset
807 {
2b676dc88c66 [xemacs-hg @ 2002-04-01 03:58:02 by ben]
ben
parents: 800
diff changeset
808 int retval = qxememcmp (s1, s2, min (len1, len2));
2b676dc88c66 [xemacs-hg @ 2002-04-01 03:58:02 by ben]
ben
parents: 800
diff changeset
809 if (retval)
2b676dc88c66 [xemacs-hg @ 2002-04-01 03:58:02 by ben]
ben
parents: 800
diff changeset
810 return retval;
2b676dc88c66 [xemacs-hg @ 2002-04-01 03:58:02 by ben]
ben
parents: 800
diff changeset
811 return len1 - len2;
2b676dc88c66 [xemacs-hg @ 2002-04-01 03:58:02 by ben]
ben
parents: 800
diff changeset
812 }
2b676dc88c66 [xemacs-hg @ 2002-04-01 03:58:02 by ben]
ben
parents: 800
diff changeset
813
2b676dc88c66 [xemacs-hg @ 2002-04-01 03:58:02 by ben]
ben
parents: 800
diff changeset
814 int
771
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
815 qxememcasecmp (const Intbyte *s1, const Intbyte *s2, Bytecount len)
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
816 {
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
817 Intbyte *cm = strcasecmp_charmap;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
818
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
819 while (len--)
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
820 {
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
821 int diff = cm[*s1] - cm[*s2];
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
822 if (diff != 0)
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
823 return diff;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
824 s1++, s2++;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
825 }
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
826
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
827 return 0;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
828 }
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
829
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
830 int
801
2b676dc88c66 [xemacs-hg @ 2002-04-01 03:58:02 by ben]
ben
parents: 800
diff changeset
831 qxememcasecmp4 (const Intbyte *s1, Bytecount len1,
2b676dc88c66 [xemacs-hg @ 2002-04-01 03:58:02 by ben]
ben
parents: 800
diff changeset
832 const Intbyte *s2, Bytecount len2)
771
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
833 {
801
2b676dc88c66 [xemacs-hg @ 2002-04-01 03:58:02 by ben]
ben
parents: 800
diff changeset
834 int retval = qxememcasecmp (s1, s2, min (len1, len2));
2b676dc88c66 [xemacs-hg @ 2002-04-01 03:58:02 by ben]
ben
parents: 800
diff changeset
835 if (retval)
2b676dc88c66 [xemacs-hg @ 2002-04-01 03:58:02 by ben]
ben
parents: 800
diff changeset
836 return retval;
2b676dc88c66 [xemacs-hg @ 2002-04-01 03:58:02 by ben]
ben
parents: 800
diff changeset
837 return len1 - len2;
2b676dc88c66 [xemacs-hg @ 2002-04-01 03:58:02 by ben]
ben
parents: 800
diff changeset
838 }
2b676dc88c66 [xemacs-hg @ 2002-04-01 03:58:02 by ben]
ben
parents: 800
diff changeset
839
2b676dc88c66 [xemacs-hg @ 2002-04-01 03:58:02 by ben]
ben
parents: 800
diff changeset
840 /* Do a character-by-character comparison, returning "which is greater" by
2b676dc88c66 [xemacs-hg @ 2002-04-01 03:58:02 by ben]
ben
parents: 800
diff changeset
841 comparing the Emchar values. (#### Should have option to compare Unicode
2b676dc88c66 [xemacs-hg @ 2002-04-01 03:58:02 by ben]
ben
parents: 800
diff changeset
842 points) */
2b676dc88c66 [xemacs-hg @ 2002-04-01 03:58:02 by ben]
ben
parents: 800
diff changeset
843
2b676dc88c66 [xemacs-hg @ 2002-04-01 03:58:02 by ben]
ben
parents: 800
diff changeset
844 int
2b676dc88c66 [xemacs-hg @ 2002-04-01 03:58:02 by ben]
ben
parents: 800
diff changeset
845 qxetextcmp (const Intbyte *s1, Bytecount len1,
2b676dc88c66 [xemacs-hg @ 2002-04-01 03:58:02 by ben]
ben
parents: 800
diff changeset
846 const Intbyte *s2, Bytecount len2)
2b676dc88c66 [xemacs-hg @ 2002-04-01 03:58:02 by ben]
ben
parents: 800
diff changeset
847 {
2b676dc88c66 [xemacs-hg @ 2002-04-01 03:58:02 by ben]
ben
parents: 800
diff changeset
848 while (len1 > 0 && len2 > 0)
771
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
849 {
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
850 const Intbyte *old_s1 = s1;
801
2b676dc88c66 [xemacs-hg @ 2002-04-01 03:58:02 by ben]
ben
parents: 800
diff changeset
851 const Intbyte *old_s2 = s2;
2b676dc88c66 [xemacs-hg @ 2002-04-01 03:58:02 by ben]
ben
parents: 800
diff changeset
852 int diff = charptr_emchar (s1) - charptr_emchar (s2);
2b676dc88c66 [xemacs-hg @ 2002-04-01 03:58:02 by ben]
ben
parents: 800
diff changeset
853 if (diff != 0)
2b676dc88c66 [xemacs-hg @ 2002-04-01 03:58:02 by ben]
ben
parents: 800
diff changeset
854 return diff;
2b676dc88c66 [xemacs-hg @ 2002-04-01 03:58:02 by ben]
ben
parents: 800
diff changeset
855 INC_CHARPTR (s1);
2b676dc88c66 [xemacs-hg @ 2002-04-01 03:58:02 by ben]
ben
parents: 800
diff changeset
856 INC_CHARPTR (s2);
2b676dc88c66 [xemacs-hg @ 2002-04-01 03:58:02 by ben]
ben
parents: 800
diff changeset
857 len1 -= s1 - old_s1;
2b676dc88c66 [xemacs-hg @ 2002-04-01 03:58:02 by ben]
ben
parents: 800
diff changeset
858 len2 -= s2 - old_s2;
2b676dc88c66 [xemacs-hg @ 2002-04-01 03:58:02 by ben]
ben
parents: 800
diff changeset
859 }
2b676dc88c66 [xemacs-hg @ 2002-04-01 03:58:02 by ben]
ben
parents: 800
diff changeset
860
2b676dc88c66 [xemacs-hg @ 2002-04-01 03:58:02 by ben]
ben
parents: 800
diff changeset
861 assert (len1 >= 0 && len2 >= 0);
2b676dc88c66 [xemacs-hg @ 2002-04-01 03:58:02 by ben]
ben
parents: 800
diff changeset
862 return len1 - len2;
2b676dc88c66 [xemacs-hg @ 2002-04-01 03:58:02 by ben]
ben
parents: 800
diff changeset
863 }
2b676dc88c66 [xemacs-hg @ 2002-04-01 03:58:02 by ben]
ben
parents: 800
diff changeset
864
2b676dc88c66 [xemacs-hg @ 2002-04-01 03:58:02 by ben]
ben
parents: 800
diff changeset
865 int
2b676dc88c66 [xemacs-hg @ 2002-04-01 03:58:02 by ben]
ben
parents: 800
diff changeset
866 qxetextcmp_matching (const Intbyte *s1, Bytecount len1,
2b676dc88c66 [xemacs-hg @ 2002-04-01 03:58:02 by ben]
ben
parents: 800
diff changeset
867 const Intbyte *s2, Bytecount len2,
2b676dc88c66 [xemacs-hg @ 2002-04-01 03:58:02 by ben]
ben
parents: 800
diff changeset
868 Charcount *matching)
2b676dc88c66 [xemacs-hg @ 2002-04-01 03:58:02 by ben]
ben
parents: 800
diff changeset
869 {
2b676dc88c66 [xemacs-hg @ 2002-04-01 03:58:02 by ben]
ben
parents: 800
diff changeset
870 *matching = 0;
2b676dc88c66 [xemacs-hg @ 2002-04-01 03:58:02 by ben]
ben
parents: 800
diff changeset
871 while (len1 > 0 && len2 > 0)
2b676dc88c66 [xemacs-hg @ 2002-04-01 03:58:02 by ben]
ben
parents: 800
diff changeset
872 {
2b676dc88c66 [xemacs-hg @ 2002-04-01 03:58:02 by ben]
ben
parents: 800
diff changeset
873 const Intbyte *old_s1 = s1;
2b676dc88c66 [xemacs-hg @ 2002-04-01 03:58:02 by ben]
ben
parents: 800
diff changeset
874 const Intbyte *old_s2 = s2;
2b676dc88c66 [xemacs-hg @ 2002-04-01 03:58:02 by ben]
ben
parents: 800
diff changeset
875 int diff = charptr_emchar (s1) - charptr_emchar (s2);
2b676dc88c66 [xemacs-hg @ 2002-04-01 03:58:02 by ben]
ben
parents: 800
diff changeset
876 if (diff != 0)
2b676dc88c66 [xemacs-hg @ 2002-04-01 03:58:02 by ben]
ben
parents: 800
diff changeset
877 return diff;
2b676dc88c66 [xemacs-hg @ 2002-04-01 03:58:02 by ben]
ben
parents: 800
diff changeset
878 INC_CHARPTR (s1);
2b676dc88c66 [xemacs-hg @ 2002-04-01 03:58:02 by ben]
ben
parents: 800
diff changeset
879 INC_CHARPTR (s2);
2b676dc88c66 [xemacs-hg @ 2002-04-01 03:58:02 by ben]
ben
parents: 800
diff changeset
880 len1 -= s1 - old_s1;
2b676dc88c66 [xemacs-hg @ 2002-04-01 03:58:02 by ben]
ben
parents: 800
diff changeset
881 len2 -= s2 - old_s2;
2b676dc88c66 [xemacs-hg @ 2002-04-01 03:58:02 by ben]
ben
parents: 800
diff changeset
882 (*matching)++;
2b676dc88c66 [xemacs-hg @ 2002-04-01 03:58:02 by ben]
ben
parents: 800
diff changeset
883 }
2b676dc88c66 [xemacs-hg @ 2002-04-01 03:58:02 by ben]
ben
parents: 800
diff changeset
884
2b676dc88c66 [xemacs-hg @ 2002-04-01 03:58:02 by ben]
ben
parents: 800
diff changeset
885 assert (len1 >= 0 && len2 >= 0);
2b676dc88c66 [xemacs-hg @ 2002-04-01 03:58:02 by ben]
ben
parents: 800
diff changeset
886 return len1 - len2;
2b676dc88c66 [xemacs-hg @ 2002-04-01 03:58:02 by ben]
ben
parents: 800
diff changeset
887 }
2b676dc88c66 [xemacs-hg @ 2002-04-01 03:58:02 by ben]
ben
parents: 800
diff changeset
888
2b676dc88c66 [xemacs-hg @ 2002-04-01 03:58:02 by ben]
ben
parents: 800
diff changeset
889 /* Do a character-by-character comparison, returning "which is greater" by
2b676dc88c66 [xemacs-hg @ 2002-04-01 03:58:02 by ben]
ben
parents: 800
diff changeset
890 comparing the Emchar values, case insensitively (by downcasing both
2b676dc88c66 [xemacs-hg @ 2002-04-01 03:58:02 by ben]
ben
parents: 800
diff changeset
891 first). (#### Should have option to compare Unicode points)
2b676dc88c66 [xemacs-hg @ 2002-04-01 03:58:02 by ben]
ben
parents: 800
diff changeset
892
2b676dc88c66 [xemacs-hg @ 2002-04-01 03:58:02 by ben]
ben
parents: 800
diff changeset
893 In this case, both lengths must be specified becaused downcasing can
2b676dc88c66 [xemacs-hg @ 2002-04-01 03:58:02 by ben]
ben
parents: 800
diff changeset
894 convert characters from one length in bytes to another; therefore, two
2b676dc88c66 [xemacs-hg @ 2002-04-01 03:58:02 by ben]
ben
parents: 800
diff changeset
895 blocks of text of different length might be equal. If both compare
2b676dc88c66 [xemacs-hg @ 2002-04-01 03:58:02 by ben]
ben
parents: 800
diff changeset
896 equal up to the limit in length of one but not the other, the longer one
2b676dc88c66 [xemacs-hg @ 2002-04-01 03:58:02 by ben]
ben
parents: 800
diff changeset
897 is "greater". */
2b676dc88c66 [xemacs-hg @ 2002-04-01 03:58:02 by ben]
ben
parents: 800
diff changeset
898
2b676dc88c66 [xemacs-hg @ 2002-04-01 03:58:02 by ben]
ben
parents: 800
diff changeset
899 int
2b676dc88c66 [xemacs-hg @ 2002-04-01 03:58:02 by ben]
ben
parents: 800
diff changeset
900 qxetextcasecmp (const Intbyte *s1, Bytecount len1,
2b676dc88c66 [xemacs-hg @ 2002-04-01 03:58:02 by ben]
ben
parents: 800
diff changeset
901 const Intbyte *s2, Bytecount len2)
2b676dc88c66 [xemacs-hg @ 2002-04-01 03:58:02 by ben]
ben
parents: 800
diff changeset
902 {
2b676dc88c66 [xemacs-hg @ 2002-04-01 03:58:02 by ben]
ben
parents: 800
diff changeset
903 while (len1 > 0 && len2 > 0)
2b676dc88c66 [xemacs-hg @ 2002-04-01 03:58:02 by ben]
ben
parents: 800
diff changeset
904 {
2b676dc88c66 [xemacs-hg @ 2002-04-01 03:58:02 by ben]
ben
parents: 800
diff changeset
905 const Intbyte *old_s1 = s1;
2b676dc88c66 [xemacs-hg @ 2002-04-01 03:58:02 by ben]
ben
parents: 800
diff changeset
906 const Intbyte *old_s2 = s2;
771
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
907 int diff = (DOWNCASE (0, charptr_emchar (s1)) -
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
908 DOWNCASE (0, charptr_emchar (s2)));
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
909 if (diff != 0)
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
910 return diff;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
911 INC_CHARPTR (s1);
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
912 INC_CHARPTR (s2);
801
2b676dc88c66 [xemacs-hg @ 2002-04-01 03:58:02 by ben]
ben
parents: 800
diff changeset
913 len1 -= s1 - old_s1;
2b676dc88c66 [xemacs-hg @ 2002-04-01 03:58:02 by ben]
ben
parents: 800
diff changeset
914 len2 -= s2 - old_s2;
771
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
915 }
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
916
801
2b676dc88c66 [xemacs-hg @ 2002-04-01 03:58:02 by ben]
ben
parents: 800
diff changeset
917 assert (len1 >= 0 && len2 >= 0);
2b676dc88c66 [xemacs-hg @ 2002-04-01 03:58:02 by ben]
ben
parents: 800
diff changeset
918 return len1 - len2;
2b676dc88c66 [xemacs-hg @ 2002-04-01 03:58:02 by ben]
ben
parents: 800
diff changeset
919 }
2b676dc88c66 [xemacs-hg @ 2002-04-01 03:58:02 by ben]
ben
parents: 800
diff changeset
920
2b676dc88c66 [xemacs-hg @ 2002-04-01 03:58:02 by ben]
ben
parents: 800
diff changeset
921 /* Like qxetextcasecmp() but also return number of characters at
2b676dc88c66 [xemacs-hg @ 2002-04-01 03:58:02 by ben]
ben
parents: 800
diff changeset
922 beginning that match. */
2b676dc88c66 [xemacs-hg @ 2002-04-01 03:58:02 by ben]
ben
parents: 800
diff changeset
923
2b676dc88c66 [xemacs-hg @ 2002-04-01 03:58:02 by ben]
ben
parents: 800
diff changeset
924 int
2b676dc88c66 [xemacs-hg @ 2002-04-01 03:58:02 by ben]
ben
parents: 800
diff changeset
925 qxetextcasecmp_matching (const Intbyte *s1, Bytecount len1,
2b676dc88c66 [xemacs-hg @ 2002-04-01 03:58:02 by ben]
ben
parents: 800
diff changeset
926 const Intbyte *s2, Bytecount len2,
2b676dc88c66 [xemacs-hg @ 2002-04-01 03:58:02 by ben]
ben
parents: 800
diff changeset
927 Charcount *matching)
2b676dc88c66 [xemacs-hg @ 2002-04-01 03:58:02 by ben]
ben
parents: 800
diff changeset
928 {
2b676dc88c66 [xemacs-hg @ 2002-04-01 03:58:02 by ben]
ben
parents: 800
diff changeset
929 *matching = 0;
2b676dc88c66 [xemacs-hg @ 2002-04-01 03:58:02 by ben]
ben
parents: 800
diff changeset
930 while (len1 > 0 && len2 > 0)
2b676dc88c66 [xemacs-hg @ 2002-04-01 03:58:02 by ben]
ben
parents: 800
diff changeset
931 {
2b676dc88c66 [xemacs-hg @ 2002-04-01 03:58:02 by ben]
ben
parents: 800
diff changeset
932 const Intbyte *old_s1 = s1;
2b676dc88c66 [xemacs-hg @ 2002-04-01 03:58:02 by ben]
ben
parents: 800
diff changeset
933 const Intbyte *old_s2 = s2;
2b676dc88c66 [xemacs-hg @ 2002-04-01 03:58:02 by ben]
ben
parents: 800
diff changeset
934 int diff = (DOWNCASE (0, charptr_emchar (s1)) -
2b676dc88c66 [xemacs-hg @ 2002-04-01 03:58:02 by ben]
ben
parents: 800
diff changeset
935 DOWNCASE (0, charptr_emchar (s2)));
2b676dc88c66 [xemacs-hg @ 2002-04-01 03:58:02 by ben]
ben
parents: 800
diff changeset
936 if (diff != 0)
2b676dc88c66 [xemacs-hg @ 2002-04-01 03:58:02 by ben]
ben
parents: 800
diff changeset
937 return diff;
2b676dc88c66 [xemacs-hg @ 2002-04-01 03:58:02 by ben]
ben
parents: 800
diff changeset
938 INC_CHARPTR (s1);
2b676dc88c66 [xemacs-hg @ 2002-04-01 03:58:02 by ben]
ben
parents: 800
diff changeset
939 INC_CHARPTR (s2);
2b676dc88c66 [xemacs-hg @ 2002-04-01 03:58:02 by ben]
ben
parents: 800
diff changeset
940 len1 -= s1 - old_s1;
2b676dc88c66 [xemacs-hg @ 2002-04-01 03:58:02 by ben]
ben
parents: 800
diff changeset
941 len2 -= s2 - old_s2;
2b676dc88c66 [xemacs-hg @ 2002-04-01 03:58:02 by ben]
ben
parents: 800
diff changeset
942 (*matching)++;
2b676dc88c66 [xemacs-hg @ 2002-04-01 03:58:02 by ben]
ben
parents: 800
diff changeset
943 }
2b676dc88c66 [xemacs-hg @ 2002-04-01 03:58:02 by ben]
ben
parents: 800
diff changeset
944
2b676dc88c66 [xemacs-hg @ 2002-04-01 03:58:02 by ben]
ben
parents: 800
diff changeset
945 assert (len1 >= 0 && len2 >= 0);
2b676dc88c66 [xemacs-hg @ 2002-04-01 03:58:02 by ben]
ben
parents: 800
diff changeset
946 return len1 - len2;
771
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
947 }
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
948
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
949 int
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
950 lisp_strcasecmp (Lisp_Object s1, Lisp_Object s2)
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
951 {
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
952 Intbyte *cm = strcasecmp_charmap;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
953 Intbyte *p1 = XSTRING_DATA (s1);
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
954 Intbyte *p2 = XSTRING_DATA (s2);
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
955 Intbyte *e1 = p1 + XSTRING_LENGTH (s1);
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
956 Intbyte *e2 = p2 + XSTRING_LENGTH (s2);
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
957
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
958 /* again, we use a symmetric algorithm and favor clarity over
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
959 nanosecond improvements. */
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
960 while (1)
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
961 {
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
962 /* if we reached the end of either string, compare lengths.
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
963 do NOT compare the final null byte against anything, in case
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
964 the other string also has a null byte at that position. */
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
965 if (p1 == e1 || p2 == e2)
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
966 return e1 - e2;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
967 if (cm[*p1] != cm[*p2])
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
968 return cm[*p1] - cm[*p2];
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
969 p1++, p2++;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
970 }
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
971 }
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
972
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
973 int
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
974 lisp_strcasecmp_i18n (Lisp_Object s1, Lisp_Object s2)
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
975 {
801
2b676dc88c66 [xemacs-hg @ 2002-04-01 03:58:02 by ben]
ben
parents: 800
diff changeset
976 return qxetextcasecmp (XSTRING_DATA (s1), XSTRING_LENGTH (s1),
2b676dc88c66 [xemacs-hg @ 2002-04-01 03:58:02 by ben]
ben
parents: 800
diff changeset
977 XSTRING_DATA (s2), XSTRING_LENGTH (s2));
771
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
978 }
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
979
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
980
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
981 /************************************************************************/
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
982 /* conversion between textual representations */
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
983 /************************************************************************/
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
984
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
985 /* NOTE: Does not reset the Dynarr. */
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
986
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
987 void
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
988 convert_intbyte_string_into_emchar_dynarr (const Intbyte *str, Bytecount len,
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
989 Emchar_dynarr *dyn)
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
990 {
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
991 const Intbyte *strend = str + len;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
992
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
993 while (str < strend)
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
994 {
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
995 Emchar ch = charptr_emchar (str);
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
996 Dynarr_add (dyn, ch);
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
997 INC_CHARPTR (str);
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
998 }
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
999 }
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1000
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1001 Charcount
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1002 convert_intbyte_string_into_emchar_string (const Intbyte *str, Bytecount len,
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1003 Emchar *arr)
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1004 {
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1005 const Intbyte *strend = str + len;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1006 Charcount newlen = 0;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1007 while (str < strend)
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1008 {
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1009 Emchar ch = charptr_emchar (str);
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1010 arr[newlen++] = ch;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1011 INC_CHARPTR (str);
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1012 }
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1013 return newlen;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1014 }
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1015
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1016 /* Convert an array of Emchars into the equivalent string representation.
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1017 Store into the given Intbyte dynarr. Does not reset the dynarr.
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1018 Does not add a terminating zero. */
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1019
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1020 void
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1021 convert_emchar_string_into_intbyte_dynarr (Emchar *arr, int nels,
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1022 Intbyte_dynarr *dyn)
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1023 {
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1024 Intbyte str[MAX_EMCHAR_LEN];
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1025 int i;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1026
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1027 for (i = 0; i < nels; i++)
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1028 {
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1029 Bytecount len = set_charptr_emchar (str, arr[i]);
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1030 Dynarr_add_many (dyn, str, len);
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1031 }
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1032 }
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1033
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1034 /* Convert an array of Emchars into the equivalent string representation.
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1035 Malloc the space needed for this and return it. If LEN_OUT is not a
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1036 NULL pointer, store into LEN_OUT the number of Intbytes in the
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1037 malloc()ed string. Note that the actual number of Intbytes allocated
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1038 is one more than this: the returned string is zero-terminated. */
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1039
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1040 Intbyte *
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1041 convert_emchar_string_into_malloced_string (Emchar *arr, int nels,
826
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
1042 Bytecount *len_out)
771
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1043 {
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1044 /* Damn zero-termination. */
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1045 Intbyte *str = (Intbyte *) alloca (nels * MAX_EMCHAR_LEN + 1);
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1046 Intbyte *strorig = str;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1047 Bytecount len;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1048
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1049 int i;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1050
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1051 for (i = 0; i < nels; i++)
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1052 str += set_charptr_emchar (str, arr[i]);
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1053 *str = '\0';
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1054 len = str - strorig;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1055 str = (Intbyte *) xmalloc (1 + len);
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1056 memcpy (str, strorig, 1 + len);
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1057 if (len_out)
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1058 *len_out = len;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1059 return str;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1060 }
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1061
826
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
1062 #define COPY_TEXT_BETWEEN_FORMATS(srcfmt, dstfmt) \
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
1063 do \
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
1064 { \
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
1065 if (dst) \
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
1066 { \
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
1067 Intbyte *dstend = dst + dstlen; \
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
1068 Intbyte *dstp = dst; \
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
1069 const Intbyte *srcend = src + srclen; \
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
1070 const Intbyte *srcp = src; \
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
1071 \
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
1072 while (srcp < srcend) \
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
1073 { \
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
1074 Emchar ch = charptr_emchar_fmt (srcp, srcfmt, srcobj); \
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
1075 Bytecount len = emchar_len_fmt (ch, dstfmt); \
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
1076 \
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
1077 if (dstp + len <= dstend) \
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
1078 { \
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
1079 set_charptr_emchar_fmt (dstp, ch, dstfmt, dstobj); \
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
1080 dstp += len; \
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
1081 } \
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
1082 else \
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
1083 break; \
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
1084 INC_CHARPTR_FMT (srcp, srcfmt); \
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
1085 } \
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
1086 text_checking_assert (srcp <= srcend); \
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
1087 if (src_used) \
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
1088 *src_used = srcp - src; \
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
1089 return dstp - dst; \
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
1090 } \
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
1091 else \
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
1092 { \
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
1093 const Intbyte *srcend = src + srclen; \
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
1094 const Intbyte *srcp = src; \
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
1095 Bytecount total = 0; \
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
1096 \
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
1097 while (srcp < srcend) \
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
1098 { \
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
1099 total += emchar_len_fmt (charptr_emchar_fmt (srcp, srcfmt, \
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
1100 srcobj), dstfmt); \
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
1101 INC_CHARPTR_FMT (srcp, srcfmt); \
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
1102 } \
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
1103 text_checking_assert (srcp == srcend); \
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
1104 if (src_used) \
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
1105 *src_used = srcp - src; \
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
1106 return total; \
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
1107 } \
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
1108 } \
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
1109 while (0)
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
1110
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
1111 /* Copy as much text from SRC/SRCLEN to DST/DSTLEN as will fit, converting
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
1112 from SRCFMT/SRCOBJ to DSTFMT/DSTOBJ. Return number of bytes stored into
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
1113 DST as return value, and number of bytes copied from SRC through
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
1114 SRC_USED (if not NULL). If DST is NULL, don't actually store anything
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
1115 and just return the size needed to store all the text. Will not copy
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
1116 partial characters into DST. */
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
1117
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
1118 Bytecount
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
1119 copy_text_between_formats (const Intbyte *src, Bytecount srclen,
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
1120 Internal_Format srcfmt,
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
1121 Lisp_Object srcobj,
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
1122 Intbyte *dst, Bytecount dstlen,
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
1123 Internal_Format dstfmt,
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
1124 Lisp_Object dstobj,
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
1125 Bytecount *src_used)
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
1126 {
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
1127 if (srcfmt == dstfmt &&
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
1128 objects_have_same_internal_representation (srcobj, dstobj))
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
1129 {
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
1130 if (dst)
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
1131 {
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
1132 srclen = min (srclen, dstlen);
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
1133 srclen = validate_intbyte_string_backward (src, srclen);
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
1134 memcpy (dst, src, srclen);
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
1135 if (src_used)
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
1136 *src_used = srclen;
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
1137 return srclen;
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
1138 }
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
1139 else
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
1140 return srclen;
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
1141 }
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
1142 /* Everything before the final else statement is an optimization.
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
1143 The inner loops inside COPY_TEXT_BETWEEN_FORMATS() have a number
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
1144 of calls to *_fmt(), each of which has a switch statement in it.
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
1145 By using constants as the FMT argument, these switch statements
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
1146 will be optimized out of existence. */
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
1147 #define ELSE_FORMATS(fmt1, fmt2) \
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
1148 else if (srcfmt == fmt1 && dstfmt == fmt2) \
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
1149 COPY_TEXT_BETWEEN_FORMATS (fmt1, fmt2)
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
1150 ELSE_FORMATS (FORMAT_DEFAULT, FORMAT_8_BIT_FIXED);
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
1151 ELSE_FORMATS (FORMAT_8_BIT_FIXED, FORMAT_DEFAULT);
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
1152 ELSE_FORMATS (FORMAT_DEFAULT, FORMAT_32_BIT_FIXED);
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
1153 ELSE_FORMATS (FORMAT_32_BIT_FIXED, FORMAT_DEFAULT);
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
1154 else
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
1155 COPY_TEXT_BETWEEN_FORMATS (srcfmt, dstfmt);
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
1156 #undef ELSE_FORMATS
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
1157 }
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
1158
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
1159 /* Copy as much buffer text in BUF, starting at POS, of length LEN, as will
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
1160 fit into DST/DSTLEN, converting to DSTFMT. Return number of bytes
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
1161 stored into DST as return value, and number of bytes copied from BUF
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
1162 through SRC_USED (if not NULL). If DST is NULL, don't actually store
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
1163 anything and just return the size needed to store all the text. */
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
1164
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
1165 Bytecount
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
1166 copy_buffer_text_out (struct buffer *buf, Bytebpos pos,
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
1167 Bytecount len, Intbyte *dst, Bytecount dstlen,
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
1168 Internal_Format dstfmt, Lisp_Object dstobj,
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
1169 Bytecount *src_used)
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
1170 {
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
1171 Bytecount dst_used = 0;
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
1172 if (src_used)
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
1173 *src_used = 0;
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
1174
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
1175 {
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
1176 BUFFER_TEXT_LOOP (buf, pos, len, runptr, runlen)
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
1177 {
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
1178 Bytecount the_src_used, the_dst_used;
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
1179
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
1180 the_dst_used = copy_text_between_formats (runptr, runlen,
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
1181 BUF_FORMAT (buf),
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
1182 wrap_buffer (buf),
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
1183 dst, dstlen, dstfmt,
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
1184 dstobj, &the_src_used);
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
1185 dst_used += the_dst_used;
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
1186 if (src_used)
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
1187 *src_used += the_src_used;
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
1188 if (dst)
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
1189 {
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
1190 dst += the_dst_used;
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
1191 dstlen -= the_dst_used;
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
1192 if (!dstlen)
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
1193 break;
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
1194 }
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
1195 }
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
1196 }
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
1197
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
1198 return dst_used;
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
1199 }
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
1200
771
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1201
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1202 /************************************************************************/
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1203 /* charset properties of strings */
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1204 /************************************************************************/
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1205
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1206 void
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1207 find_charsets_in_intbyte_string (unsigned char *charsets, const Intbyte *str,
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1208 Bytecount len)
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1209 {
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1210 #ifndef MULE
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1211 /* Telescope this. */
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1212 charsets[0] = 1;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1213 #else
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1214 const Intbyte *strend = str + len;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1215 memset (charsets, 0, NUM_LEADING_BYTES);
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1216
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1217 /* #### SJT doesn't like this. */
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1218 if (len == 0)
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1219 {
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1220 charsets[XCHARSET_LEADING_BYTE (Vcharset_ascii) - MIN_LEADING_BYTE] = 1;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1221 return;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1222 }
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1223
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1224 while (str < strend)
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1225 {
826
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
1226 charsets[emchar_leading_byte (charptr_emchar (str)) - MIN_LEADING_BYTE] =
771
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1227 1;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1228 INC_CHARPTR (str);
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1229 }
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1230 #endif
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1231 }
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1232
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1233 void
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1234 find_charsets_in_emchar_string (unsigned char *charsets, const Emchar *str,
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1235 Charcount len)
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1236 {
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1237 #ifndef MULE
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1238 /* Telescope this. */
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1239 charsets[0] = 1;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1240 #else
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1241 int i;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1242
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1243 memset (charsets, 0, NUM_LEADING_BYTES);
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1244
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1245 /* #### SJT doesn't like this. */
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1246 if (len == 0)
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1247 {
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1248 charsets[XCHARSET_LEADING_BYTE (Vcharset_ascii) - MIN_LEADING_BYTE] = 1;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1249 return;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1250 }
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1251
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1252 for (i = 0; i < len; i++)
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1253 {
826
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
1254 charsets[emchar_leading_byte (str[i]) - MIN_LEADING_BYTE] = 1;
771
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1255 }
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1256 #endif
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1257 }
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1258
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1259 int
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1260 intbyte_string_displayed_columns (const Intbyte *str, Bytecount len)
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1261 {
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1262 int cols = 0;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1263 const Intbyte *end = str + len;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1264
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1265 while (str < end)
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1266 {
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1267 #ifdef MULE
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1268 Emchar ch = charptr_emchar (str);
826
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
1269 cols += XCHARSET_COLUMNS (emchar_charset (ch));
771
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1270 #else
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1271 cols++;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1272 #endif
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1273 INC_CHARPTR (str);
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1274 }
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1275
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1276 return cols;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1277 }
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1278
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1279 int
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1280 emchar_string_displayed_columns (const Emchar *str, Charcount len)
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1281 {
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1282 #ifdef MULE
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1283 int cols = 0;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1284 int i;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1285
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1286 for (i = 0; i < len; i++)
826
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
1287 cols += XCHARSET_COLUMNS (emchar_charset (str[i]));
771
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1288
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1289 return cols;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1290 #else /* not MULE */
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1291 return len;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1292 #endif
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1293 }
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1294
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1295 Charcount
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1296 intbyte_string_nonascii_chars (const Intbyte *str, Bytecount len)
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1297 {
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1298 #ifdef MULE
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1299 const Intbyte *end = str + len;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1300 Charcount retval = 0;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1301
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1302 while (str < end)
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1303 {
826
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
1304 if (!byte_ascii_p (*str))
771
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1305 retval++;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1306 INC_CHARPTR (str);
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1307 }
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1308
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1309 return retval;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1310 #else
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1311 return 0;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1312 #endif
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1313 }
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1314
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1315
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1316 /***************************************************************************/
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1317 /* Eistring helper functions */
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1318 /***************************************************************************/
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1319
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1320 int
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1321 eistr_casefiddle_1 (Intbyte *olddata, Bytecount len, Intbyte *newdata,
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1322 int downp)
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1323 {
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1324 Intbyte *endp = olddata + len;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1325 Intbyte *newp = newdata;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1326 int changedp = 0;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1327
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1328 while (olddata < endp)
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1329 {
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1330 Emchar c = charptr_emchar (olddata);
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1331 Emchar newc;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1332
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1333 if (downp)
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1334 newc = DOWNCASE (0, c);
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1335 else
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1336 newc = UPCASE (0, c);
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1337
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1338 if (c != newc)
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1339 changedp = 1;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1340
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1341 newp += set_charptr_emchar (newp, newc);
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1342 INC_CHARPTR (olddata);
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1343 }
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1344
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1345 *newp = '\0';
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1346
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1347 return changedp ? newp - newdata : 0;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1348 }
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1349
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1350 int
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1351 eifind_large_enough_buffer (int oldbufsize, int needed_size)
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1352 {
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1353 while (oldbufsize < needed_size)
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1354 {
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1355 oldbufsize = oldbufsize * 3 / 2;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1356 oldbufsize = max (oldbufsize, 32);
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1357 }
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1358
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1359 return oldbufsize;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1360 }
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1361
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1362 void
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1363 eito_malloc_1 (Eistring *ei)
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1364 {
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1365 if (ei->mallocp_)
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1366 return;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1367 ei->mallocp_ = 1;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1368 if (ei->data_)
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1369 {
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1370 Intbyte *newdata;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1371
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1372 ei->max_size_allocated_ =
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1373 eifind_large_enough_buffer (0, ei->bytelen_ + 1);
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1374 newdata = (Intbyte *) xmalloc (ei->max_size_allocated_);
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1375 memcpy (newdata, ei->data_, ei->bytelen_ + 1);
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1376 ei->data_ = newdata;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1377 }
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1378
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1379 if (ei->extdata_)
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1380 {
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1381 Extbyte *newdata = (Extbyte *) xmalloc (ei->extlen_ + 2);
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1382
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1383 memcpy (newdata, ei->extdata_, ei->extlen_);
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1384 /* Double null-terminate in case of Unicode data */
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1385 newdata[ei->extlen_] = '\0';
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1386 newdata[ei->extlen_ + 1] = '\0';
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1387 ei->extdata_ = newdata;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1388 }
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1389 }
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1390
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1391 int
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1392 eicmp_1 (Eistring *ei, Bytecount off, Charcount charoff,
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1393 Bytecount len, Charcount charlen, const Intbyte *data,
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1394 const Eistring *ei2, int is_c, int fold_case)
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1395 {
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1396 assert ((off < 0) != (charoff < 0));
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1397 if (off < 0)
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1398 {
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1399 off = charcount_to_bytecount (ei->data_, charoff);
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1400 if (charlen < 0)
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1401 len = -1;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1402 else
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1403 len = charcount_to_bytecount (ei->data_ + off, charlen);
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1404 }
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1405 if (len < 0)
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1406 len = ei->bytelen_ - off;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1407
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1408 assert (off >= 0 && off <= ei->bytelen_);
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1409 assert (len >= 0 && off + len <= ei->bytelen_);
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1410 assert ((data == 0) != (ei == 0));
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1411 assert ((is_c != 0) == (data != 0));
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1412 assert (fold_case >= 0 && fold_case <= 2);
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1413
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1414 {
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1415 Bytecount dstlen;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1416 const Intbyte *src = ei->data_, *dst;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1417
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1418 if (data)
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1419 {
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1420 dst = data;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1421 dstlen = qxestrlen (data);
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1422 }
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1423 else
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1424 {
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1425 dst = ei2->data_;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1426 dstlen = ei2->bytelen_;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1427 }
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1428
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1429 if (is_c)
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1430 EI_ASSERT_ASCII ((Char_ASCII *) dst, dstlen);
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1431
801
2b676dc88c66 [xemacs-hg @ 2002-04-01 03:58:02 by ben]
ben
parents: 800
diff changeset
1432 return (fold_case == 0 ? qxememcmp4 (src, len, dst, dstlen) :
2b676dc88c66 [xemacs-hg @ 2002-04-01 03:58:02 by ben]
ben
parents: 800
diff changeset
1433 fold_case == 1 ? qxememcasecmp4 (src, len, dst, dstlen) :
2b676dc88c66 [xemacs-hg @ 2002-04-01 03:58:02 by ben]
ben
parents: 800
diff changeset
1434 qxetextcasecmp (src, len, dst, dstlen));
771
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1435 }
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1436 }
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1437
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1438 Intbyte *
826
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
1439 eicpyout_malloc_fmt (Eistring *eistr, Bytecount *len_out, Internal_Format fmt,
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
1440 Lisp_Object object)
771
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1441 {
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1442 Intbyte *ptr;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1443
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1444 assert (fmt == FORMAT_DEFAULT);
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1445 ptr = xnew_array (Intbyte, eistr->bytelen_ + 1);
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1446 if (len_out)
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1447 *len_out = eistr->bytelen_;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1448 memcpy (ptr, eistr->data_, eistr->bytelen_ + 1);
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1449 return ptr;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1450 }
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1451
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1452
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1453 /************************************************************************/
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1454 /* Charcount/Bytecount conversion */
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1455 /************************************************************************/
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1456
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1457 /* Optimization. Do it. Live it. Love it. */
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1458
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1459 #ifdef MULE
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1460
826
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
1461 /* Skip as many ASCII bytes as possible in the memory block [PTR, END).
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
1462 Return pointer to the first non-ASCII byte. optimized for long
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
1463 stretches of ASCII. */
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
1464 inline static const Intbyte *
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
1465 skip_ascii (const Intbyte *ptr, const Intbyte *end)
771
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1466 {
826
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
1467 #ifdef EFFICIENT_INT_128_BIT
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
1468 # define STRIDE_TYPE INT_128_BIT
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
1469 # define HIGH_BIT_MASK \
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
1470 MAKE_128_BIT_UNSIGNED_CONSTANT (0x80808080808080808080808080808080)
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
1471 #elif defined (EFFICIENT_INT_64_BIT)
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
1472 # define STRIDE_TYPE INT_64_BIT
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
1473 # define HIGH_BIT_MASK MAKE_64_BIT_UNSIGNED_CONSTANT (0x8080808080808080)
771
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1474 #else
826
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
1475 # define STRIDE_TYPE INT_32_BIT
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
1476 # define HIGH_BIT_MASK MAKE_32_BIT_UNSIGNED_CONSTANT (0x80808080)
771
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1477 #endif
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1478
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1479 #define ALIGN_BITS ((EMACS_UINT) (ALIGNOF (STRIDE_TYPE) - 1))
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1480 #define ALIGN_MASK (~ ALIGN_BITS)
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1481 #define ALIGNED(ptr) ((((EMACS_UINT) ptr) & ALIGN_BITS) == 0)
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1482 #define STRIDE sizeof (STRIDE_TYPE)
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1483
826
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
1484 const unsigned STRIDE_TYPE *ascii_end;
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
1485
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
1486 /* Need to do in 3 sections -- before alignment start, aligned chunk,
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
1487 after alignment end. */
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
1488 while (!ALIGNED (ptr))
771
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1489 {
826
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
1490 if (ptr == end || !byte_ascii_p (*ptr))
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
1491 return ptr;
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
1492 ptr++;
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
1493 }
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
1494 ascii_end = (const unsigned STRIDE_TYPE *) ptr;
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
1495 /* This loop screams, because we can detect ASCII
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
1496 characters 4 or 8 at a time. */
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
1497 while ((const Intbyte *) ascii_end + STRIDE <= end
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
1498 && !(*ascii_end & HIGH_BIT_MASK))
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
1499 ascii_end++;
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
1500 ptr = (Intbyte *) ascii_end;
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
1501 while (ptr < end && byte_ascii_p (*ptr))
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
1502 ptr++;
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
1503 return ptr;
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
1504 }
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
1505
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
1506 /* Function equivalents of bytecount_to_charcount/charcount_to_bytecount.
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
1507 These work on strings of all sizes but are more efficient than a simple
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
1508 loop on large strings and probably less efficient on sufficiently small
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
1509 strings. */
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
1510
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
1511 Charcount
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
1512 bytecount_to_charcount_fun (const Intbyte *ptr, Bytecount len)
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
1513 {
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
1514 Charcount count = 0;
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
1515 const Intbyte *end = ptr + len;
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
1516 while (1)
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
1517 {
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
1518 const Intbyte *newptr = skip_ascii (ptr, end);
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
1519 count += newptr - ptr;
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
1520 ptr = newptr;
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
1521 if (ptr == end)
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
1522 break;
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
1523 {
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
1524 /* Optimize for successive characters from the same charset */
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
1525 Intbyte leading_byte = *ptr;
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
1526 int bytes = rep_bytes_by_first_byte (leading_byte);
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
1527 while (ptr < end && *ptr == leading_byte)
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
1528 ptr += bytes, count++;
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
1529 }
771
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1530 }
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1531
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1532 /* Bomb out if the specified substring ends in the middle
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1533 of a character. Note that we might have already gotten
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1534 a core dump above from an invalid reference, but at least
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1535 we will get no farther than here.
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1536
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1537 This also catches len < 0. */
800
a5954632b187 [xemacs-hg @ 2002-03-31 08:27:14 by ben]
ben
parents: 793
diff changeset
1538 text_checking_assert (ptr == end);
771
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1539
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1540 return count;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1541 }
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1542
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1543 Bytecount
826
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
1544 charcount_to_bytecount_fun (const Intbyte *ptr, Charcount len)
771
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1545 {
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1546 const Intbyte *newptr = ptr;
826
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
1547 while (1)
771
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1548 {
826
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
1549 const Intbyte *newnewptr = skip_ascii (newptr, newptr + len);
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
1550 len -= newnewptr - newptr;
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
1551 newptr = newnewptr;
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
1552 if (!len)
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
1553 break;
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
1554 {
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
1555 /* Optimize for successive characters from the same charset */
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
1556 Intbyte leading_byte = *newptr;
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
1557 int bytes = rep_bytes_by_first_byte (leading_byte);
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
1558 while (len > 0 && *newptr == leading_byte)
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
1559 newptr += bytes, len--;
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
1560 }
771
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1561 }
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1562 return newptr - ptr;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1563 }
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1564
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1565 /* The next two functions are the actual meat behind the
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1566 charbpos-to-bytebpos and bytebpos-to-charbpos conversions. Currently
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1567 the method they use is fairly unsophisticated; see buffer.h.
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1568
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1569 Note that charbpos_to_bytebpos_func() is probably the most-called
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1570 function in all of XEmacs. Therefore, it must be FAST FAST FAST.
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1571 This is the reason why so much of the code is duplicated.
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1572
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1573 Similar considerations apply to bytebpos_to_charbpos_func(), although
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1574 less so because the function is not called so often.
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1575
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1576 #### At some point this should use a more sophisticated method;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1577 see buffer.h. */
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1578
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1579 static int not_very_random_number;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1580
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1581 Bytebpos
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1582 charbpos_to_bytebpos_func (struct buffer *buf, Charbpos x)
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1583 {
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1584 Charbpos bufmin;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1585 Charbpos bufmax;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1586 Bytebpos bytmin;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1587 Bytebpos bytmax;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1588 int size;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1589 int forward_p;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1590 Bytebpos retval;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1591 int diff_so_far;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1592 int add_to_cache = 0;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1593
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1594 /* Check for some cached positions, for speed. */
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1595 if (x == BUF_PT (buf))
826
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
1596 return BYTE_BUF_PT (buf);
771
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1597 if (x == BUF_ZV (buf))
826
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
1598 return BYTE_BUF_ZV (buf);
771
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1599 if (x == BUF_BEGV (buf))
826
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
1600 return BYTE_BUF_BEGV (buf);
771
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1601
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1602 bufmin = buf->text->mule_bufmin;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1603 bufmax = buf->text->mule_bufmax;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1604 bytmin = buf->text->mule_bytmin;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1605 bytmax = buf->text->mule_bytmax;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1606 size = (1 << buf->text->mule_shifter) + !!buf->text->mule_three_p;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1607
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1608 /* The basic idea here is that we shift the "known region" up or down
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1609 until it overlaps the specified position. We do this by moving
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1610 the upper bound of the known region up one character at a time,
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1611 and moving the lower bound of the known region up as necessary
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1612 when the size of the character just seen changes.
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1613
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1614 We optimize this, however, by first shifting the known region to
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1615 one of the cached points if it's close by. (We don't check BEG or
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1616 Z, even though they're cached; most of the time these will be the
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1617 same as BEGV and ZV, and when they're not, they're not likely
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1618 to be used.) */
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1619
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1620 if (x > bufmax)
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1621 {
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1622 Charbpos diffmax = x - bufmax;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1623 Charbpos diffpt = x - BUF_PT (buf);
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1624 Charbpos diffzv = BUF_ZV (buf) - x;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1625 /* #### This value could stand some more exploration. */
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1626 Charcount heuristic_hack = (bufmax - bufmin) >> 2;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1627
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1628 /* Check if the position is closer to PT or ZV than to the
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1629 end of the known region. */
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1630
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1631 if (diffpt < 0)
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1632 diffpt = -diffpt;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1633 if (diffzv < 0)
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1634 diffzv = -diffzv;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1635
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1636 /* But also implement a heuristic that favors the known region
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1637 over PT or ZV. The reason for this is that switching to
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1638 PT or ZV will wipe out the knowledge in the known region,
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1639 which might be annoying if the known region is large and
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1640 PT or ZV is not that much closer than the end of the known
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1641 region. */
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1642
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1643 diffzv += heuristic_hack;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1644 diffpt += heuristic_hack;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1645 if (diffpt < diffmax && diffpt <= diffzv)
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1646 {
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1647 bufmax = bufmin = BUF_PT (buf);
826
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
1648 bytmax = bytmin = BYTE_BUF_PT (buf);
771
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1649 /* We set the size to 1 even though it doesn't really
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1650 matter because the new known region contains no
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1651 characters. We do this because this is the most
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1652 likely size of the characters around the new known
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1653 region, and we avoid potential yuckiness that is
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1654 done when size == 3. */
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1655 size = 1;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1656 }
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1657 if (diffzv < diffmax)
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1658 {
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1659 bufmax = bufmin = BUF_ZV (buf);
826
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
1660 bytmax = bytmin = BYTE_BUF_ZV (buf);
771
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1661 size = 1;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1662 }
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1663 }
800
a5954632b187 [xemacs-hg @ 2002-03-31 08:27:14 by ben]
ben
parents: 793
diff changeset
1664 #ifdef ERROR_CHECK_TEXT
771
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1665 else if (x >= bufmin)
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1666 abort ();
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1667 #endif
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1668 else
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1669 {
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1670 Charbpos diffmin = bufmin - x;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1671 Charbpos diffpt = BUF_PT (buf) - x;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1672 Charbpos diffbegv = x - BUF_BEGV (buf);
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1673 /* #### This value could stand some more exploration. */
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1674 Charcount heuristic_hack = (bufmax - bufmin) >> 2;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1675
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1676 if (diffpt < 0)
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1677 diffpt = -diffpt;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1678 if (diffbegv < 0)
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1679 diffbegv = -diffbegv;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1680
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1681 /* But also implement a heuristic that favors the known region --
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1682 see above. */
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1683
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1684 diffbegv += heuristic_hack;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1685 diffpt += heuristic_hack;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1686
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1687 if (diffpt < diffmin && diffpt <= diffbegv)
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1688 {
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1689 bufmax = bufmin = BUF_PT (buf);
826
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
1690 bytmax = bytmin = BYTE_BUF_PT (buf);
771
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1691 /* We set the size to 1 even though it doesn't really
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1692 matter because the new known region contains no
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1693 characters. We do this because this is the most
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1694 likely size of the characters around the new known
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1695 region, and we avoid potential yuckiness that is
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1696 done when size == 3. */
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1697 size = 1;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1698 }
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1699 if (diffbegv < diffmin)
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1700 {
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1701 bufmax = bufmin = BUF_BEGV (buf);
826
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
1702 bytmax = bytmin = BYTE_BUF_BEGV (buf);
771
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1703 size = 1;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1704 }
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1705 }
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1706
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1707 diff_so_far = x > bufmax ? x - bufmax : bufmin - x;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1708 if (diff_so_far > 50)
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1709 {
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1710 /* If we have to move more than a certain amount, then look
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1711 into our cache. */
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1712 int minval = INT_MAX;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1713 int found = 0;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1714 int i;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1715
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1716 add_to_cache = 1;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1717 /* I considered keeping the positions ordered. This would speed
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1718 up this loop, but updating the cache would take longer, so
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1719 it doesn't seem like it would really matter. */
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1720 for (i = 0; i < 16; i++)
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1721 {
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1722 int diff = buf->text->mule_charbpos_cache[i] - x;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1723
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1724 if (diff < 0)
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1725 diff = -diff;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1726 if (diff < minval)
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1727 {
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1728 minval = diff;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1729 found = i;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1730 }
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1731 }
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1732
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1733 if (minval < diff_so_far)
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1734 {
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1735 bufmax = bufmin = buf->text->mule_charbpos_cache[found];
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1736 bytmax = bytmin = buf->text->mule_bytebpos_cache[found];
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1737 size = 1;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1738 }
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1739 }
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1740
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1741 /* It's conceivable that the caching above could lead to X being
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1742 the same as one of the range edges. */
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1743 if (x >= bufmax)
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1744 {
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1745 Bytebpos newmax;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1746 Bytecount newsize;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1747
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1748 forward_p = 1;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1749 while (x > bufmax)
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1750 {
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1751 newmax = bytmax;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1752
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1753 INC_BYTEBPOS (buf, newmax);
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1754 newsize = newmax - bytmax;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1755 if (newsize != size)
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1756 {
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1757 bufmin = bufmax;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1758 bytmin = bytmax;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1759 size = newsize;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1760 }
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1761 bytmax = newmax;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1762 bufmax++;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1763 }
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1764 retval = bytmax;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1765
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1766 /* #### Should go past the found location to reduce the number
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1767 of times that this function is called */
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1768 }
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1769 else /* x < bufmin */
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1770 {
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1771 Bytebpos newmin;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1772 Bytecount newsize;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1773
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1774 forward_p = 0;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1775 while (x < bufmin)
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1776 {
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1777 newmin = bytmin;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1778
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1779 DEC_BYTEBPOS (buf, newmin);
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1780 newsize = bytmin - newmin;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1781 if (newsize != size)
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1782 {
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1783 bufmax = bufmin;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1784 bytmax = bytmin;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1785 size = newsize;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1786 }
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1787 bytmin = newmin;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1788 bufmin--;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1789 }
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1790 retval = bytmin;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1791
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1792 /* #### Should go past the found location to reduce the number
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1793 of times that this function is called
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1794 */
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1795 }
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1796
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1797 /* If size is three, than we have to max sure that the range we
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1798 discovered isn't too large, because we use a fixed-length
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1799 table to divide by 3. */
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1800
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1801 if (size == 3)
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1802 {
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1803 int gap = bytmax - bytmin;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1804 buf->text->mule_three_p = 1;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1805 buf->text->mule_shifter = 1;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1806
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1807 if (gap > MAX_BYTEBPOS_GAP_SIZE_3)
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1808 {
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1809 if (forward_p)
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1810 {
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1811 bytmin = bytmax - MAX_BYTEBPOS_GAP_SIZE_3;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1812 bufmin = bufmax - MAX_CHARBPOS_GAP_SIZE_3;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1813 }
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1814 else
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1815 {
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1816 bytmax = bytmin + MAX_BYTEBPOS_GAP_SIZE_3;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1817 bufmax = bufmin + MAX_CHARBPOS_GAP_SIZE_3;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1818 }
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1819 }
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1820 }
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1821 else
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1822 {
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1823 buf->text->mule_three_p = 0;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1824 if (size == 4)
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1825 buf->text->mule_shifter = 2;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1826 else
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1827 buf->text->mule_shifter = size - 1;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1828 }
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1829
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1830 buf->text->mule_bufmin = bufmin;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1831 buf->text->mule_bufmax = bufmax;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1832 buf->text->mule_bytmin = bytmin;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1833 buf->text->mule_bytmax = bytmax;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1834
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1835 if (add_to_cache)
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1836 {
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1837 int replace_loc;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1838
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1839 /* We throw away a "random" cached value and replace it with
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1840 the new value. It doesn't actually have to be very random
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1841 at all, just evenly distributed.
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1842
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1843 #### It would be better to use a least-recently-used algorithm
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1844 or something that tries to space things out, but I'm not sure
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1845 it's worth it to go to the trouble of maintaining that. */
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1846 not_very_random_number += 621;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1847 replace_loc = not_very_random_number & 15;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1848 buf->text->mule_charbpos_cache[replace_loc] = x;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1849 buf->text->mule_bytebpos_cache[replace_loc] = retval;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1850 }
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1851
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1852 return retval;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1853 }
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1854
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1855 /* The logic in this function is almost identical to the logic in
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1856 the previous function. */
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1857
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1858 Charbpos
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1859 bytebpos_to_charbpos_func (struct buffer *buf, Bytebpos x)
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1860 {
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1861 Charbpos bufmin;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1862 Charbpos bufmax;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1863 Bytebpos bytmin;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1864 Bytebpos bytmax;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1865 int size;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1866 int forward_p;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1867 Charbpos retval;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1868 int diff_so_far;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1869 int add_to_cache = 0;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1870
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1871 /* Check for some cached positions, for speed. */
826
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
1872 if (x == BYTE_BUF_PT (buf))
771
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1873 return BUF_PT (buf);
826
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
1874 if (x == BYTE_BUF_ZV (buf))
771
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1875 return BUF_ZV (buf);
826
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
1876 if (x == BYTE_BUF_BEGV (buf))
771
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1877 return BUF_BEGV (buf);
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1878
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1879 bufmin = buf->text->mule_bufmin;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1880 bufmax = buf->text->mule_bufmax;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1881 bytmin = buf->text->mule_bytmin;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1882 bytmax = buf->text->mule_bytmax;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1883 size = (1 << buf->text->mule_shifter) + !!buf->text->mule_three_p;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1884
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1885 /* The basic idea here is that we shift the "known region" up or down
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1886 until it overlaps the specified position. We do this by moving
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1887 the upper bound of the known region up one character at a time,
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1888 and moving the lower bound of the known region up as necessary
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1889 when the size of the character just seen changes.
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1890
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1891 We optimize this, however, by first shifting the known region to
826
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
1892 one of the cached points if it's close by. (We don't check BYTE_BEG or
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
1893 BYTE_Z, even though they're cached; most of the time these will be the
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
1894 same as BYTE_BEGV and BYTE_ZV, and when they're not, they're not likely
771
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1895 to be used.) */
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1896
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1897 if (x > bytmax)
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1898 {
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1899 Bytebpos diffmax = x - bytmax;
826
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
1900 Bytebpos diffpt = x - BYTE_BUF_PT (buf);
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
1901 Bytebpos diffzv = BYTE_BUF_ZV (buf) - x;
771
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1902 /* #### This value could stand some more exploration. */
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1903 Bytecount heuristic_hack = (bytmax - bytmin) >> 2;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1904
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1905 /* Check if the position is closer to PT or ZV than to the
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1906 end of the known region. */
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1907
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1908 if (diffpt < 0)
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1909 diffpt = -diffpt;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1910 if (diffzv < 0)
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1911 diffzv = -diffzv;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1912
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1913 /* But also implement a heuristic that favors the known region
826
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
1914 over BYTE_PT or BYTE_ZV. The reason for this is that switching to
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
1915 BYTE_PT or BYTE_ZV will wipe out the knowledge in the known region,
771
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1916 which might be annoying if the known region is large and
826
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
1917 BYTE_PT or BYTE_ZV is not that much closer than the end of the known
771
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1918 region. */
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1919
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1920 diffzv += heuristic_hack;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1921 diffpt += heuristic_hack;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1922 if (diffpt < diffmax && diffpt <= diffzv)
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1923 {
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1924 bufmax = bufmin = BUF_PT (buf);
826
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
1925 bytmax = bytmin = BYTE_BUF_PT (buf);
771
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1926 /* We set the size to 1 even though it doesn't really
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1927 matter because the new known region contains no
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1928 characters. We do this because this is the most
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1929 likely size of the characters around the new known
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1930 region, and we avoid potential yuckiness that is
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1931 done when size == 3. */
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1932 size = 1;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1933 }
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1934 if (diffzv < diffmax)
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1935 {
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1936 bufmax = bufmin = BUF_ZV (buf);
826
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
1937 bytmax = bytmin = BYTE_BUF_ZV (buf);
771
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1938 size = 1;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1939 }
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1940 }
800
a5954632b187 [xemacs-hg @ 2002-03-31 08:27:14 by ben]
ben
parents: 793
diff changeset
1941 #ifdef ERROR_CHECK_TEXT
771
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1942 else if (x >= bytmin)
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1943 abort ();
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1944 #endif
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1945 else
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1946 {
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1947 Bytebpos diffmin = bytmin - x;
826
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
1948 Bytebpos diffpt = BYTE_BUF_PT (buf) - x;
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
1949 Bytebpos diffbegv = x - BYTE_BUF_BEGV (buf);
771
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1950 /* #### This value could stand some more exploration. */
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1951 Bytecount heuristic_hack = (bytmax - bytmin) >> 2;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1952
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1953 if (diffpt < 0)
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1954 diffpt = -diffpt;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1955 if (diffbegv < 0)
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1956 diffbegv = -diffbegv;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1957
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1958 /* But also implement a heuristic that favors the known region --
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1959 see above. */
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1960
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1961 diffbegv += heuristic_hack;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1962 diffpt += heuristic_hack;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1963
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1964 if (diffpt < diffmin && diffpt <= diffbegv)
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1965 {
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1966 bufmax = bufmin = BUF_PT (buf);
826
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
1967 bytmax = bytmin = BYTE_BUF_PT (buf);
771
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1968 /* We set the size to 1 even though it doesn't really
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1969 matter because the new known region contains no
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1970 characters. We do this because this is the most
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1971 likely size of the characters around the new known
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1972 region, and we avoid potential yuckiness that is
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1973 done when size == 3. */
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1974 size = 1;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1975 }
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1976 if (diffbegv < diffmin)
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1977 {
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1978 bufmax = bufmin = BUF_BEGV (buf);
826
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
1979 bytmax = bytmin = BYTE_BUF_BEGV (buf);
771
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1980 size = 1;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1981 }
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1982 }
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1983
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1984 diff_so_far = x > bytmax ? x - bytmax : bytmin - x;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1985 if (diff_so_far > 50)
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1986 {
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1987 /* If we have to move more than a certain amount, then look
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1988 into our cache. */
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1989 int minval = INT_MAX;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1990 int found = 0;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1991 int i;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1992
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1993 add_to_cache = 1;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1994 /* I considered keeping the positions ordered. This would speed
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1995 up this loop, but updating the cache would take longer, so
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1996 it doesn't seem like it would really matter. */
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1997 for (i = 0; i < 16; i++)
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1998 {
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
1999 int diff = buf->text->mule_bytebpos_cache[i] - x;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2000
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2001 if (diff < 0)
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2002 diff = -diff;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2003 if (diff < minval)
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2004 {
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2005 minval = diff;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2006 found = i;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2007 }
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2008 }
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2009
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2010 if (minval < diff_so_far)
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2011 {
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2012 bufmax = bufmin = buf->text->mule_charbpos_cache[found];
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2013 bytmax = bytmin = buf->text->mule_bytebpos_cache[found];
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2014 size = 1;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2015 }
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2016 }
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2017
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2018 /* It's conceivable that the caching above could lead to X being
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2019 the same as one of the range edges. */
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2020 if (x >= bytmax)
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2021 {
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2022 Bytebpos newmax;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2023 Bytecount newsize;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2024
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2025 forward_p = 1;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2026 while (x > bytmax)
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2027 {
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2028 newmax = bytmax;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2029
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2030 INC_BYTEBPOS (buf, newmax);
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2031 newsize = newmax - bytmax;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2032 if (newsize != size)
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2033 {
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2034 bufmin = bufmax;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2035 bytmin = bytmax;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2036 size = newsize;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2037 }
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2038 bytmax = newmax;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2039 bufmax++;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2040 }
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2041 retval = bufmax;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2042
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2043 /* #### Should go past the found location to reduce the number
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2044 of times that this function is called */
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2045 }
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2046 else /* x <= bytmin */
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2047 {
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2048 Bytebpos newmin;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2049 Bytecount newsize;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2050
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2051 forward_p = 0;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2052 while (x < bytmin)
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2053 {
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2054 newmin = bytmin;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2055
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2056 DEC_BYTEBPOS (buf, newmin);
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2057 newsize = bytmin - newmin;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2058 if (newsize != size)
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2059 {
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2060 bufmax = bufmin;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2061 bytmax = bytmin;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2062 size = newsize;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2063 }
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2064 bytmin = newmin;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2065 bufmin--;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2066 }
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2067 retval = bufmin;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2068
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2069 /* #### Should go past the found location to reduce the number
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2070 of times that this function is called
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2071 */
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2072 }
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2073
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2074 /* If size is three, than we have to max sure that the range we
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2075 discovered isn't too large, because we use a fixed-length
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2076 table to divide by 3. */
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2077
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2078 if (size == 3)
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2079 {
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2080 int gap = bytmax - bytmin;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2081 buf->text->mule_three_p = 1;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2082 buf->text->mule_shifter = 1;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2083
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2084 if (gap > MAX_BYTEBPOS_GAP_SIZE_3)
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2085 {
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2086 if (forward_p)
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2087 {
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2088 bytmin = bytmax - MAX_BYTEBPOS_GAP_SIZE_3;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2089 bufmin = bufmax - MAX_CHARBPOS_GAP_SIZE_3;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2090 }
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2091 else
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2092 {
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2093 bytmax = bytmin + MAX_BYTEBPOS_GAP_SIZE_3;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2094 bufmax = bufmin + MAX_CHARBPOS_GAP_SIZE_3;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2095 }
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2096 }
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2097 }
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2098 else
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2099 {
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2100 buf->text->mule_three_p = 0;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2101 if (size == 4)
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2102 buf->text->mule_shifter = 2;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2103 else
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2104 buf->text->mule_shifter = size - 1;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2105 }
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2106
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2107 buf->text->mule_bufmin = bufmin;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2108 buf->text->mule_bufmax = bufmax;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2109 buf->text->mule_bytmin = bytmin;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2110 buf->text->mule_bytmax = bytmax;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2111
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2112 if (add_to_cache)
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2113 {
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2114 int replace_loc;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2115
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2116 /* We throw away a "random" cached value and replace it with
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2117 the new value. It doesn't actually have to be very random
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2118 at all, just evenly distributed.
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2119
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2120 #### It would be better to use a least-recently-used algorithm
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2121 or something that tries to space things out, but I'm not sure
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2122 it's worth it to go to the trouble of maintaining that. */
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2123 not_very_random_number += 621;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2124 replace_loc = not_very_random_number & 15;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2125 buf->text->mule_charbpos_cache[replace_loc] = retval;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2126 buf->text->mule_bytebpos_cache[replace_loc] = x;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2127 }
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2128
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2129 return retval;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2130 }
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2131
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2132 /* Text of length BYTELENGTH and CHARLENGTH (in different units)
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2133 was inserted at charbpos START. */
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2134
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2135 void
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2136 buffer_mule_signal_inserted_region (struct buffer *buf, Charbpos start,
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2137 Bytecount bytelength,
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2138 Charcount charlength)
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2139 {
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2140 int size = (1 << buf->text->mule_shifter) + !!buf->text->mule_three_p;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2141 int i;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2142
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2143 /* Adjust the cache of known positions. */
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2144 for (i = 0; i < 16; i++)
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2145 {
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2146
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2147 if (buf->text->mule_charbpos_cache[i] > start)
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2148 {
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2149 buf->text->mule_charbpos_cache[i] += charlength;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2150 buf->text->mule_bytebpos_cache[i] += bytelength;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2151 }
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2152 }
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2153
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2154 if (start >= buf->text->mule_bufmax)
826
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
2155 return;
771
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2156
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2157 /* The insertion is either before the known region, in which case
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2158 it shoves it forward; or within the known region, in which case
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2159 it shoves the end forward. (But it may make the known region
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2160 inconsistent, so we may have to shorten it.) */
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2161
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2162 if (start <= buf->text->mule_bufmin)
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2163 {
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2164 buf->text->mule_bufmin += charlength;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2165 buf->text->mule_bufmax += charlength;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2166 buf->text->mule_bytmin += bytelength;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2167 buf->text->mule_bytmax += bytelength;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2168 }
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2169 else
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2170 {
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2171 Charbpos end = start + charlength;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2172 /* the insertion point divides the known region in two.
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2173 Keep the longer half, at least, and expand into the
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2174 inserted chunk as much as possible. */
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2175
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2176 if (start - buf->text->mule_bufmin > buf->text->mule_bufmax - start)
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2177 {
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2178 Bytebpos bytestart = (buf->text->mule_bytmin
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2179 + size * (start - buf->text->mule_bufmin));
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2180 Bytebpos bytenew;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2181
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2182 while (start < end)
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2183 {
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2184 bytenew = bytestart;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2185 INC_BYTEBPOS (buf, bytenew);
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2186 if (bytenew - bytestart != size)
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2187 break;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2188 start++;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2189 bytestart = bytenew;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2190 }
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2191 if (start != end)
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2192 {
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2193 buf->text->mule_bufmax = start;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2194 buf->text->mule_bytmax = bytestart;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2195 }
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2196 else
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2197 {
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2198 buf->text->mule_bufmax += charlength;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2199 buf->text->mule_bytmax += bytelength;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2200 }
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2201 }
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2202 else
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2203 {
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2204 Bytebpos byteend = (buf->text->mule_bytmin
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2205 + size * (start - buf->text->mule_bufmin)
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2206 + bytelength);
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2207 Bytebpos bytenew;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2208
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2209 buf->text->mule_bufmax += charlength;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2210 buf->text->mule_bytmax += bytelength;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2211
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2212 while (end > start)
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2213 {
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2214 bytenew = byteend;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2215 DEC_BYTEBPOS (buf, bytenew);
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2216 if (byteend - bytenew != size)
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2217 break;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2218 end--;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2219 byteend = bytenew;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2220 }
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2221 if (start != end)
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2222 {
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2223 buf->text->mule_bufmin = end;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2224 buf->text->mule_bytmin = byteend;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2225 }
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2226 }
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2227 }
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2228 }
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2229
826
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
2230 /* Text from START to END (equivalent in Bytebpos's: from BYTE_START to
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
2231 BYTE_END) was deleted. */
771
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2232
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2233 void
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2234 buffer_mule_signal_deleted_region (struct buffer *buf, Charbpos start,
826
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
2235 Charbpos end, Bytebpos byte_start,
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
2236 Bytebpos byte_end)
771
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2237 {
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2238 int i;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2239
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2240 /* Adjust the cache of known positions. */
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2241 for (i = 0; i < 16; i++)
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2242 {
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2243 /* After the end; gets shoved backward */
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2244 if (buf->text->mule_charbpos_cache[i] > end)
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2245 {
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2246 buf->text->mule_charbpos_cache[i] -= end - start;
826
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
2247 buf->text->mule_bytebpos_cache[i] -= byte_end - byte_start;
771
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2248 }
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2249 /* In the range; moves to start of range */
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2250 else if (buf->text->mule_charbpos_cache[i] > start)
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2251 {
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2252 buf->text->mule_charbpos_cache[i] = start;
826
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
2253 buf->text->mule_bytebpos_cache[i] = byte_start;
771
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2254 }
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2255 }
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2256
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2257 /* We don't care about any text after the end of the known region. */
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2258
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2259 end = min (end, buf->text->mule_bufmax);
826
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
2260 byte_end = min (byte_end, buf->text->mule_bytmax);
771
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2261 if (start >= end)
826
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
2262 return;
771
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2263
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2264 /* The end of the known region offsets by the total amount of deletion,
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2265 since it's all before it. */
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2266
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2267 buf->text->mule_bufmax -= end - start;
826
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
2268 buf->text->mule_bytmax -= byte_end - byte_start;
771
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2269
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2270 /* Now we don't care about any text after the start of the known region. */
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2271
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2272 end = min (end, buf->text->mule_bufmin);
826
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
2273 byte_end = min (byte_end, buf->text->mule_bytmin);
771
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2274 if (start < end)
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2275 {
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2276 buf->text->mule_bufmin -= end - start;
826
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
2277 buf->text->mule_bytmin -= byte_end - byte_start;
771
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2278 }
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2279 }
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2280
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2281 #endif /* MULE */
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2282
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2283
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2284 /************************************************************************/
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2285 /* verifying buffer and string positions */
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2286 /************************************************************************/
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2287
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2288 /* Functions below are tagged with either _byte or _char indicating
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2289 whether they return byte or character positions. For a buffer,
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2290 a character position is a "Charbpos" and a byte position is a "Bytebpos".
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2291 For strings, these are sometimes typed using "Charcount" and
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2292 "Bytecount". */
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2293
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2294 /* Flags for the functions below are:
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2295
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2296 GB_ALLOW_PAST_ACCESSIBLE
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2297
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2298 Allow positions to range over the entire buffer (BUF_BEG to BUF_Z),
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2299 rather than just the accessible portion (BUF_BEGV to BUF_ZV).
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2300 For strings, this flag has no effect.
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2301
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2302 GB_COERCE_RANGE
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2303
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2304 If the position is outside the allowable range, return the lower
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2305 or upper bound of the range, whichever is closer to the specified
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2306 position.
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2307
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2308 GB_NO_ERROR_IF_BAD
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2309
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2310 If the position is outside the allowable range, return -1.
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2311
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2312 GB_NEGATIVE_FROM_END
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2313
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2314 If a value is negative, treat it as an offset from the end.
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2315 Only applies to strings.
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2316
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2317 The following additional flags apply only to the functions
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2318 that return ranges:
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2319
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2320 GB_ALLOW_NIL
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2321
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2322 Either or both positions can be nil. If FROM is nil,
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2323 FROM_OUT will contain the lower bound of the allowed range.
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2324 If TO is nil, TO_OUT will contain the upper bound of the
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2325 allowed range.
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2326
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2327 GB_CHECK_ORDER
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2328
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2329 FROM must contain the lower bound and TO the upper bound
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2330 of the range. If the positions are reversed, an error is
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2331 signalled.
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2332
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2333 The following is a combination flag:
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2334
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2335 GB_HISTORICAL_STRING_BEHAVIOR
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2336
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2337 Equivalent to (GB_NEGATIVE_FROM_END | GB_ALLOW_NIL).
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2338 */
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2339
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2340 /* Return a buffer position stored in a Lisp_Object. Full
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2341 error-checking is done on the position. Flags can be specified to
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2342 control the behavior of out-of-range values. The default behavior
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2343 is to require that the position is within the accessible part of
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2344 the buffer (BEGV and ZV), and to signal an error if the position is
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2345 out of range.
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2346
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2347 */
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2348
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2349 Charbpos
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2350 get_buffer_pos_char (struct buffer *b, Lisp_Object pos, unsigned int flags)
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2351 {
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2352 /* Does not GC */
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2353 Charbpos ind;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2354 Charbpos min_allowed, max_allowed;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2355
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2356 CHECK_INT_COERCE_MARKER (pos);
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2357 ind = XINT (pos);
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2358 min_allowed = flags & GB_ALLOW_PAST_ACCESSIBLE ? BUF_BEG (b) : BUF_BEGV (b);
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2359 max_allowed = flags & GB_ALLOW_PAST_ACCESSIBLE ? BUF_Z (b) : BUF_ZV (b);
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2360
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2361 if (ind < min_allowed || ind > max_allowed)
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2362 {
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2363 if (flags & GB_COERCE_RANGE)
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2364 ind = ind < min_allowed ? min_allowed : max_allowed;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2365 else if (flags & GB_NO_ERROR_IF_BAD)
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2366 ind = -1;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2367 else
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2368 {
793
e38acbeb1cae [xemacs-hg @ 2002-03-29 04:46:17 by ben]
ben
parents: 788
diff changeset
2369 Lisp_Object buffer = wrap_buffer (b);
e38acbeb1cae [xemacs-hg @ 2002-03-29 04:46:17 by ben]
ben
parents: 788
diff changeset
2370
771
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2371 args_out_of_range (buffer, pos);
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2372 }
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2373 }
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2374
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2375 return ind;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2376 }
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2377
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2378 Bytebpos
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2379 get_buffer_pos_byte (struct buffer *b, Lisp_Object pos, unsigned int flags)
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2380 {
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2381 Charbpos bpos = get_buffer_pos_char (b, pos, flags);
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2382 if (bpos < 0) /* could happen with GB_NO_ERROR_IF_BAD */
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2383 return -1;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2384 return charbpos_to_bytebpos (b, bpos);
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2385 }
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2386
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2387 /* Return a pair of buffer positions representing a range of text,
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2388 taken from a pair of Lisp_Objects. Full error-checking is
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2389 done on the positions. Flags can be specified to control the
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2390 behavior of out-of-range values. The default behavior is to
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2391 allow the range bounds to be specified in either order
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2392 (however, FROM_OUT will always be the lower bound of the range
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2393 and TO_OUT the upper bound),to require that the positions
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2394 are within the accessible part of the buffer (BEGV and ZV),
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2395 and to signal an error if the positions are out of range.
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2396 */
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2397
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2398 void
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2399 get_buffer_range_char (struct buffer *b, Lisp_Object from, Lisp_Object to,
826
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
2400 Charbpos *from_out, Charbpos *to_out,
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
2401 unsigned int flags)
771
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2402 {
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2403 /* Does not GC */
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2404 Charbpos min_allowed, max_allowed;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2405
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2406 min_allowed = (flags & GB_ALLOW_PAST_ACCESSIBLE) ?
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2407 BUF_BEG (b) : BUF_BEGV (b);
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2408 max_allowed = (flags & GB_ALLOW_PAST_ACCESSIBLE) ?
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2409 BUF_Z (b) : BUF_ZV (b);
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2410
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2411 if (NILP (from) && (flags & GB_ALLOW_NIL))
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2412 *from_out = min_allowed;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2413 else
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2414 *from_out = get_buffer_pos_char (b, from, flags | GB_NO_ERROR_IF_BAD);
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2415
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2416 if (NILP (to) && (flags & GB_ALLOW_NIL))
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2417 *to_out = max_allowed;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2418 else
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2419 *to_out = get_buffer_pos_char (b, to, flags | GB_NO_ERROR_IF_BAD);
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2420
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2421 if ((*from_out < 0 || *to_out < 0) && !(flags & GB_NO_ERROR_IF_BAD))
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2422 {
793
e38acbeb1cae [xemacs-hg @ 2002-03-29 04:46:17 by ben]
ben
parents: 788
diff changeset
2423 Lisp_Object buffer = wrap_buffer (b);
e38acbeb1cae [xemacs-hg @ 2002-03-29 04:46:17 by ben]
ben
parents: 788
diff changeset
2424
771
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2425 args_out_of_range_3 (buffer, from, to);
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2426 }
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2427
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2428 if (*from_out >= 0 && *to_out >= 0 && *from_out > *to_out)
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2429 {
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2430 if (flags & GB_CHECK_ORDER)
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2431 invalid_argument_2 ("start greater than end", from, to);
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2432 else
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2433 {
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2434 Charbpos temp = *from_out;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2435 *from_out = *to_out;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2436 *to_out = temp;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2437 }
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2438 }
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2439 }
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2440
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2441 void
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2442 get_buffer_range_byte (struct buffer *b, Lisp_Object from, Lisp_Object to,
826
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
2443 Bytebpos *from_out, Bytebpos *to_out,
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
2444 unsigned int flags)
771
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2445 {
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2446 Charbpos s, e;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2447
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2448 get_buffer_range_char (b, from, to, &s, &e, flags);
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2449 if (s >= 0)
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2450 *from_out = charbpos_to_bytebpos (b, s);
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2451 else /* could happen with GB_NO_ERROR_IF_BAD */
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2452 *from_out = -1;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2453 if (e >= 0)
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2454 *to_out = charbpos_to_bytebpos (b, e);
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2455 else
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2456 *to_out = -1;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2457 }
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2458
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2459 static Charcount
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2460 get_string_pos_char_1 (Lisp_Object string, Lisp_Object pos, unsigned int flags,
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2461 Charcount known_length)
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2462 {
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2463 Charcount ccpos;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2464 Charcount min_allowed = 0;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2465 Charcount max_allowed = known_length;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2466
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2467 /* Computation of KNOWN_LENGTH is potentially expensive so we pass
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2468 it in. */
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2469 CHECK_INT (pos);
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2470 ccpos = XINT (pos);
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2471 if (ccpos < 0 && flags & GB_NEGATIVE_FROM_END)
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2472 ccpos += max_allowed;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2473
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2474 if (ccpos < min_allowed || ccpos > max_allowed)
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2475 {
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2476 if (flags & GB_COERCE_RANGE)
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2477 ccpos = ccpos < min_allowed ? min_allowed : max_allowed;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2478 else if (flags & GB_NO_ERROR_IF_BAD)
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2479 ccpos = -1;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2480 else
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2481 args_out_of_range (string, pos);
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2482 }
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2483
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2484 return ccpos;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2485 }
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2486
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2487 Charcount
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2488 get_string_pos_char (Lisp_Object string, Lisp_Object pos, unsigned int flags)
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2489 {
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2490 return get_string_pos_char_1 (string, pos, flags,
826
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
2491 string_char_length (string));
771
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2492 }
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2493
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2494 Bytecount
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2495 get_string_pos_byte (Lisp_Object string, Lisp_Object pos, unsigned int flags)
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2496 {
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2497 Charcount ccpos = get_string_pos_char (string, pos, flags);
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2498 if (ccpos < 0) /* could happen with GB_NO_ERROR_IF_BAD */
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2499 return -1;
793
e38acbeb1cae [xemacs-hg @ 2002-03-29 04:46:17 by ben]
ben
parents: 788
diff changeset
2500 return string_index_char_to_byte (string, ccpos);
771
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2501 }
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2502
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2503 void
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2504 get_string_range_char (Lisp_Object string, Lisp_Object from, Lisp_Object to,
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2505 Charcount *from_out, Charcount *to_out,
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2506 unsigned int flags)
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2507 {
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2508 Charcount min_allowed = 0;
826
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
2509 Charcount max_allowed = string_char_length (string);
771
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2510
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2511 if (NILP (from) && (flags & GB_ALLOW_NIL))
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2512 *from_out = min_allowed;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2513 else
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2514 *from_out = get_string_pos_char_1 (string, from,
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2515 flags | GB_NO_ERROR_IF_BAD,
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2516 max_allowed);
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2517
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2518 if (NILP (to) && (flags & GB_ALLOW_NIL))
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2519 *to_out = max_allowed;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2520 else
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2521 *to_out = get_string_pos_char_1 (string, to,
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2522 flags | GB_NO_ERROR_IF_BAD,
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2523 max_allowed);
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2524
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2525 if ((*from_out < 0 || *to_out < 0) && !(flags & GB_NO_ERROR_IF_BAD))
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2526 args_out_of_range_3 (string, from, to);
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2527
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2528 if (*from_out >= 0 && *to_out >= 0 && *from_out > *to_out)
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2529 {
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2530 if (flags & GB_CHECK_ORDER)
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2531 invalid_argument_2 ("start greater than end", from, to);
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2532 else
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2533 {
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2534 Charbpos temp = *from_out;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2535 *from_out = *to_out;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2536 *to_out = temp;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2537 }
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2538 }
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2539 }
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2540
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2541 void
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2542 get_string_range_byte (Lisp_Object string, Lisp_Object from, Lisp_Object to,
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2543 Bytecount *from_out, Bytecount *to_out,
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2544 unsigned int flags)
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2545 {
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2546 Charcount s, e;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2547
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2548 get_string_range_char (string, from, to, &s, &e, flags);
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2549 if (s >= 0)
793
e38acbeb1cae [xemacs-hg @ 2002-03-29 04:46:17 by ben]
ben
parents: 788
diff changeset
2550 *from_out = string_index_char_to_byte (string, s);
771
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2551 else /* could happen with GB_NO_ERROR_IF_BAD */
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2552 *from_out = -1;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2553 if (e >= 0)
793
e38acbeb1cae [xemacs-hg @ 2002-03-29 04:46:17 by ben]
ben
parents: 788
diff changeset
2554 *to_out = string_index_char_to_byte (string, e);
771
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2555 else
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2556 *to_out = -1;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2557
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2558 }
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2559
826
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
2560 Charxpos
771
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2561 get_buffer_or_string_pos_char (Lisp_Object object, Lisp_Object pos,
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2562 unsigned int flags)
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2563 {
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2564 return STRINGP (object) ?
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2565 get_string_pos_char (object, pos, flags) :
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2566 get_buffer_pos_char (XBUFFER (object), pos, flags);
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2567 }
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2568
826
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
2569 Bytexpos
771
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2570 get_buffer_or_string_pos_byte (Lisp_Object object, Lisp_Object pos,
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2571 unsigned int flags)
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2572 {
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2573 return STRINGP (object) ?
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2574 get_string_pos_byte (object, pos, flags) :
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2575 get_buffer_pos_byte (XBUFFER (object), pos, flags);
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2576 }
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2577
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2578 void
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2579 get_buffer_or_string_range_char (Lisp_Object object, Lisp_Object from,
826
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
2580 Lisp_Object to, Charxpos *from_out,
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
2581 Charxpos *to_out, unsigned int flags)
771
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2582 {
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2583 if (STRINGP (object))
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2584 get_string_range_char (object, from, to, from_out, to_out, flags);
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2585 else
826
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
2586 get_buffer_range_char (XBUFFER (object), from, to, from_out, to_out,
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
2587 flags);
771
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2588 }
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2589
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2590 void
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2591 get_buffer_or_string_range_byte (Lisp_Object object, Lisp_Object from,
826
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
2592 Lisp_Object to, Bytexpos *from_out,
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
2593 Bytexpos *to_out, unsigned int flags)
771
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2594 {
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2595 if (STRINGP (object))
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2596 get_string_range_byte (object, from, to, from_out, to_out, flags);
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2597 else
826
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
2598 get_buffer_range_byte (XBUFFER (object), from, to, from_out, to_out,
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
2599 flags);
771
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2600 }
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2601
826
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
2602 Charxpos
771
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2603 buffer_or_string_accessible_begin_char (Lisp_Object object)
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2604 {
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2605 return STRINGP (object) ? 0 : BUF_BEGV (XBUFFER (object));
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2606 }
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2607
826
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
2608 Charxpos
771
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2609 buffer_or_string_accessible_end_char (Lisp_Object object)
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2610 {
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2611 return STRINGP (object) ?
826
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
2612 string_char_length (object) : BUF_ZV (XBUFFER (object));
771
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2613 }
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2614
826
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
2615 Bytexpos
771
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2616 buffer_or_string_accessible_begin_byte (Lisp_Object object)
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2617 {
826
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
2618 return STRINGP (object) ? 0 : BYTE_BUF_BEGV (XBUFFER (object));
771
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2619 }
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2620
826
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
2621 Bytexpos
771
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2622 buffer_or_string_accessible_end_byte (Lisp_Object object)
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2623 {
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2624 return STRINGP (object) ?
826
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
2625 XSTRING_LENGTH (object) : BYTE_BUF_ZV (XBUFFER (object));
771
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2626 }
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2627
826
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
2628 Charxpos
771
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2629 buffer_or_string_absolute_begin_char (Lisp_Object object)
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2630 {
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2631 return STRINGP (object) ? 0 : BUF_BEG (XBUFFER (object));
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2632 }
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2633
826
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
2634 Charxpos
771
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2635 buffer_or_string_absolute_end_char (Lisp_Object object)
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2636 {
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2637 return STRINGP (object) ?
826
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
2638 string_char_length (object) : BUF_Z (XBUFFER (object));
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
2639 }
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
2640
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
2641 Bytexpos
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
2642 buffer_or_string_absolute_begin_byte (Lisp_Object object)
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
2643 {
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
2644 return STRINGP (object) ? 0 : BYTE_BUF_BEG (XBUFFER (object));
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
2645 }
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
2646
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
2647 Bytexpos
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
2648 buffer_or_string_absolute_end_byte (Lisp_Object object)
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
2649 {
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
2650 return STRINGP (object) ?
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
2651 XSTRING_LENGTH (object) : BYTE_BUF_Z (XBUFFER (object));
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
2652 }
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
2653
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
2654 Charbpos
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
2655 charbpos_clip_to_bounds (Charbpos lower, Charbpos num, Charbpos upper)
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
2656 {
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
2657 return (num < lower ? lower :
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
2658 num > upper ? upper :
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
2659 num);
771
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2660 }
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2661
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2662 Bytebpos
826
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
2663 bytebpos_clip_to_bounds (Bytebpos lower, Bytebpos num, Bytebpos upper)
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
2664 {
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
2665 return (num < lower ? lower :
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
2666 num > upper ? upper :
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
2667 num);
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
2668 }
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
2669
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
2670 Charxpos
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
2671 charxpos_clip_to_bounds (Charxpos lower, Charxpos num, Charxpos upper)
771
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2672 {
826
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
2673 return (num < lower ? lower :
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
2674 num > upper ? upper :
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
2675 num);
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
2676 }
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
2677
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
2678 Bytexpos
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
2679 bytexpos_clip_to_bounds (Bytexpos lower, Bytexpos num, Bytexpos upper)
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
2680 {
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
2681 return (num < lower ? lower :
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
2682 num > upper ? upper :
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
2683 num);
771
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2684 }
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2685
826
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
2686 /* These could be implemented in terms of the get_buffer_or_string()
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
2687 functions above, but those are complicated and handle lots of weird
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
2688 cases stemming from uncertain external input. */
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
2689
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
2690 Charxpos
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
2691 buffer_or_string_clip_to_accessible_char (Lisp_Object object, Charxpos pos)
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
2692 {
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
2693 return (charxpos_clip_to_bounds
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
2694 (pos, buffer_or_string_accessible_begin_char (object),
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
2695 buffer_or_string_accessible_end_char (object)));
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
2696 }
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
2697
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
2698 Bytexpos
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
2699 buffer_or_string_clip_to_accessible_byte (Lisp_Object object, Bytexpos pos)
771
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2700 {
826
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
2701 return (bytexpos_clip_to_bounds
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
2702 (pos, buffer_or_string_accessible_begin_byte (object),
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
2703 buffer_or_string_accessible_end_byte (object)));
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
2704 }
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
2705
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
2706 Charxpos
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
2707 buffer_or_string_clip_to_absolute_char (Lisp_Object object, Charxpos pos)
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
2708 {
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
2709 return (charxpos_clip_to_bounds
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
2710 (pos, buffer_or_string_absolute_begin_char (object),
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
2711 buffer_or_string_absolute_end_char (object)));
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
2712 }
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
2713
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
2714 Bytexpos
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
2715 buffer_or_string_clip_to_absolute_byte (Lisp_Object object, Bytexpos pos)
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
2716 {
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
2717 return (bytexpos_clip_to_bounds
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
2718 (pos, buffer_or_string_absolute_begin_byte (object),
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
2719 buffer_or_string_absolute_end_byte (object)));
771
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2720 }
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2721
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2722
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2723 /************************************************************************/
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2724 /* Implement TO_EXTERNAL_FORMAT, TO_INTERNAL_FORMAT */
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2725 /************************************************************************/
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2726
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2727 typedef struct
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2728 {
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2729 Dynarr_declare (Intbyte_dynarr *);
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2730 } Intbyte_dynarr_dynarr;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2731
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2732 typedef struct
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2733 {
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2734 Dynarr_declare (Extbyte_dynarr *);
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2735 } Extbyte_dynarr_dynarr;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2736
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2737 static Extbyte_dynarr_dynarr *conversion_out_dynarr_list;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2738 static Intbyte_dynarr_dynarr *conversion_in_dynarr_list;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2739
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2740 static int dfc_convert_to_external_format_in_use;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2741 static int dfc_convert_to_internal_format_in_use;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2742
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2743 static Lisp_Object
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2744 dfc_convert_to_external_format_reset_in_use (Lisp_Object value)
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2745 {
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2746 dfc_convert_to_external_format_in_use = XINT (value);
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2747 return Qnil;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2748 }
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2749
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2750 static Lisp_Object
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2751 dfc_convert_to_internal_format_reset_in_use (Lisp_Object value)
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2752 {
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2753 dfc_convert_to_internal_format_in_use = XINT (value);
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2754 return Qnil;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2755 }
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2756
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2757 void
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2758 dfc_convert_to_external_format (dfc_conversion_type source_type,
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2759 dfc_conversion_data *source,
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2760 Lisp_Object coding_system,
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2761 dfc_conversion_type sink_type,
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2762 dfc_conversion_data *sink)
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2763 {
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2764 /* It's guaranteed that many callers are not prepared for GC here,
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2765 esp. given that this code conversion occurs in many very hidden
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2766 places. */
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2767 int count = begin_gc_forbidden ();
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2768 Extbyte_dynarr *conversion_out_dynarr;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2769
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2770 type_checking_assert
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2771 (((source_type == DFC_TYPE_DATA) ||
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2772 (source_type == DFC_TYPE_LISP_LSTREAM && LSTREAMP (source->lisp_object)) ||
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2773 (source_type == DFC_TYPE_LISP_STRING && STRINGP (source->lisp_object)))
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2774 &&
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2775 ((sink_type == DFC_TYPE_DATA) ||
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2776 (sink_type == DFC_TYPE_LISP_LSTREAM && LSTREAMP (source->lisp_object))));
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2777
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2778 record_unwind_protect (dfc_convert_to_external_format_reset_in_use,
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2779 make_int (dfc_convert_to_external_format_in_use));
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2780 if (Dynarr_length (conversion_out_dynarr_list) <=
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2781 dfc_convert_to_external_format_in_use)
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2782 Dynarr_add (conversion_out_dynarr_list, Dynarr_new (Extbyte));
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2783 conversion_out_dynarr = Dynarr_at (conversion_out_dynarr_list,
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2784 dfc_convert_to_external_format_in_use);
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2785 dfc_convert_to_external_format_in_use++;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2786 Dynarr_reset (conversion_out_dynarr);
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2787
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2788 coding_system = get_coding_system_for_text_file (coding_system, 0);
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2789
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2790 /* Here we optimize in the case where the coding system does no
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2791 conversion. However, we don't want to optimize in case the source
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2792 or sink is an lstream, since writing to an lstream can cause a
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2793 garbage collection, and this could be problematic if the source
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2794 is a lisp string. */
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2795 if (source_type != DFC_TYPE_LISP_LSTREAM &&
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2796 sink_type != DFC_TYPE_LISP_LSTREAM &&
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2797 coding_system_is_binary (coding_system))
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2798 {
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2799 const Intbyte *ptr;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2800 Bytecount len;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2801
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2802 if (source_type == DFC_TYPE_LISP_STRING)
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2803 {
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2804 ptr = XSTRING_DATA (source->lisp_object);
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2805 len = XSTRING_LENGTH (source->lisp_object);
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2806 }
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2807 else
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2808 {
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2809 ptr = (Intbyte *) source->data.ptr;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2810 len = source->data.len;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2811 }
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2812
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2813 #ifdef MULE
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2814 {
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2815 const Intbyte *end;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2816 for (end = ptr + len; ptr < end;)
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2817 {
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2818 Intbyte c =
826
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
2819 (byte_ascii_p (*ptr)) ? *ptr :
771
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2820 (*ptr == LEADING_BYTE_CONTROL_1) ? (*(ptr+1) - 0x20) :
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2821 (*ptr == LEADING_BYTE_LATIN_ISO8859_1) ? (*(ptr+1)) :
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2822 '~';
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2823
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2824 Dynarr_add (conversion_out_dynarr, (Extbyte) c);
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2825 INC_CHARPTR (ptr);
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2826 }
800
a5954632b187 [xemacs-hg @ 2002-03-31 08:27:14 by ben]
ben
parents: 793
diff changeset
2827 text_checking_assert (ptr == end);
771
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2828 }
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2829 #else
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2830 Dynarr_add_many (conversion_out_dynarr, ptr, len);
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2831 #endif
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2832
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2833 }
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2834 #ifdef HAVE_WIN32_CODING_SYSTEMS
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2835 /* Optimize the common case involving Unicode where only ASCII is involved */
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2836 else if (source_type != DFC_TYPE_LISP_LSTREAM &&
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2837 sink_type != DFC_TYPE_LISP_LSTREAM &&
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2838 dfc_coding_system_is_unicode (coding_system))
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2839 {
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2840 const Intbyte *ptr, *p;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2841 Bytecount len;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2842 const Intbyte *end;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2843
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2844 if (source_type == DFC_TYPE_LISP_STRING)
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2845 {
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2846 ptr = XSTRING_DATA (source->lisp_object);
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2847 len = XSTRING_LENGTH (source->lisp_object);
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2848 }
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2849 else
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2850 {
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2851 ptr = (Intbyte *) source->data.ptr;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2852 len = source->data.len;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2853 }
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2854 end = ptr + len;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2855
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2856 for (p = ptr; p < end; p++)
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2857 {
826
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
2858 if (!byte_ascii_p (*p))
771
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2859 goto the_hard_way;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2860 }
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2861
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2862 for (p = ptr; p < end; p++)
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2863 {
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2864 Dynarr_add (conversion_out_dynarr, (Extbyte) (*p));
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2865 Dynarr_add (conversion_out_dynarr, (Extbyte) '\0');
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2866 }
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2867 }
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2868 #endif /* HAVE_WIN32_CODING_SYSTEMS */
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2869 else
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2870 {
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2871 Lisp_Object streams_to_delete[3];
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2872 int delete_count;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2873 Lisp_Object instream, outstream;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2874 Lstream *reader, *writer;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2875 struct gcpro gcpro1, gcpro2;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2876
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2877 #ifdef HAVE_WIN32_CODING_SYSTEMS
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2878 the_hard_way:
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2879 #endif /* HAVE_WIN32_CODING_SYSTEMS */
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2880 delete_count = 0;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2881 if (source_type == DFC_TYPE_LISP_LSTREAM)
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2882 instream = source->lisp_object;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2883 else if (source_type == DFC_TYPE_DATA)
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2884 streams_to_delete[delete_count++] = instream =
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2885 make_fixed_buffer_input_stream (source->data.ptr, source->data.len);
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2886 else
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2887 {
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2888 type_checking_assert (source_type == DFC_TYPE_LISP_STRING);
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2889 streams_to_delete[delete_count++] = instream =
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2890 /* This will GCPRO the Lisp string */
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2891 make_lisp_string_input_stream (source->lisp_object, 0, -1);
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2892 }
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2893
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2894 if (sink_type == DFC_TYPE_LISP_LSTREAM)
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2895 outstream = sink->lisp_object;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2896 else
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2897 {
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2898 type_checking_assert (sink_type == DFC_TYPE_DATA);
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2899 streams_to_delete[delete_count++] = outstream =
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2900 make_dynarr_output_stream
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2901 ((unsigned_char_dynarr *) conversion_out_dynarr);
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2902 }
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2903
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2904 streams_to_delete[delete_count++] = outstream =
800
a5954632b187 [xemacs-hg @ 2002-03-31 08:27:14 by ben]
ben
parents: 793
diff changeset
2905 make_coding_output_stream (XLSTREAM (outstream), coding_system,
a5954632b187 [xemacs-hg @ 2002-03-31 08:27:14 by ben]
ben
parents: 793
diff changeset
2906 CODING_ENCODE, 0);
771
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2907
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2908 reader = XLSTREAM (instream);
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2909 writer = XLSTREAM (outstream);
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2910 /* decoding_stream will gc-protect outstream */
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2911 GCPRO2 (instream, outstream);
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2912
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2913 while (1)
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2914 {
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2915 Bytecount size_in_bytes;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2916 char tempbuf[1024]; /* some random amount */
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2917
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2918 size_in_bytes = Lstream_read (reader, tempbuf, sizeof (tempbuf));
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2919
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2920 if (size_in_bytes == 0)
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2921 break;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2922 else if (size_in_bytes < 0)
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2923 signal_error (Qtext_conversion_error,
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2924 "Error converting to external format", Qunbound);
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2925
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2926 if (Lstream_write (writer, tempbuf, size_in_bytes) < 0)
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2927 signal_error (Qtext_conversion_error,
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2928 "Error converting to external format", Qunbound);
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2929 }
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2930
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2931 /* Closing writer will close any stream at the other end of writer. */
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2932 Lstream_close (writer);
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2933 Lstream_close (reader);
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2934 UNGCPRO;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2935
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2936 /* The idea is that this function will create no garbage. */
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2937 while (delete_count)
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2938 Lstream_delete (XLSTREAM (streams_to_delete [--delete_count]));
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2939 }
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2940
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2941 unbind_to (count);
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2942
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2943 if (sink_type != DFC_TYPE_LISP_LSTREAM)
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2944 {
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2945 sink->data.len = Dynarr_length (conversion_out_dynarr);
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2946 /* double zero-extend because we may be dealing with Unicode data */
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2947 Dynarr_add (conversion_out_dynarr, '\0');
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2948 Dynarr_add (conversion_out_dynarr, '\0');
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2949 sink->data.ptr = Dynarr_atp (conversion_out_dynarr, 0);
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2950 }
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2951 }
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2952
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2953 void
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2954 dfc_convert_to_internal_format (dfc_conversion_type source_type,
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2955 dfc_conversion_data *source,
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2956 Lisp_Object coding_system,
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2957 dfc_conversion_type sink_type,
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2958 dfc_conversion_data *sink)
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2959 {
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2960 /* It's guaranteed that many callers are not prepared for GC here,
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2961 esp. given that this code conversion occurs in many very hidden
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2962 places. */
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2963 int count = begin_gc_forbidden ();
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2964 Intbyte_dynarr *conversion_in_dynarr;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2965
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2966 type_checking_assert
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2967 ((source_type == DFC_TYPE_DATA ||
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2968 source_type == DFC_TYPE_LISP_LSTREAM)
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2969 &&
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2970 (sink_type == DFC_TYPE_DATA ||
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2971 sink_type == DFC_TYPE_LISP_LSTREAM));
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2972
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2973 record_unwind_protect (dfc_convert_to_internal_format_reset_in_use,
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2974 make_int (dfc_convert_to_internal_format_in_use));
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2975 if (Dynarr_length (conversion_in_dynarr_list) <=
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2976 dfc_convert_to_internal_format_in_use)
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2977 Dynarr_add (conversion_in_dynarr_list, Dynarr_new (Intbyte));
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2978 conversion_in_dynarr = Dynarr_at (conversion_in_dynarr_list,
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2979 dfc_convert_to_internal_format_in_use);
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2980 dfc_convert_to_internal_format_in_use++;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2981 Dynarr_reset (conversion_in_dynarr);
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2982
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2983 coding_system = get_coding_system_for_text_file (coding_system, 1);
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2984
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2985 if (source_type != DFC_TYPE_LISP_LSTREAM &&
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2986 sink_type != DFC_TYPE_LISP_LSTREAM &&
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2987 coding_system_is_binary (coding_system))
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2988 {
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2989 #ifdef MULE
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2990 const Intbyte *ptr = (const Intbyte *) source->data.ptr;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2991 Bytecount len = source->data.len;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2992 const Intbyte *end = ptr + len;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2993
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2994 for (; ptr < end; ptr++)
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2995 {
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2996 Intbyte c = *ptr;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2997
826
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
2998 if (byte_ascii_p (c))
771
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
2999 Dynarr_add (conversion_in_dynarr, c);
826
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
3000 else if (byte_c1_p (c))
771
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3001 {
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3002 Dynarr_add (conversion_in_dynarr, LEADING_BYTE_CONTROL_1);
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3003 Dynarr_add (conversion_in_dynarr, c + 0x20);
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3004 }
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3005 else
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3006 {
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3007 Dynarr_add (conversion_in_dynarr, LEADING_BYTE_LATIN_ISO8859_1);
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3008 Dynarr_add (conversion_in_dynarr, c);
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3009 }
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3010 }
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3011 #else
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3012 Dynarr_add_many (conversion_in_dynarr, source->data.ptr, source->data.len);
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3013 #endif
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3014 }
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3015 #ifdef HAVE_WIN32_CODING_SYSTEMS
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3016 /* Optimize the common case involving Unicode where only ASCII/Latin-1 is involved */
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3017 else if (source_type != DFC_TYPE_LISP_LSTREAM &&
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3018 sink_type != DFC_TYPE_LISP_LSTREAM &&
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3019 dfc_coding_system_is_unicode (coding_system))
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3020 {
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3021 const Intbyte *ptr = (const Intbyte *) source->data.ptr + 1;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3022 Bytecount len = source->data.len;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3023 const Intbyte *end = ptr + len;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3024
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3025 if (len & 1)
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3026 goto the_hard_way;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3027
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3028 for (; ptr < end; ptr += 2)
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3029 {
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3030 if (*ptr)
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3031 goto the_hard_way;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3032 }
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3033
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3034 ptr = (const Intbyte *) source->data.ptr;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3035 end = ptr + len;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3036
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3037 for (; ptr < end; ptr += 2)
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3038 {
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3039 Intbyte c = *ptr;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3040
826
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
3041 if (byte_ascii_p (c))
771
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3042 Dynarr_add (conversion_in_dynarr, c);
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3043 #ifdef MULE
826
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
3044 else if (byte_c1_p (c))
771
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3045 {
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3046 Dynarr_add (conversion_in_dynarr, LEADING_BYTE_CONTROL_1);
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3047 Dynarr_add (conversion_in_dynarr, c + 0x20);
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3048 }
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3049 else
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3050 {
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3051 Dynarr_add (conversion_in_dynarr, LEADING_BYTE_LATIN_ISO8859_1);
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3052 Dynarr_add (conversion_in_dynarr, c);
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3053 }
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3054 #endif /* MULE */
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3055 }
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3056 }
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3057 #endif /* HAVE_WIN32_CODING_SYSTEMS */
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3058 else
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3059 {
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3060 Lisp_Object streams_to_delete[3];
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3061 int delete_count;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3062 Lisp_Object instream, outstream;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3063 Lstream *reader, *writer;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3064 struct gcpro gcpro1, gcpro2;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3065
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3066 #ifdef HAVE_WIN32_CODING_SYSTEMS
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3067 the_hard_way:
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3068 #endif /* HAVE_WIN32_CODING_SYSTEMS */
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3069 delete_count = 0;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3070 if (source_type == DFC_TYPE_LISP_LSTREAM)
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3071 instream = source->lisp_object;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3072 else
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3073 {
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3074 type_checking_assert (source_type == DFC_TYPE_DATA);
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3075 streams_to_delete[delete_count++] = instream =
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3076 make_fixed_buffer_input_stream (source->data.ptr, source->data.len);
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3077 }
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3078
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3079 if (sink_type == DFC_TYPE_LISP_LSTREAM)
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3080 outstream = sink->lisp_object;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3081 else
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3082 {
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3083 type_checking_assert (sink_type == DFC_TYPE_DATA);
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3084 streams_to_delete[delete_count++] = outstream =
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3085 make_dynarr_output_stream
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3086 ((unsigned_char_dynarr *) conversion_in_dynarr);
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3087 }
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3088
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3089 streams_to_delete[delete_count++] = outstream =
800
a5954632b187 [xemacs-hg @ 2002-03-31 08:27:14 by ben]
ben
parents: 793
diff changeset
3090 make_coding_output_stream (XLSTREAM (outstream), coding_system,
a5954632b187 [xemacs-hg @ 2002-03-31 08:27:14 by ben]
ben
parents: 793
diff changeset
3091 CODING_DECODE, 0);
771
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3092
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3093 reader = XLSTREAM (instream);
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3094 writer = XLSTREAM (outstream);
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3095 /* outstream will gc-protect its sink stream, if necessary */
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3096 GCPRO2 (instream, outstream);
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3097
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3098 while (1)
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3099 {
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3100 Bytecount size_in_bytes;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3101 char tempbuf[1024]; /* some random amount */
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3102
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3103 size_in_bytes = Lstream_read (reader, tempbuf, sizeof (tempbuf));
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3104
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3105 if (size_in_bytes == 0)
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3106 break;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3107 else if (size_in_bytes < 0)
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3108 signal_error (Qtext_conversion_error,
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3109 "Error converting to internal format", Qunbound);
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3110
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3111 if (Lstream_write (writer, tempbuf, size_in_bytes) < 0)
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3112 signal_error (Qtext_conversion_error,
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3113 "Error converting to internal format", Qunbound);
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3114 }
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3115
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3116 /* Closing writer will close any stream at the other end of writer. */
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3117 Lstream_close (writer);
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3118 Lstream_close (reader);
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3119 UNGCPRO;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3120
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3121 /* The idea is that this function will create no garbage. */
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3122 while (delete_count)
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3123 Lstream_delete (XLSTREAM (streams_to_delete [--delete_count]));
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3124 }
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3125
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3126 unbind_to (count);
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3127
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3128 if (sink_type != DFC_TYPE_LISP_LSTREAM)
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3129 {
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3130 sink->data.len = Dynarr_length (conversion_in_dynarr);
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3131 Dynarr_add (conversion_in_dynarr, '\0'); /* remember to NUL-terminate! */
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3132 /* The macros don't currently distinguish between internal and
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3133 external sinks, and allocate and copy two extra bytes in both
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3134 cases. So we add a second zero, just like for external data
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3135 (in that case, because we may be converting to Unicode). */
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3136 Dynarr_add (conversion_in_dynarr, '\0');
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3137 sink->data.ptr = Dynarr_atp (conversion_in_dynarr, 0);
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3138 }
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3139 }
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3140
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3141
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3142 /************************************************************************/
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3143 /* Basic Emchar functions */
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3144 /************************************************************************/
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3145
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3146 #ifdef MULE
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3147
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3148 /* Convert a non-ASCII Mule character C into a one-character Mule-encoded
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3149 string in STR. Returns the number of bytes stored.
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3150 Do not call this directly. Use the macro set_charptr_emchar() instead.
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3151 */
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3152
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3153 Bytecount
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3154 non_ascii_set_charptr_emchar (Intbyte *str, Emchar c)
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3155 {
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3156 Intbyte *p;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3157 Intbyte lb;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3158 int c1, c2;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3159 Lisp_Object charset;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3160
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3161 p = str;
826
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
3162 BREAKUP_EMCHAR (c, charset, c1, c2);
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
3163 lb = emchar_leading_byte (c);
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
3164 if (leading_byte_private_p (lb))
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
3165 *p++ = private_leading_byte_prefix (lb);
771
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3166 *p++ = lb;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3167 if (EQ (charset, Vcharset_control_1))
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3168 c1 += 0x20;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3169 *p++ = c1 | 0x80;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3170 if (c2)
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3171 *p++ = c2 | 0x80;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3172
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3173 return (p - str);
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3174 }
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3175
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3176 /* Return the first character from a Mule-encoded string in STR,
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3177 assuming it's non-ASCII. Do not call this directly.
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3178 Use the macro charptr_emchar() instead. */
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3179
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3180 Emchar
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3181 non_ascii_charptr_emchar (const Intbyte *str)
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3182 {
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3183 Intbyte i0 = *str, i1, i2 = 0;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3184 Lisp_Object charset;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3185
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3186 if (i0 == LEADING_BYTE_CONTROL_1)
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3187 return (Emchar) (*++str - 0x20);
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3188
826
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
3189 if (leading_byte_prefix_p (i0))
771
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3190 i0 = *++str;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3191
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3192 i1 = *++str & 0x7F;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3193
826
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
3194 charset = charset_by_leading_byte (i0);
771
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3195 if (XCHARSET_DIMENSION (charset) == 2)
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3196 i2 = *++str & 0x7F;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3197
826
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
3198 return make_emchar (charset, i1, i2);
771
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3199 }
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3200
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3201 /* Return whether CH is a valid Emchar, assuming it's non-ASCII.
826
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
3202 Do not call this directly. Use the macro valid_emchar_p() instead. */
771
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3203
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3204 int
826
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
3205 non_ascii_valid_emchar_p (Emchar ch)
771
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3206 {
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3207 int f1, f2, f3;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3208
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3209 /* Must have only lowest 19 bits set */
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3210 if (ch & ~0x7FFFF)
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3211 return 0;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3212
826
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
3213 f1 = emchar_field1 (ch);
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
3214 f2 = emchar_field2 (ch);
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
3215 f3 = emchar_field3 (ch);
771
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3216
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3217 if (f1 == 0)
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3218 {
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3219 /* dimension-1 char */
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3220 Lisp_Object charset;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3221
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3222 /* leading byte must be correct */
826
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
3223 if (f2 < MIN_EMCHAR_FIELD2_OFFICIAL ||
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
3224 (f2 > MAX_EMCHAR_FIELD2_OFFICIAL && f2 < MIN_EMCHAR_FIELD2_PRIVATE) ||
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
3225 f2 > MAX_EMCHAR_FIELD2_PRIVATE)
771
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3226 return 0;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3227 /* octet not out of range */
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3228 if (f3 < 0x20)
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3229 return 0;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3230 /* charset exists */
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3231 /*
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3232 NOTE: This takes advantage of the fact that
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3233 FIELD2_TO_OFFICIAL_LEADING_BYTE and
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3234 FIELD2_TO_PRIVATE_LEADING_BYTE are the same.
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3235 */
826
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
3236 charset = charset_by_leading_byte (f2 + FIELD2_TO_OFFICIAL_LEADING_BYTE);
771
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3237 if (EQ (charset, Qnil))
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3238 return 0;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3239 /* check range as per size (94 or 96) of charset */
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3240 return ((f3 > 0x20 && f3 < 0x7f) || XCHARSET_CHARS (charset) == 96);
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3241 }
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3242 else
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3243 {
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3244 /* dimension-2 char */
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3245 Lisp_Object charset;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3246
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3247 /* leading byte must be correct */
826
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
3248 if (f1 < MIN_EMCHAR_FIELD1_OFFICIAL ||
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
3249 (f1 > MAX_EMCHAR_FIELD1_OFFICIAL && f1 < MIN_EMCHAR_FIELD1_PRIVATE) ||
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
3250 f1 > MAX_EMCHAR_FIELD1_PRIVATE)
771
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3251 return 0;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3252 /* octets not out of range */
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3253 if (f2 < 0x20 || f3 < 0x20)
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3254 return 0;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3255
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3256 #ifdef ENABLE_COMPOSITE_CHARS
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3257 if (f1 + FIELD1_TO_OFFICIAL_LEADING_BYTE == LEADING_BYTE_COMPOSITE)
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3258 {
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3259 if (UNBOUNDP (Fgethash (make_int (ch),
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3260 Vcomposite_char_char2string_hash_table,
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3261 Qunbound)))
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3262 return 0;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3263 return 1;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3264 }
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3265 #endif /* ENABLE_COMPOSITE_CHARS */
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3266
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3267 /* charset exists */
826
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
3268 if (f1 <= MAX_EMCHAR_FIELD1_OFFICIAL)
771
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3269 charset =
826
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
3270 charset_by_leading_byte (f1 + FIELD1_TO_OFFICIAL_LEADING_BYTE);
771
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3271 else
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3272 charset =
826
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
3273 charset_by_leading_byte (f1 + FIELD1_TO_PRIVATE_LEADING_BYTE);
771
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3274
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3275 if (EQ (charset, Qnil))
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3276 return 0;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3277 /* check range as per size (94x94 or 96x96) of charset */
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3278 return ((f2 != 0x20 && f2 != 0x7F && f3 != 0x20 && f3 != 0x7F) ||
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3279 XCHARSET_CHARS (charset) == 96);
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3280 }
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3281 }
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3282
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3283 /* Copy the character pointed to by SRC into DST. Do not call this
826
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
3284 directly. Use the macro charptr_copy_emchar() instead.
771
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3285 Return the number of bytes copied. */
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3286
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3287 Bytecount
826
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
3288 non_ascii_charptr_copy_emchar (const Intbyte *src, Intbyte *dst)
771
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3289 {
826
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
3290 Bytecount bytes = rep_bytes_by_first_byte (*src);
771
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3291 Bytecount i;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3292 for (i = bytes; i; i--, dst++, src++)
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3293 *dst = *src;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3294 return bytes;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3295 }
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3296
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3297 #endif /* MULE */
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3298
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3299
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3300 /************************************************************************/
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3301 /* streams of Emchars */
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3302 /************************************************************************/
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3303
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3304 #ifdef MULE
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3305
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3306 /* Treat a stream as a stream of Emchar's rather than a stream of bytes.
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3307 The functions below are not meant to be called directly; use
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3308 the macros in insdel.h. */
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3309
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3310 Emchar
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3311 Lstream_get_emchar_1 (Lstream *stream, int ch)
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3312 {
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3313 Intbyte str[MAX_EMCHAR_LEN];
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3314 Intbyte *strptr = str;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3315 Bytecount bytes;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3316
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3317 str[0] = (Intbyte) ch;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3318
826
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
3319 for (bytes = rep_bytes_by_first_byte (ch) - 1; bytes; bytes--)
771
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3320 {
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3321 int c = Lstream_getc (stream);
800
a5954632b187 [xemacs-hg @ 2002-03-31 08:27:14 by ben]
ben
parents: 793
diff changeset
3322 text_checking_assert (c >= 0);
771
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3323 *++strptr = (Intbyte) c;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3324 }
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3325 return charptr_emchar (str);
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3326 }
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3327
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3328 int
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3329 Lstream_fput_emchar (Lstream *stream, Emchar ch)
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3330 {
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3331 Intbyte str[MAX_EMCHAR_LEN];
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3332 Bytecount len = set_charptr_emchar (str, ch);
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3333 return Lstream_write (stream, str, len);
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3334 }
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3335
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3336 void
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3337 Lstream_funget_emchar (Lstream *stream, Emchar ch)
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3338 {
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3339 Intbyte str[MAX_EMCHAR_LEN];
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3340 Bytecount len = set_charptr_emchar (str, ch);
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3341 Lstream_unread (stream, str, len);
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3342 }
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3343
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3344 #endif /* MULE */
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3345
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3346
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3347 /************************************************************************/
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3348 /* Lisp primitives for working with characters */
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3349 /************************************************************************/
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3350
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3351 DEFUN ("make-char", Fmake_char, 2, 3, 0, /*
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3352 Make a character from CHARSET and octets ARG1 and ARG2.
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3353 ARG2 is required only for characters from two-dimensional charsets.
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3354
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3355 Each octet should be in the range 32 through 127 for a 96 or 96x96
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3356 charset and 33 through 126 for a 94 or 94x94 charset. (Most charsets
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3357 are either 96 or 94x94.) Note that this is 32 more than the values
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3358 typically given for 94x94 charsets. When two octets are required, the
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3359 order is "standard" -- the same as appears in ISO-2022 encodings,
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3360 reference tables, etc.
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3361
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3362 \(Note the following non-obvious result: Computerized translation
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3363 tables often encode the two octets as the high and low bytes,
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3364 respectively, of a hex short, while when there's only one octet, it
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3365 goes in the low byte. When decoding such a value, you need to treat
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3366 the two cases differently when calling make-char: One is (make-char
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3367 CHARSET HIGH LOW), the other is (make-char CHARSET LOW).)
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3368
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3369 For example, (make-char 'latin-iso8859-2 185) or (make-char
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3370 'latin-iso8859-2 57) will return the Latin 2 character s with caron.
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3371
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3372 As another example, the Japanese character for "kawa" (stream), which
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3373 looks something like this:
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3374
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3375 | |
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3376 | | |
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3377 | | |
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3378 | | |
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3379 / |
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3380
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3381 appears in the Unicode Standard (version 2.0) on page 7-287 with the
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3382 following values (see also page 7-4):
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3383
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3384 U 5DDD (Unicode)
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3385 G 0-2008 (GB 2312-80)
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3386 J 0-3278 (JIS X 0208-1990)
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3387 K 0-8425 (KS C 5601-1987)
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3388 B A474 (Big Five)
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3389 C 1-4455 (CNS 11643-1986 (1st plane))
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3390 A 213C34 (ANSI Z39.64-1989)
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3391
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3392 These are equivalent to:
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3393
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3394 \(make-char 'chinese-gb2312 52 40)
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3395 \(make-char 'japanese-jisx0208 64 110)
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3396 \(make-char 'korean-ksc5601 116 57)
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3397 \(make-char 'chinese-cns11643-1 76 87)
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3398 \(decode-big5-char '(164 . 116))
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3399
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3400 \(All codes above are two decimal numbers except for Big Five and ANSI
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3401 Z39.64, which we don't support. We add 32 to each of the decimal
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3402 numbers. Big Five is split in a rather hackish fashion into two
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3403 charsets, `big5-1' and `big5-2', due to its excessive size -- 94x157,
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3404 with the first codepoint in the range 0xA1 to 0xFE and the second in
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3405 the range 0x40 to 0x7E or 0xA1 to 0xFE. `decode-big5-char' is used to
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3406 generate the char from its codes, and `encode-big5-char' extracts the
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3407 codes.)
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3408
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3409 When compiled without MULE, this function does not do much, but it's
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3410 provided for compatibility. In this case, the following CHARSET symbols
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3411 are allowed:
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3412
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3413 `ascii' -- ARG1 should be in the range 0 through 127.
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3414 `control-1' -- ARG1 should be in the range 128 through 159.
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3415 else -- ARG1 is coerced to be between 0 and 255, and then the high
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3416 bit is set.
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3417
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3418 `int-to-char of the resulting ARG1' is returned, and ARG2 is always ignored.
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3419 */
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3420 (charset, arg1, arg2))
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3421 {
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3422 #ifdef MULE
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3423 Lisp_Charset *cs;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3424 int a1, a2;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3425 int lowlim, highlim;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3426
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3427 charset = Fget_charset (charset);
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3428 cs = XCHARSET (charset);
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3429
788
026c5bf9c134 [xemacs-hg @ 2002-03-21 07:29:57 by ben]
ben
parents: 771
diff changeset
3430 get_charset_limits (charset, &lowlim, &highlim);
771
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3431
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3432 CHECK_INT (arg1);
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3433 /* It is useful (and safe, according to Olivier Galibert) to strip
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3434 the 8th bit off ARG1 and ARG2 because it allows programmers to
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3435 write (make-char 'latin-iso8859-2 CODE) where code is the actual
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3436 Latin 2 code of the character. */
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3437 a1 = XINT (arg1) & 0x7f;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3438 if (a1 < lowlim || a1 > highlim)
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3439 args_out_of_range_3 (arg1, make_int (lowlim), make_int (highlim));
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3440
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3441 if (CHARSET_DIMENSION (cs) == 1)
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3442 {
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3443 if (!NILP (arg2))
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3444 invalid_argument
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3445 ("Charset is of dimension one; second octet must be nil", arg2);
826
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
3446 return make_char (make_emchar (charset, a1, 0));
771
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3447 }
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3448
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3449 CHECK_INT (arg2);
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3450 a2 = XINT (arg2) & 0x7f;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3451 if (a2 < lowlim || a2 > highlim)
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3452 args_out_of_range_3 (arg2, make_int (lowlim), make_int (highlim));
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3453
826
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
3454 return make_char (make_emchar (charset, a1, a2));
771
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3455 #else
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3456 int a1;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3457 int lowlim, highlim;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3458
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3459 if (EQ (charset, Qascii)) lowlim = 0, highlim = 127;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3460 else if (EQ (charset, Qcontrol_1)) lowlim = 0, highlim = 31;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3461 else lowlim = 0, highlim = 127;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3462
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3463 CHECK_INT (arg1);
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3464 /* It is useful (and safe, according to Olivier Galibert) to strip
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3465 the 8th bit off ARG1 and ARG2 because it allows programmers to
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3466 write (make-char 'latin-iso8859-2 CODE) where code is the actual
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3467 Latin 2 code of the character. */
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3468 a1 = XINT (arg1) & 0x7f;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3469 if (a1 < lowlim || a1 > highlim)
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3470 args_out_of_range_3 (arg1, make_int (lowlim), make_int (highlim));
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3471
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3472 if (EQ (charset, Qascii))
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3473 return make_char (a1);
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3474 return make_char (a1 + 128);
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3475 #endif /* MULE */
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3476 }
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3477
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3478 #ifdef MULE
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3479
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3480 DEFUN ("char-charset", Fchar_charset, 1, 1, 0, /*
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3481 Return the character set of char CH.
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3482 */
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3483 (ch))
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3484 {
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3485 CHECK_CHAR_COERCE_INT (ch);
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3486
826
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
3487 return XCHARSET_NAME (charset_by_leading_byte
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
3488 (emchar_leading_byte (XCHAR (ch))));
771
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3489 }
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3490
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3491 DEFUN ("char-octet", Fchar_octet, 1, 2, 0, /*
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3492 Return the octet numbered N (should be 0 or 1) of char CH.
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3493 N defaults to 0 if omitted.
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3494 */
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3495 (ch, n))
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3496 {
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3497 Lisp_Object charset;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3498 int octet0, octet1;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3499
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3500 CHECK_CHAR_COERCE_INT (ch);
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3501
826
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
3502 BREAKUP_EMCHAR (XCHAR (ch), charset, octet0, octet1);
771
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3503
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3504 if (NILP (n) || EQ (n, Qzero))
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3505 return make_int (octet0);
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3506 else if (EQ (n, make_int (1)))
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3507 return make_int (octet1);
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3508 else
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3509 invalid_constant ("Octet number must be 0 or 1", n);
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3510 }
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3511
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3512 DEFUN ("split-char", Fsplit_char, 1, 1, 0, /*
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3513 Return list of charset and one or two position-codes of CHAR.
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3514 */
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3515 (character))
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3516 {
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3517 /* This function can GC */
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3518 struct gcpro gcpro1, gcpro2;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3519 Lisp_Object charset = Qnil;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3520 Lisp_Object rc = Qnil;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3521 int c1, c2;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3522
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3523 GCPRO2 (charset, rc);
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3524 CHECK_CHAR_COERCE_INT (character);
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3525
826
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
3526 BREAKUP_EMCHAR (XCHAR (character), charset, c1, c2);
771
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3527
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3528 if (XCHARSET_DIMENSION (Fget_charset (charset)) == 2)
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3529 {
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3530 rc = list3 (XCHARSET_NAME (charset), make_int (c1), make_int (c2));
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3531 }
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3532 else
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3533 {
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3534 rc = list2 (XCHARSET_NAME (charset), make_int (c1));
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3535 }
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3536 UNGCPRO;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3537
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3538 return rc;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3539 }
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3540
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3541 #endif /* MULE */
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3542
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3543
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3544 /************************************************************************/
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3545 /* composite character functions */
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3546 /************************************************************************/
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3547
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3548 #ifdef ENABLE_COMPOSITE_CHARS
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3549
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3550 Emchar
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3551 lookup_composite_char (Intbyte *str, int len)
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3552 {
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3553 Lisp_Object lispstr = make_string (str, len);
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3554 Lisp_Object ch = Fgethash (lispstr,
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3555 Vcomposite_char_string2char_hash_table,
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3556 Qunbound);
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3557 Emchar emch;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3558
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3559 if (UNBOUNDP (ch))
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3560 {
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3561 if (composite_char_row_next >= 128)
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3562 invalid_operation ("No more composite chars available", lispstr);
826
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
3563 emch = make_emchar (Vcharset_composite, composite_char_row_next,
771
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3564 composite_char_col_next);
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3565 Fputhash (make_char (emch), lispstr,
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3566 Vcomposite_char_char2string_hash_table);
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3567 Fputhash (lispstr, make_char (emch),
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3568 Vcomposite_char_string2char_hash_table);
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3569 composite_char_col_next++;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3570 if (composite_char_col_next >= 128)
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3571 {
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3572 composite_char_col_next = 32;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3573 composite_char_row_next++;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3574 }
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3575 }
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3576 else
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3577 emch = XCHAR (ch);
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3578 return emch;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3579 }
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3580
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3581 Lisp_Object
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3582 composite_char_string (Emchar ch)
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3583 {
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3584 Lisp_Object str = Fgethash (make_char (ch),
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3585 Vcomposite_char_char2string_hash_table,
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3586 Qunbound);
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3587 assert (!UNBOUNDP (str));
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3588 return str;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3589 }
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3590
826
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
3591 DEFUN ("make-composite-char", Fmake_composite_char, 1, 1, 0, /*
771
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3592 Convert a string into a single composite character.
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3593 The character is the result of overstriking all the characters in
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3594 the string.
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3595 */
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3596 (string))
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3597 {
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3598 CHECK_STRING (string);
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3599 return make_char (lookup_composite_char (XSTRING_DATA (string),
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3600 XSTRING_LENGTH (string)));
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3601 }
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3602
826
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
3603 DEFUN ("composite-char-string", Fcomposite_char_string, 1, 1, 0, /*
771
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3604 Return a string of the characters comprising a composite character.
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3605 */
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3606 (ch))
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3607 {
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3608 Emchar emch;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3609
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3610 CHECK_CHAR (ch);
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3611 emch = XCHAR (ch);
826
6728e641994e [xemacs-hg @ 2002-05-05 11:30:15 by ben]
ben
parents: 814
diff changeset
3612 if (emchar_leading_byte (emch) != LEADING_BYTE_COMPOSITE)
771
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3613 invalid_argument ("Must be composite char", ch);
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3614 return composite_char_string (emch);
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3615 }
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3616 #endif /* ENABLE_COMPOSITE_CHARS */
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3617
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3618
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3619 /************************************************************************/
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3620 /* initialization */
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3621 /************************************************************************/
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3622
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3623 void
814
a634e3b7acc8 [xemacs-hg @ 2002-04-14 12:41:59 by ben]
ben
parents: 801
diff changeset
3624 reinit_eistring_once_early (void)
771
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3625 {
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3626 the_eistring_malloc_zero_init = the_eistring_zero_init;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3627 the_eistring_malloc_zero_init.mallocp_ = 1;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3628 }
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3629
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3630 void
814
a634e3b7acc8 [xemacs-hg @ 2002-04-14 12:41:59 by ben]
ben
parents: 801
diff changeset
3631 init_eistring_once_early (void)
a634e3b7acc8 [xemacs-hg @ 2002-04-14 12:41:59 by ben]
ben
parents: 801
diff changeset
3632 {
a634e3b7acc8 [xemacs-hg @ 2002-04-14 12:41:59 by ben]
ben
parents: 801
diff changeset
3633 reinit_eistring_once_early ();
a634e3b7acc8 [xemacs-hg @ 2002-04-14 12:41:59 by ben]
ben
parents: 801
diff changeset
3634 }
a634e3b7acc8 [xemacs-hg @ 2002-04-14 12:41:59 by ben]
ben
parents: 801
diff changeset
3635
a634e3b7acc8 [xemacs-hg @ 2002-04-14 12:41:59 by ben]
ben
parents: 801
diff changeset
3636 void
771
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3637 syms_of_text (void)
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3638 {
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3639 DEFSUBR (Fmake_char);
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3640
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3641 #ifdef MULE
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3642 DEFSUBR (Fchar_charset);
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3643 DEFSUBR (Fchar_octet);
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3644 DEFSUBR (Fsplit_char);
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3645
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3646 #ifdef ENABLE_COMPOSITE_CHARS
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3647 DEFSUBR (Fmake_composite_char);
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3648 DEFSUBR (Fcomposite_char_string);
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3649 #endif
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3650 #endif /* MULE */
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3651 }
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3652
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3653 void
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3654 reinit_vars_of_text (void)
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3655 {
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3656 int i;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3657
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3658 conversion_in_dynarr_list = Dynarr_new2 (Intbyte_dynarr_dynarr,
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3659 Intbyte_dynarr *);
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3660 conversion_out_dynarr_list = Dynarr_new2 (Extbyte_dynarr_dynarr,
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3661 Extbyte_dynarr *);
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3662
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3663 /* #### Olivier, why does this need to be reinitted? */
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3664 for (i = 0; i <= MAX_BYTEBPOS_GAP_SIZE_3; i++)
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3665 three_to_one_table[i] = i / 3;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3666 }
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3667
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3668 void
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3669 vars_of_text (void)
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3670 {
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3671 reinit_vars_of_text ();
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3672
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3673 #ifdef ENABLE_COMPOSITE_CHARS
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3674 /* #### not dumped properly */
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3675 composite_char_row_next = 32;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3676 composite_char_col_next = 32;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3677
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3678 Vcomposite_char_string2char_hash_table =
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3679 make_lisp_hash_table (500, HASH_TABLE_NON_WEAK, HASH_TABLE_EQUAL);
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3680 Vcomposite_char_char2string_hash_table =
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3681 make_lisp_hash_table (500, HASH_TABLE_NON_WEAK, HASH_TABLE_EQ);
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3682 staticpro (&Vcomposite_char_string2char_hash_table);
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3683 staticpro (&Vcomposite_char_char2string_hash_table);
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3684 #endif /* ENABLE_COMPOSITE_CHARS */
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents:
diff changeset
3685 }