Mercurial > hg > xemacs-beta
annotate src/text.h @ 5879:77d7b77909c2
Move extents.c to working in byte positions only; fix a bug, extent_detach()
src/ChangeLog addition:
2015-03-27 Aidan Kehoe <kehoea@parhasard.net>
Fix a small bug, extent_detach(); minimise needless char-byte
conversion, extents.c, sticking to byte positions in general in
this file.
* extents.c:
* extents.c (signal_single_extent_changed):
Pass byte endpoints to
gutter_extent_signal_changed_region_maybe(),
buffer_extent_signal_changed_region().
* extents.c (extent_detach):
Call signal_extent_changed() correctly, pass both extent endpoints
rather than just the byte and character variants of the start.
* extents.c (struct report_extent_modification_closure):
Do this in terms of byte positions.
* extents.c (report_extent_modification_mapper):
Use byte positions, only converting to characters when we are
definitely calling Lisp.
* extents.c (report_extent_modification):
Use byte positions in this API, move the byte-char conversion to
our callers, simplifying extents.c (it all now works in byte
positions).
* extents.h:
Update report_extent_modification's prototype.
* gutter.c (gutter_extent_signal_changed_region_maybe):
Use byte positions here, avoids needless byte-char conversion.
* gutter.h:
Update the prototype here.
* insdel.c:
* insdel.c (buffer_extent_signal_changed_region):
Implement this in terms of byte positions.
* insdel.c (signal_before_change):
* insdel.c (signal_after_change):
Call report_extent_modification() with byte positions, doing the
char->byte conversion here rather than leaving it to extents.c.
* insdel.h:
* insdel.h (struct each_buffer_change_data):
The extent unchanged info now describes bytecounts.
| author | Aidan Kehoe <kehoea@parhasard.net> |
|---|---|
| date | Fri, 27 Mar 2015 23:39:49 +0000 |
| parents | 15041705c196 |
| children |
| rev | line source |
|---|---|
| 771 | 1 /* Header file for text manipulation primitives and macros. |
| 2 Copyright (C) 1985-1995 Free Software Foundation, Inc. | |
| 3 Copyright (C) 1995 Sun Microsystems, Inc. | |
|
4952
19a72041c5ed
Mule-izing, various fixes related to char * arguments
Ben Wing <ben@xemacs.org>
parents:
4853
diff
changeset
|
4 Copyright (C) 2000, 2001, 2002, 2003, 2004, 2005, 2010 Ben Wing. |
| 771 | 5 |
| 6 This file is part of XEmacs. | |
| 7 | |
|
5402
308d34e9f07d
Changed bulk of GPLv2 or later files identified by script
Mats Lidell <matsl@xemacs.org>
parents:
5254
diff
changeset
|
8 XEmacs is free software: you can redistribute it and/or modify it |
| 771 | 9 under the terms of the GNU General Public License as published by the |
|
5402
308d34e9f07d
Changed bulk of GPLv2 or later files identified by script
Mats Lidell <matsl@xemacs.org>
parents:
5254
diff
changeset
|
10 Free Software Foundation, either version 3 of the License, or (at your |
|
308d34e9f07d
Changed bulk of GPLv2 or later files identified by script
Mats Lidell <matsl@xemacs.org>
parents:
5254
diff
changeset
|
11 option) any later version. |
| 771 | 12 |
| 13 XEmacs is distributed in the hope that it will be useful, but WITHOUT | |
| 14 ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or | |
| 15 FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License | |
| 16 for more details. | |
| 17 | |
| 18 You should have received a copy of the GNU General Public License | |
|
5402
308d34e9f07d
Changed bulk of GPLv2 or later files identified by script
Mats Lidell <matsl@xemacs.org>
parents:
5254
diff
changeset
|
19 along with XEmacs. If not, see <http://www.gnu.org/licenses/>. */ |
| 771 | 20 |
| 21 /* Synched up with: FSF 19.30. */ | |
| 22 | |
| 23 /* Authorship: | |
| 24 | |
| 25 Mostly written by Ben Wing, starting around 1995. | |
| 26 Current TO_IN/EXTERNAL_FORMAT macros written by Martin Buchholz, | |
| 27 designed by Ben Wing based on earlier macros by Ben Wing. | |
| 28 Separated out June 18, 2000 from buffer.h into text.h. | |
| 29 */ | |
| 30 | |
| 31 #ifndef INCLUDED_text_h_ | |
| 32 #define INCLUDED_text_h_ | |
| 33 | |
| 912 | 34 #ifdef HAVE_WCHAR_H |
| 771 | 35 #include <wchar.h> |
| 912 | 36 #else |
| 1257 | 37 size_t wcslen (const wchar_t *); |
| 912 | 38 #endif |
| 1204 | 39 #ifndef HAVE_STRLWR |
| 1257 | 40 char *strlwr (char *); |
| 1204 | 41 #endif |
| 42 #ifndef HAVE_STRUPR | |
| 1257 | 43 char *strupr (char *); |
| 1204 | 44 #endif |
| 771 | 45 |
| 1743 | 46 BEGIN_C_DECLS |
| 1650 | 47 |
|
5200
70ed8a0d8da8
port Mule-ization of mule-wnnfns.c from ben-unicode-internal
Ben Wing <ben@xemacs.org>
parents:
5169
diff
changeset
|
48 /* Forward compatibility from ben-unicode-internal: Following used for |
|
70ed8a0d8da8
port Mule-ization of mule-wnnfns.c from ben-unicode-internal
Ben Wing <ben@xemacs.org>
parents:
5169
diff
changeset
|
49 functions that do character conversion and need to handle errors. */ |
|
70ed8a0d8da8
port Mule-ization of mule-wnnfns.c from ben-unicode-internal
Ben Wing <ben@xemacs.org>
parents:
5169
diff
changeset
|
50 |
|
70ed8a0d8da8
port Mule-ization of mule-wnnfns.c from ben-unicode-internal
Ben Wing <ben@xemacs.org>
parents:
5169
diff
changeset
|
51 enum converr |
|
70ed8a0d8da8
port Mule-ization of mule-wnnfns.c from ben-unicode-internal
Ben Wing <ben@xemacs.org>
parents:
5169
diff
changeset
|
52 { |
|
70ed8a0d8da8
port Mule-ization of mule-wnnfns.c from ben-unicode-internal
Ben Wing <ben@xemacs.org>
parents:
5169
diff
changeset
|
53 /* ---- Basic actions ---- */ |
|
70ed8a0d8da8
port Mule-ization of mule-wnnfns.c from ben-unicode-internal
Ben Wing <ben@xemacs.org>
parents:
5169
diff
changeset
|
54 |
|
70ed8a0d8da8
port Mule-ization of mule-wnnfns.c from ben-unicode-internal
Ben Wing <ben@xemacs.org>
parents:
5169
diff
changeset
|
55 /* Do nothing upon failure and return a failure indication. |
|
70ed8a0d8da8
port Mule-ization of mule-wnnfns.c from ben-unicode-internal
Ben Wing <ben@xemacs.org>
parents:
5169
diff
changeset
|
56 Same as what happens when the *_raw() version is called. */ |
|
70ed8a0d8da8
port Mule-ization of mule-wnnfns.c from ben-unicode-internal
Ben Wing <ben@xemacs.org>
parents:
5169
diff
changeset
|
57 CONVERR_FAIL, |
|
70ed8a0d8da8
port Mule-ization of mule-wnnfns.c from ben-unicode-internal
Ben Wing <ben@xemacs.org>
parents:
5169
diff
changeset
|
58 /* abort() on failure, i.e. crash. */ |
|
70ed8a0d8da8
port Mule-ization of mule-wnnfns.c from ben-unicode-internal
Ben Wing <ben@xemacs.org>
parents:
5169
diff
changeset
|
59 CONVERR_ABORT, |
|
70ed8a0d8da8
port Mule-ization of mule-wnnfns.c from ben-unicode-internal
Ben Wing <ben@xemacs.org>
parents:
5169
diff
changeset
|
60 /* Signal a Lisp error. */ |
|
70ed8a0d8da8
port Mule-ization of mule-wnnfns.c from ben-unicode-internal
Ben Wing <ben@xemacs.org>
parents:
5169
diff
changeset
|
61 CONVERR_ERROR, |
|
70ed8a0d8da8
port Mule-ization of mule-wnnfns.c from ben-unicode-internal
Ben Wing <ben@xemacs.org>
parents:
5169
diff
changeset
|
62 /* Try to "recover" and continue processing. Currently this is always |
|
70ed8a0d8da8
port Mule-ization of mule-wnnfns.c from ben-unicode-internal
Ben Wing <ben@xemacs.org>
parents:
5169
diff
changeset
|
63 the same as CONVERR_SUBSTITUTE, where one of the substitution |
|
70ed8a0d8da8
port Mule-ization of mule-wnnfns.c from ben-unicode-internal
Ben Wing <ben@xemacs.org>
parents:
5169
diff
changeset
|
64 characters defined below (CANT_CONVERT_*) is used. */ |
|
70ed8a0d8da8
port Mule-ization of mule-wnnfns.c from ben-unicode-internal
Ben Wing <ben@xemacs.org>
parents:
5169
diff
changeset
|
65 CONVERR_SUCCEED, |
|
70ed8a0d8da8
port Mule-ization of mule-wnnfns.c from ben-unicode-internal
Ben Wing <ben@xemacs.org>
parents:
5169
diff
changeset
|
66 |
|
70ed8a0d8da8
port Mule-ization of mule-wnnfns.c from ben-unicode-internal
Ben Wing <ben@xemacs.org>
parents:
5169
diff
changeset
|
67 /* ---- More specific actions ---- */ |
|
70ed8a0d8da8
port Mule-ization of mule-wnnfns.c from ben-unicode-internal
Ben Wing <ben@xemacs.org>
parents:
5169
diff
changeset
|
68 |
|
70ed8a0d8da8
port Mule-ization of mule-wnnfns.c from ben-unicode-internal
Ben Wing <ben@xemacs.org>
parents:
5169
diff
changeset
|
69 /* Substitute something (0xFFFD, the Unicode replacement character, |
|
70ed8a0d8da8
port Mule-ization of mule-wnnfns.c from ben-unicode-internal
Ben Wing <ben@xemacs.org>
parents:
5169
diff
changeset
|
70 when converting to Unicode or to a Unicode-internal Ichar, JISX0208 |
|
70ed8a0d8da8
port Mule-ization of mule-wnnfns.c from ben-unicode-internal
Ben Wing <ben@xemacs.org>
parents:
5169
diff
changeset
|
71 GETA mark when converting to non-Mule Ichar). */ |
|
70ed8a0d8da8
port Mule-ization of mule-wnnfns.c from ben-unicode-internal
Ben Wing <ben@xemacs.org>
parents:
5169
diff
changeset
|
72 CONVERR_SUBSTITUTE, |
|
70ed8a0d8da8
port Mule-ization of mule-wnnfns.c from ben-unicode-internal
Ben Wing <ben@xemacs.org>
parents:
5169
diff
changeset
|
73 /* Use private Unicode space when converting to Unicode. */ |
|
70ed8a0d8da8
port Mule-ization of mule-wnnfns.c from ben-unicode-internal
Ben Wing <ben@xemacs.org>
parents:
5169
diff
changeset
|
74 CONVERR_USE_PRIVATE |
|
70ed8a0d8da8
port Mule-ization of mule-wnnfns.c from ben-unicode-internal
Ben Wing <ben@xemacs.org>
parents:
5169
diff
changeset
|
75 }; |
|
70ed8a0d8da8
port Mule-ization of mule-wnnfns.c from ben-unicode-internal
Ben Wing <ben@xemacs.org>
parents:
5169
diff
changeset
|
76 |
|
5092
3aa3888729c3
move inclusion point of text.h to clean things up a bit
Ben Wing <ben@xemacs.org>
parents:
5027
diff
changeset
|
77 /************************************************************************/ |
|
3aa3888729c3
move inclusion point of text.h to clean things up a bit
Ben Wing <ben@xemacs.org>
parents:
5027
diff
changeset
|
78 /* A short intro to the format of text and of characters */ |
|
3aa3888729c3
move inclusion point of text.h to clean things up a bit
Ben Wing <ben@xemacs.org>
parents:
5027
diff
changeset
|
79 /************************************************************************/ |
|
3aa3888729c3
move inclusion point of text.h to clean things up a bit
Ben Wing <ben@xemacs.org>
parents:
5027
diff
changeset
|
80 |
|
3aa3888729c3
move inclusion point of text.h to clean things up a bit
Ben Wing <ben@xemacs.org>
parents:
5027
diff
changeset
|
81 /* |
|
3aa3888729c3
move inclusion point of text.h to clean things up a bit
Ben Wing <ben@xemacs.org>
parents:
5027
diff
changeset
|
82 "internally formatted text" and the term "internal format" in |
|
3aa3888729c3
move inclusion point of text.h to clean things up a bit
Ben Wing <ben@xemacs.org>
parents:
5027
diff
changeset
|
83 general are likely to refer to the format of text in buffers and |
|
3aa3888729c3
move inclusion point of text.h to clean things up a bit
Ben Wing <ben@xemacs.org>
parents:
5027
diff
changeset
|
84 strings; "externally formatted text" and the term "external format" |
|
3aa3888729c3
move inclusion point of text.h to clean things up a bit
Ben Wing <ben@xemacs.org>
parents:
5027
diff
changeset
|
85 refer to any text format used in the O.S. or elsewhere outside of |
|
3aa3888729c3
move inclusion point of text.h to clean things up a bit
Ben Wing <ben@xemacs.org>
parents:
5027
diff
changeset
|
86 XEmacs. The format of text and of a character are related and |
|
3aa3888729c3
move inclusion point of text.h to clean things up a bit
Ben Wing <ben@xemacs.org>
parents:
5027
diff
changeset
|
87 there must be a one-to-one relationship (hopefully through a |
|
3aa3888729c3
move inclusion point of text.h to clean things up a bit
Ben Wing <ben@xemacs.org>
parents:
5027
diff
changeset
|
88 relatively simple algorithmic means of conversion) between a string |
|
3aa3888729c3
move inclusion point of text.h to clean things up a bit
Ben Wing <ben@xemacs.org>
parents:
5027
diff
changeset
|
89 of text and an equivalent array of characters, but the conversion |
|
3aa3888729c3
move inclusion point of text.h to clean things up a bit
Ben Wing <ben@xemacs.org>
parents:
5027
diff
changeset
|
90 between the two is NOT necessarily trivial. |
|
3aa3888729c3
move inclusion point of text.h to clean things up a bit
Ben Wing <ben@xemacs.org>
parents:
5027
diff
changeset
|
91 |
|
3aa3888729c3
move inclusion point of text.h to clean things up a bit
Ben Wing <ben@xemacs.org>
parents:
5027
diff
changeset
|
92 In a non-Mule XEmacs, allowed characters are numbered 0 through |
|
3aa3888729c3
move inclusion point of text.h to clean things up a bit
Ben Wing <ben@xemacs.org>
parents:
5027
diff
changeset
|
93 255, where no fixed meaning is assigned to them, but (when |
|
3aa3888729c3
move inclusion point of text.h to clean things up a bit
Ben Wing <ben@xemacs.org>
parents:
5027
diff
changeset
|
94 representing text, rather than bytes in a binary file) in practice |
|
3aa3888729c3
move inclusion point of text.h to clean things up a bit
Ben Wing <ben@xemacs.org>
parents:
5027
diff
changeset
|
95 the lower half represents ASCII and the upper half some other 8-bit |
|
3aa3888729c3
move inclusion point of text.h to clean things up a bit
Ben Wing <ben@xemacs.org>
parents:
5027
diff
changeset
|
96 character set (chosen by setting the font, case tables, syntax |
|
3aa3888729c3
move inclusion point of text.h to clean things up a bit
Ben Wing <ben@xemacs.org>
parents:
5027
diff
changeset
|
97 tables, etc. appropriately for the character set through ad-hoc |
|
3aa3888729c3
move inclusion point of text.h to clean things up a bit
Ben Wing <ben@xemacs.org>
parents:
5027
diff
changeset
|
98 means such as the `iso-8859-1' file and the |
|
3aa3888729c3
move inclusion point of text.h to clean things up a bit
Ben Wing <ben@xemacs.org>
parents:
5027
diff
changeset
|
99 `standard-display-european' function). |
|
3aa3888729c3
move inclusion point of text.h to clean things up a bit
Ben Wing <ben@xemacs.org>
parents:
5027
diff
changeset
|
100 |
|
3aa3888729c3
move inclusion point of text.h to clean things up a bit
Ben Wing <ben@xemacs.org>
parents:
5027
diff
changeset
|
101 For more info, see `text.c' and the Internals Manual. |
|
3aa3888729c3
move inclusion point of text.h to clean things up a bit
Ben Wing <ben@xemacs.org>
parents:
5027
diff
changeset
|
102 */ |
|
3aa3888729c3
move inclusion point of text.h to clean things up a bit
Ben Wing <ben@xemacs.org>
parents:
5027
diff
changeset
|
103 |
| 771 | 104 /* ---------------------------------------------------------------------- */ |
| 105 /* Super-basic character properties */ | |
| 106 /* ---------------------------------------------------------------------- */ | |
| 107 | |
| 108 /* These properties define the specifics of how our current encoding fits | |
| 109 in the basic model used for the encoding. Because this model is the same | |
| 110 as is used for UTF-8, all these properties could be defined for it, too. | |
| 111 This would instantly make the rest of this file work with UTF-8 (with | |
| 112 the exception of a few called functions that would need to be redefined). | |
| 113 | |
| 114 (UTF-2000 implementers, take note!) | |
| 115 */ | |
| 116 | |
| 117 /* If you want more than this, you need to include charset.h */ | |
| 118 | |
| 119 #ifndef MULE | |
| 120 | |
| 826 | 121 #define rep_bytes_by_first_byte(fb) 1 |
| 122 #define byte_ascii_p(byte) 1 | |
| 867 | 123 #define MAX_ICHAR_LEN 1 |
|
5863
15041705c196
Provide `char-code-limit', implement the GNU equivalent in terms of it.
Aidan Kehoe <kehoea@parhasard.net>
parents:
5820
diff
changeset
|
124 /* Exclusive upper bound on character codes. */ |
|
15041705c196
Provide `char-code-limit', implement the GNU equivalent in terms of it.
Aidan Kehoe <kehoea@parhasard.net>
parents:
5820
diff
changeset
|
125 #define CHAR_CODE_LIMIT 0x100 |
| 771 | 126 |
| 127 #else /* MULE */ | |
| 128 | |
| 129 /* These are carefully designed to work if BYTE is signed or unsigned. */ | |
| 130 /* Note that SPC and DEL are considered ASCII, not control. */ | |
| 131 | |
| 826 | 132 #define byte_ascii_p(byte) (((byte) & ~0x7f) == 0) |
| 133 #define byte_c0_p(byte) (((byte) & ~0x1f) == 0) | |
| 134 #define byte_c1_p(byte) (((byte) & ~0x1f) == 0x80) | |
| 771 | 135 |
| 136 /* Does BYTE represent the first byte of a character? */ | |
| 137 | |
| 826 | 138 #ifdef ERROR_CHECK_TEXT |
| 139 | |
| 140 DECLARE_INLINE_HEADER ( | |
| 141 int | |
| 867 | 142 ibyte_first_byte_p_1 (int byte, const char *file, int line) |
| 826 | 143 ) |
| 144 { | |
| 145 assert_at_line (byte >= 0 && byte < 256, file, line); | |
| 146 return byte < 0xA0; | |
| 147 } | |
| 148 | |
| 867 | 149 #define ibyte_first_byte_p(byte) \ |
| 150 ibyte_first_byte_p_1 (byte, __FILE__, __LINE__) | |
| 826 | 151 |
| 152 #else | |
| 153 | |
| 867 | 154 #define ibyte_first_byte_p(byte) ((byte) < 0xA0) |
| 826 | 155 |
| 156 #endif | |
| 157 | |
| 158 #ifdef ERROR_CHECK_TEXT | |
| 771 | 159 |
| 160 /* Does BYTE represent the first byte of a multi-byte character? */ | |
| 161 | |
| 826 | 162 DECLARE_INLINE_HEADER ( |
| 163 int | |
| 867 | 164 ibyte_leading_byte_p_1 (int byte, const char *file, int line) |
| 826 | 165 ) |
| 166 { | |
| 167 assert_at_line (byte >= 0 && byte < 256, file, line); | |
| 168 return byte_c1_p (byte); | |
| 169 } | |
| 170 | |
| 867 | 171 #define ibyte_leading_byte_p(byte) \ |
| 172 ibyte_leading_byte_p_1 (byte, __FILE__, __LINE__) | |
| 826 | 173 |
| 174 #else | |
| 175 | |
| 867 | 176 #define ibyte_leading_byte_p(byte) byte_c1_p (byte) |
| 826 | 177 |
| 178 #endif | |
| 771 | 179 |
| 180 /* Table of number of bytes in the string representation of a character | |
| 181 indexed by the first byte of that representation. | |
| 182 | |
| 183 This value can be derived in other ways -- e.g. something like | |
| 826 | 184 XCHARSET_REP_BYTES (charset_by_leading_byte (first_byte)) |
| 771 | 185 but it's faster this way. */ |
| 1632 | 186 extern MODULE_API const Bytecount rep_bytes_by_first_byte[0xA0]; |
| 771 | 187 |
| 188 /* Number of bytes in the string representation of a character. */ | |
| 788 | 189 |
| 800 | 190 #ifdef ERROR_CHECK_TEXT |
| 788 | 191 |
| 826 | 192 DECLARE_INLINE_HEADER ( |
| 193 Bytecount | |
| 194 rep_bytes_by_first_byte_1 (int fb, const char *file, int line) | |
| 195 ) | |
| 771 | 196 { |
| 826 | 197 assert_at_line (fb >= 0 && fb < 0xA0, file, line); |
| 771 | 198 return rep_bytes_by_first_byte[fb]; |
| 199 } | |
| 200 | |
| 826 | 201 #define rep_bytes_by_first_byte(fb) \ |
| 202 rep_bytes_by_first_byte_1 (fb, __FILE__, __LINE__) | |
| 788 | 203 |
| 800 | 204 #else /* ERROR_CHECK_TEXT */ |
| 788 | 205 |
| 826 | 206 #define rep_bytes_by_first_byte(fb) (rep_bytes_by_first_byte[fb]) |
| 788 | 207 |
| 800 | 208 #endif /* ERROR_CHECK_TEXT */ |
| 788 | 209 |
| 826 | 210 /* Is this character represented by more than one byte in a string in the |
| 211 default format? */ | |
| 212 | |
| 867 | 213 #define ichar_multibyte_p(c) ((c) >= 0x80) |
| 214 | |
| 215 #define ichar_ascii_p(c) (!ichar_multibyte_p (c)) | |
| 826 | 216 |
|
5863
15041705c196
Provide `char-code-limit', implement the GNU equivalent in terms of it.
Aidan Kehoe <kehoea@parhasard.net>
parents:
5820
diff
changeset
|
217 /* Maximum number of bytes per Ichar when represented as text. */ |
| 867 | 218 #define MAX_ICHAR_LEN 4 |
| 771 | 219 |
|
5863
15041705c196
Provide `char-code-limit', implement the GNU equivalent in terms of it.
Aidan Kehoe <kehoea@parhasard.net>
parents:
5820
diff
changeset
|
220 /* Exclusive upper bound on char codes. */ |
|
15041705c196
Provide `char-code-limit', implement the GNU equivalent in terms of it.
Aidan Kehoe <kehoea@parhasard.net>
parents:
5820
diff
changeset
|
221 #define CHAR_CODE_LIMIT 0x200000 |
|
15041705c196
Provide `char-code-limit', implement the GNU equivalent in terms of it.
Aidan Kehoe <kehoea@parhasard.net>
parents:
5820
diff
changeset
|
222 |
| 826 | 223 #endif /* not MULE */ |
| 224 | |
|
5092
3aa3888729c3
move inclusion point of text.h to clean things up a bit
Ben Wing <ben@xemacs.org>
parents:
5027
diff
changeset
|
225 #ifdef MULE |
|
3aa3888729c3
move inclusion point of text.h to clean things up a bit
Ben Wing <ben@xemacs.org>
parents:
5027
diff
changeset
|
226 |
|
3aa3888729c3
move inclusion point of text.h to clean things up a bit
Ben Wing <ben@xemacs.org>
parents:
5027
diff
changeset
|
227 MODULE_API int non_ascii_valid_ichar_p (Ichar ch); |
|
3aa3888729c3
move inclusion point of text.h to clean things up a bit
Ben Wing <ben@xemacs.org>
parents:
5027
diff
changeset
|
228 |
|
3aa3888729c3
move inclusion point of text.h to clean things up a bit
Ben Wing <ben@xemacs.org>
parents:
5027
diff
changeset
|
229 /* Return whether the given Ichar is valid. |
|
3aa3888729c3
move inclusion point of text.h to clean things up a bit
Ben Wing <ben@xemacs.org>
parents:
5027
diff
changeset
|
230 */ |
|
3aa3888729c3
move inclusion point of text.h to clean things up a bit
Ben Wing <ben@xemacs.org>
parents:
5027
diff
changeset
|
231 |
|
3aa3888729c3
move inclusion point of text.h to clean things up a bit
Ben Wing <ben@xemacs.org>
parents:
5027
diff
changeset
|
232 DECLARE_INLINE_HEADER ( |
|
3aa3888729c3
move inclusion point of text.h to clean things up a bit
Ben Wing <ben@xemacs.org>
parents:
5027
diff
changeset
|
233 int |
|
3aa3888729c3
move inclusion point of text.h to clean things up a bit
Ben Wing <ben@xemacs.org>
parents:
5027
diff
changeset
|
234 valid_ichar_p (Ichar ch) |
|
3aa3888729c3
move inclusion point of text.h to clean things up a bit
Ben Wing <ben@xemacs.org>
parents:
5027
diff
changeset
|
235 ) |
|
3aa3888729c3
move inclusion point of text.h to clean things up a bit
Ben Wing <ben@xemacs.org>
parents:
5027
diff
changeset
|
236 { |
|
3aa3888729c3
move inclusion point of text.h to clean things up a bit
Ben Wing <ben@xemacs.org>
parents:
5027
diff
changeset
|
237 return (! (ch & ~0xFF)) || non_ascii_valid_ichar_p (ch); |
|
3aa3888729c3
move inclusion point of text.h to clean things up a bit
Ben Wing <ben@xemacs.org>
parents:
5027
diff
changeset
|
238 } |
|
3aa3888729c3
move inclusion point of text.h to clean things up a bit
Ben Wing <ben@xemacs.org>
parents:
5027
diff
changeset
|
239 |
|
3aa3888729c3
move inclusion point of text.h to clean things up a bit
Ben Wing <ben@xemacs.org>
parents:
5027
diff
changeset
|
240 #else /* not MULE */ |
|
3aa3888729c3
move inclusion point of text.h to clean things up a bit
Ben Wing <ben@xemacs.org>
parents:
5027
diff
changeset
|
241 |
|
3aa3888729c3
move inclusion point of text.h to clean things up a bit
Ben Wing <ben@xemacs.org>
parents:
5027
diff
changeset
|
242 /* This works when CH is negative, and correctly returns non-zero only when CH |
|
3aa3888729c3
move inclusion point of text.h to clean things up a bit
Ben Wing <ben@xemacs.org>
parents:
5027
diff
changeset
|
243 is in the range [0, 255], inclusive. */ |
|
5863
15041705c196
Provide `char-code-limit', implement the GNU equivalent in terms of it.
Aidan Kehoe <kehoea@parhasard.net>
parents:
5820
diff
changeset
|
244 #define valid_ichar_p(ch) (! (ch & ~(CHAR_CODE_LIMIT - 1))) |
|
5092
3aa3888729c3
move inclusion point of text.h to clean things up a bit
Ben Wing <ben@xemacs.org>
parents:
5027
diff
changeset
|
245 |
|
3aa3888729c3
move inclusion point of text.h to clean things up a bit
Ben Wing <ben@xemacs.org>
parents:
5027
diff
changeset
|
246 #endif /* not MULE */ |
|
3aa3888729c3
move inclusion point of text.h to clean things up a bit
Ben Wing <ben@xemacs.org>
parents:
5027
diff
changeset
|
247 |
| 2367 | 248 /* For more discussion, see text.c, "handling non-default formats" */ |
| 249 | |
| 826 | 250 typedef enum internal_format |
| 251 { | |
| 252 FORMAT_DEFAULT, | |
| 253 FORMAT_8_BIT_FIXED, | |
| 254 FORMAT_16_BIT_FIXED, /* not implemented */ | |
| 255 FORMAT_32_BIT_FIXED /* not implemented */ | |
| 256 } Internal_Format; | |
| 257 | |
| 258 #ifdef MULE | |
| 259 /* "OBJECT" below will usually be a buffer, string, or nil. This needs to | |
| 260 be passed in because the interpretation of 8-bit-fixed and 16-bit-fixed | |
| 261 values may depend on the buffer, e.g. depending on what language the | |
| 262 text in the buffer is in. */ | |
| 263 | |
| 867 | 264 /* True if Ichar CH can be represented in 8-bit-fixed format. */ |
| 265 #define ichar_8_bit_fixed_p(ch, object) (((ch) & ~0xff) == 0) | |
| 266 /* Convert Ichar CH to an 8-bit int, as will be stored in the buffer. */ | |
| 267 #define ichar_to_raw_8_bit_fixed(ch, object) ((Ibyte) (ch)) | |
| 826 | 268 /* Convert the other way. */ |
| 867 | 269 #define raw_8_bit_fixed_to_ichar(ch, object) ((Ichar) (ch)) |
| 270 | |
| 271 #define ichar_16_bit_fixed_p(ch, object) (((ch) & ~0xffff) == 0) | |
| 272 /* Convert Ichar CH to a 16-bit int, as will be stored in the buffer. */ | |
| 273 #define ichar_to_raw_16_bit_fixed(ch, object) ((UINT_16_BIT) (ch)) | |
| 826 | 274 /* Convert the other way. */ |
| 867 | 275 #define raw_16_bit_fixed_to_ichar(ch, object) ((Ichar) (ch)) |
| 276 | |
| 277 /* Convert Ichar CH to a 32-bit int, as will be stored in the buffer. */ | |
| 278 #define ichar_to_raw_32_bit_fixed(ch, object) ((UINT_32_BIT) (ch)) | |
| 826 | 279 /* Convert the other way. */ |
| 867 | 280 #define raw_32_bit_fixed_to_ichar(ch, object) ((Ichar) (ch)) |
| 826 | 281 |
| 282 /* Return the "raw value" of a character as stored in the buffer. In the | |
| 283 default format, this is just the same as the character. In fixed-width | |
| 284 formats, this is the actual value in the buffer, which will be limited | |
| 285 to the range as established by the format. This is used when searching | |
| 286 for a character in a buffer -- it's faster to convert the character to | |
| 287 the raw value and look for that, than repeatedly convert each raw value | |
| 288 in the buffer into a character. */ | |
| 289 | |
| 290 DECLARE_INLINE_HEADER ( | |
| 867 | 291 Raw_Ichar |
| 2286 | 292 ichar_to_raw (Ichar ch, Internal_Format fmt, |
| 293 Lisp_Object UNUSED (object)) | |
| 826 | 294 ) |
| 295 { | |
| 296 switch (fmt) | |
| 297 { | |
| 298 case FORMAT_DEFAULT: | |
| 867 | 299 return (Raw_Ichar) ch; |
| 826 | 300 case FORMAT_16_BIT_FIXED: |
| 867 | 301 text_checking_assert (ichar_16_bit_fixed_p (ch, object)); |
| 302 return (Raw_Ichar) ichar_to_raw_16_bit_fixed (ch, object); | |
| 826 | 303 case FORMAT_32_BIT_FIXED: |
| 867 | 304 return (Raw_Ichar) ichar_to_raw_32_bit_fixed (ch, object); |
| 826 | 305 default: |
| 306 text_checking_assert (fmt == FORMAT_8_BIT_FIXED); | |
| 867 | 307 text_checking_assert (ichar_8_bit_fixed_p (ch, object)); |
| 308 return (Raw_Ichar) ichar_to_raw_8_bit_fixed (ch, object); | |
| 826 | 309 } |
| 310 } | |
| 311 | |
| 312 /* Return whether CH is representable in the given format in the given | |
| 313 object. */ | |
| 314 | |
| 315 DECLARE_INLINE_HEADER ( | |
| 316 int | |
| 2286 | 317 ichar_fits_in_format (Ichar ch, Internal_Format fmt, |
| 318 Lisp_Object UNUSED (object)) | |
| 826 | 319 ) |
| 320 { | |
| 321 switch (fmt) | |
| 322 { | |
| 323 case FORMAT_DEFAULT: | |
| 324 return 1; | |
| 325 case FORMAT_16_BIT_FIXED: | |
| 867 | 326 return ichar_16_bit_fixed_p (ch, object); |
| 826 | 327 case FORMAT_32_BIT_FIXED: |
| 328 return 1; | |
| 329 default: | |
| 330 text_checking_assert (fmt == FORMAT_8_BIT_FIXED); | |
| 867 | 331 return ichar_8_bit_fixed_p (ch, object); |
| 826 | 332 } |
| 333 } | |
| 334 | |
| 335 /* Assuming the formats are the same, return whether the two objects | |
| 336 represent text in exactly the same way. */ | |
| 337 | |
| 338 DECLARE_INLINE_HEADER ( | |
| 339 int | |
| 2286 | 340 objects_have_same_internal_representation (Lisp_Object UNUSED (srcobj), |
| 341 Lisp_Object UNUSED (dstobj)) | |
| 826 | 342 ) |
| 343 { | |
| 344 /* &&#### implement this properly when we allow per-object format | |
| 345 differences */ | |
| 346 return 1; | |
| 347 } | |
| 348 | |
| 349 #else | |
| 350 | |
| 867 | 351 #define ichar_to_raw(ch, fmt, object) ((Raw_Ichar) (ch)) |
| 352 #define ichar_fits_in_format(ch, fmt, object) 1 | |
| 826 | 353 #define objects_have_same_internal_representation(srcobj, dstobj) 1 |
| 354 | |
| 771 | 355 #endif /* MULE */ |
| 356 | |
| 1632 | 357 MODULE_API int dfc_coding_system_is_unicode (Lisp_Object codesys); |
| 771 | 358 |
| 359 DECLARE_INLINE_HEADER ( | |
| 360 Bytecount dfc_external_data_len (const void *ptr, Lisp_Object codesys) | |
| 361 ) | |
| 362 { | |
| 363 if (dfc_coding_system_is_unicode (codesys)) | |
| 364 return sizeof (wchar_t) * wcslen ((wchar_t *) ptr); | |
| 365 else | |
| 366 return strlen ((char *) ptr); | |
| 367 } | |
| 368 | |
| 369 | |
| 370 /************************************************************************/ | |
| 371 /* */ | |
| 372 /* working with raw internal-format data */ | |
| 373 /* */ | |
| 374 /************************************************************************/ | |
| 375 | |
| 826 | 376 /* |
| 377 Use the following functions/macros on contiguous text in any of the | |
| 378 internal formats. Those that take a format arg work on all internal | |
| 379 formats; the others work only on the default (variable-width under Mule) | |
| 380 format. If the text you're operating on is known to come from a buffer, | |
| 381 use the buffer-level functions in buffer.h, which automatically know the | |
| 382 correct format and handle the gap. | |
| 383 | |
| 384 Some terminology: | |
| 385 | |
| 867 | 386 "itext" appearing in the macros means "internal-format text" -- type |
| 387 `Ibyte *'. Operations on such pointers themselves, rather than on the | |
| 388 text being pointed to, have "itext" instead of "itext" in the macro | |
| 389 name. "ichar" in the macro names means an Ichar -- the representation | |
| 826 | 390 of a character as a single integer rather than a series of bytes, as part |
| 867 | 391 of "itext". Many of the macros below are for converting between the |
| 826 | 392 two representations of characters. |
| 393 | |
| 867 | 394 Note also that we try to consistently distinguish between an "Ichar" and |
| 826 | 395 a Lisp character. Stuff working with Lisp characters often just says |
| 867 | 396 "char", so we consistently use "Ichar" when that's what we're working |
| 826 | 397 with. */ |
| 398 | |
| 399 /* The three golden rules of macros: | |
| 771 | 400 |
| 401 1) Anything that's an lvalue can be evaluated more than once. | |
| 826 | 402 |
| 403 2) Macros where anything else can be evaluated more than once should | |
| 404 have the word "unsafe" in their name (exceptions may be made for | |
| 405 large sets of macros that evaluate arguments of certain types more | |
| 406 than once, e.g. struct buffer * arguments, when clearly indicated in | |
| 407 the macro documentation). These macros are generally meant to be | |
| 408 called only by other macros that have already stored the calling | |
| 409 values in temporary variables. | |
| 410 | |
| 411 3) Nothing else can be evaluated more than once. Use inline | |
| 771 | 412 functions, if necessary, to prevent multiple evaluation. |
| 826 | 413 |
| 414 NOTE: The functions and macros below are given full prototypes in their | |
| 415 docs, even when the implementation is a macro. In such cases, passing | |
| 416 an argument of a type other than expected will produce undefined | |
| 417 results. Also, given that macros can do things functions can't (in | |
| 418 particular, directly modify arguments as if they were passed by | |
| 419 reference), the declaration syntax has been extended to include the | |
| 420 call-by-reference syntax from C++, where an & after a type indicates | |
| 421 that the argument is an lvalue and is passed by reference, i.e. the | |
| 422 function can modify its value. (This is equivalent in C to passing a | |
| 423 pointer to the argument, but without the need to explicitly worry about | |
| 424 pointers.) | |
| 425 | |
| 426 When to capitalize macros: | |
| 427 | |
| 428 -- Capitalize macros doing stuff obviously impossible with (C) | |
| 429 functions, e.g. directly modifying arguments as if they were passed by | |
| 430 reference. | |
| 431 | |
| 432 -- Capitalize macros that evaluate *any* argument more than once regardless | |
| 433 of whether that's "allowed" (e.g. buffer arguments). | |
| 434 | |
| 435 -- Capitalize macros that directly access a field in a Lisp_Object or | |
| 436 its equivalent underlying structure. In such cases, access through the | |
| 437 Lisp_Object precedes the macro with an X, and access through the underlying | |
| 438 structure doesn't. | |
| 439 | |
| 440 -- Capitalize certain other basic macros relating to Lisp_Objects; e.g. | |
| 441 FRAMEP, CHECK_FRAME, etc. | |
| 442 | |
| 443 -- Try to avoid capitalizing any other macros. | |
| 771 | 444 */ |
| 445 | |
| 446 /* ---------------------------------------------------------------------- */ | |
| 867 | 447 /* Working with itext's (pointers to internally-formatted text) */ |
| 771 | 448 /* ---------------------------------------------------------------------- */ |
| 449 | |
| 867 | 450 /* Given an itext, does it point to the beginning of a character? |
| 826 | 451 */ |
| 452 | |
| 771 | 453 #ifdef MULE |
| 867 | 454 # define valid_ibyteptr_p(ptr) ibyte_first_byte_p (* (ptr)) |
| 771 | 455 #else |
| 867 | 456 # define valid_ibyteptr_p(ptr) 1 |
| 771 | 457 #endif |
| 458 | |
| 867 | 459 /* If error-checking is enabled, assert that the given itext points to |
| 826 | 460 the beginning of a character. Otherwise, do nothing. |
| 461 */ | |
| 462 | |
| 867 | 463 #define assert_valid_ibyteptr(ptr) text_checking_assert (valid_ibyteptr_p (ptr)) |
| 464 | |
| 465 /* Given a itext (assumed to point at the beginning of a character), | |
| 826 | 466 modify that pointer so it points to the beginning of the next character. |
| 467 | |
| 867 | 468 Note that INC_IBYTEPTR() and DEC_IBYTEPTR() have to be written in |
| 469 completely separate ways. INC_IBYTEPTR() cannot use the DEC_IBYTEPTR() | |
| 771 | 470 trick of looking for a valid first byte because it might run off |
| 867 | 471 the end of the string. DEC_IBYTEPTR() can't use the INC_IBYTEPTR() |
| 771 | 472 method because it doesn't have easy access to the first byte of |
| 473 the character it's moving over. */ | |
| 474 | |
| 867 | 475 #define INC_IBYTEPTR(ptr) do { \ |
| 476 assert_valid_ibyteptr (ptr); \ | |
| 826 | 477 (ptr) += rep_bytes_by_first_byte (* (ptr)); \ |
| 478 } while (0) | |
| 479 | |
| 1204 | 480 #define INC_IBYTEPTR_FMT(ptr, fmt) \ |
| 481 do { \ | |
| 482 Internal_Format __icf_fmt = (fmt); \ | |
| 483 switch (__icf_fmt) \ | |
| 484 { \ | |
| 485 case FORMAT_DEFAULT: \ | |
| 486 INC_IBYTEPTR (ptr); \ | |
| 487 break; \ | |
| 488 case FORMAT_16_BIT_FIXED: \ | |
| 489 text_checking_assert ((void *) ptr == ALIGN_PTR (ptr, UINT_16_BIT)); \ | |
| 490 (ptr) += 2; \ | |
| 491 break; \ | |
| 492 case FORMAT_32_BIT_FIXED: \ | |
| 493 text_checking_assert ((void *) ptr == ALIGN_PTR (ptr, UINT_32_BIT)); \ | |
| 494 (ptr) += 4; \ | |
| 495 break; \ | |
| 496 default: \ | |
| 497 text_checking_assert (fmt == FORMAT_8_BIT_FIXED); \ | |
| 498 (ptr)++; \ | |
| 499 break; \ | |
| 500 } \ | |
| 826 | 501 } while (0) |
| 502 | |
| 867 | 503 /* Given a itext (assumed to point at the beginning of a character or at |
| 826 | 504 the very end of the text), modify that pointer so it points to the |
| 505 beginning of the previous character. | |
| 506 */ | |
| 771 | 507 |
| 800 | 508 #ifdef ERROR_CHECK_TEXT |
| 826 | 509 /* We use a separate definition to avoid warnings about unused dc_ptr1 */ |
| 867 | 510 #define DEC_IBYTEPTR(ptr) do { \ |
| 1333 | 511 const Ibyte *dc_ptr1 = (ptr); \ |
| 826 | 512 do { \ |
| 513 (ptr)--; \ | |
| 867 | 514 } while (!valid_ibyteptr_p (ptr)); \ |
| 826 | 515 text_checking_assert (dc_ptr1 - (ptr) == rep_bytes_by_first_byte (*(ptr))); \ |
| 771 | 516 } while (0) |
| 826 | 517 #else |
| 867 | 518 #define DEC_IBYTEPTR(ptr) do { \ |
| 826 | 519 do { \ |
| 520 (ptr)--; \ | |
| 867 | 521 } while (!valid_ibyteptr_p (ptr)); \ |
| 771 | 522 } while (0) |
| 826 | 523 #endif /* ERROR_CHECK_TEXT */ |
| 524 | |
| 1204 | 525 #define DEC_IBYTEPTR_FMT(ptr, fmt) \ |
| 526 do { \ | |
| 527 Internal_Format __icf_fmt = (fmt); \ | |
| 528 switch (__icf_fmt) \ | |
| 529 { \ | |
| 530 case FORMAT_DEFAULT: \ | |
| 531 DEC_IBYTEPTR (ptr); \ | |
| 532 break; \ | |
| 533 case FORMAT_16_BIT_FIXED: \ | |
| 534 text_checking_assert ((void *) ptr == ALIGN_PTR (ptr, UINT_16_BIT)); \ | |
| 535 (ptr) -= 2; \ | |
| 536 break; \ | |
| 537 case FORMAT_32_BIT_FIXED: \ | |
| 538 text_checking_assert ((void *) ptr == ALIGN_PTR (ptr, UINT_32_BIT)); \ | |
| 539 (ptr) -= 4; \ | |
| 540 break; \ | |
| 541 default: \ | |
| 542 text_checking_assert (fmt == FORMAT_8_BIT_FIXED); \ | |
| 543 (ptr)--; \ | |
| 544 break; \ | |
| 545 } \ | |
| 771 | 546 } while (0) |
| 547 | |
| 548 #ifdef MULE | |
| 549 | |
| 826 | 550 /* Make sure that PTR is pointing to the beginning of a character. If not, |
| 551 back up until this is the case. Note that there are not too many places | |
| 552 where it is legitimate to do this sort of thing. It's an error if | |
| 553 you're passed an "invalid" char * pointer. NOTE: PTR *must* be pointing | |
| 554 to a valid part of the string (i.e. not the very end, unless the string | |
| 555 is zero-terminated or something) in order for this function to not cause | |
| 556 crashes. | |
| 557 */ | |
| 558 | |
| 771 | 559 /* Note that this reads the byte at *PTR! */ |
| 560 | |
| 867 | 561 #define VALIDATE_IBYTEPTR_BACKWARD(ptr) do { \ |
|
5026
46cf825f6158
revamp DFC comment in text.h, some whitespace cleanup
Ben Wing <ben@xemacs.org>
parents:
4982
diff
changeset
|
562 while (!valid_ibyteptr_p (ptr)) ptr--; \ |
| 771 | 563 } while (0) |
| 564 | |
| 826 | 565 /* Make sure that PTR is pointing to the beginning of a character. If not, |
| 566 move forward until this is the case. Note that there are not too many | |
| 567 places where it is legitimate to do this sort of thing. It's an error | |
| 568 if you're passed an "invalid" char * pointer. | |
| 569 */ | |
| 771 | 570 |
| 867 | 571 /* This needs to be trickier than VALIDATE_IBYTEPTR_BACKWARD() to avoid the |
| 771 | 572 possibility of running off the end of the string. */ |
| 573 | |
| 867 | 574 #define VALIDATE_IBYTEPTR_FORWARD(ptr) do { \ |
| 575 Ibyte *vcf_ptr = (ptr); \ | |
| 576 VALIDATE_IBYTEPTR_BACKWARD (vcf_ptr); \ | |
| 771 | 577 if (vcf_ptr != (ptr)) \ |
| 578 { \ | |
| 579 (ptr) = vcf_ptr; \ | |
| 867 | 580 INC_IBYTEPTR (ptr); \ |
| 771 | 581 } \ |
| 582 } while (0) | |
| 583 | |
| 584 #else /* not MULE */ | |
| 867 | 585 #define VALIDATE_IBYTEPTR_BACKWARD(ptr) |
| 586 #define VALIDATE_IBYTEPTR_FORWARD(ptr) | |
| 826 | 587 #endif /* not MULE */ |
| 588 | |
| 589 #ifdef MULE | |
| 590 | |
| 867 | 591 /* Given a Ibyte string at PTR of size N, possibly with a partial |
| 826 | 592 character at the end, return the size of the longest substring of |
| 593 complete characters. Does not assume that the byte at *(PTR + N) is | |
| 594 readable. Note that there are not too many places where it is | |
| 595 legitimate to do this sort of thing. It's an error if you're passed an | |
| 596 "invalid" offset. */ | |
| 597 | |
| 598 DECLARE_INLINE_HEADER ( | |
| 599 Bytecount | |
| 867 | 600 validate_ibyte_string_backward (const Ibyte *ptr, Bytecount n) |
| 826 | 601 ) |
| 602 { | |
| 867 | 603 const Ibyte *ptr2; |
| 826 | 604 |
| 605 if (n == 0) | |
| 606 return n; | |
| 607 ptr2 = ptr + n - 1; | |
| 867 | 608 VALIDATE_IBYTEPTR_BACKWARD (ptr2); |
| 826 | 609 if (ptr2 + rep_bytes_by_first_byte (*ptr2) != ptr + n) |
| 610 return ptr2 - ptr; | |
| 611 return n; | |
| 612 } | |
| 613 | |
| 614 #else | |
| 615 | |
| 867 | 616 #define validate_ibyte_string_backward(ptr, n) (n) |
| 826 | 617 |
| 618 #endif /* MULE */ | |
| 771 | 619 |
|
4952
19a72041c5ed
Mule-izing, various fixes related to char * arguments
Ben Wing <ben@xemacs.org>
parents:
4853
diff
changeset
|
620 /* ASSERT_ASCTEXT_ASCII(ptr): Check that an Ascbyte * pointer points to |
|
19a72041c5ed
Mule-izing, various fixes related to char * arguments
Ben Wing <ben@xemacs.org>
parents:
4853
diff
changeset
|
621 purely ASCII text. Useful for checking that putatively ASCII strings |
|
19a72041c5ed
Mule-izing, various fixes related to char * arguments
Ben Wing <ben@xemacs.org>
parents:
4853
diff
changeset
|
622 (i.e. declared as Ascbyte * or const Ascbyte *) are actually ASCII. |
|
19a72041c5ed
Mule-izing, various fixes related to char * arguments
Ben Wing <ben@xemacs.org>
parents:
4853
diff
changeset
|
623 This is important because otherwise we need to worry about what |
|
19a72041c5ed
Mule-izing, various fixes related to char * arguments
Ben Wing <ben@xemacs.org>
parents:
4853
diff
changeset
|
624 encoding they are in -- internal or some external encoding. |
|
19a72041c5ed
Mule-izing, various fixes related to char * arguments
Ben Wing <ben@xemacs.org>
parents:
4853
diff
changeset
|
625 |
|
19a72041c5ed
Mule-izing, various fixes related to char * arguments
Ben Wing <ben@xemacs.org>
parents:
4853
diff
changeset
|
626 ASSERT_ASCTEXT_ASCII_LEN(ptr, len): Same as ASSERT_ASCTEXT_ASCII() |
|
19a72041c5ed
Mule-izing, various fixes related to char * arguments
Ben Wing <ben@xemacs.org>
parents:
4853
diff
changeset
|
627 but where the length has been explicitly given. Useful if the string |
|
19a72041c5ed
Mule-izing, various fixes related to char * arguments
Ben Wing <ben@xemacs.org>
parents:
4853
diff
changeset
|
628 may contain embedded zeroes. |
|
19a72041c5ed
Mule-izing, various fixes related to char * arguments
Ben Wing <ben@xemacs.org>
parents:
4853
diff
changeset
|
629 */ |
|
19a72041c5ed
Mule-izing, various fixes related to char * arguments
Ben Wing <ben@xemacs.org>
parents:
4853
diff
changeset
|
630 |
| 2367 | 631 #ifdef ERROR_CHECK_TEXT |
| 632 #define ASSERT_ASCTEXT_ASCII_LEN(ptr, len) \ | |
| 633 do { \ | |
| 634 int aia2; \ | |
| 635 const Ascbyte *aia2ptr = (ptr); \ | |
| 636 int aia2len = (len); \ | |
| 637 \ | |
| 638 for (aia2 = 0; aia2 < aia2len; aia2++) \ | |
| 639 assert (aia2ptr[aia2] >= 0x00 && aia2ptr[aia2] < 0x7F); \ | |
| 640 } while (0) | |
| 641 #define ASSERT_ASCTEXT_ASCII(ptr) \ | |
| 642 do { \ | |
| 643 const Ascbyte *aiaz2 = (ptr); \ | |
| 644 ASSERT_ASCTEXT_ASCII_LEN (aiaz2, strlen (aiaz2)); \ | |
| 645 } while (0) | |
| 646 #else | |
|
5820
b3824b7f5627
Some changes to eliminate warnings with Apple clang version 1.7.
Aidan Kehoe <kehoea@parhasard.net>
parents:
5786
diff
changeset
|
647 #define ASSERT_ASCTEXT_ASCII_LEN(ptr, len) DO_NOTHING |
|
b3824b7f5627
Some changes to eliminate warnings with Apple clang version 1.7.
Aidan Kehoe <kehoea@parhasard.net>
parents:
5786
diff
changeset
|
648 #define ASSERT_ASCTEXT_ASCII(ptr) DO_NOTHING |
| 2367 | 649 #endif |
| 650 | |
| 771 | 651 /* -------------------------------------------------------------- */ |
| 826 | 652 /* Working with the length (in bytes and characters) of a */ |
| 653 /* section of internally-formatted text */ | |
| 771 | 654 /* -------------------------------------------------------------- */ |
| 655 | |
| 826 | 656 #ifdef MULE |
| 657 | |
| 1632 | 658 MODULE_API Charcount |
| 659 bytecount_to_charcount_fun (const Ibyte *ptr, Bytecount len); | |
| 660 MODULE_API Bytecount | |
| 661 charcount_to_bytecount_fun (const Ibyte *ptr, Charcount len); | |
| 826 | 662 |
| 663 /* Given a pointer to a text string and a length in bytes, return | |
| 664 the equivalent length in characters. */ | |
| 665 | |
| 666 DECLARE_INLINE_HEADER ( | |
| 667 Charcount | |
| 867 | 668 bytecount_to_charcount (const Ibyte *ptr, Bytecount len) |
| 826 | 669 ) |
| 670 { | |
| 671 if (len < 20) /* Just a random guess, but it should be more or less correct. | |
| 672 If number of bytes is small, just do a simple loop, | |
| 673 which should be more efficient. */ | |
| 674 { | |
| 675 Charcount count = 0; | |
| 867 | 676 const Ibyte *end = ptr + len; |
| 826 | 677 while (ptr < end) |
| 678 { | |
| 867 | 679 INC_IBYTEPTR (ptr); |
| 826 | 680 count++; |
| 681 } | |
| 682 /* Bomb out if the specified substring ends in the middle | |
| 683 of a character. Note that we might have already gotten | |
| 684 a core dump above from an invalid reference, but at least | |
| 685 we will get no farther than here. | |
| 686 | |
| 687 This also catches len < 0. */ | |
| 688 text_checking_assert (ptr == end); | |
| 689 | |
| 690 return count; | |
| 691 } | |
| 692 else | |
| 693 return bytecount_to_charcount_fun (ptr, len); | |
| 694 } | |
| 695 | |
| 696 /* Given a pointer to a text string and a length in characters, return the | |
| 697 equivalent length in bytes. | |
| 698 */ | |
| 699 | |
| 700 DECLARE_INLINE_HEADER ( | |
| 701 Bytecount | |
| 867 | 702 charcount_to_bytecount (const Ibyte *ptr, Charcount len) |
| 826 | 703 ) |
| 704 { | |
| 705 text_checking_assert (len >= 0); | |
| 706 if (len < 20) /* See above */ | |
| 707 { | |
| 867 | 708 const Ibyte *newptr = ptr; |
| 826 | 709 while (len > 0) |
| 710 { | |
| 867 | 711 INC_IBYTEPTR (newptr); |
| 826 | 712 len--; |
| 713 } | |
| 714 return newptr - ptr; | |
| 715 } | |
| 716 else | |
| 717 return charcount_to_bytecount_fun (ptr, len); | |
| 718 } | |
| 719 | |
| 2367 | 720 MODULE_API Bytecount |
| 721 charcount_to_bytecount_down_fun (const Ibyte *ptr, Charcount len); | |
| 722 | |
| 723 /* Given a pointer to a text string and a length in bytes, return | |
| 724 the equivalent length in characters of the stretch [PTR - LEN, PTR). */ | |
| 725 | |
| 726 DECLARE_INLINE_HEADER ( | |
| 727 Charcount | |
| 728 bytecount_to_charcount_down (const Ibyte *ptr, Bytecount len) | |
| 729 ) | |
| 730 { | |
| 731 /* No need to be clever here */ | |
| 732 return bytecount_to_charcount (ptr - len, len); | |
| 733 } | |
| 734 | |
| 735 /* Given a pointer to a text string and a length in characters, return the | |
| 736 equivalent length in bytes of the stretch of characters of that length | |
| 737 BEFORE the pointer. | |
| 738 */ | |
| 739 | |
| 740 DECLARE_INLINE_HEADER ( | |
| 741 Bytecount | |
| 742 charcount_to_bytecount_down (const Ibyte *ptr, Charcount len) | |
| 743 ) | |
| 744 { | |
| 745 #define SLEDGEHAMMER_CHECK_TEXT | |
| 746 #ifdef SLEDGEHAMMER_CHECK_TEXT | |
| 747 Charcount len1 = len; | |
| 748 Bytecount ret1, ret2; | |
| 749 | |
| 750 /* To test the correctness of the function version, always do the | |
| 751 calculation both ways and check that the values are the same. */ | |
| 752 text_checking_assert (len >= 0); | |
| 753 { | |
| 754 const Ibyte *newptr = ptr; | |
| 755 while (len1 > 0) | |
| 756 { | |
| 757 DEC_IBYTEPTR (newptr); | |
| 758 len1--; | |
| 759 } | |
| 760 ret1 = ptr - newptr; | |
| 761 } | |
| 762 ret2 = charcount_to_bytecount_down_fun (ptr, len); | |
| 763 text_checking_assert (ret1 == ret2); | |
| 764 return ret1; | |
| 765 #else | |
| 766 text_checking_assert (len >= 0); | |
| 767 if (len < 20) /* See above */ | |
| 768 { | |
| 769 const Ibyte *newptr = ptr; | |
| 770 while (len > 0) | |
| 771 { | |
| 772 DEC_IBYTEPTR (newptr); | |
| 773 len--; | |
| 774 } | |
| 775 return ptr - newptr; | |
| 776 } | |
| 777 else | |
| 778 return charcount_to_bytecount_down_fun (ptr, len); | |
| 779 #endif /* SLEDGEHAMMER_CHECK_TEXT */ | |
| 780 } | |
| 781 | |
| 826 | 782 /* Given a pointer to a text string in the specified format and a length in |
| 783 bytes, return the equivalent length in characters. | |
| 784 */ | |
| 785 | |
| 786 DECLARE_INLINE_HEADER ( | |
| 787 Charcount | |
| 867 | 788 bytecount_to_charcount_fmt (const Ibyte *ptr, Bytecount len, |
| 826 | 789 Internal_Format fmt) |
| 790 ) | |
| 791 { | |
| 792 switch (fmt) | |
| 793 { | |
| 794 case FORMAT_DEFAULT: | |
| 795 return bytecount_to_charcount (ptr, len); | |
| 796 case FORMAT_16_BIT_FIXED: | |
| 1204 | 797 text_checking_assert ((void *) ptr == ALIGN_PTR (ptr, UINT_16_BIT)); |
| 826 | 798 return (Charcount) (len << 1); |
| 799 case FORMAT_32_BIT_FIXED: | |
| 1204 | 800 text_checking_assert ((void *) ptr == ALIGN_PTR (ptr, UINT_32_BIT)); |
| 826 | 801 return (Charcount) (len << 2); |
| 802 default: | |
| 803 text_checking_assert (fmt == FORMAT_8_BIT_FIXED); | |
| 804 return (Charcount) len; | |
| 805 } | |
| 806 } | |
| 807 | |
| 808 /* Given a pointer to a text string in the specified format and a length in | |
| 809 characters, return the equivalent length in bytes. | |
| 810 */ | |
| 811 | |
| 812 DECLARE_INLINE_HEADER ( | |
| 813 Bytecount | |
| 867 | 814 charcount_to_bytecount_fmt (const Ibyte *ptr, Charcount len, |
| 826 | 815 Internal_Format fmt) |
| 816 ) | |
| 817 { | |
| 818 switch (fmt) | |
| 819 { | |
| 820 case FORMAT_DEFAULT: | |
| 821 return charcount_to_bytecount (ptr, len); | |
| 822 case FORMAT_16_BIT_FIXED: | |
| 1204 | 823 text_checking_assert ((void *) ptr == ALIGN_PTR (ptr, UINT_16_BIT)); |
| 826 | 824 text_checking_assert (!(len & 1)); |
| 825 return (Bytecount) (len >> 1); | |
| 826 case FORMAT_32_BIT_FIXED: | |
| 827 text_checking_assert (!(len & 3)); | |
| 1204 | 828 text_checking_assert ((void *) ptr == ALIGN_PTR (ptr, UINT_32_BIT)); |
| 826 | 829 return (Bytecount) (len >> 2); |
| 830 default: | |
| 831 text_checking_assert (fmt == FORMAT_8_BIT_FIXED); | |
| 832 return (Bytecount) len; | |
| 833 } | |
| 834 } | |
| 835 | |
|
5774
7a538e1a4676
Use skip_ascii() in no_conversion_convert() when encoding.
Aidan Kehoe <kehoea@parhasard.net>
parents:
5402
diff
changeset
|
836 #ifdef EFFICIENT_INT_128_BIT |
|
7a538e1a4676
Use skip_ascii() in no_conversion_convert() when encoding.
Aidan Kehoe <kehoea@parhasard.net>
parents:
5402
diff
changeset
|
837 # define STRIDE_TYPE INT_128_BIT |
|
7a538e1a4676
Use skip_ascii() in no_conversion_convert() when encoding.
Aidan Kehoe <kehoea@parhasard.net>
parents:
5402
diff
changeset
|
838 # define HIGH_BIT_MASK \ |
|
7a538e1a4676
Use skip_ascii() in no_conversion_convert() when encoding.
Aidan Kehoe <kehoea@parhasard.net>
parents:
5402
diff
changeset
|
839 MAKE_128_BIT_UNSIGNED_CONSTANT (0x80808080808080808080808080808080) |
|
7a538e1a4676
Use skip_ascii() in no_conversion_convert() when encoding.
Aidan Kehoe <kehoea@parhasard.net>
parents:
5402
diff
changeset
|
840 #elif defined (EFFICIENT_INT_64_BIT) |
|
7a538e1a4676
Use skip_ascii() in no_conversion_convert() when encoding.
Aidan Kehoe <kehoea@parhasard.net>
parents:
5402
diff
changeset
|
841 # define STRIDE_TYPE INT_64_BIT |
|
7a538e1a4676
Use skip_ascii() in no_conversion_convert() when encoding.
Aidan Kehoe <kehoea@parhasard.net>
parents:
5402
diff
changeset
|
842 # define HIGH_BIT_MASK MAKE_64_BIT_UNSIGNED_CONSTANT (0x8080808080808080) |
|
7a538e1a4676
Use skip_ascii() in no_conversion_convert() when encoding.
Aidan Kehoe <kehoea@parhasard.net>
parents:
5402
diff
changeset
|
843 #else |
|
7a538e1a4676
Use skip_ascii() in no_conversion_convert() when encoding.
Aidan Kehoe <kehoea@parhasard.net>
parents:
5402
diff
changeset
|
844 # define STRIDE_TYPE INT_32_BIT |
|
7a538e1a4676
Use skip_ascii() in no_conversion_convert() when encoding.
Aidan Kehoe <kehoea@parhasard.net>
parents:
5402
diff
changeset
|
845 # define HIGH_BIT_MASK MAKE_32_BIT_UNSIGNED_CONSTANT (0x80808080) |
|
7a538e1a4676
Use skip_ascii() in no_conversion_convert() when encoding.
Aidan Kehoe <kehoea@parhasard.net>
parents:
5402
diff
changeset
|
846 #endif |
|
7a538e1a4676
Use skip_ascii() in no_conversion_convert() when encoding.
Aidan Kehoe <kehoea@parhasard.net>
parents:
5402
diff
changeset
|
847 |
|
7a538e1a4676
Use skip_ascii() in no_conversion_convert() when encoding.
Aidan Kehoe <kehoea@parhasard.net>
parents:
5402
diff
changeset
|
848 #define ALIGN_BITS ((EMACS_UINT) (ALIGNOF (STRIDE_TYPE) - 1)) |
|
7a538e1a4676
Use skip_ascii() in no_conversion_convert() when encoding.
Aidan Kehoe <kehoea@parhasard.net>
parents:
5402
diff
changeset
|
849 #define ALIGN_MASK (~ ALIGN_BITS) |
|
7a538e1a4676
Use skip_ascii() in no_conversion_convert() when encoding.
Aidan Kehoe <kehoea@parhasard.net>
parents:
5402
diff
changeset
|
850 #define ALIGNED(ptr) ((((EMACS_UINT) ptr) & ALIGN_BITS) == 0) |
|
7a538e1a4676
Use skip_ascii() in no_conversion_convert() when encoding.
Aidan Kehoe <kehoea@parhasard.net>
parents:
5402
diff
changeset
|
851 #define STRIDE sizeof (STRIDE_TYPE) |
|
7a538e1a4676
Use skip_ascii() in no_conversion_convert() when encoding.
Aidan Kehoe <kehoea@parhasard.net>
parents:
5402
diff
changeset
|
852 |
|
7a538e1a4676
Use skip_ascii() in no_conversion_convert() when encoding.
Aidan Kehoe <kehoea@parhasard.net>
parents:
5402
diff
changeset
|
853 /* Skip as many ASCII bytes as possible in the memory block [PTR, END). |
|
7a538e1a4676
Use skip_ascii() in no_conversion_convert() when encoding.
Aidan Kehoe <kehoea@parhasard.net>
parents:
5402
diff
changeset
|
854 Return pointer to the first non-ASCII byte. optimized for long |
|
7a538e1a4676
Use skip_ascii() in no_conversion_convert() when encoding.
Aidan Kehoe <kehoea@parhasard.net>
parents:
5402
diff
changeset
|
855 stretches of ASCII. */ |
|
7a538e1a4676
Use skip_ascii() in no_conversion_convert() when encoding.
Aidan Kehoe <kehoea@parhasard.net>
parents:
5402
diff
changeset
|
856 DECLARE_INLINE_HEADER ( |
|
7a538e1a4676
Use skip_ascii() in no_conversion_convert() when encoding.
Aidan Kehoe <kehoea@parhasard.net>
parents:
5402
diff
changeset
|
857 const Ibyte * |
|
7a538e1a4676
Use skip_ascii() in no_conversion_convert() when encoding.
Aidan Kehoe <kehoea@parhasard.net>
parents:
5402
diff
changeset
|
858 skip_ascii (const Ibyte *ptr, const Ibyte *end) |
|
7a538e1a4676
Use skip_ascii() in no_conversion_convert() when encoding.
Aidan Kehoe <kehoea@parhasard.net>
parents:
5402
diff
changeset
|
859 ) |
|
7a538e1a4676
Use skip_ascii() in no_conversion_convert() when encoding.
Aidan Kehoe <kehoea@parhasard.net>
parents:
5402
diff
changeset
|
860 { |
|
7a538e1a4676
Use skip_ascii() in no_conversion_convert() when encoding.
Aidan Kehoe <kehoea@parhasard.net>
parents:
5402
diff
changeset
|
861 const unsigned STRIDE_TYPE *ascii_end; |
|
7a538e1a4676
Use skip_ascii() in no_conversion_convert() when encoding.
Aidan Kehoe <kehoea@parhasard.net>
parents:
5402
diff
changeset
|
862 |
|
7a538e1a4676
Use skip_ascii() in no_conversion_convert() when encoding.
Aidan Kehoe <kehoea@parhasard.net>
parents:
5402
diff
changeset
|
863 /* Need to do in 3 sections -- before alignment start, aligned chunk, |
|
7a538e1a4676
Use skip_ascii() in no_conversion_convert() when encoding.
Aidan Kehoe <kehoea@parhasard.net>
parents:
5402
diff
changeset
|
864 after alignment end. */ |
|
7a538e1a4676
Use skip_ascii() in no_conversion_convert() when encoding.
Aidan Kehoe <kehoea@parhasard.net>
parents:
5402
diff
changeset
|
865 while (!ALIGNED (ptr)) |
|
7a538e1a4676
Use skip_ascii() in no_conversion_convert() when encoding.
Aidan Kehoe <kehoea@parhasard.net>
parents:
5402
diff
changeset
|
866 { |
|
7a538e1a4676
Use skip_ascii() in no_conversion_convert() when encoding.
Aidan Kehoe <kehoea@parhasard.net>
parents:
5402
diff
changeset
|
867 if (ptr == end || !byte_ascii_p (*ptr)) |
|
7a538e1a4676
Use skip_ascii() in no_conversion_convert() when encoding.
Aidan Kehoe <kehoea@parhasard.net>
parents:
5402
diff
changeset
|
868 return ptr; |
|
7a538e1a4676
Use skip_ascii() in no_conversion_convert() when encoding.
Aidan Kehoe <kehoea@parhasard.net>
parents:
5402
diff
changeset
|
869 ptr++; |
|
7a538e1a4676
Use skip_ascii() in no_conversion_convert() when encoding.
Aidan Kehoe <kehoea@parhasard.net>
parents:
5402
diff
changeset
|
870 } |
|
7a538e1a4676
Use skip_ascii() in no_conversion_convert() when encoding.
Aidan Kehoe <kehoea@parhasard.net>
parents:
5402
diff
changeset
|
871 ascii_end = (const unsigned STRIDE_TYPE *) ptr; |
|
7a538e1a4676
Use skip_ascii() in no_conversion_convert() when encoding.
Aidan Kehoe <kehoea@parhasard.net>
parents:
5402
diff
changeset
|
872 /* This loop screams, because we can detect ASCII |
|
7a538e1a4676
Use skip_ascii() in no_conversion_convert() when encoding.
Aidan Kehoe <kehoea@parhasard.net>
parents:
5402
diff
changeset
|
873 characters 4 or 8 at a time. */ |
|
7a538e1a4676
Use skip_ascii() in no_conversion_convert() when encoding.
Aidan Kehoe <kehoea@parhasard.net>
parents:
5402
diff
changeset
|
874 while ((const Ibyte *) ascii_end + STRIDE <= end |
|
7a538e1a4676
Use skip_ascii() in no_conversion_convert() when encoding.
Aidan Kehoe <kehoea@parhasard.net>
parents:
5402
diff
changeset
|
875 && !(*ascii_end & HIGH_BIT_MASK)) |
|
7a538e1a4676
Use skip_ascii() in no_conversion_convert() when encoding.
Aidan Kehoe <kehoea@parhasard.net>
parents:
5402
diff
changeset
|
876 ascii_end++; |
|
7a538e1a4676
Use skip_ascii() in no_conversion_convert() when encoding.
Aidan Kehoe <kehoea@parhasard.net>
parents:
5402
diff
changeset
|
877 ptr = (Ibyte *) ascii_end; |
|
7a538e1a4676
Use skip_ascii() in no_conversion_convert() when encoding.
Aidan Kehoe <kehoea@parhasard.net>
parents:
5402
diff
changeset
|
878 while (ptr < end && byte_ascii_p (*ptr)) |
|
7a538e1a4676
Use skip_ascii() in no_conversion_convert() when encoding.
Aidan Kehoe <kehoea@parhasard.net>
parents:
5402
diff
changeset
|
879 ptr++; |
|
7a538e1a4676
Use skip_ascii() in no_conversion_convert() when encoding.
Aidan Kehoe <kehoea@parhasard.net>
parents:
5402
diff
changeset
|
880 return ptr; |
|
7a538e1a4676
Use skip_ascii() in no_conversion_convert() when encoding.
Aidan Kehoe <kehoea@parhasard.net>
parents:
5402
diff
changeset
|
881 } |
|
7a538e1a4676
Use skip_ascii() in no_conversion_convert() when encoding.
Aidan Kehoe <kehoea@parhasard.net>
parents:
5402
diff
changeset
|
882 |
|
7a538e1a4676
Use skip_ascii() in no_conversion_convert() when encoding.
Aidan Kehoe <kehoea@parhasard.net>
parents:
5402
diff
changeset
|
883 /* Skip as many ASCII bytes as possible in the memory block [END, PTR), |
|
7a538e1a4676
Use skip_ascii() in no_conversion_convert() when encoding.
Aidan Kehoe <kehoea@parhasard.net>
parents:
5402
diff
changeset
|
884 going downwards. Return pointer to the location above the first |
|
7a538e1a4676
Use skip_ascii() in no_conversion_convert() when encoding.
Aidan Kehoe <kehoea@parhasard.net>
parents:
5402
diff
changeset
|
885 non-ASCII byte. Optimized for long stretches of ASCII. */ |
|
7a538e1a4676
Use skip_ascii() in no_conversion_convert() when encoding.
Aidan Kehoe <kehoea@parhasard.net>
parents:
5402
diff
changeset
|
886 DECLARE_INLINE_HEADER ( |
|
7a538e1a4676
Use skip_ascii() in no_conversion_convert() when encoding.
Aidan Kehoe <kehoea@parhasard.net>
parents:
5402
diff
changeset
|
887 const Ibyte * |
|
7a538e1a4676
Use skip_ascii() in no_conversion_convert() when encoding.
Aidan Kehoe <kehoea@parhasard.net>
parents:
5402
diff
changeset
|
888 skip_ascii_down (const Ibyte *ptr, const Ibyte *end) |
|
7a538e1a4676
Use skip_ascii() in no_conversion_convert() when encoding.
Aidan Kehoe <kehoea@parhasard.net>
parents:
5402
diff
changeset
|
889 ) |
|
7a538e1a4676
Use skip_ascii() in no_conversion_convert() when encoding.
Aidan Kehoe <kehoea@parhasard.net>
parents:
5402
diff
changeset
|
890 { |
|
7a538e1a4676
Use skip_ascii() in no_conversion_convert() when encoding.
Aidan Kehoe <kehoea@parhasard.net>
parents:
5402
diff
changeset
|
891 const unsigned STRIDE_TYPE *ascii_end; |
|
7a538e1a4676
Use skip_ascii() in no_conversion_convert() when encoding.
Aidan Kehoe <kehoea@parhasard.net>
parents:
5402
diff
changeset
|
892 |
|
7a538e1a4676
Use skip_ascii() in no_conversion_convert() when encoding.
Aidan Kehoe <kehoea@parhasard.net>
parents:
5402
diff
changeset
|
893 /* Need to do in 3 sections -- before alignment start, aligned chunk, |
|
7a538e1a4676
Use skip_ascii() in no_conversion_convert() when encoding.
Aidan Kehoe <kehoea@parhasard.net>
parents:
5402
diff
changeset
|
894 after alignment end. */ |
|
7a538e1a4676
Use skip_ascii() in no_conversion_convert() when encoding.
Aidan Kehoe <kehoea@parhasard.net>
parents:
5402
diff
changeset
|
895 while (!ALIGNED (ptr)) |
|
7a538e1a4676
Use skip_ascii() in no_conversion_convert() when encoding.
Aidan Kehoe <kehoea@parhasard.net>
parents:
5402
diff
changeset
|
896 { |
|
7a538e1a4676
Use skip_ascii() in no_conversion_convert() when encoding.
Aidan Kehoe <kehoea@parhasard.net>
parents:
5402
diff
changeset
|
897 if (ptr == end || !byte_ascii_p (*(ptr - 1))) |
|
7a538e1a4676
Use skip_ascii() in no_conversion_convert() when encoding.
Aidan Kehoe <kehoea@parhasard.net>
parents:
5402
diff
changeset
|
898 return ptr; |
|
7a538e1a4676
Use skip_ascii() in no_conversion_convert() when encoding.
Aidan Kehoe <kehoea@parhasard.net>
parents:
5402
diff
changeset
|
899 ptr--; |
|
7a538e1a4676
Use skip_ascii() in no_conversion_convert() when encoding.
Aidan Kehoe <kehoea@parhasard.net>
parents:
5402
diff
changeset
|
900 } |
|
7a538e1a4676
Use skip_ascii() in no_conversion_convert() when encoding.
Aidan Kehoe <kehoea@parhasard.net>
parents:
5402
diff
changeset
|
901 ascii_end = (const unsigned STRIDE_TYPE *) ptr - 1; |
|
7a538e1a4676
Use skip_ascii() in no_conversion_convert() when encoding.
Aidan Kehoe <kehoea@parhasard.net>
parents:
5402
diff
changeset
|
902 /* This loop screams, because we can detect ASCII |
|
7a538e1a4676
Use skip_ascii() in no_conversion_convert() when encoding.
Aidan Kehoe <kehoea@parhasard.net>
parents:
5402
diff
changeset
|
903 characters 4 or 8 at a time. */ |
|
7a538e1a4676
Use skip_ascii() in no_conversion_convert() when encoding.
Aidan Kehoe <kehoea@parhasard.net>
parents:
5402
diff
changeset
|
904 while ((const Ibyte *) ascii_end >= end |
|
7a538e1a4676
Use skip_ascii() in no_conversion_convert() when encoding.
Aidan Kehoe <kehoea@parhasard.net>
parents:
5402
diff
changeset
|
905 && !(*ascii_end & HIGH_BIT_MASK)) |
|
7a538e1a4676
Use skip_ascii() in no_conversion_convert() when encoding.
Aidan Kehoe <kehoea@parhasard.net>
parents:
5402
diff
changeset
|
906 ascii_end--; |
|
7a538e1a4676
Use skip_ascii() in no_conversion_convert() when encoding.
Aidan Kehoe <kehoea@parhasard.net>
parents:
5402
diff
changeset
|
907 ptr = (Ibyte *) (ascii_end + 1); |
|
7a538e1a4676
Use skip_ascii() in no_conversion_convert() when encoding.
Aidan Kehoe <kehoea@parhasard.net>
parents:
5402
diff
changeset
|
908 while (ptr > end && byte_ascii_p (*(ptr - 1))) |
|
7a538e1a4676
Use skip_ascii() in no_conversion_convert() when encoding.
Aidan Kehoe <kehoea@parhasard.net>
parents:
5402
diff
changeset
|
909 ptr--; |
|
7a538e1a4676
Use skip_ascii() in no_conversion_convert() when encoding.
Aidan Kehoe <kehoea@parhasard.net>
parents:
5402
diff
changeset
|
910 return ptr; |
|
7a538e1a4676
Use skip_ascii() in no_conversion_convert() when encoding.
Aidan Kehoe <kehoea@parhasard.net>
parents:
5402
diff
changeset
|
911 } |
|
7a538e1a4676
Use skip_ascii() in no_conversion_convert() when encoding.
Aidan Kehoe <kehoea@parhasard.net>
parents:
5402
diff
changeset
|
912 |
|
5784
0cb4f494a548
Have the result of coding_character_tell() reflect str->convert_to, too.
Aidan Kehoe <kehoea@parhasard.net>
parents:
5774
diff
changeset
|
913 /* Return the character count of an lstream or coding buffer of internal |
|
0cb4f494a548
Have the result of coding_character_tell() reflect str->convert_to, too.
Aidan Kehoe <kehoea@parhasard.net>
parents:
5774
diff
changeset
|
914 format text, counting partial characters at the beginning of the buffer |
|
0cb4f494a548
Have the result of coding_character_tell() reflect str->convert_to, too.
Aidan Kehoe <kehoea@parhasard.net>
parents:
5774
diff
changeset
|
915 as whole characters, and *not* counting partial characters at the end of |
|
0cb4f494a548
Have the result of coding_character_tell() reflect str->convert_to, too.
Aidan Kehoe <kehoea@parhasard.net>
parents:
5774
diff
changeset
|
916 the buffer. */ |
|
0cb4f494a548
Have the result of coding_character_tell() reflect str->convert_to, too.
Aidan Kehoe <kehoea@parhasard.net>
parents:
5774
diff
changeset
|
917 Charcount buffered_bytecount_to_charcount (const Ibyte *, Bytecount len); |
|
0cb4f494a548
Have the result of coding_character_tell() reflect str->convert_to, too.
Aidan Kehoe <kehoea@parhasard.net>
parents:
5774
diff
changeset
|
918 |
| 826 | 919 #else |
| 920 | |
| 921 #define bytecount_to_charcount(ptr, len) ((Charcount) (len)) | |
| 922 #define bytecount_to_charcount_fmt(ptr, len, fmt) ((Charcount) (len)) | |
| 923 #define charcount_to_bytecount(ptr, len) ((Bytecount) (len)) | |
| 924 #define charcount_to_bytecount_fmt(ptr, len, fmt) ((Bytecount) (len)) | |
|
5774
7a538e1a4676
Use skip_ascii() in no_conversion_convert() when encoding.
Aidan Kehoe <kehoea@parhasard.net>
parents:
5402
diff
changeset
|
925 #define skip_ascii(ptr, end) end |
|
7a538e1a4676
Use skip_ascii() in no_conversion_convert() when encoding.
Aidan Kehoe <kehoea@parhasard.net>
parents:
5402
diff
changeset
|
926 #define skip_ascii_down(ptr, end) end |
|
5786
6355de501637
Correct buffered_bytecount_to_charcount() on non-Mule.
Aidan Kehoe <kehoea@parhasard.net>
parents:
5784
diff
changeset
|
927 #define buffered_bytecount_to_charcount(ptr, len) (len) |
| 826 | 928 |
| 929 #endif /* MULE */ | |
| 930 | |
| 931 /* Return the length of the first character at PTR. Equivalent to | |
| 932 charcount_to_bytecount (ptr, 1). | |
| 933 | |
| 934 [Since charcount_to_bytecount() is Written as inline, a smart compiler | |
| 935 should really optimize charcount_to_bytecount (ptr, 1) to the same as | |
| 936 the following, with no error checking. But since this idiom occurs so | |
| 937 often, we'll be helpful and define a special macro for it.] | |
| 938 */ | |
| 939 | |
| 867 | 940 #define itext_ichar_len(ptr) rep_bytes_by_first_byte (*(ptr)) |
| 826 | 941 |
| 942 /* Return the length of the first character at PTR, which is in the | |
| 943 specified internal format. Equivalent to charcount_to_bytecount_fmt | |
| 944 (ptr, 1, fmt). | |
| 945 */ | |
| 946 | |
| 947 DECLARE_INLINE_HEADER ( | |
| 948 Bytecount | |
| 4853 | 949 itext_ichar_len_fmt (const Ibyte *ptr, Internal_Format fmt) |
| 826 | 950 ) |
| 951 { | |
| 952 switch (fmt) | |
| 953 { | |
| 954 case FORMAT_DEFAULT: | |
| 867 | 955 return itext_ichar_len (ptr); |
| 826 | 956 case FORMAT_16_BIT_FIXED: |
| 1204 | 957 text_checking_assert ((void *) ptr == ALIGN_PTR (ptr, UINT_16_BIT)); |
| 826 | 958 return 2; |
| 959 case FORMAT_32_BIT_FIXED: | |
| 1204 | 960 text_checking_assert ((void *) ptr == ALIGN_PTR (ptr, UINT_32_BIT)); |
| 826 | 961 return 4; |
| 962 default: | |
| 963 text_checking_assert (fmt == FORMAT_8_BIT_FIXED); | |
| 964 return 1; | |
| 965 } | |
| 966 } | |
| 967 | |
| 968 /* Return a pointer to the beginning of the character offset N (in | |
| 969 characters) from PTR. | |
| 970 */ | |
| 971 | |
| 972 DECLARE_INLINE_HEADER ( | |
| 867 | 973 const Ibyte * |
| 974 itext_n_addr (const Ibyte *ptr, Charcount offset) | |
| 826 | 975 ) |
| 771 | 976 { |
| 977 return ptr + charcount_to_bytecount (ptr, offset); | |
| 978 } | |
| 979 | |
| 867 | 980 /* Given a itext and an offset into the text pointed to by the itext, |
| 826 | 981 modify the offset so it points to the beginning of the next character. |
| 982 */ | |
| 983 | |
| 984 #define INC_BYTECOUNT(ptr, pos) do { \ | |
| 867 | 985 assert_valid_ibyteptr (ptr); \ |
| 826 | 986 (pos += rep_bytes_by_first_byte (* ((ptr) + (pos)))); \ |
| 987 } while (0) | |
| 988 | |
| 771 | 989 /* -------------------------------------------------------------------- */ |
| 867 | 990 /* Retrieving or changing the character pointed to by a itext */ |
| 771 | 991 /* -------------------------------------------------------------------- */ |
| 992 | |
| 867 | 993 #define simple_itext_ichar(ptr) ((Ichar) (ptr)[0]) |
| 994 #define simple_set_itext_ichar(ptr, x) \ | |
| 995 ((ptr)[0] = (Ibyte) (x), (Bytecount) 1) | |
| 996 #define simple_itext_copy_ichar(src, dst) \ | |
| 814 | 997 ((dst)[0] = *(src), (Bytecount) 1) |
| 771 | 998 |
| 999 #ifdef MULE | |
| 1000 | |
| 1632 | 1001 MODULE_API Ichar non_ascii_itext_ichar (const Ibyte *ptr); |
| 1002 MODULE_API Bytecount non_ascii_set_itext_ichar (Ibyte *ptr, Ichar c); | |
| 1003 MODULE_API Bytecount non_ascii_itext_copy_ichar (const Ibyte *src, Ibyte *dst); | |
| 867 | 1004 |
| 1005 /* Retrieve the character pointed to by PTR as an Ichar. */ | |
| 826 | 1006 |
| 1007 DECLARE_INLINE_HEADER ( | |
| 867 | 1008 Ichar |
| 1009 itext_ichar (const Ibyte *ptr) | |
| 826 | 1010 ) |
| 771 | 1011 { |
| 826 | 1012 return byte_ascii_p (*ptr) ? |
| 867 | 1013 simple_itext_ichar (ptr) : |
| 1014 non_ascii_itext_ichar (ptr); | |
| 771 | 1015 } |
| 1016 | |
| 826 | 1017 /* Retrieve the character pointed to by PTR (a pointer to text in the |
| 1018 format FMT, coming from OBJECT [a buffer, string?, or nil]) as an | |
| 867 | 1019 Ichar. |
| 826 | 1020 |
| 1021 Note: For these and other *_fmt() functions, if you pass in a constant | |
| 1022 FMT, the switch will be optimized out of existence. Therefore, there is | |
| 1023 no need to create separate versions for the various formats for | |
| 867 | 1024 "efficiency reasons". In fact, we don't really need itext_ichar() |
| 826 | 1025 and such written separately, but they are used often so it's simpler |
| 1026 that way. */ | |
| 1027 | |
| 1028 DECLARE_INLINE_HEADER ( | |
| 867 | 1029 Ichar |
| 1030 itext_ichar_fmt (const Ibyte *ptr, Internal_Format fmt, | |
| 2286 | 1031 Lisp_Object UNUSED (object)) |
| 826 | 1032 ) |
| 1033 { | |
| 1034 switch (fmt) | |
| 1035 { | |
| 1036 case FORMAT_DEFAULT: | |
| 867 | 1037 return itext_ichar (ptr); |
| 826 | 1038 case FORMAT_16_BIT_FIXED: |
| 1204 | 1039 text_checking_assert ((void *) ptr == ALIGN_PTR (ptr, UINT_16_BIT)); |
| 867 | 1040 return raw_16_bit_fixed_to_ichar (* (UINT_16_BIT *) ptr, object); |
| 826 | 1041 case FORMAT_32_BIT_FIXED: |
| 1204 | 1042 text_checking_assert ((void *) ptr == ALIGN_PTR (ptr, UINT_32_BIT)); |
| 867 | 1043 return raw_32_bit_fixed_to_ichar (* (UINT_32_BIT *) ptr, object); |
| 826 | 1044 default: |
| 1045 text_checking_assert (fmt == FORMAT_8_BIT_FIXED); | |
| 867 | 1046 return raw_8_bit_fixed_to_ichar (*ptr, object); |
| 826 | 1047 } |
| 1048 } | |
| 1049 | |
| 1050 /* Return the character at PTR (which is in format FMT), suitable for | |
| 1051 comparison with an ASCII character. This guarantees that if the | |
| 1052 character at PTR is ASCII (range 0 - 127), that character will be | |
| 1053 returned; otherwise, some character outside of the ASCII range will be | |
| 1054 returned, but not necessarily the character actually at PTR. This will | |
| 867 | 1055 be faster than itext_ichar_fmt() for some formats -- in particular, |
| 826 | 1056 FORMAT_DEFAULT. */ |
| 1057 | |
| 1058 DECLARE_INLINE_HEADER ( | |
| 867 | 1059 Ichar |
| 1060 itext_ichar_ascii_fmt (const Ibyte *ptr, Internal_Format fmt, | |
| 2286 | 1061 Lisp_Object UNUSED (object)) |
| 826 | 1062 ) |
| 1063 { | |
| 1064 switch (fmt) | |
| 1065 { | |
| 1066 case FORMAT_DEFAULT: | |
| 867 | 1067 return (Ichar) *ptr; |
| 826 | 1068 case FORMAT_16_BIT_FIXED: |
| 1204 | 1069 text_checking_assert ((void *) ptr == ALIGN_PTR (ptr, UINT_16_BIT)); |
| 867 | 1070 return raw_16_bit_fixed_to_ichar (* (UINT_16_BIT *) ptr, object); |
| 826 | 1071 case FORMAT_32_BIT_FIXED: |
| 1204 | 1072 text_checking_assert ((void *) ptr == ALIGN_PTR (ptr, UINT_32_BIT)); |
| 867 | 1073 return raw_32_bit_fixed_to_ichar (* (UINT_32_BIT *) ptr, object); |
| 826 | 1074 default: |
| 1075 text_checking_assert (fmt == FORMAT_8_BIT_FIXED); | |
| 867 | 1076 return raw_8_bit_fixed_to_ichar (*ptr, object); |
| 826 | 1077 } |
| 1078 } | |
| 1079 | |
| 1080 /* Return the "raw value" of the character at PTR, in format FMT. This is | |
| 1081 useful when searching for a character; convert the character using | |
| 867 | 1082 ichar_to_raw(). */ |
| 826 | 1083 |
| 1084 DECLARE_INLINE_HEADER ( | |
| 867 | 1085 Raw_Ichar |
| 1086 itext_ichar_raw_fmt (const Ibyte *ptr, Internal_Format fmt) | |
| 826 | 1087 ) |
| 1088 { | |
| 1089 switch (fmt) | |
| 1090 { | |
| 1091 case FORMAT_DEFAULT: | |
| 867 | 1092 return (Raw_Ichar) itext_ichar (ptr); |
| 826 | 1093 case FORMAT_16_BIT_FIXED: |
| 1204 | 1094 text_checking_assert ((void *) ptr == ALIGN_PTR (ptr, UINT_16_BIT)); |
| 867 | 1095 return (Raw_Ichar) (* (UINT_16_BIT *) ptr); |
| 826 | 1096 case FORMAT_32_BIT_FIXED: |
| 1204 | 1097 text_checking_assert ((void *) ptr == ALIGN_PTR (ptr, UINT_32_BIT)); |
| 867 | 1098 return (Raw_Ichar) (* (UINT_32_BIT *) ptr); |
| 826 | 1099 default: |
| 1100 text_checking_assert (fmt == FORMAT_8_BIT_FIXED); | |
| 867 | 1101 return (Raw_Ichar) (*ptr); |
| 826 | 1102 } |
| 1103 } | |
| 1104 | |
| 867 | 1105 /* Store the character CH (an Ichar) as internally-formatted text starting |
| 826 | 1106 at PTR. Return the number of bytes stored. |
| 1107 */ | |
| 1108 | |
| 1109 DECLARE_INLINE_HEADER ( | |
| 1110 Bytecount | |
| 867 | 1111 set_itext_ichar (Ibyte *ptr, Ichar x) |
| 826 | 1112 ) |
| 771 | 1113 { |
| 867 | 1114 return !ichar_multibyte_p (x) ? |
| 1115 simple_set_itext_ichar (ptr, x) : | |
| 1116 non_ascii_set_itext_ichar (ptr, x); | |
| 771 | 1117 } |
| 1118 | |
| 867 | 1119 /* Store the character CH (an Ichar) as internally-formatted text of |
| 826 | 1120 format FMT starting at PTR, which comes from OBJECT. Return the number |
| 1121 of bytes stored. | |
| 1122 */ | |
| 1123 | |
| 1124 DECLARE_INLINE_HEADER ( | |
| 1125 Bytecount | |
| 867 | 1126 set_itext_ichar_fmt (Ibyte *ptr, Ichar x, Internal_Format fmt, |
| 2286 | 1127 Lisp_Object UNUSED (object)) |
| 826 | 1128 ) |
| 771 | 1129 { |
| 826 | 1130 switch (fmt) |
| 1131 { | |
| 1132 case FORMAT_DEFAULT: | |
| 867 | 1133 return set_itext_ichar (ptr, x); |
| 826 | 1134 case FORMAT_16_BIT_FIXED: |
| 867 | 1135 text_checking_assert (ichar_16_bit_fixed_p (x, object)); |
| 1204 | 1136 text_checking_assert ((void *) ptr == ALIGN_PTR (ptr, UINT_16_BIT)); |
| 867 | 1137 * (UINT_16_BIT *) ptr = ichar_to_raw_16_bit_fixed (x, object); |
| 826 | 1138 return 2; |
| 1139 case FORMAT_32_BIT_FIXED: | |
| 1204 | 1140 text_checking_assert ((void *) ptr == ALIGN_PTR (ptr, UINT_32_BIT)); |
| 867 | 1141 * (UINT_32_BIT *) ptr = ichar_to_raw_32_bit_fixed (x, object); |
| 826 | 1142 return 4; |
| 1143 default: | |
| 1144 text_checking_assert (fmt == FORMAT_8_BIT_FIXED); | |
| 867 | 1145 text_checking_assert (ichar_8_bit_fixed_p (x, object)); |
| 1146 *ptr = ichar_to_raw_8_bit_fixed (x, object); | |
| 826 | 1147 return 1; |
| 1148 } | |
| 1149 } | |
| 1150 | |
| 1151 /* Retrieve the character pointed to by SRC and store it as | |
| 1152 internally-formatted text in DST. | |
| 1153 */ | |
| 1154 | |
| 1155 DECLARE_INLINE_HEADER ( | |
| 1156 Bytecount | |
| 867 | 1157 itext_copy_ichar (const Ibyte *src, Ibyte *dst) |
| 826 | 1158 ) |
| 1159 { | |
| 1160 return byte_ascii_p (*src) ? | |
| 867 | 1161 simple_itext_copy_ichar (src, dst) : |
| 1162 non_ascii_itext_copy_ichar (src, dst); | |
| 771 | 1163 } |
| 1164 | |
| 1165 #else /* not MULE */ | |
| 1166 | |
| 867 | 1167 # define itext_ichar(ptr) simple_itext_ichar (ptr) |
| 1168 # define itext_ichar_fmt(ptr, fmt, object) itext_ichar (ptr) | |
| 1169 # define itext_ichar_ascii_fmt(ptr, fmt, object) itext_ichar (ptr) | |
| 1170 # define itext_ichar_raw_fmt(ptr, fmt) itext_ichar (ptr) | |
| 1171 # define set_itext_ichar(ptr, x) simple_set_itext_ichar (ptr, x) | |
| 1172 # define set_itext_ichar_fmt(ptr, x, fmt, obj) set_itext_ichar (ptr, x) | |
| 1173 # define itext_copy_ichar(src, dst) simple_itext_copy_ichar (src, dst) | |
| 771 | 1174 |
| 1175 #endif /* not MULE */ | |
| 1176 | |
| 826 | 1177 /* Retrieve the character at offset N (in characters) from PTR, as an |
| 867 | 1178 Ichar. |
| 826 | 1179 */ |
| 1180 | |
| 867 | 1181 #define itext_ichar_n(ptr, offset) \ |
| 1182 itext_ichar (itext_n_addr (ptr, offset)) | |
| 771 | 1183 |
| 1184 | |
| 1185 /************************************************************************/ | |
| 1186 /* */ | |
| 826 | 1187 /* working with Lisp strings */ |
| 1188 /* */ | |
| 1189 /************************************************************************/ | |
| 1190 | |
| 1191 #define string_char_length(s) \ | |
| 1192 string_index_byte_to_char (s, XSTRING_LENGTH (s)) | |
| 1193 #define string_byte(s, i) (XSTRING_DATA (s)[i] + 0) | |
| 1194 /* In case we ever allow strings to be in a different format ... */ | |
| 1195 #define set_string_byte(s, i, c) (XSTRING_DATA (s)[i] = (c)) | |
| 1196 | |
| 1197 #define ASSERT_VALID_CHAR_STRING_INDEX_UNSAFE(s, x) do { \ | |
| 1198 text_checking_assert ((x) >= 0 && x <= string_char_length (s)); \ | |
| 1199 } while (0) | |
| 1200 | |
| 1201 #define ASSERT_VALID_BYTE_STRING_INDEX_UNSAFE(s, x) do { \ | |
| 1202 text_checking_assert ((x) >= 0 && x <= XSTRING_LENGTH (s)); \ | |
| 867 | 1203 text_checking_assert (valid_ibyteptr_p (string_byte_addr (s, x))); \ |
| 826 | 1204 } while (0) |
| 1205 | |
| 1206 /* Convert offset I in string S to a pointer to text there. */ | |
| 1207 #define string_byte_addr(s, i) (&(XSTRING_DATA (s)[i])) | |
| 1208 /* Convert pointer to text in string S into the byte offset to that text. */ | |
| 1209 #define string_addr_to_byte(s, ptr) ((Bytecount) ((ptr) - XSTRING_DATA (s))) | |
| 867 | 1210 /* Return the Ichar at *CHARACTER* offset I. */ |
| 1211 #define string_ichar(s, i) itext_ichar (string_char_addr (s, i)) | |
| 826 | 1212 |
| 1213 #ifdef ERROR_CHECK_TEXT | |
| 1214 #define SLEDGEHAMMER_CHECK_ASCII_BEGIN | |
| 1215 #endif | |
| 1216 | |
| 1217 #ifdef SLEDGEHAMMER_CHECK_ASCII_BEGIN | |
| 1218 void sledgehammer_check_ascii_begin (Lisp_Object str); | |
| 1219 #else | |
| 1220 #define sledgehammer_check_ascii_begin(str) | |
| 1221 #endif | |
| 1222 | |
| 1223 /* Make an alloca'd copy of a Lisp string */ | |
| 1224 #define LISP_STRING_TO_ALLOCA(s, lval) \ | |
| 1225 do { \ | |
| 1315 | 1226 Ibyte **_lta_ = (Ibyte **) &(lval); \ |
| 826 | 1227 Lisp_Object _lta_2 = (s); \ |
| 2367 | 1228 *_lta_ = alloca_ibytes (1 + XSTRING_LENGTH (_lta_2)); \ |
| 826 | 1229 memcpy (*_lta_, XSTRING_DATA (_lta_2), 1 + XSTRING_LENGTH (_lta_2)); \ |
| 1230 } while (0) | |
| 1231 | |
| 1232 void resize_string (Lisp_Object s, Bytecount pos, Bytecount delta); | |
| 1233 | |
| 1234 /* Convert a byte index into a string into a char index. */ | |
| 1235 DECLARE_INLINE_HEADER ( | |
| 1236 Charcount | |
| 4853 | 1237 string_index_byte_to_char (Lisp_Object s, Bytecount idx) |
| 826 | 1238 ) |
| 1239 { | |
| 1240 Charcount retval; | |
| 1241 ASSERT_VALID_BYTE_STRING_INDEX_UNSAFE (s, idx); | |
| 1242 #ifdef MULE | |
| 1243 if (idx <= (Bytecount) XSTRING_ASCII_BEGIN (s)) | |
| 1244 retval = (Charcount) idx; | |
| 1245 else | |
| 1246 retval = (XSTRING_ASCII_BEGIN (s) + | |
| 1247 bytecount_to_charcount (XSTRING_DATA (s) + | |
| 1248 XSTRING_ASCII_BEGIN (s), | |
| 1249 idx - XSTRING_ASCII_BEGIN (s))); | |
| 1250 # ifdef SLEDGEHAMMER_CHECK_ASCII_BEGIN | |
| 1251 assert (retval == bytecount_to_charcount (XSTRING_DATA (s), idx)); | |
| 1252 # endif | |
| 1253 #else | |
| 1254 retval = (Charcount) idx; | |
| 1255 #endif | |
| 1256 /* Don't call ASSERT_VALID_CHAR_STRING_INDEX_UNSAFE() here because it will | |
| 1257 call string_index_byte_to_char(). */ | |
| 1258 return retval; | |
| 1259 } | |
| 1260 | |
| 1261 /* Convert a char index into a string into a byte index. */ | |
| 1262 DECLARE_INLINE_HEADER ( | |
| 1263 Bytecount | |
| 4853 | 1264 string_index_char_to_byte (Lisp_Object s, Charcount idx) |
| 826 | 1265 ) |
| 1266 { | |
| 1267 Bytecount retval; | |
| 1268 ASSERT_VALID_CHAR_STRING_INDEX_UNSAFE (s, idx); | |
| 1269 #ifdef MULE | |
| 1270 if (idx <= (Charcount) XSTRING_ASCII_BEGIN (s)) | |
| 1271 retval = (Bytecount) idx; | |
| 1272 else | |
| 1273 retval = (XSTRING_ASCII_BEGIN (s) + | |
| 1274 charcount_to_bytecount (XSTRING_DATA (s) + | |
| 1275 XSTRING_ASCII_BEGIN (s), | |
| 1276 idx - XSTRING_ASCII_BEGIN (s))); | |
| 1277 # ifdef SLEDGEHAMMER_CHECK_ASCII_BEGIN | |
| 1278 assert (retval == charcount_to_bytecount (XSTRING_DATA (s), idx)); | |
| 1279 # endif | |
| 1280 #else | |
| 1281 retval = (Bytecount) idx; | |
| 1282 #endif | |
| 1283 ASSERT_VALID_BYTE_STRING_INDEX_UNSAFE (s, retval); | |
| 1284 return retval; | |
| 1285 } | |
| 1286 | |
| 1287 /* Convert a substring length (starting at byte offset OFF) from bytes to | |
| 1288 chars. */ | |
| 1289 DECLARE_INLINE_HEADER ( | |
| 1290 Charcount | |
| 4853 | 1291 string_offset_byte_to_char_len (Lisp_Object s, Bytecount off, Bytecount len) |
| 826 | 1292 ) |
| 1293 { | |
| 1294 Charcount retval; | |
| 1295 ASSERT_VALID_BYTE_STRING_INDEX_UNSAFE (s, off); | |
| 1296 ASSERT_VALID_BYTE_STRING_INDEX_UNSAFE (s, off + len); | |
| 1297 #ifdef MULE | |
| 1298 if (off + len <= (Bytecount) XSTRING_ASCII_BEGIN (s)) | |
| 1299 retval = (Charcount) len; | |
| 1300 else if (off < (Bytecount) XSTRING_ASCII_BEGIN (s)) | |
| 1301 retval = | |
| 1302 XSTRING_ASCII_BEGIN (s) - (Charcount) off + | |
| 1303 bytecount_to_charcount (XSTRING_DATA (s) + XSTRING_ASCII_BEGIN (s), | |
| 1304 len - (XSTRING_ASCII_BEGIN (s) - off)); | |
| 1305 else | |
| 1306 retval = bytecount_to_charcount (XSTRING_DATA (s) + off, len); | |
| 1307 # ifdef SLEDGEHAMMER_CHECK_ASCII_BEGIN | |
| 1308 assert (retval == bytecount_to_charcount (XSTRING_DATA (s) + off, len)); | |
| 1309 # endif | |
| 1310 #else | |
| 1311 retval = (Charcount) len; | |
| 1312 #endif | |
| 1313 return retval; | |
| 1314 } | |
| 1315 | |
| 1316 /* Convert a substring length (starting at byte offset OFF) from chars to | |
| 1317 bytes. */ | |
| 1318 DECLARE_INLINE_HEADER ( | |
| 1319 Bytecount | |
| 4853 | 1320 string_offset_char_to_byte_len (Lisp_Object s, Bytecount off, Charcount len) |
| 826 | 1321 ) |
| 1322 { | |
| 1323 Bytecount retval; | |
| 1324 ASSERT_VALID_BYTE_STRING_INDEX_UNSAFE (s, off); | |
| 1325 #ifdef MULE | |
| 1326 /* casts to avoid errors from combining Bytecount/Charcount and warnings | |
| 1327 from signed/unsigned comparisons */ | |
| 1328 if (off + (Bytecount) len <= (Bytecount) XSTRING_ASCII_BEGIN (s)) | |
| 1329 retval = (Bytecount) len; | |
| 1330 else if (off < (Bytecount) XSTRING_ASCII_BEGIN (s)) | |
| 1331 retval = | |
| 1332 XSTRING_ASCII_BEGIN (s) - off + | |
| 1333 charcount_to_bytecount (XSTRING_DATA (s) + XSTRING_ASCII_BEGIN (s), | |
| 1334 len - (XSTRING_ASCII_BEGIN (s) - | |
| 1335 (Charcount) off)); | |
| 1336 else | |
| 1337 retval = charcount_to_bytecount (XSTRING_DATA (s) + off, len); | |
| 1338 # ifdef SLEDGEHAMMER_CHECK_ASCII_BEGIN | |
| 1339 assert (retval == charcount_to_bytecount (XSTRING_DATA (s) + off, len)); | |
| 1340 # endif | |
| 1341 #else | |
| 1342 retval = (Bytecount) len; | |
| 1343 #endif | |
| 1344 ASSERT_VALID_BYTE_STRING_INDEX_UNSAFE (s, off + retval); | |
| 1345 return retval; | |
| 1346 } | |
| 1347 | |
| 1348 DECLARE_INLINE_HEADER ( | |
| 867 | 1349 const Ibyte * |
| 826 | 1350 string_char_addr (Lisp_Object s, Charcount idx) |
| 1351 ) | |
| 1352 { | |
| 1353 return XSTRING_DATA (s) + string_index_char_to_byte (s, idx); | |
| 1354 } | |
| 1355 | |
| 1356 /* WARNING: If you modify an existing string, you must call | |
| 1357 bump_string_modiff() afterwards. */ | |
| 1358 #ifdef MULE | |
| 867 | 1359 void set_string_char (Lisp_Object s, Charcount i, Ichar c); |
| 826 | 1360 #else |
| 1361 #define set_string_char(s, i, c) set_string_byte (s, i, c) | |
| 1362 #endif /* not MULE */ | |
| 1363 | |
| 1364 /* Return index to character before the one at IDX. */ | |
| 1365 DECLARE_INLINE_HEADER ( | |
| 1366 Bytecount | |
| 1367 prev_string_index (Lisp_Object s, Bytecount idx) | |
| 1368 ) | |
| 1369 { | |
| 867 | 1370 const Ibyte *ptr = string_byte_addr (s, idx); |
| 1371 DEC_IBYTEPTR (ptr); | |
| 826 | 1372 return string_addr_to_byte (s, ptr); |
| 1373 } | |
| 1374 | |
| 1375 /* Return index to character after the one at IDX. */ | |
| 1376 DECLARE_INLINE_HEADER ( | |
| 1377 Bytecount | |
| 1378 next_string_index (Lisp_Object s, Bytecount idx) | |
| 1379 ) | |
| 1380 { | |
| 867 | 1381 const Ibyte *ptr = string_byte_addr (s, idx); |
| 1382 INC_IBYTEPTR (ptr); | |
| 826 | 1383 return string_addr_to_byte (s, ptr); |
| 1384 } | |
| 1385 | |
| 1386 | |
| 1387 /************************************************************************/ | |
| 1388 /* */ | |
| 771 | 1389 /* working with Eistrings */ |
| 1390 /* */ | |
| 1391 /************************************************************************/ | |
| 1392 | |
| 1393 /* | |
| 1394 #### NOTE: This is a work in progress. Neither the API nor especially | |
| 1395 the implementation is finished. | |
| 1396 | |
| 1397 NOTE: An Eistring is a structure that makes it easy to work with | |
| 1398 internally-formatted strings of data. It provides operations similar | |
| 1399 in feel to the standard strcpy(), strcat(), strlen(), etc., but | |
| 1400 | |
| 1401 (a) it is Mule-correct | |
| 1402 (b) it does dynamic allocation so you never have to worry about size | |
| 793 | 1403 restrictions |
| 851 | 1404 (c) it comes in an ALLOCA() variety (all allocation is stack-local, |
| 793 | 1405 so there is no need to explicitly clean up) as well as a malloc() |
| 1406 variety | |
| 1407 (d) it knows its own length, so it does not suffer from standard null | |
| 1408 byte brain-damage -- but it null-terminates the data anyway, so | |
| 1409 it can be passed to standard routines | |
| 1410 (e) it provides a much more powerful set of operations and knows about | |
| 771 | 1411 all the standard places where string data might reside: Lisp_Objects, |
| 867 | 1412 other Eistrings, Ibyte * data with or without an explicit length, |
| 1413 ASCII strings, Ichars, etc. | |
| 793 | 1414 (f) it provides easy operations to convert to/from externally-formatted |
| 1415 data, and is easier to use than the standard TO_INTERNAL_FORMAT | |
| 771 | 1416 and TO_EXTERNAL_FORMAT macros. (An Eistring can store both the internal |
| 1417 and external version of its data, but the external version is only | |
| 1418 initialized or changed when you call eito_external().) | |
| 1419 | |
| 793 | 1420 The idea is to make it as easy to write Mule-correct string manipulation |
| 1421 code as it is to write normal string manipulation code. We also make | |
| 1422 the API sufficiently general that it can handle multiple internal data | |
| 1423 formats (e.g. some fixed-width optimizing formats and a default variable | |
| 1424 width format) and allows for *ANY* data format we might choose in the | |
| 1425 future for the default format, including UCS2. (In other words, we can't | |
| 1426 assume that the internal format is ASCII-compatible and we can't assume | |
| 1427 it doesn't have embedded null bytes. We do assume, however, that any | |
| 1428 chosen format will have the concept of null-termination.) All of this is | |
| 1429 hidden from the user. | |
| 771 | 1430 |
| 1431 #### It is really too bad that we don't have a real object-oriented | |
| 1432 language, or at least a language with polymorphism! | |
| 1433 | |
| 1434 | |
| 1435 ********************************************** | |
| 1436 * Declaration * | |
| 1437 ********************************************** | |
| 1438 | |
| 1439 To declare an Eistring, either put one of the following in the local | |
| 1440 variable section: | |
| 1441 | |
| 1442 DECLARE_EISTRING (name); | |
| 2367 | 1443 Declare a new Eistring and initialize it to the empy string. This |
| 1444 is a standard local variable declaration and can go anywhere in the | |
| 1445 variable declaration section. NAME itself is declared as an | |
| 1446 Eistring *, and its storage declared on the stack. | |
| 771 | 1447 |
| 1448 DECLARE_EISTRING_MALLOC (name); | |
| 2367 | 1449 Declare and initialize a new Eistring, which uses malloc()ed |
| 1450 instead of ALLOCA()ed data. This is a standard local variable | |
| 1451 declaration and can go anywhere in the variable declaration | |
| 1452 section. Once you initialize the Eistring, you will have to free | |
| 1453 it using eifree() to avoid memory leaks. You will need to use this | |
| 1454 form if you are passing an Eistring to any function that modifies | |
| 1455 it (otherwise, the modified data may be in stack space and get | |
| 1456 overwritten when the function returns). | |
| 771 | 1457 |
| 1458 or use | |
| 1459 | |
| 793 | 1460 Eistring ei; |
| 1461 void eiinit (Eistring *ei); | |
| 1462 void eiinit_malloc (Eistring *einame); | |
| 771 | 1463 If you need to put an Eistring elsewhere than in a local variable |
| 1464 declaration (e.g. in a structure), declare it as shown and then | |
| 1465 call one of the init macros. | |
| 1466 | |
| 1467 Also note: | |
| 1468 | |
| 793 | 1469 void eifree (Eistring *ei); |
| 771 | 1470 If you declared an Eistring to use malloc() to hold its data, |
| 1471 or converted it to the heap using eito_malloc(), then this | |
| 1472 releases any data in it and afterwards resets the Eistring | |
| 1473 using eiinit_malloc(). Otherwise, it just resets the Eistring | |
| 1474 using eiinit(). | |
| 1475 | |
| 1476 | |
| 1477 ********************************************** | |
| 1478 * Conventions * | |
| 1479 ********************************************** | |
| 1480 | |
| 1481 - The names of the functions have been chosen, where possible, to | |
| 1482 match the names of str*() functions in the standard C API. | |
| 1483 - | |
| 1484 | |
| 1485 | |
| 1486 ********************************************** | |
| 1487 * Initialization * | |
| 1488 ********************************************** | |
| 1489 | |
| 1490 void eireset (Eistring *eistr); | |
| 1491 Initialize the Eistring to the empty string. | |
| 1492 | |
| 1493 void eicpy_* (Eistring *eistr, ...); | |
| 1494 Initialize the Eistring from somewhere: | |
| 1495 | |
| 1496 void eicpy_ei (Eistring *eistr, Eistring *eistr2); | |
| 1497 ... from another Eistring. | |
| 1498 void eicpy_lstr (Eistring *eistr, Lisp_Object lisp_string); | |
| 1499 ... from a Lisp_Object string. | |
| 867 | 1500 void eicpy_ch (Eistring *eistr, Ichar ch); |
| 1501 ... from an Ichar (this can be a conventional C character). | |
| 771 | 1502 |
| 1503 void eicpy_lstr_off (Eistring *eistr, Lisp_Object lisp_string, | |
| 1504 Bytecount off, Charcount charoff, | |
| 1505 Bytecount len, Charcount charlen); | |
| 1506 ... from a section of a Lisp_Object string. | |
| 1507 void eicpy_lbuf (Eistring *eistr, Lisp_Object lisp_buf, | |
| 1508 Bytecount off, Charcount charoff, | |
| 1509 Bytecount len, Charcount charlen); | |
| 1510 ... from a section of a Lisp_Object buffer. | |
| 867 | 1511 void eicpy_raw (Eistring *eistr, const Ibyte *data, Bytecount len); |
| 771 | 1512 ... from raw internal-format data in the default internal format. |
| 867 | 1513 void eicpy_rawz (Eistring *eistr, const Ibyte *data); |
| 771 | 1514 ... from raw internal-format data in the default internal format |
| 1515 that is "null-terminated" (the meaning of this depends on the nature | |
| 1516 of the default internal format). | |
| 867 | 1517 void eicpy_raw_fmt (Eistring *eistr, const Ibyte *data, Bytecount len, |
| 826 | 1518 Internal_Format intfmt, Lisp_Object object); |
| 771 | 1519 ... from raw internal-format data in the specified format. |
| 867 | 1520 void eicpy_rawz_fmt (Eistring *eistr, const Ibyte *data, |
| 826 | 1521 Internal_Format intfmt, Lisp_Object object); |
| 771 | 1522 ... from raw internal-format data in the specified format that is |
| 1523 "null-terminated" (the meaning of this depends on the nature of | |
| 1524 the specific format). | |
| 2421 | 1525 void eicpy_ascii (Eistring *eistr, const Ascbyte *ascstr); |
| 771 | 1526 ... from an ASCII null-terminated string. Non-ASCII characters in |
| 2500 | 1527 the string are *ILLEGAL* (read ABORT() with error-checking defined). |
| 2421 | 1528 void eicpy_ascii_len (Eistring *eistr, const Ascbyte *ascstr, len); |
| 771 | 1529 ... from an ASCII string, with length specified. Non-ASCII characters |
| 2500 | 1530 in the string are *ILLEGAL* (read ABORT() with error-checking defined). |
| 771 | 1531 void eicpy_ext (Eistring *eistr, const Extbyte *extdata, |
| 1318 | 1532 Lisp_Object codesys); |
| 771 | 1533 ... from external null-terminated data, with coding system specified. |
| 1534 void eicpy_ext_len (Eistring *eistr, const Extbyte *extdata, | |
| 1318 | 1535 Bytecount extlen, Lisp_Object codesys); |
| 771 | 1536 ... from external data, with length and coding system specified. |
| 1537 void eicpy_lstream (Eistring *eistr, Lisp_Object lstream); | |
| 1538 ... from an lstream; reads data till eof. Data must be in default | |
| 1539 internal format; otherwise, interpose a decoding lstream. | |
| 1540 | |
| 1541 | |
| 1542 ********************************************** | |
| 1543 * Getting the data out of the Eistring * | |
| 1544 ********************************************** | |
| 1545 | |
| 867 | 1546 Ibyte *eidata (Eistring *eistr); |
| 771 | 1547 Return a pointer to the raw data in an Eistring. This is NOT |
| 1548 a copy. | |
| 1549 | |
| 1550 Lisp_Object eimake_string (Eistring *eistr); | |
| 1551 Make a Lisp string out of the Eistring. | |
| 1552 | |
| 1553 Lisp_Object eimake_string_off (Eistring *eistr, | |
| 1554 Bytecount off, Charcount charoff, | |
| 1555 Bytecount len, Charcount charlen); | |
| 1556 Make a Lisp string out of a section of the Eistring. | |
| 1557 | |
| 867 | 1558 void eicpyout_alloca (Eistring *eistr, LVALUE: Ibyte *ptr_out, |
| 771 | 1559 LVALUE: Bytecount len_out); |
| 851 | 1560 Make an ALLOCA() copy of the data in the Eistring, using the |
| 1561 default internal format. Due to the nature of ALLOCA(), this | |
| 771 | 1562 must be a macro, with all lvalues passed in as parameters. |
| 793 | 1563 (More specifically, not all compilers correctly handle using |
| 851 | 1564 ALLOCA() as the argument to a function call -- GCC on x86 |
| 1565 didn't used to, for example.) A pointer to the ALLOCA()ed data | |
| 793 | 1566 is stored in PTR_OUT, and the length of the data (not including |
| 1567 the terminating zero) is stored in LEN_OUT. | |
| 771 | 1568 |
| 867 | 1569 void eicpyout_alloca_fmt (Eistring *eistr, LVALUE: Ibyte *ptr_out, |
| 771 | 1570 LVALUE: Bytecount len_out, |
| 826 | 1571 Internal_Format intfmt, Lisp_Object object); |
| 771 | 1572 Like eicpyout_alloca(), but converts to the specified internal |
| 1573 format. (No formats other than FORMAT_DEFAULT are currently | |
| 1574 implemented, and you get an assertion failure if you try.) | |
| 1575 | |
| 867 | 1576 Ibyte *eicpyout_malloc (Eistring *eistr, Bytecount *intlen_out); |
| 771 | 1577 Make a malloc() copy of the data in the Eistring, using the |
| 1578 default internal format. This is a real function. No lvalues | |
| 1579 passed in. Returns the new data, and stores the length (not | |
| 1580 including the terminating zero) using INTLEN_OUT, unless it's | |
| 1581 a NULL pointer. | |
| 1582 | |
| 867 | 1583 Ibyte *eicpyout_malloc_fmt (Eistring *eistr, Internal_Format intfmt, |
| 826 | 1584 Bytecount *intlen_out, Lisp_Object object); |
| 771 | 1585 Like eicpyout_malloc(), but converts to the specified internal |
| 1586 format. (No formats other than FORMAT_DEFAULT are currently | |
| 1587 implemented, and you get an assertion failure if you try.) | |
| 1588 | |
| 1589 | |
| 1590 ********************************************** | |
| 1591 * Moving to the heap * | |
| 1592 ********************************************** | |
| 1593 | |
| 1594 void eito_malloc (Eistring *eistr); | |
| 1595 Move this Eistring to the heap. Its data will be stored in a | |
| 1596 malloc()ed block rather than the stack. Subsequent changes to | |
| 1597 this Eistring will realloc() the block as necessary. Use this | |
| 1598 when you want the Eistring to remain in scope past the end of | |
| 1599 this function call. You will have to manually free the data | |
| 1600 in the Eistring using eifree(). | |
| 1601 | |
| 1602 void eito_alloca (Eistring *eistr); | |
| 1603 Move this Eistring back to the stack, if it was moved to the | |
| 1604 heap with eito_malloc(). This will automatically free any | |
| 1605 heap-allocated data. | |
| 1606 | |
| 1607 | |
| 1608 | |
| 1609 ********************************************** | |
| 1610 * Retrieving the length * | |
| 1611 ********************************************** | |
| 1612 | |
| 1613 Bytecount eilen (Eistring *eistr); | |
| 1614 Return the length of the internal data, in bytes. See also | |
| 1615 eiextlen(), below. | |
| 1616 Charcount eicharlen (Eistring *eistr); | |
| 1617 Return the length of the internal data, in characters. | |
| 1618 | |
| 1619 | |
| 1620 ********************************************** | |
| 1621 * Working with positions * | |
| 1622 ********************************************** | |
| 1623 | |
| 1624 Bytecount eicharpos_to_bytepos (Eistring *eistr, Charcount charpos); | |
| 1625 Convert a char offset to a byte offset. | |
| 1626 Charcount eibytepos_to_charpos (Eistring *eistr, Bytecount bytepos); | |
| 1627 Convert a byte offset to a char offset. | |
| 1628 Bytecount eiincpos (Eistring *eistr, Bytecount bytepos); | |
| 1629 Increment the given position by one character. | |
| 1630 Bytecount eiincpos_n (Eistring *eistr, Bytecount bytepos, Charcount n); | |
| 1631 Increment the given position by N characters. | |
| 1632 Bytecount eidecpos (Eistring *eistr, Bytecount bytepos); | |
| 1633 Decrement the given position by one character. | |
| 1634 Bytecount eidecpos_n (Eistring *eistr, Bytecount bytepos, Charcount n); | |
| 1635 Deccrement the given position by N characters. | |
| 1636 | |
| 1637 | |
| 1638 ********************************************** | |
| 1639 * Getting the character at a position * | |
| 1640 ********************************************** | |
| 1641 | |
| 867 | 1642 Ichar eigetch (Eistring *eistr, Bytecount bytepos); |
| 771 | 1643 Return the character at a particular byte offset. |
| 867 | 1644 Ichar eigetch_char (Eistring *eistr, Charcount charpos); |
| 771 | 1645 Return the character at a particular character offset. |
| 1646 | |
| 1647 | |
| 1648 ********************************************** | |
| 1649 * Setting the character at a position * | |
| 1650 ********************************************** | |
| 1651 | |
| 867 | 1652 Ichar eisetch (Eistring *eistr, Bytecount bytepos, Ichar chr); |
| 771 | 1653 Set the character at a particular byte offset. |
| 867 | 1654 Ichar eisetch_char (Eistring *eistr, Charcount charpos, Ichar chr); |
| 771 | 1655 Set the character at a particular character offset. |
| 1656 | |
| 1657 | |
| 1658 ********************************************** | |
| 1659 * Concatenation * | |
| 1660 ********************************************** | |
| 1661 | |
| 1662 void eicat_* (Eistring *eistr, ...); | |
| 1663 Concatenate onto the end of the Eistring, with data coming from the | |
| 1664 same places as above: | |
| 1665 | |
| 1666 void eicat_ei (Eistring *eistr, Eistring *eistr2); | |
| 1667 ... from another Eistring. | |
| 2421 | 1668 void eicat_ascii (Eistring *eistr, Ascbyte *ascstr); |
| 771 | 1669 ... from an ASCII null-terminated string. Non-ASCII characters in |
| 2500 | 1670 the string are *ILLEGAL* (read ABORT() with error-checking defined). |
| 867 | 1671 void eicat_raw (ei, const Ibyte *data, Bytecount len); |
| 771 | 1672 ... from raw internal-format data in the default internal format. |
| 867 | 1673 void eicat_rawz (ei, const Ibyte *data); |
| 771 | 1674 ... from raw internal-format data in the default internal format |
| 1675 that is "null-terminated" (the meaning of this depends on the nature | |
| 1676 of the default internal format). | |
| 1677 void eicat_lstr (ei, Lisp_Object lisp_string); | |
| 1678 ... from a Lisp_Object string. | |
| 867 | 1679 void eicat_ch (ei, Ichar ch); |
| 1680 ... from an Ichar. | |
| 771 | 1681 |
| 1682 (All except the first variety are convenience functions. | |
| 1683 In the general case, create another Eistring from the source.) | |
| 1684 | |
| 1685 | |
| 1686 ********************************************** | |
| 1687 * Replacement * | |
| 1688 ********************************************** | |
| 1689 | |
| 1690 void eisub_* (Eistring *eistr, Bytecount off, Charcount charoff, | |
| 1691 Bytecount len, Charcount charlen, ...); | |
| 1692 Replace a section of the Eistring, specifically: | |
| 1693 | |
| 1694 void eisub_ei (Eistring *eistr, Bytecount off, Charcount charoff, | |
| 1695 Bytecount len, Charcount charlen, Eistring *eistr2); | |
| 1696 ... with another Eistring. | |
| 2421 | 1697 void eisub_ascii (Eistring *eistr, Bytecount off, Charcount charoff, |
| 1698 Bytecount len, Charcount charlen, Ascbyte *ascstr); | |
| 771 | 1699 ... with an ASCII null-terminated string. Non-ASCII characters in |
| 2500 | 1700 the string are *ILLEGAL* (read ABORT() with error-checking defined). |
| 771 | 1701 void eisub_ch (Eistring *eistr, Bytecount off, Charcount charoff, |
| 867 | 1702 Bytecount len, Charcount charlen, Ichar ch); |
| 1703 ... with an Ichar. | |
| 771 | 1704 |
| 1705 void eidel (Eistring *eistr, Bytecount off, Charcount charoff, | |
| 1706 Bytecount len, Charcount charlen); | |
| 1707 Delete a section of the Eistring. | |
| 1708 | |
| 1709 | |
| 1710 ********************************************** | |
| 1711 * Converting to an external format * | |
| 1712 ********************************************** | |
| 1713 | |
| 1318 | 1714 void eito_external (Eistring *eistr, Lisp_Object codesys); |
| 771 | 1715 Convert the Eistring to an external format and store the result |
| 1716 in the string. NOTE: Further changes to the Eistring will *NOT* | |
| 1717 change the external data stored in the string. You will have to | |
| 1718 call eito_external() again in such a case if you want the external | |
| 1719 data. | |
| 1720 | |
| 1721 Extbyte *eiextdata (Eistring *eistr); | |
| 1722 Return a pointer to the external data stored in the Eistring as | |
| 1723 a result of a prior call to eito_external(). | |
| 1724 | |
| 1725 Bytecount eiextlen (Eistring *eistr); | |
| 1726 Return the length in bytes of the external data stored in the | |
| 1727 Eistring as a result of a prior call to eito_external(). | |
| 1728 | |
| 1729 | |
| 1730 ********************************************** | |
| 1731 * Searching in the Eistring for a character * | |
| 1732 ********************************************** | |
| 1733 | |
| 867 | 1734 Bytecount eichr (Eistring *eistr, Ichar chr); |
| 1735 Charcount eichr_char (Eistring *eistr, Ichar chr); | |
| 1736 Bytecount eichr_off (Eistring *eistr, Ichar chr, Bytecount off, | |
| 771 | 1737 Charcount charoff); |
| 867 | 1738 Charcount eichr_off_char (Eistring *eistr, Ichar chr, Bytecount off, |
| 771 | 1739 Charcount charoff); |
| 867 | 1740 Bytecount eirchr (Eistring *eistr, Ichar chr); |
| 1741 Charcount eirchr_char (Eistring *eistr, Ichar chr); | |
| 1742 Bytecount eirchr_off (Eistring *eistr, Ichar chr, Bytecount off, | |
| 771 | 1743 Charcount charoff); |
| 867 | 1744 Charcount eirchr_off_char (Eistring *eistr, Ichar chr, Bytecount off, |
| 771 | 1745 Charcount charoff); |
| 1746 | |
| 1747 | |
| 1748 ********************************************** | |
| 1749 * Searching in the Eistring for a string * | |
| 1750 ********************************************** | |
| 1751 | |
| 1752 Bytecount eistr_ei (Eistring *eistr, Eistring *eistr2); | |
| 1753 Charcount eistr_ei_char (Eistring *eistr, Eistring *eistr2); | |
| 1754 Bytecount eistr_ei_off (Eistring *eistr, Eistring *eistr2, Bytecount off, | |
| 1755 Charcount charoff); | |
| 1756 Charcount eistr_ei_off_char (Eistring *eistr, Eistring *eistr2, | |
| 1757 Bytecount off, Charcount charoff); | |
| 1758 Bytecount eirstr_ei (Eistring *eistr, Eistring *eistr2); | |
| 1759 Charcount eirstr_ei_char (Eistring *eistr, Eistring *eistr2); | |
| 1760 Bytecount eirstr_ei_off (Eistring *eistr, Eistring *eistr2, Bytecount off, | |
| 1761 Charcount charoff); | |
| 1762 Charcount eirstr_ei_off_char (Eistring *eistr, Eistring *eistr2, | |
| 1763 Bytecount off, Charcount charoff); | |
| 1764 | |
| 2421 | 1765 Bytecount eistr_ascii (Eistring *eistr, Ascbyte *ascstr); |
| 1766 Charcount eistr_ascii_char (Eistring *eistr, Ascbyte *ascstr); | |
| 1767 Bytecount eistr_ascii_off (Eistring *eistr, Ascbyte *ascstr, Bytecount off, | |
| 771 | 1768 Charcount charoff); |
| 2421 | 1769 Charcount eistr_ascii_off_char (Eistring *eistr, Ascbyte *ascstr, |
| 771 | 1770 Bytecount off, Charcount charoff); |
| 2421 | 1771 Bytecount eirstr_ascii (Eistring *eistr, Ascbyte *ascstr); |
| 1772 Charcount eirstr_ascii_char (Eistring *eistr, Ascbyte *ascstr); | |
| 1773 Bytecount eirstr_ascii_off (Eistring *eistr, Ascbyte *ascstr, | |
| 771 | 1774 Bytecount off, Charcount charoff); |
| 2421 | 1775 Charcount eirstr_ascii_off_char (Eistring *eistr, Ascbyte *ascstr, |
| 771 | 1776 Bytecount off, Charcount charoff); |
| 1777 | |
| 1778 | |
| 1779 ********************************************** | |
| 1780 * Comparison * | |
| 1781 ********************************************** | |
| 1782 | |
| 1783 int eicmp_* (Eistring *eistr, ...); | |
| 1784 int eicmp_off_* (Eistring *eistr, Bytecount off, Charcount charoff, | |
| 1785 Bytecount len, Charcount charlen, ...); | |
| 1786 int eicasecmp_* (Eistring *eistr, ...); | |
| 1787 int eicasecmp_off_* (Eistring *eistr, Bytecount off, Charcount charoff, | |
| 1788 Bytecount len, Charcount charlen, ...); | |
| 1789 int eicasecmp_i18n_* (Eistring *eistr, ...); | |
| 1790 int eicasecmp_i18n_off_* (Eistring *eistr, Bytecount off, Charcount charoff, | |
| 1791 Bytecount len, Charcount charlen, ...); | |
| 1792 | |
| 1793 Compare the Eistring with the other data. Return value same as | |
| 1794 from strcmp. The `*' is either `ei' for another Eistring (in | |
| 1795 which case `...' is an Eistring), or `c' for a pure-ASCII string | |
| 1796 (in which case `...' is a pointer to that string). For anything | |
| 1797 more complex, first create an Eistring out of the source. | |
| 1798 Comparison is either simple (`eicmp_...'), ASCII case-folding | |
| 1799 (`eicasecmp_...'), or multilingual case-folding | |
| 1800 (`eicasecmp_i18n_...). | |
| 1801 | |
| 1802 | |
| 1803 More specifically, the prototypes are: | |
| 1804 | |
| 1805 int eicmp_ei (Eistring *eistr, Eistring *eistr2); | |
| 1806 int eicmp_off_ei (Eistring *eistr, Bytecount off, Charcount charoff, | |
| 1807 Bytecount len, Charcount charlen, Eistring *eistr2); | |
| 1808 int eicasecmp_ei (Eistring *eistr, Eistring *eistr2); | |
| 1809 int eicasecmp_off_ei (Eistring *eistr, Bytecount off, Charcount charoff, | |
| 1810 Bytecount len, Charcount charlen, Eistring *eistr2); | |
| 1811 int eicasecmp_i18n_ei (Eistring *eistr, Eistring *eistr2); | |
| 1812 int eicasecmp_i18n_off_ei (Eistring *eistr, Bytecount off, | |
| 1813 Charcount charoff, Bytecount len, | |
| 1814 Charcount charlen, Eistring *eistr2); | |
| 1815 | |
| 2421 | 1816 int eicmp_ascii (Eistring *eistr, Ascbyte *ascstr); |
| 1817 int eicmp_off_ascii (Eistring *eistr, Bytecount off, Charcount charoff, | |
| 1818 Bytecount len, Charcount charlen, Ascbyte *ascstr); | |
| 1819 int eicasecmp_ascii (Eistring *eistr, Ascbyte *ascstr); | |
| 1820 int eicasecmp_off_ascii (Eistring *eistr, Bytecount off, Charcount charoff, | |
| 771 | 1821 Bytecount len, Charcount charlen, |
| 2421 | 1822 Ascbyte *ascstr); |
| 1823 int eicasecmp_i18n_ascii (Eistring *eistr, Ascbyte *ascstr); | |
| 1824 int eicasecmp_i18n_off_ascii (Eistring *eistr, Bytecount off, Charcount charoff, | |
| 771 | 1825 Bytecount len, Charcount charlen, |
| 2421 | 1826 Ascbyte *ascstr); |
| 771 | 1827 |
| 1828 | |
| 1829 ********************************************** | |
| 1830 * Case-changing the Eistring * | |
| 1831 ********************************************** | |
| 1832 | |
| 1833 void eilwr (Eistring *eistr); | |
| 1834 Convert all characters in the Eistring to lowercase. | |
| 1835 void eiupr (Eistring *eistr); | |
| 1836 Convert all characters in the Eistring to uppercase. | |
| 1837 */ | |
| 1838 | |
| 1839 | |
| 1840 /* Principles for writing Eistring functions: | |
| 1841 | |
| 1842 (1) Unfortunately, we have to write most of the Eistring functions | |
| 851 | 1843 as macros, because of the use of ALLOCA(). The principle used |
| 771 | 1844 below to assure no conflict in local variables is to prefix all |
| 1845 local variables with "ei" plus a number, which should be unique | |
| 1846 among macros. In practice, when finding a new number, find the | |
| 1847 highest so far used, and add 1. | |
| 1848 | |
| 1849 (2) We also suffix the Eistring fields with an _ to avoid problems | |
| 1850 with macro parameters of the same name. (And as the standard | |
| 1851 signal not to access these fields directly.) | |
| 1852 | |
| 1853 (3) We maintain both the length in bytes and chars of the data in | |
| 1854 the Eistring at all times, for convenient retrieval by outside | |
| 1855 functions. That means when writing functions that manipulate | |
| 1856 Eistrings, you too need to keep both lengths up to date for all | |
| 1857 data that you work with. | |
| 1858 | |
| 1859 (4) When writing a new type of operation (e.g. substitution), you | |
| 1860 will often find yourself working with outside data, and thus | |
| 1861 have a series of related API's, for different forms that the | |
| 1862 outside data is in. Generally, you will want to choose a | |
| 1863 subset of the forms supported by eicpy_*, which has to be | |
| 1864 totally general because that's the fundamental way to get data | |
| 1865 into an Eistring, and once the data is into the string, it | |
| 1866 would be to create a whole series of Ei operations that work on | |
| 1867 nothing but Eistrings. Although theoretically nice, in | |
| 1868 practice it's a hassle, so we suggest that you provide | |
| 1869 convenience functions. In particular, there are two paths you | |
| 1870 can take. One is minimalist -- it only allows other Eistrings | |
| 867 | 1871 and ASCII data, and Ichars if the particular operation makes |
| 771 | 1872 sense with a character. The other provides interfaces for the |
| 1873 most commonly-used forms -- Eistring, ASCII data, Lisp string, | |
| 1874 raw internal-format string with length, raw internal-format | |
| 867 | 1875 string without, and possibly Ichar. (In the function names, |
| 771 | 1876 these are designated `ei', `c', `lstr', `raw', `rawz', and |
| 1877 `ch', respectively.) | |
| 1878 | |
| 1879 (5) When coding a new type of operation, such as was discussed in | |
| 1880 previous section, the correct approach is to declare an worker | |
| 1881 function that does the work of everything, and is called by the | |
| 1882 other "container" macros that handle the different outside data | |
| 1883 forms. The data coming into the worker function, which | |
| 1884 typically ends in `_1', is in the form of three parameters: | |
| 1885 DATA, LEN, CHARLEN. (See point [3] about having two lengths and | |
| 1886 keeping them in sync.) | |
| 1887 | |
| 1888 (6) Handling argument evaluation in macros: We take great care | |
| 1889 never to evaluate any argument more than once in any macro, | |
| 1890 except the initial Eistring parameter. This can and will be | |
| 1891 evaluated multiple times, but it should pretty much always just | |
| 1892 be a simple variable. This means, for example, that if an | |
| 1893 Eistring is the second (not first) argument of a macro, it | |
| 1894 doesn't fall under the "initial Eistring" exemption, so it | |
| 1895 needs protection against multi-evaluation. (Take the address of | |
| 1896 the Eistring structure, store in a temporary variable, and use | |
| 1897 temporary variable for all access to the Eistring. | |
| 1898 Essentially, we want it to appear as if these Eistring macros | |
| 1899 are functions -- we would like to declare them as functions but | |
| 851 | 1900 they use ALLOCA(), so we can't (and we can't make them inline |
| 1901 functions either -- ALLOCA() is explicitly disallowed in inline | |
| 771 | 1902 functions.) |
| 1903 | |
| 1904 (7) Note that our rules regarding multiple evaluation are *more* | |
| 1905 strict than the rules listed above under the heading "working | |
| 1906 with raw internal-format data". | |
| 1907 */ | |
| 1908 | |
| 1909 | |
| 1910 /* ----- Declaration ----- */ | |
| 1911 | |
| 1912 typedef struct | |
| 1913 { | |
| 1914 /* Data for the Eistring, stored in the default internal format. | |
| 1915 Always includes terminating null. */ | |
| 867 | 1916 Ibyte *data_; |
| 771 | 1917 /* Total number of bytes allocated in DATA (including null). */ |
| 1918 Bytecount max_size_allocated_; | |
| 1919 Bytecount bytelen_; | |
| 1920 Charcount charlen_; | |
| 1921 int mallocp_; | |
| 1922 | |
| 1923 Extbyte *extdata_; | |
| 1924 Bytecount extlen_; | |
| 1925 } Eistring; | |
| 1926 | |
| 1927 extern Eistring the_eistring_zero_init, the_eistring_malloc_zero_init; | |
| 1928 | |
| 1929 #define DECLARE_EISTRING(name) \ | |
| 1930 Eistring __ ## name ## __storage__ = the_eistring_zero_init; \ | |
| 1931 Eistring *name = & __ ## name ## __storage__ | |
| 1932 #define DECLARE_EISTRING_MALLOC(name) \ | |
| 1933 Eistring __ ## name ## __storage__ = the_eistring_malloc_zero_init; \ | |
| 1934 Eistring *name = & __ ## name ## __storage__ | |
| 1935 | |
| 1936 #define eiinit(ei) \ | |
| 1937 do { \ | |
| 793 | 1938 *(ei) = the_eistring_zero_init; \ |
| 771 | 1939 } while (0) |
| 1940 | |
| 1941 #define eiinit_malloc(ei) \ | |
| 1942 do { \ | |
| 793 | 1943 *(ei) = the_eistring_malloc_zero_init; \ |
| 771 | 1944 } while (0) |
| 1945 | |
| 1946 | |
| 1947 /* ----- Utility ----- */ | |
| 1948 | |
| 1949 /* Make sure both LEN and CHARLEN are specified, in case one is given | |
| 1950 as -1. PTR evaluated at most once, others multiply. */ | |
| 1951 #define eifixup_bytechar(ptr, len, charlen) \ | |
| 1952 do { \ | |
| 1953 if ((len) == -1) \ | |
| 1954 (len) = charcount_to_bytecount (ptr, charlen); \ | |
| 1955 else if ((charlen) == -1) \ | |
| 1956 (charlen) = bytecount_to_charcount (ptr, len); \ | |
| 1957 } while (0) | |
| 1958 | |
| 1959 /* Make sure LEN is specified, in case it's is given as -1. PTR | |
| 1960 evaluated at most once, others multiply. */ | |
| 1961 #define eifixup_byte(ptr, len, charlen) \ | |
| 1962 do { \ | |
| 1963 if ((len) == -1) \ | |
| 1964 (len) = charcount_to_bytecount (ptr, charlen); \ | |
| 1965 } while (0) | |
| 1966 | |
| 1967 /* Make sure CHARLEN is specified, in case it's is given as -1. PTR | |
| 1968 evaluated at most once, others multiply. */ | |
| 1969 #define eifixup_char(ptr, len, charlen) \ | |
| 1970 do { \ | |
| 1971 if ((charlen) == -1) \ | |
| 1972 (charlen) = bytecount_to_charcount (ptr, len); \ | |
| 1973 } while (0) | |
| 1974 | |
| 1975 | |
| 1976 | |
| 1977 /* Make sure we can hold NEWBYTELEN bytes (which is NEWCHARLEN chars) | |
| 1978 plus a zero terminator. Preserve existing data as much as possible, | |
| 1979 including existing zero terminator. Put a new zero terminator where it | |
| 1980 should go if NEWZ if non-zero. All args but EI are evalled only once. */ | |
| 1981 | |
|
5026
46cf825f6158
revamp DFC comment in text.h, some whitespace cleanup
Ben Wing <ben@xemacs.org>
parents:
4982
diff
changeset
|
1982 #define EI_ALLOC(ei, newbytelen, newcharlen, newz) \ |
|
46cf825f6158
revamp DFC comment in text.h, some whitespace cleanup
Ben Wing <ben@xemacs.org>
parents:
4982
diff
changeset
|
1983 do { \ |
|
46cf825f6158
revamp DFC comment in text.h, some whitespace cleanup
Ben Wing <ben@xemacs.org>
parents:
4982
diff
changeset
|
1984 int ei1oldeibytelen = (ei)->bytelen_; \ |
|
46cf825f6158
revamp DFC comment in text.h, some whitespace cleanup
Ben Wing <ben@xemacs.org>
parents:
4982
diff
changeset
|
1985 \ |
|
46cf825f6158
revamp DFC comment in text.h, some whitespace cleanup
Ben Wing <ben@xemacs.org>
parents:
4982
diff
changeset
|
1986 (ei)->charlen_ = (newcharlen); \ |
|
46cf825f6158
revamp DFC comment in text.h, some whitespace cleanup
Ben Wing <ben@xemacs.org>
parents:
4982
diff
changeset
|
1987 (ei)->bytelen_ = (newbytelen); \ |
|
46cf825f6158
revamp DFC comment in text.h, some whitespace cleanup
Ben Wing <ben@xemacs.org>
parents:
4982
diff
changeset
|
1988 \ |
|
46cf825f6158
revamp DFC comment in text.h, some whitespace cleanup
Ben Wing <ben@xemacs.org>
parents:
4982
diff
changeset
|
1989 if (ei1oldeibytelen != (ei)->bytelen_) \ |
|
46cf825f6158
revamp DFC comment in text.h, some whitespace cleanup
Ben Wing <ben@xemacs.org>
parents:
4982
diff
changeset
|
1990 { \ |
|
46cf825f6158
revamp DFC comment in text.h, some whitespace cleanup
Ben Wing <ben@xemacs.org>
parents:
4982
diff
changeset
|
1991 int ei1newsize = (ei)->max_size_allocated_; \ |
|
46cf825f6158
revamp DFC comment in text.h, some whitespace cleanup
Ben Wing <ben@xemacs.org>
parents:
4982
diff
changeset
|
1992 while (ei1newsize < (ei)->bytelen_ + 1) \ |
|
46cf825f6158
revamp DFC comment in text.h, some whitespace cleanup
Ben Wing <ben@xemacs.org>
parents:
4982
diff
changeset
|
1993 { \ |
|
46cf825f6158
revamp DFC comment in text.h, some whitespace cleanup
Ben Wing <ben@xemacs.org>
parents:
4982
diff
changeset
|
1994 ei1newsize = (int) (ei1newsize * 1.5); \ |
|
46cf825f6158
revamp DFC comment in text.h, some whitespace cleanup
Ben Wing <ben@xemacs.org>
parents:
4982
diff
changeset
|
1995 if (ei1newsize < 32) \ |
|
46cf825f6158
revamp DFC comment in text.h, some whitespace cleanup
Ben Wing <ben@xemacs.org>
parents:
4982
diff
changeset
|
1996 ei1newsize = 32; \ |
|
46cf825f6158
revamp DFC comment in text.h, some whitespace cleanup
Ben Wing <ben@xemacs.org>
parents:
4982
diff
changeset
|
1997 } \ |
|
46cf825f6158
revamp DFC comment in text.h, some whitespace cleanup
Ben Wing <ben@xemacs.org>
parents:
4982
diff
changeset
|
1998 if (ei1newsize != (ei)->max_size_allocated_) \ |
|
46cf825f6158
revamp DFC comment in text.h, some whitespace cleanup
Ben Wing <ben@xemacs.org>
parents:
4982
diff
changeset
|
1999 { \ |
|
46cf825f6158
revamp DFC comment in text.h, some whitespace cleanup
Ben Wing <ben@xemacs.org>
parents:
4982
diff
changeset
|
2000 if ((ei)->mallocp_) \ |
| 771 | 2001 /* xrealloc always preserves existing data as much as possible */ \ |
|
5026
46cf825f6158
revamp DFC comment in text.h, some whitespace cleanup
Ben Wing <ben@xemacs.org>
parents:
4982
diff
changeset
|
2002 (ei)->data_ = (Ibyte *) xrealloc ((ei)->data_, ei1newsize); \ |
|
46cf825f6158
revamp DFC comment in text.h, some whitespace cleanup
Ben Wing <ben@xemacs.org>
parents:
4982
diff
changeset
|
2003 else \ |
|
46cf825f6158
revamp DFC comment in text.h, some whitespace cleanup
Ben Wing <ben@xemacs.org>
parents:
4982
diff
changeset
|
2004 { \ |
|
46cf825f6158
revamp DFC comment in text.h, some whitespace cleanup
Ben Wing <ben@xemacs.org>
parents:
4982
diff
changeset
|
2005 /* We don't have realloc, so ALLOCA() more space and copy the \ |
|
46cf825f6158
revamp DFC comment in text.h, some whitespace cleanup
Ben Wing <ben@xemacs.org>
parents:
4982
diff
changeset
|
2006 data into it. */ \ |
|
46cf825f6158
revamp DFC comment in text.h, some whitespace cleanup
Ben Wing <ben@xemacs.org>
parents:
4982
diff
changeset
|
2007 Ibyte *ei1oldeidata = (ei)->data_; \ |
|
46cf825f6158
revamp DFC comment in text.h, some whitespace cleanup
Ben Wing <ben@xemacs.org>
parents:
4982
diff
changeset
|
2008 (ei)->data_ = alloca_ibytes (ei1newsize); \ |
|
46cf825f6158
revamp DFC comment in text.h, some whitespace cleanup
Ben Wing <ben@xemacs.org>
parents:
4982
diff
changeset
|
2009 if (ei1oldeidata) \ |
|
46cf825f6158
revamp DFC comment in text.h, some whitespace cleanup
Ben Wing <ben@xemacs.org>
parents:
4982
diff
changeset
|
2010 memcpy ((ei)->data_, ei1oldeidata, ei1oldeibytelen + 1); \ |
|
46cf825f6158
revamp DFC comment in text.h, some whitespace cleanup
Ben Wing <ben@xemacs.org>
parents:
4982
diff
changeset
|
2011 } \ |
|
46cf825f6158
revamp DFC comment in text.h, some whitespace cleanup
Ben Wing <ben@xemacs.org>
parents:
4982
diff
changeset
|
2012 (ei)->max_size_allocated_ = ei1newsize; \ |
|
46cf825f6158
revamp DFC comment in text.h, some whitespace cleanup
Ben Wing <ben@xemacs.org>
parents:
4982
diff
changeset
|
2013 } \ |
|
46cf825f6158
revamp DFC comment in text.h, some whitespace cleanup
Ben Wing <ben@xemacs.org>
parents:
4982
diff
changeset
|
2014 if (newz) \ |
|
46cf825f6158
revamp DFC comment in text.h, some whitespace cleanup
Ben Wing <ben@xemacs.org>
parents:
4982
diff
changeset
|
2015 (ei)->data_[(ei)->bytelen_] = '\0'; \ |
|
46cf825f6158
revamp DFC comment in text.h, some whitespace cleanup
Ben Wing <ben@xemacs.org>
parents:
4982
diff
changeset
|
2016 } \ |
| 771 | 2017 } while (0) |
| 2018 | |
| 2019 #define EI_ALLOC_AND_COPY(ei, data, bytelen, charlen) \ | |
| 2020 do { \ | |
| 2021 EI_ALLOC (ei, bytelen, charlen, 1); \ | |
| 2022 memcpy ((ei)->data_, data, (ei)->bytelen_); \ | |
| 2023 } while (0) | |
| 2024 | |
| 2025 /* ----- Initialization ----- */ | |
| 2026 | |
| 2027 #define eicpy_ei(ei, eicpy) \ | |
| 2028 do { \ | |
| 2029 const Eistring *ei2 = (eicpy); \ | |
| 2030 EI_ALLOC_AND_COPY (ei, ei2->data_, ei2->bytelen_, ei2->charlen_); \ | |
| 2031 } while (0) | |
| 2032 | |
| 2033 #define eicpy_lstr(ei, lisp_string) \ | |
| 2034 do { \ | |
| 2035 Lisp_Object ei3 = (lisp_string); \ | |
| 2036 EI_ALLOC_AND_COPY (ei, XSTRING_DATA (ei3), XSTRING_LENGTH (ei3), \ | |
| 1333 | 2037 string_char_length (ei3)); \ |
| 771 | 2038 } while (0) |
| 2039 | |
| 2040 #define eicpy_lstr_off(ei, lisp_string, off, charoff, len, charlen) \ | |
| 2041 do { \ | |
| 2042 Lisp_Object ei23lstr = (lisp_string); \ | |
| 2043 int ei23off = (off); \ | |
| 2044 int ei23charoff = (charoff); \ | |
| 2045 int ei23len = (len); \ | |
| 2046 int ei23charlen = (charlen); \ | |
| 867 | 2047 const Ibyte *ei23data = XSTRING_DATA (ei23lstr); \ |
| 771 | 2048 \ |
| 2049 int ei23oldbytelen = (ei)->bytelen_; \ | |
| 2050 \ | |
| 2051 eifixup_byte (ei23data, ei23off, ei23charoff); \ | |
| 2052 eifixup_bytechar (ei23data + ei23off, ei23len, ei23charlen); \ | |
| 2053 \ | |
| 2054 EI_ALLOC_AND_COPY (ei, ei23data + ei23off, ei23len, ei23charlen); \ | |
| 2055 } while (0) | |
| 2056 | |
| 826 | 2057 #define eicpy_raw_fmt(ei, ptr, len, fmt, object) \ |
| 771 | 2058 do { \ |
| 1333 | 2059 const Ibyte *ei12ptr = (ptr); \ |
| 771 | 2060 Internal_Format ei12fmt = (fmt); \ |
| 2061 int ei12len = (len); \ | |
| 2062 assert (ei12fmt == FORMAT_DEFAULT); \ | |
| 2063 EI_ALLOC_AND_COPY (ei, ei12ptr, ei12len, \ | |
| 2064 bytecount_to_charcount (ei12ptr, ei12len)); \ | |
| 2065 } while (0) | |
| 2066 | |
| 826 | 2067 #define eicpy_raw(ei, ptr, len) \ |
| 2068 eicpy_raw_fmt (ei, ptr, len, FORMAT_DEFAULT, Qnil) | |
| 2069 | |
| 2070 #define eicpy_rawz_fmt(ei, ptr, fmt, object) \ | |
| 2071 do { \ | |
| 867 | 2072 const Ibyte *ei12p1ptr = (ptr); \ |
| 826 | 2073 Internal_Format ei12p1fmt = (fmt); \ |
| 2074 assert (ei12p1fmt == FORMAT_DEFAULT); \ | |
| 2075 eicpy_raw_fmt (ei, ei12p1ptr, qxestrlen (ei12p1ptr), fmt, object); \ | |
| 771 | 2076 } while (0) |
| 2077 | |
| 826 | 2078 #define eicpy_rawz(ei, ptr) eicpy_rawz_fmt (ei, ptr, FORMAT_DEFAULT, Qnil) |
| 771 | 2079 |
| 1333 | 2080 #define eicpy_ch(ei, ch) \ |
| 2081 do { \ | |
| 867 | 2082 Ibyte ei12p2[MAX_ICHAR_LEN]; \ |
| 2083 Bytecount ei12p2len = set_itext_ichar (ei12p2, ch); \ | |
| 1333 | 2084 EI_ALLOC_AND_COPY (ei, ei12p2, ei12p2len, 1); \ |
| 771 | 2085 } while (0) |
| 2086 | |
| 2421 | 2087 #define eicpy_ascii(ei, ascstr) \ |
| 771 | 2088 do { \ |
| 2421 | 2089 const Ascbyte *ei4 = (ascstr); \ |
| 771 | 2090 \ |
| 2367 | 2091 ASSERT_ASCTEXT_ASCII (ei4); \ |
| 771 | 2092 eicpy_ext (ei, ei4, Qbinary); \ |
| 2093 } while (0) | |
| 2094 | |
| 2421 | 2095 #define eicpy_ascii_len(ei, ascstr, c_len) \ |
| 771 | 2096 do { \ |
| 2421 | 2097 const Ascbyte *ei6 = (ascstr); \ |
| 771 | 2098 int ei6len = (c_len); \ |
| 2099 \ | |
| 2367 | 2100 ASSERT_ASCTEXT_ASCII_LEN (ei6, ei6len); \ |
| 771 | 2101 eicpy_ext_len (ei, ei6, ei6len, Qbinary); \ |
| 2102 } while (0) | |
| 2103 | |
|
4981
4aebb0131297
Cleanups/renaming of EXTERNAL_TO_C_STRING and friends
Ben Wing <ben@xemacs.org>
parents:
4953
diff
changeset
|
2104 #define eicpy_ext_len(ei, extdata, extlen, codesys) \ |
|
4aebb0131297
Cleanups/renaming of EXTERNAL_TO_C_STRING and friends
Ben Wing <ben@xemacs.org>
parents:
4953
diff
changeset
|
2105 do { \ |
|
4aebb0131297
Cleanups/renaming of EXTERNAL_TO_C_STRING and friends
Ben Wing <ben@xemacs.org>
parents:
4953
diff
changeset
|
2106 const Extbyte *ei7 = (extdata); \ |
|
4aebb0131297
Cleanups/renaming of EXTERNAL_TO_C_STRING and friends
Ben Wing <ben@xemacs.org>
parents:
4953
diff
changeset
|
2107 int ei7len = (extlen); \ |
|
4aebb0131297
Cleanups/renaming of EXTERNAL_TO_C_STRING and friends
Ben Wing <ben@xemacs.org>
parents:
4953
diff
changeset
|
2108 \ |
|
4aebb0131297
Cleanups/renaming of EXTERNAL_TO_C_STRING and friends
Ben Wing <ben@xemacs.org>
parents:
4953
diff
changeset
|
2109 TO_INTERNAL_FORMAT (DATA, (ei7, ei7len), \ |
|
4aebb0131297
Cleanups/renaming of EXTERNAL_TO_C_STRING and friends
Ben Wing <ben@xemacs.org>
parents:
4953
diff
changeset
|
2110 ALLOCA, ((ei)->data_, (ei)->bytelen_), \ |
|
4aebb0131297
Cleanups/renaming of EXTERNAL_TO_C_STRING and friends
Ben Wing <ben@xemacs.org>
parents:
4953
diff
changeset
|
2111 codesys); \ |
|
4aebb0131297
Cleanups/renaming of EXTERNAL_TO_C_STRING and friends
Ben Wing <ben@xemacs.org>
parents:
4953
diff
changeset
|
2112 (ei)->max_size_allocated_ = (ei)->bytelen_ + 1; \ |
| 771 | 2113 (ei)->charlen_ = bytecount_to_charcount ((ei)->data_, (ei)->bytelen_); \ |
| 2114 } while (0) | |
| 2115 | |
| 1318 | 2116 #define eicpy_ext(ei, extdata, codesys) \ |
| 2117 do { \ | |
| 2118 const Extbyte *ei8 = (extdata); \ | |
| 2119 \ | |
| 2120 eicpy_ext_len (ei, ei8, dfc_external_data_len (ei8, codesys), \ | |
| 2121 codesys); \ | |
| 771 | 2122 } while (0) |
| 2123 | |
| 2124 #define eicpy_lbuf(eistr, lisp_buf, off, charoff, len, charlen) \ | |
| 2125 NOT YET IMPLEMENTED | |
| 2126 | |
| 2127 #define eicpy_lstream(eistr, lstream) \ | |
| 2128 NOT YET IMPLEMENTED | |
| 2129 | |
| 867 | 2130 #define eireset(eistr) eicpy_rawz (eistr, (Ibyte *) "") |
| 771 | 2131 |
| 2132 /* ----- Getting the data out of the Eistring ----- */ | |
| 2133 | |
| 2134 #define eidata(ei) ((ei)->data_) | |
| 2135 | |
| 2136 #define eimake_string(ei) make_string (eidata (ei), eilen (ei)) | |
| 2137 | |
| 2138 #define eimake_string_off(eistr, off, charoff, len, charlen) \ | |
| 2139 do { \ | |
| 2140 Lisp_Object ei24lstr; \ | |
| 2141 int ei24off = (off); \ | |
| 2142 int ei24charoff = (charoff); \ | |
| 2143 int ei24len = (len); \ | |
| 2144 int ei24charlen = (charlen); \ | |
| 2145 \ | |
| 2146 eifixup_byte ((eistr)->data_, ei24off, ei24charoff); \ | |
| 2147 eifixup_byte ((eistr)->data_ + ei24off, ei24len, ei24charlen); \ | |
| 2148 \ | |
| 2149 return make_string ((eistr)->data_ + ei24off, ei24len); \ | |
| 2150 } while (0) | |
| 2151 | |
| 2152 #define eicpyout_alloca(eistr, ptrout, lenout) \ | |
| 826 | 2153 eicpyout_alloca_fmt (eistr, ptrout, lenout, FORMAT_DEFAULT, Qnil) |
| 771 | 2154 #define eicpyout_malloc(eistr, lenout) \ |
| 826 | 2155 eicpyout_malloc_fmt (eistr, lenout, FORMAT_DEFAULT, Qnil) |
| 867 | 2156 Ibyte *eicpyout_malloc_fmt (Eistring *eistr, Bytecount *len_out, |
| 826 | 2157 Internal_Format fmt, Lisp_Object object); |
| 2158 #define eicpyout_alloca_fmt(eistr, ptrout, lenout, fmt, object) \ | |
| 771 | 2159 do { \ |
| 2160 Internal_Format ei23fmt = (fmt); \ | |
| 867 | 2161 Ibyte *ei23ptrout = &(ptrout); \ |
| 771 | 2162 Bytecount *ei23lenout = &(lenout); \ |
| 2163 \ | |
| 2164 assert (ei23fmt == FORMAT_DEFAULT); \ | |
| 2165 \ | |
| 2166 *ei23lenout = (eistr)->bytelen_; \ | |
|
5026
46cf825f6158
revamp DFC comment in text.h, some whitespace cleanup
Ben Wing <ben@xemacs.org>
parents:
4982
diff
changeset
|
2167 *ei23ptrout = alloca_ibytes ((eistr)->bytelen_ + 1); \ |
| 771 | 2168 memcpy (*ei23ptrout, (eistr)->data_, (eistr)->bytelen_ + 1); \ |
| 2169 } while (0) | |
| 2170 | |
| 2171 /* ----- Moving to the heap ----- */ | |
| 2172 | |
| 2173 #define eifree(ei) \ | |
| 2174 do { \ | |
| 2175 if ((ei)->mallocp_) \ | |
| 2176 { \ | |
| 2177 if ((ei)->data_) \ | |
|
5169
6c6d78781d59
cleanup of code related to xfree(), better KKCC backtrace capabilities, document XD_INLINE_LISP_OBJECT_BLOCK_PTR, fix some memory leaks, other code cleanup
Ben Wing <ben@xemacs.org>
parents:
5092
diff
changeset
|
2178 { \ |
|
6c6d78781d59
cleanup of code related to xfree(), better KKCC backtrace capabilities, document XD_INLINE_LISP_OBJECT_BLOCK_PTR, fix some memory leaks, other code cleanup
Ben Wing <ben@xemacs.org>
parents:
5092
diff
changeset
|
2179 xfree ((ei)->data_); \ |
|
6c6d78781d59
cleanup of code related to xfree(), better KKCC backtrace capabilities, document XD_INLINE_LISP_OBJECT_BLOCK_PTR, fix some memory leaks, other code cleanup
Ben Wing <ben@xemacs.org>
parents:
5092
diff
changeset
|
2180 (ei)->data_ = 0; \ |
|
6c6d78781d59
cleanup of code related to xfree(), better KKCC backtrace capabilities, document XD_INLINE_LISP_OBJECT_BLOCK_PTR, fix some memory leaks, other code cleanup
Ben Wing <ben@xemacs.org>
parents:
5092
diff
changeset
|
2181 } \ |
| 771 | 2182 if ((ei)->extdata_) \ |
|
5169
6c6d78781d59
cleanup of code related to xfree(), better KKCC backtrace capabilities, document XD_INLINE_LISP_OBJECT_BLOCK_PTR, fix some memory leaks, other code cleanup
Ben Wing <ben@xemacs.org>
parents:
5092
diff
changeset
|
2183 { \ |
|
6c6d78781d59
cleanup of code related to xfree(), better KKCC backtrace capabilities, document XD_INLINE_LISP_OBJECT_BLOCK_PTR, fix some memory leaks, other code cleanup
Ben Wing <ben@xemacs.org>
parents:
5092
diff
changeset
|
2184 xfree ((ei)->extdata_); \ |
|
6c6d78781d59
cleanup of code related to xfree(), better KKCC backtrace capabilities, document XD_INLINE_LISP_OBJECT_BLOCK_PTR, fix some memory leaks, other code cleanup
Ben Wing <ben@xemacs.org>
parents:
5092
diff
changeset
|
2185 (ei)->extdata_ = 0; \ |
|
6c6d78781d59
cleanup of code related to xfree(), better KKCC backtrace capabilities, document XD_INLINE_LISP_OBJECT_BLOCK_PTR, fix some memory leaks, other code cleanup
Ben Wing <ben@xemacs.org>
parents:
5092
diff
changeset
|
2186 } \ |
| 771 | 2187 eiinit_malloc (ei); \ |
| 2188 } \ | |
| 2189 else \ | |
| 2190 eiinit (ei); \ | |
| 2191 } while (0) | |
| 2192 | |
| 2193 int eifind_large_enough_buffer (int oldbufsize, int needed_size); | |
| 2194 void eito_malloc_1 (Eistring *ei); | |
| 2195 | |
| 2196 #define eito_malloc(ei) eito_malloc_1 (ei) | |
| 2197 | |
| 2198 #define eito_alloca(ei) \ | |
| 2199 do { \ | |
| 2200 if (!(ei)->mallocp_) \ | |
| 2201 return; \ | |
| 2202 (ei)->mallocp_ = 0; \ | |
| 2203 if ((ei)->data_) \ | |
| 2204 { \ | |
| 867 | 2205 Ibyte *ei13newdata; \ |
| 771 | 2206 \ |
| 2207 (ei)->max_size_allocated_ = \ | |
| 2208 eifind_large_enough_buffer (0, (ei)->bytelen_ + 1); \ | |
| 2367 | 2209 ei13newdata = alloca_ibytes ((ei)->max_size_allocated_); \ |
| 771 | 2210 memcpy (ei13newdata, (ei)->data_, (ei)->bytelen_ + 1); \ |
|
5026
46cf825f6158
revamp DFC comment in text.h, some whitespace cleanup
Ben Wing <ben@xemacs.org>
parents:
4982
diff
changeset
|
2211 xfree ((ei)->data_); \ |
| 771 | 2212 (ei)->data_ = ei13newdata; \ |
| 2213 } \ | |
| 2214 \ | |
| 2215 if ((ei)->extdata_) \ | |
| 2216 { \ | |
| 2367 | 2217 Extbyte *ei13newdata = alloca_extbytes ((ei)->extlen_ + 2); \ |
| 771 | 2218 \ |
| 2219 memcpy (ei13newdata, (ei)->extdata_, (ei)->extlen_); \ | |
| 2220 /* Double null-terminate in case of Unicode data */ \ | |
| 2221 ei13newdata[(ei)->extlen_] = '\0'; \ | |
| 2222 ei13newdata[(ei)->extlen_ + 1] = '\0'; \ | |
|
5026
46cf825f6158
revamp DFC comment in text.h, some whitespace cleanup
Ben Wing <ben@xemacs.org>
parents:
4982
diff
changeset
|
2223 xfree ((ei)->extdata_); \ |
| 771 | 2224 (ei)->extdata_ = ei13newdata; \ |
| 2225 } \ | |
| 2226 } while (0) | |
| 2227 | |
| 2228 | |
| 2229 /* ----- Retrieving the length ----- */ | |
| 2230 | |
| 2231 #define eilen(ei) ((ei)->bytelen_) | |
| 2232 #define eicharlen(ei) ((ei)->charlen_) | |
| 2233 | |
| 2234 | |
| 2235 /* ----- Working with positions ----- */ | |
| 2236 | |
| 2237 #define eicharpos_to_bytepos(ei, charpos) \ | |
| 2238 charcount_to_bytecount ((ei)->data_, charpos) | |
| 2239 #define eibytepos_to_charpos(ei, bytepos) \ | |
| 2240 bytecount_to_charcount ((ei)->data_, bytepos) | |
| 2241 | |
| 2242 DECLARE_INLINE_HEADER (Bytecount eiincpos_1 (Eistring *eistr, | |
| 2243 Bytecount bytepos, | |
| 2244 Charcount n)) | |
| 2245 { | |
| 867 | 2246 Ibyte *pos = eistr->data_ + bytepos; |
| 814 | 2247 Charcount i; |
| 771 | 2248 |
| 800 | 2249 text_checking_assert (bytepos >= 0 && bytepos <= eistr->bytelen_); |
| 2250 text_checking_assert (n >= 0 && n <= eistr->charlen_); | |
| 771 | 2251 /* We could check N more correctly now, but that would require a |
| 2252 call to bytecount_to_charcount(), which would be needlessly | |
| 2253 expensive (it would convert O(N) algorithms into O(N^2) algorithms | |
| 800 | 2254 with ERROR_CHECK_TEXT, which would be bad). If N is bad, we are |
| 867 | 2255 guaranteed to catch it either inside INC_IBYTEPTR() or in the check |
| 771 | 2256 below. */ |
| 2257 for (i = 0; i < n; i++) | |
| 867 | 2258 INC_IBYTEPTR (pos); |
| 800 | 2259 text_checking_assert (pos - eistr->data_ <= eistr->bytelen_); |
| 771 | 2260 return pos - eistr->data_; |
| 2261 } | |
| 2262 | |
| 2263 #define eiincpos (ei, bytepos) eiincpos_1 (ei, bytepos, 1) | |
| 2264 #define eiincpos_n (ei, bytepos, n) eiincpos_1 (ei, bytepos, n) | |
| 2265 | |
| 2266 DECLARE_INLINE_HEADER (Bytecount eidecpos_1 (Eistring *eistr, | |
| 2267 Bytecount bytepos, | |
| 2268 Charcount n)) | |
| 2269 { | |
| 867 | 2270 Ibyte *pos = eistr->data_ + bytepos; |
| 771 | 2271 int i; |
| 2272 | |
| 800 | 2273 text_checking_assert (bytepos >= 0 && bytepos <= eistr->bytelen_); |
| 2274 text_checking_assert (n >= 0 && n <= eistr->charlen_); | |
| 771 | 2275 /* We could check N more correctly now, but ... see above. */ |
| 2276 for (i = 0; i < n; i++) | |
| 867 | 2277 DEC_IBYTEPTR (pos); |
| 800 | 2278 text_checking_assert (pos - eistr->data_ <= eistr->bytelen_); |
| 771 | 2279 return pos - eistr->data_; |
| 2280 } | |
| 2281 | |
| 2282 #define eidecpos (ei, bytepos) eidecpos_1 (ei, bytepos, 1) | |
| 2283 #define eidecpos_n (ei, bytepos, n) eidecpos_1 (ei, bytepos, n) | |
| 2284 | |
| 2285 | |
| 2286 /* ----- Getting the character at a position ----- */ | |
| 2287 | |
| 2288 #define eigetch(ei, bytepos) \ | |
| 867 | 2289 itext_ichar ((ei)->data_ + (bytepos)) |
| 2290 #define eigetch_char(ei, charpos) itext_ichar_n ((ei)->data_, charpos) | |
| 771 | 2291 |
| 2292 | |
| 2293 /* ----- Setting the character at a position ----- */ | |
| 2294 | |
| 2295 #define eisetch(ei, bytepos, chr) \ | |
| 2296 eisub_ch (ei, bytepos, -1, -1, 1, chr) | |
| 2297 #define eisetch_char(ei, charpos, chr) \ | |
| 2298 eisub_ch (ei, -1, charpos, -1, 1, chr) | |
| 2299 | |
| 2300 | |
| 2301 /* ----- Concatenation ----- */ | |
| 2302 | |
| 2303 #define eicat_1(ei, data, bytelen, charlen) \ | |
| 2304 do { \ | |
| 2305 int ei14oldeibytelen = (ei)->bytelen_; \ | |
| 2306 int ei14bytelen = (bytelen); \ | |
| 2307 EI_ALLOC (ei, (ei)->bytelen_ + ei14bytelen, \ | |
| 2308 (ei)->charlen_ + (charlen), 1); \ | |
| 2309 memcpy ((ei)->data_ + ei14oldeibytelen, (data), \ | |
| 2310 ei14bytelen); \ | |
| 2311 } while (0) | |
| 2312 | |
| 2313 #define eicat_ei(ei, ei2) \ | |
| 2314 do { \ | |
| 2315 const Eistring *ei9 = (ei2); \ | |
| 2316 eicat_1 (ei, ei9->data_, ei9->bytelen_, ei9->charlen_); \ | |
| 2317 } while (0) | |
| 2318 | |
| 2421 | 2319 #define eicat_ascii(ei, ascstr) \ |
| 771 | 2320 do { \ |
| 2421 | 2321 const Ascbyte *ei15 = (ascstr); \ |
| 771 | 2322 int ei15len = strlen (ei15); \ |
| 2323 \ | |
| 2367 | 2324 ASSERT_ASCTEXT_ASCII_LEN (ei15, ei15len); \ |
| 771 | 2325 eicat_1 (ei, ei15, ei15len, \ |
| 867 | 2326 bytecount_to_charcount ((Ibyte *) ei15, ei15len)); \ |
| 771 | 2327 } while (0) |
| 2328 | |
| 2329 #define eicat_raw(ei, data, len) \ | |
| 2330 do { \ | |
| 2331 int ei16len = (len); \ | |
| 867 | 2332 const Ibyte *ei16data = (data); \ |
| 771 | 2333 eicat_1 (ei, ei16data, ei16len, \ |
| 2334 bytecount_to_charcount (ei16data, ei16len)); \ | |
| 2335 } while (0) | |
| 2336 | |
| 2337 #define eicat_rawz(ei, ptr) \ | |
| 2338 do { \ | |
| 867 | 2339 const Ibyte *ei16p5ptr = (ptr); \ |
| 771 | 2340 eicat_raw (ei, ei16p5ptr, qxestrlen (ei16p5ptr)); \ |
| 2341 } while (0) | |
| 2342 | |
| 2343 #define eicat_lstr(ei, lisp_string) \ | |
| 2344 do { \ | |
| 2345 Lisp_Object ei17 = (lisp_string); \ | |
| 2346 eicat_1 (ei, XSTRING_DATA (ei17), XSTRING_LENGTH (ei17), \ | |
| 826 | 2347 string_char_length (ei17)); \ |
| 771 | 2348 } while (0) |
| 2349 | |
| 2350 #define eicat_ch(ei, ch) \ | |
| 2351 do { \ | |
| 1333 | 2352 Ibyte ei22ch[MAX_ICHAR_LEN]; \ |
| 867 | 2353 Bytecount ei22len = set_itext_ichar (ei22ch, ch); \ |
| 771 | 2354 eicat_1 (ei, ei22ch, ei22len, 1); \ |
| 2355 } while (0) | |
| 2356 | |
| 2357 | |
| 2358 /* ----- Replacement ----- */ | |
| 2359 | |
| 2360 /* Replace the section of an Eistring at (OFF, LEN) with the data at | |
| 2361 SRC of length LEN. All positions have corresponding character values, | |
| 2362 and either can be -1 -- it will be computed from the other. */ | |
| 2363 | |
| 2364 #define eisub_1(ei, off, charoff, len, charlen, src, srclen, srccharlen) \ | |
| 2365 do { \ | |
| 2366 int ei18off = (off); \ | |
| 2367 int ei18charoff = (charoff); \ | |
| 2368 int ei18len = (len); \ | |
| 2369 int ei18charlen = (charlen); \ | |
| 867 | 2370 Ibyte *ei18src = (Ibyte *) (src); \ |
| 771 | 2371 int ei18srclen = (srclen); \ |
| 2372 int ei18srccharlen = (srccharlen); \ | |
| 2373 \ | |
| 2374 int ei18oldeibytelen = (ei)->bytelen_; \ | |
| 2375 \ | |
| 2376 eifixup_bytechar ((ei)->data_, ei18off, ei18charoff); \ | |
| 2377 eifixup_bytechar ((ei)->data_ + ei18off, ei18len, ei18charlen); \ | |
| 2378 eifixup_bytechar (ei18src, ei18srclen, ei18srccharlen); \ | |
| 2379 \ | |
| 2380 EI_ALLOC (ei, (ei)->bytelen_ + ei18srclen - ei18len, \ | |
| 2381 (ei)->charlen_ + ei18srccharlen - ei18charlen, 0); \ | |
| 2382 if (ei18len != ei18srclen) \ | |
| 2383 memmove ((ei)->data_ + ei18off + ei18srclen, \ | |
| 2384 (ei)->data_ + ei18off + ei18len, \ | |
| 2385 /* include zero terminator. */ \ | |
| 2386 ei18oldeibytelen - (ei18off + ei18len) + 1); \ | |
| 2387 if (ei18srclen > 0) \ | |
| 2388 memcpy ((ei)->data_ + ei18off, ei18src, ei18srclen); \ | |
| 2389 } while (0) | |
| 2390 | |
| 2391 #define eisub_ei(ei, off, charoff, len, charlen, ei2) \ | |
| 2392 do { \ | |
| 1333 | 2393 const Eistring *ei19 = (ei2); \ |
| 771 | 2394 eisub_1 (ei, off, charoff, len, charlen, ei19->data_, ei19->bytelen_, \ |
| 2395 ei19->charlen_); \ | |
| 2396 } while (0) | |
| 2397 | |
| 2421 | 2398 #define eisub_ascii(ei, off, charoff, len, charlen, ascstr) \ |
| 771 | 2399 do { \ |
| 2421 | 2400 const Ascbyte *ei20 = (ascstr); \ |
| 771 | 2401 int ei20len = strlen (ei20); \ |
| 2367 | 2402 ASSERT_ASCTEXT_ASCII_LEN (ei20, ei20len); \ |
| 771 | 2403 eisub_1 (ei, off, charoff, len, charlen, ei20, ei20len, -1); \ |
| 2404 } while (0) | |
| 2405 | |
| 2406 #define eisub_ch(ei, off, charoff, len, charlen, ch) \ | |
| 2407 do { \ | |
| 1333 | 2408 Ibyte ei21ch[MAX_ICHAR_LEN]; \ |
| 867 | 2409 Bytecount ei21len = set_itext_ichar (ei21ch, ch); \ |
| 771 | 2410 eisub_1 (ei, off, charoff, len, charlen, ei21ch, ei21len, 1); \ |
| 2411 } while (0) | |
| 2412 | |
| 2413 #define eidel(ei, off, charoff, len, charlen) \ | |
| 2414 eisub_1(ei, off, charoff, len, charlen, NULL, 0, 0) | |
| 2415 | |
| 2416 | |
| 2417 /* ----- Converting to an external format ----- */ | |
| 2418 | |
| 1333 | 2419 #define eito_external(ei, codesys) \ |
| 771 | 2420 do { \ |
| 2421 if ((ei)->mallocp_) \ | |
| 2422 { \ | |
| 2423 if ((ei)->extdata_) \ | |
| 2424 { \ | |
|
5026
46cf825f6158
revamp DFC comment in text.h, some whitespace cleanup
Ben Wing <ben@xemacs.org>
parents:
4982
diff
changeset
|
2425 xfree ((ei)->extdata_); \ |
| 771 | 2426 (ei)->extdata_ = 0; \ |
| 2427 } \ | |
| 2428 TO_EXTERNAL_FORMAT (DATA, ((ei)->data_, (ei)->bytelen_), \ | |
| 2429 MALLOC, ((ei)->extdata_, (ei)->extlen_), \ | |
| 1333 | 2430 codesys); \ |
| 771 | 2431 } \ |
| 2432 else \ | |
| 2433 TO_EXTERNAL_FORMAT (DATA, ((ei)->data_, (ei)->bytelen_), \ | |
| 2434 ALLOCA, ((ei)->extdata_, (ei)->extlen_), \ | |
| 1318 | 2435 codesys); \ |
| 771 | 2436 } while (0) |
| 2437 | |
| 2438 #define eiextdata(ei) ((ei)->extdata_) | |
| 2439 #define eiextlen(ei) ((ei)->extlen_) | |
| 2440 | |
| 2441 | |
| 2442 /* ----- Searching in the Eistring for a character ----- */ | |
| 2443 | |
| 2444 #define eichr(eistr, chr) \ | |
| 2445 NOT YET IMPLEMENTED | |
| 2446 #define eichr_char(eistr, chr) \ | |
| 2447 NOT YET IMPLEMENTED | |
| 2448 #define eichr_off(eistr, chr, off, charoff) \ | |
| 2449 NOT YET IMPLEMENTED | |
| 2450 #define eichr_off_char(eistr, chr, off, charoff) \ | |
| 2451 NOT YET IMPLEMENTED | |
| 2452 #define eirchr(eistr, chr) \ | |
| 2453 NOT YET IMPLEMENTED | |
| 2454 #define eirchr_char(eistr, chr) \ | |
| 2455 NOT YET IMPLEMENTED | |
| 2456 #define eirchr_off(eistr, chr, off, charoff) \ | |
| 2457 NOT YET IMPLEMENTED | |
| 2458 #define eirchr_off_char(eistr, chr, off, charoff) \ | |
| 2459 NOT YET IMPLEMENTED | |
| 2460 | |
| 2461 | |
| 2462 /* ----- Searching in the Eistring for a string ----- */ | |
| 2463 | |
| 2464 #define eistr_ei(eistr, eistr2) \ | |
| 2465 NOT YET IMPLEMENTED | |
| 2466 #define eistr_ei_char(eistr, eistr2) \ | |
| 2467 NOT YET IMPLEMENTED | |
| 2468 #define eistr_ei_off(eistr, eistr2, off, charoff) \ | |
| 2469 NOT YET IMPLEMENTED | |
| 2470 #define eistr_ei_off_char(eistr, eistr2, off, charoff) \ | |
| 2471 NOT YET IMPLEMENTED | |
| 2472 #define eirstr_ei(eistr, eistr2) \ | |
| 2473 NOT YET IMPLEMENTED | |
| 2474 #define eirstr_ei_char(eistr, eistr2) \ | |
| 2475 NOT YET IMPLEMENTED | |
| 2476 #define eirstr_ei_off(eistr, eistr2, off, charoff) \ | |
| 2477 NOT YET IMPLEMENTED | |
| 2478 #define eirstr_ei_off_char(eistr, eistr2, off, charoff) \ | |
| 2479 NOT YET IMPLEMENTED | |
| 2480 | |
| 2421 | 2481 #define eistr_ascii(eistr, ascstr) \ |
| 771 | 2482 NOT YET IMPLEMENTED |
| 2421 | 2483 #define eistr_ascii_char(eistr, ascstr) \ |
| 771 | 2484 NOT YET IMPLEMENTED |
| 2421 | 2485 #define eistr_ascii_off(eistr, ascstr, off, charoff) \ |
| 771 | 2486 NOT YET IMPLEMENTED |
| 2421 | 2487 #define eistr_ascii_off_char(eistr, ascstr, off, charoff) \ |
| 771 | 2488 NOT YET IMPLEMENTED |
| 2421 | 2489 #define eirstr_ascii(eistr, ascstr) \ |
| 771 | 2490 NOT YET IMPLEMENTED |
| 2421 | 2491 #define eirstr_ascii_char(eistr, ascstr) \ |
| 771 | 2492 NOT YET IMPLEMENTED |
| 2421 | 2493 #define eirstr_ascii_off(eistr, ascstr, off, charoff) \ |
| 771 | 2494 NOT YET IMPLEMENTED |
| 2421 | 2495 #define eirstr_ascii_off_char(eistr, ascstr, off, charoff) \ |
| 771 | 2496 NOT YET IMPLEMENTED |
| 2497 | |
| 2498 | |
| 2499 /* ----- Comparison ----- */ | |
| 2500 | |
| 2501 int eicmp_1 (Eistring *ei, Bytecount off, Charcount charoff, | |
| 867 | 2502 Bytecount len, Charcount charlen, const Ibyte *data, |
| 2526 | 2503 const Eistring *ei2, int is_ascii, int fold_case); |
| 771 | 2504 |
| 2505 #define eicmp_ei(eistr, eistr2) \ | |
| 2506 eicmp_1 (eistr, 0, -1, -1, -1, 0, eistr2, 0, 0) | |
| 2507 #define eicmp_off_ei(eistr, off, charoff, len, charlen, eistr2) \ | |
| 2508 eicmp_1 (eistr, off, charoff, len, charlen, 0, eistr2, 0, 0) | |
| 2509 #define eicasecmp_ei(eistr, eistr2) \ | |
| 2510 eicmp_1 (eistr, 0, -1, -1, -1, 0, eistr2, 0, 1) | |
| 2511 #define eicasecmp_off_ei(eistr, off, charoff, len, charlen, eistr2) \ | |
| 2512 eicmp_1 (eistr, off, charoff, len, charlen, 0, eistr2, 0, 1) | |
| 2513 #define eicasecmp_i18n_ei(eistr, eistr2) \ | |
| 2514 eicmp_1 (eistr, 0, -1, -1, -1, 0, eistr2, 0, 2) | |
| 2515 #define eicasecmp_i18n_off_ei(eistr, off, charoff, len, charlen, eistr2) \ | |
| 2516 eicmp_1 (eistr, off, charoff, len, charlen, 0, eistr2, 0, 2) | |
| 2517 | |
| 2421 | 2518 #define eicmp_ascii(eistr, ascstr) \ |
| 2519 eicmp_1 (eistr, 0, -1, -1, -1, (const Ibyte *) ascstr, 0, 1, 0) | |
| 2520 #define eicmp_off_ascii(eistr, off, charoff, len, charlen, ascstr) \ | |
| 2521 eicmp_1 (eistr, off, charoff, len, charlen, (const Ibyte *) ascstr, 0, 1, 0) | |
| 2522 #define eicasecmp_ascii(eistr, ascstr) \ | |
| 2523 eicmp_1 (eistr, 0, -1, -1, -1, (const Ibyte *) ascstr, 0, 1, 1) | |
| 2524 #define eicasecmp_off_ascii(eistr, off, charoff, len, charlen, ascstr) \ | |
| 2525 eicmp_1 (eistr, off, charoff, len, charlen, (const Ibyte *) ascstr, 0, 1, 1) | |
| 2526 #define eicasecmp_i18n_ascii(eistr, ascstr) \ | |
| 2527 eicmp_1 (eistr, 0, -1, -1, -1, (const Ibyte *) ascstr, 0, 1, 2) | |
| 2528 #define eicasecmp_i18n_off_ascii(eistr, off, charoff, len, charlen, ascstr) \ | |
| 2529 eicmp_1 (eistr, off, charoff, len, charlen, (const Ibyte *) ascstr, 0, 1, 2) | |
| 771 | 2530 |
| 2531 | |
| 2532 /* ----- Case-changing the Eistring ----- */ | |
| 2533 | |
| 867 | 2534 int eistr_casefiddle_1 (Ibyte *olddata, Bytecount len, Ibyte *newdata, |
| 771 | 2535 int downp); |
| 2536 | |
| 2537 #define EI_CASECHANGE(ei, downp) \ | |
| 2538 do { \ | |
| 867 | 2539 int ei11new_allocmax = (ei)->charlen_ * MAX_ICHAR_LEN + 1; \ |
| 1333 | 2540 Ibyte *ei11storage = \ |
| 2367 | 2541 (Ibyte *) alloca_ibytes (ei11new_allocmax); \ |
| 771 | 2542 int ei11newlen = eistr_casefiddle_1 ((ei)->data_, (ei)->bytelen_, \ |
| 2543 ei11storage, downp); \ | |
| 2544 \ | |
| 2545 if (ei11newlen) \ | |
| 2546 { \ | |
| 2547 (ei)->max_size_allocated_ = ei11new_allocmax; \ | |
| 1333 | 2548 (ei)->data_ = ei11storage; \ |
| 771 | 2549 (ei)->bytelen_ = ei11newlen; \ |
| 2550 /* charlen is the same. */ \ | |
| 2551 } \ | |
| 2552 } while (0) | |
| 2553 | |
| 2554 #define eilwr(ei) EI_CASECHANGE (ei, 1) | |
| 2555 #define eiupr(ei) EI_CASECHANGE (ei, 0) | |
| 2556 | |
| 1743 | 2557 END_C_DECLS |
| 1650 | 2558 |
| 771 | 2559 |
| 2560 /************************************************************************/ | |
| 2561 /* */ | |
| 2562 /* Converting between internal and external format */ | |
| 2563 /* */ | |
| 2564 /************************************************************************/ | |
| 2565 /* | |
| 1318 | 2566 The macros below are used for converting data between different formats. |
| 2567 Generally, the data is textual, and the formats are related to | |
| 2568 internationalization (e.g. converting between internal-format text and | |
| 2569 UTF-8) -- but the mechanism is general, and could be used for anything, | |
| 2570 e.g. decoding gzipped data. | |
| 2571 | |
| 2572 In general, conversion involves a source of data, a sink, the existing | |
| 2573 format of the source data, and the desired format of the sink. The | |
| 2574 macros below, however, always require that either the source or sink is | |
| 2575 internal-format text. Therefore, in practice the conversions below | |
| 2576 involve source, sink, an external format (specified by a coding system), | |
| 2577 and the direction of conversion (internal->external or vice-versa). | |
| 2578 | |
| 2579 Sources and sinks can be raw data (sized or unsized -- when unsized, | |
| 2580 input data is assumed to be null-terminated [double null-terminated for | |
| 2581 Unicode-format data], and on output the length is not stored anywhere), | |
| 2582 Lisp strings, Lisp buffers, lstreams, and opaque data objects. When the | |
| 2583 output is raw data, the result can be allocated either with alloca() or | |
| 2584 malloc(). (There is currently no provision for writing into a fixed | |
| 2585 buffer. If you want this, use alloca() output and then copy the data -- | |
| 2586 but be careful with the size! Unless you are very sure of the encoding | |
| 2587 being used, upper bounds for the size are not in general computable.) | |
| 2588 The obvious restrictions on source and sink types apply (e.g. Lisp | |
| 2589 strings are a source and sink only for internal data). | |
| 2590 | |
| 2591 All raw data outputted will contain an extra null byte (two bytes for | |
| 2592 Unicode -- currently, in fact, all output data, whether internal or | |
| 2593 external, is double-null-terminated, but you can't count on this; see | |
| 2594 below). This means that enough space is allocated to contain the extra | |
| 2595 nulls; however, these nulls are not reflected in the returned output | |
| 2596 size. | |
| 2597 | |
| 2598 The most basic macros are TO_EXTERNAL_FORMAT and TO_INTERNAL_FORMAT. | |
| 2599 These can be used to convert between any kinds of sources or sinks. | |
| 2600 However, 99% of conversions involve raw data or Lisp strings as both | |
| 2601 source and sink, and usually data is output as alloca() rather than | |
| 2602 malloc(). For this reason, convenience macros are defined for many types | |
|
5026
46cf825f6158
revamp DFC comment in text.h, some whitespace cleanup
Ben Wing <ben@xemacs.org>
parents:
4982
diff
changeset
|
2603 of conversions involving raw data and/or Lisp strings, when the output is |
|
46cf825f6158
revamp DFC comment in text.h, some whitespace cleanup
Ben Wing <ben@xemacs.org>
parents:
4982
diff
changeset
|
2604 an alloca()ed or malloc()ed string. (When the destination is a |
| 1318 | 2605 Lisp_String, there are other functions that should be used instead -- |
|
5026
46cf825f6158
revamp DFC comment in text.h, some whitespace cleanup
Ben Wing <ben@xemacs.org>
parents:
4982
diff
changeset
|
2606 build_extstring() and make_extstring(), for example.) In general, the |
|
46cf825f6158
revamp DFC comment in text.h, some whitespace cleanup
Ben Wing <ben@xemacs.org>
parents:
4982
diff
changeset
|
2607 convenience macros return their result as a return value, even if the |
|
46cf825f6158
revamp DFC comment in text.h, some whitespace cleanup
Ben Wing <ben@xemacs.org>
parents:
4982
diff
changeset
|
2608 result is an alloca()ed string -- some trickery is required to do this, |
|
46cf825f6158
revamp DFC comment in text.h, some whitespace cleanup
Ben Wing <ben@xemacs.org>
parents:
4982
diff
changeset
|
2609 but it's definitely possible. However, for macros whose result is a |
|
46cf825f6158
revamp DFC comment in text.h, some whitespace cleanup
Ben Wing <ben@xemacs.org>
parents:
4982
diff
changeset
|
2610 "sized string" (i.e. a string plus a length), there are two values to |
|
46cf825f6158
revamp DFC comment in text.h, some whitespace cleanup
Ben Wing <ben@xemacs.org>
parents:
4982
diff
changeset
|
2611 return, and both are returned through parameters. |
|
46cf825f6158
revamp DFC comment in text.h, some whitespace cleanup
Ben Wing <ben@xemacs.org>
parents:
4982
diff
changeset
|
2612 |
|
46cf825f6158
revamp DFC comment in text.h, some whitespace cleanup
Ben Wing <ben@xemacs.org>
parents:
4982
diff
changeset
|
2613 The convenience macros have the form: |
|
46cf825f6158
revamp DFC comment in text.h, some whitespace cleanup
Ben Wing <ben@xemacs.org>
parents:
4982
diff
changeset
|
2614 |
|
46cf825f6158
revamp DFC comment in text.h, some whitespace cleanup
Ben Wing <ben@xemacs.org>
parents:
4982
diff
changeset
|
2615 (a) (SIZED_)?EXTERNAL_TO_ITEXT(_MALLOC)? |
|
46cf825f6158
revamp DFC comment in text.h, some whitespace cleanup
Ben Wing <ben@xemacs.org>
parents:
4982
diff
changeset
|
2616 (b) (ITEXT|LISP_STRING)_TO_(SIZED_)?EXTERNAL(_MALLOC)? |
|
46cf825f6158
revamp DFC comment in text.h, some whitespace cleanup
Ben Wing <ben@xemacs.org>
parents:
4982
diff
changeset
|
2617 |
|
46cf825f6158
revamp DFC comment in text.h, some whitespace cleanup
Ben Wing <ben@xemacs.org>
parents:
4982
diff
changeset
|
2618 Note also that there are some additional, more specific macros defined |
|
46cf825f6158
revamp DFC comment in text.h, some whitespace cleanup
Ben Wing <ben@xemacs.org>
parents:
4982
diff
changeset
|
2619 elsewhere, for example macros like EXTERNAL_TO_TSTR() in syswindows.h for |
|
46cf825f6158
revamp DFC comment in text.h, some whitespace cleanup
Ben Wing <ben@xemacs.org>
parents:
4982
diff
changeset
|
2620 conversions that specifically involve the `mswindows-tstr' coding system |
|
46cf825f6158
revamp DFC comment in text.h, some whitespace cleanup
Ben Wing <ben@xemacs.org>
parents:
4982
diff
changeset
|
2621 (which is normally an alias of `mswindows-unicode', a variation of |
|
46cf825f6158
revamp DFC comment in text.h, some whitespace cleanup
Ben Wing <ben@xemacs.org>
parents:
4982
diff
changeset
|
2622 UTF-16). |
|
46cf825f6158
revamp DFC comment in text.h, some whitespace cleanup
Ben Wing <ben@xemacs.org>
parents:
4982
diff
changeset
|
2623 |
|
46cf825f6158
revamp DFC comment in text.h, some whitespace cleanup
Ben Wing <ben@xemacs.org>
parents:
4982
diff
changeset
|
2624 Convenience macros of type (a) are for conversion from external to |
|
46cf825f6158
revamp DFC comment in text.h, some whitespace cleanup
Ben Wing <ben@xemacs.org>
parents:
4982
diff
changeset
|
2625 internal, while type (b) macros convert internal to external. A few |
|
46cf825f6158
revamp DFC comment in text.h, some whitespace cleanup
Ben Wing <ben@xemacs.org>
parents:
4982
diff
changeset
|
2626 notes: |
|
46cf825f6158
revamp DFC comment in text.h, some whitespace cleanup
Ben Wing <ben@xemacs.org>
parents:
4982
diff
changeset
|
2627 |
|
46cf825f6158
revamp DFC comment in text.h, some whitespace cleanup
Ben Wing <ben@xemacs.org>
parents:
4982
diff
changeset
|
2628 -- The output is an alloca()ed string unless `_MALLOC' is appended, |
|
46cf825f6158
revamp DFC comment in text.h, some whitespace cleanup
Ben Wing <ben@xemacs.org>
parents:
4982
diff
changeset
|
2629 in which case it's a malloc()ed string. |
|
46cf825f6158
revamp DFC comment in text.h, some whitespace cleanup
Ben Wing <ben@xemacs.org>
parents:
4982
diff
changeset
|
2630 -- When the destination says ITEXT, it means internally-formatted text of |
|
46cf825f6158
revamp DFC comment in text.h, some whitespace cleanup
Ben Wing <ben@xemacs.org>
parents:
4982
diff
changeset
|
2631 type `Ibyte *' (which boils down to `unsigned char *'). |
|
46cf825f6158
revamp DFC comment in text.h, some whitespace cleanup
Ben Wing <ben@xemacs.org>
parents:
4982
diff
changeset
|
2632 -- When the destination says EXTERNAL, it means externally-formatted |
|
46cf825f6158
revamp DFC comment in text.h, some whitespace cleanup
Ben Wing <ben@xemacs.org>
parents:
4982
diff
changeset
|
2633 text of type `Extbyte *' (which boils down to `char *'). Because |
|
46cf825f6158
revamp DFC comment in text.h, some whitespace cleanup
Ben Wing <ben@xemacs.org>
parents:
4982
diff
changeset
|
2634 `Ibyte *' and `Extbyte *' are different underlying types, accidentally |
|
46cf825f6158
revamp DFC comment in text.h, some whitespace cleanup
Ben Wing <ben@xemacs.org>
parents:
4982
diff
changeset
|
2635 mixing them will generally lead to a warning under gcc, and an error |
|
46cf825f6158
revamp DFC comment in text.h, some whitespace cleanup
Ben Wing <ben@xemacs.org>
parents:
4982
diff
changeset
|
2636 under g++. |
|
46cf825f6158
revamp DFC comment in text.h, some whitespace cleanup
Ben Wing <ben@xemacs.org>
parents:
4982
diff
changeset
|
2637 -- When SIZED_EXTERNAL is involved, there are two parameters, one for |
|
46cf825f6158
revamp DFC comment in text.h, some whitespace cleanup
Ben Wing <ben@xemacs.org>
parents:
4982
diff
changeset
|
2638 the string and one for its length. When SIZED_EXTERNAL is the |
|
46cf825f6158
revamp DFC comment in text.h, some whitespace cleanup
Ben Wing <ben@xemacs.org>
parents:
4982
diff
changeset
|
2639 destination, these two parameters should be lvalues and will have the |
|
46cf825f6158
revamp DFC comment in text.h, some whitespace cleanup
Ben Wing <ben@xemacs.org>
parents:
4982
diff
changeset
|
2640 result stored into them. |
|
46cf825f6158
revamp DFC comment in text.h, some whitespace cleanup
Ben Wing <ben@xemacs.org>
parents:
4982
diff
changeset
|
2641 -- There is no LISP_STRING destination; use `build_extstring' instead of |
|
46cf825f6158
revamp DFC comment in text.h, some whitespace cleanup
Ben Wing <ben@xemacs.org>
parents:
4982
diff
changeset
|
2642 `EXTERNAL_TO_LISP_STRING' and `make_extstring' instead of |
|
46cf825f6158
revamp DFC comment in text.h, some whitespace cleanup
Ben Wing <ben@xemacs.org>
parents:
4982
diff
changeset
|
2643 `SIZED_EXTERNAL_TO_LISP_STRING'. |
|
46cf825f6158
revamp DFC comment in text.h, some whitespace cleanup
Ben Wing <ben@xemacs.org>
parents:
4982
diff
changeset
|
2644 -- There is no SIZED_ITEXT type. If you need this: First, if your data |
|
46cf825f6158
revamp DFC comment in text.h, some whitespace cleanup
Ben Wing <ben@xemacs.org>
parents:
4982
diff
changeset
|
2645 is coming from a Lisp string, it would be better to use the |
|
46cf825f6158
revamp DFC comment in text.h, some whitespace cleanup
Ben Wing <ben@xemacs.org>
parents:
4982
diff
changeset
|
2646 LISP_STRING_TO_* macros. If this doesn't apply or work, call the |
|
46cf825f6158
revamp DFC comment in text.h, some whitespace cleanup
Ben Wing <ben@xemacs.org>
parents:
4982
diff
changeset
|
2647 TO_EXTERNAL_FORMAT() or TO_INTERNAL_FORMAT() macros directly. |
|
46cf825f6158
revamp DFC comment in text.h, some whitespace cleanup
Ben Wing <ben@xemacs.org>
parents:
4982
diff
changeset
|
2648 |
|
46cf825f6158
revamp DFC comment in text.h, some whitespace cleanup
Ben Wing <ben@xemacs.org>
parents:
4982
diff
changeset
|
2649 Note that previously the convenience macros, like the raw TO_*_FORMAT |
|
46cf825f6158
revamp DFC comment in text.h, some whitespace cleanup
Ben Wing <ben@xemacs.org>
parents:
4982
diff
changeset
|
2650 macros, were always written to store their arguments into a passed-in |
|
46cf825f6158
revamp DFC comment in text.h, some whitespace cleanup
Ben Wing <ben@xemacs.org>
parents:
4982
diff
changeset
|
2651 lvalue rather than return them, due to major bugs in calling alloca() |
|
46cf825f6158
revamp DFC comment in text.h, some whitespace cleanup
Ben Wing <ben@xemacs.org>
parents:
4982
diff
changeset
|
2652 inside of a function call on x86 gcc circa version 2.6. This has |
|
46cf825f6158
revamp DFC comment in text.h, some whitespace cleanup
Ben Wing <ben@xemacs.org>
parents:
4982
diff
changeset
|
2653 apparently long since been fixed, but just to make sure we have a |
|
46cf825f6158
revamp DFC comment in text.h, some whitespace cleanup
Ben Wing <ben@xemacs.org>
parents:
4982
diff
changeset
|
2654 `configure' test for broken alloca() in function calls, and in such case |
|
46cf825f6158
revamp DFC comment in text.h, some whitespace cleanup
Ben Wing <ben@xemacs.org>
parents:
4982
diff
changeset
|
2655 the portable xemacs_c_alloca() implementation is substituted instead. |
|
46cf825f6158
revamp DFC comment in text.h, some whitespace cleanup
Ben Wing <ben@xemacs.org>
parents:
4982
diff
changeset
|
2656 Note that this implementation actually uses malloc() but notes the stack |
|
46cf825f6158
revamp DFC comment in text.h, some whitespace cleanup
Ben Wing <ben@xemacs.org>
parents:
4982
diff
changeset
|
2657 pointer at the time of allocation, and at next call any allocations |
|
46cf825f6158
revamp DFC comment in text.h, some whitespace cleanup
Ben Wing <ben@xemacs.org>
parents:
4982
diff
changeset
|
2658 belonging to inner stack frames are freed. This isn't perfect but |
|
46cf825f6158
revamp DFC comment in text.h, some whitespace cleanup
Ben Wing <ben@xemacs.org>
parents:
4982
diff
changeset
|
2659 more-or-less gets the job done as an emergency backup, and in most |
|
46cf825f6158
revamp DFC comment in text.h, some whitespace cleanup
Ben Wing <ben@xemacs.org>
parents:
4982
diff
changeset
|
2660 circumstances it prevents arbitrary memory leakage -- at most you should |
|
46cf825f6158
revamp DFC comment in text.h, some whitespace cleanup
Ben Wing <ben@xemacs.org>
parents:
4982
diff
changeset
|
2661 get a fixed amount of leakage. |
|
46cf825f6158
revamp DFC comment in text.h, some whitespace cleanup
Ben Wing <ben@xemacs.org>
parents:
4982
diff
changeset
|
2662 |
|
46cf825f6158
revamp DFC comment in text.h, some whitespace cleanup
Ben Wing <ben@xemacs.org>
parents:
4982
diff
changeset
|
2663 NOTE: All convenience macros are ultimately defined in terms of |
|
46cf825f6158
revamp DFC comment in text.h, some whitespace cleanup
Ben Wing <ben@xemacs.org>
parents:
4982
diff
changeset
|
2664 TO_EXTERNAL_FORMAT and TO_INTERNAL_FORMAT. Thus, any comments below |
|
46cf825f6158
revamp DFC comment in text.h, some whitespace cleanup
Ben Wing <ben@xemacs.org>
parents:
4982
diff
changeset
|
2665 about the workings of these macros also apply to all convenience macros. |
| 1318 | 2666 |
| 2667 TO_EXTERNAL_FORMAT (source_type, source, sink_type, sink, codesys) | |
| 2668 TO_INTERNAL_FORMAT (source_type, source, sink_type, sink, codesys) | |
| 771 | 2669 |
| 2670 Typical use is | |
| 2671 | |
| 2367 | 2672 TO_EXTERNAL_FORMAT (LISP_STRING, str, C_STRING_MALLOC, ptr, Qfile_name); |
| 2673 | |
| 2674 which means that the contents of the lisp string `str' are written | |
| 2675 to a malloc'ed memory area which will be pointed to by `ptr', after the | |
| 2676 function returns. The conversion will be done using the `file-name' | |
| 2677 coding system (which will be controlled by the user indirectly by | |
| 2678 setting or binding the variable `file-name-coding-system'). | |
| 2679 | |
| 2680 Some sources and sinks require two C variables to specify. We use | |
| 2681 some preprocessor magic to allow different source and sink types, and | |
| 2682 even different numbers of arguments to specify different types of | |
| 2683 sources and sinks. | |
| 2684 | |
| 2685 So we can have a call that looks like | |
| 2686 | |
| 2687 TO_INTERNAL_FORMAT (DATA, (ptr, len), | |
| 2688 MALLOC, (ptr, len), | |
| 2689 coding_system); | |
| 2690 | |
| 2691 The parenthesized argument pairs are required to make the | |
| 2692 preprocessor magic work. | |
| 771 | 2693 |
| 2694 NOTE: GC is inhibited during the entire operation of these macros. This | |
| 2695 is because frequently the data to be converted comes from strings but | |
| 2696 gets passed in as just DATA, and GC may move around the string data. If | |
| 2697 we didn't inhibit GC, there'd have to be a lot of messy recoding, | |
| 2698 alloca-copying of strings and other annoying stuff. | |
| 2699 | |
| 2700 The source or sink can be specified in one of these ways: | |
| 2701 | |
| 2702 DATA, (ptr, len), // input data is a fixed buffer of size len | |
| 851 | 2703 ALLOCA, (ptr, len), // output data is in a ALLOCA()ed buffer of size len |
| 771 | 2704 MALLOC, (ptr, len), // output data is in a malloc()ed buffer of size len |
| 2705 C_STRING_ALLOCA, ptr, // equivalent to ALLOCA (ptr, len_ignored) on output | |
| 2706 C_STRING_MALLOC, ptr, // equivalent to MALLOC (ptr, len_ignored) on output | |
| 2707 C_STRING, ptr, // equivalent to DATA, (ptr, strlen/wcslen (ptr)) | |
| 2708 // on input (the Unicode version is used when correct) | |
| 2709 LISP_STRING, string, // input or output is a Lisp_Object of type string | |
| 2710 LISP_BUFFER, buffer, // output is written to (point) in lisp buffer | |
| 2711 LISP_LSTREAM, lstream, // input or output is a Lisp_Object of type lstream | |
| 2712 LISP_OPAQUE, object, // input or output is a Lisp_Object of type opaque | |
| 2713 | |
| 2714 When specifying the sink, use lvalues, since the macro will assign to them, | |
| 2715 except when the sink is an lstream or a lisp buffer. | |
| 2716 | |
| 2367 | 2717 For the sink types `ALLOCA' and `C_STRING_ALLOCA', the resulting text is |
| 2718 stored in a stack-allocated buffer, which is automatically freed on | |
| 2719 returning from the function. However, the sink types `MALLOC' and | |
| 2720 `C_STRING_MALLOC' return `xmalloc()'ed memory. The caller is responsible | |
| 2721 for freeing this memory using `xfree()'. | |
| 2722 | |
| 771 | 2723 The macros accept the kinds of sources and sinks appropriate for |
| 2724 internal and external data representation. See the type_checking_assert | |
| 2725 macros below for the actual allowed types. | |
| 2726 | |
| 2727 Since some sources and sinks use one argument (a Lisp_Object) to | |
| 2728 specify them, while others take a (pointer, length) pair, we use | |
| 2729 some C preprocessor trickery to allow pair arguments to be specified | |
| 2730 by parenthesizing them, as in the examples above. | |
| 2731 | |
| 2732 Anything prefixed by dfc_ (`data format conversion') is private. | |
| 2733 They are only used to implement these macros. | |
| 2734 | |
|
5026
46cf825f6158
revamp DFC comment in text.h, some whitespace cleanup
Ben Wing <ben@xemacs.org>
parents:
4982
diff
changeset
|
2735 Using C_STRING* is appropriate for data that comes from or is going to |
|
46cf825f6158
revamp DFC comment in text.h, some whitespace cleanup
Ben Wing <ben@xemacs.org>
parents:
4982
diff
changeset
|
2736 an external API that takes null-terminated strings, or when the string is |
|
46cf825f6158
revamp DFC comment in text.h, some whitespace cleanup
Ben Wing <ben@xemacs.org>
parents:
4982
diff
changeset
|
2737 always intended to contain text and never binary data, e.g. file names. |
|
46cf825f6158
revamp DFC comment in text.h, some whitespace cleanup
Ben Wing <ben@xemacs.org>
parents:
4982
diff
changeset
|
2738 Any time we are dealing with binary or general data, we must be '\0'-clean, |
|
46cf825f6158
revamp DFC comment in text.h, some whitespace cleanup
Ben Wing <ben@xemacs.org>
parents:
4982
diff
changeset
|
2739 i.e. allow arbitrary data which might contain embedded '\0', by tracking |
|
46cf825f6158
revamp DFC comment in text.h, some whitespace cleanup
Ben Wing <ben@xemacs.org>
parents:
4982
diff
changeset
|
2740 both pointer and length. |
| 771 | 2741 |
| 2742 There is no problem using the same lvalue for source and sink. | |
| 2743 | |
| 2744 Also, when pointers are required, the code (currently at least) is | |
| 2745 lax and allows any pointer types, either in the source or the sink. | |
| 2746 This makes it possible, e.g., to deal with internal format data held | |
| 2747 in char *'s or external format data held in WCHAR * (i.e. Unicode). | |
| 2748 | |
| 2749 Finally, whenever storage allocation is called for, extra space is | |
| 2750 allocated for a terminating zero, and such a zero is stored in the | |
| 2751 appropriate place, regardless of whether the source data was | |
| 2752 specified using a length or was specified as zero-terminated. This | |
| 2753 allows you to freely pass the resulting data, no matter how | |
| 2754 obtained, to a routine that expects zero termination (modulo, of | |
| 2755 course, that any embedded zeros in the resulting text will cause | |
| 2756 truncation). In fact, currently two embedded zeros are allocated | |
| 2757 and stored after the data result. This is to allow for the | |
| 2758 possibility of storing a Unicode value on output, which needs the | |
| 2759 two zeros. Currently, however, the two zeros are stored regardless | |
| 2760 of whether the conversion is internal or external and regardless of | |
| 2761 whether the external coding system is in fact Unicode. This | |
| 2762 behavior may change in the future, and you cannot rely on this -- | |
| 2763 the most you can rely on is that sink data in Unicode format will | |
| 2764 have two terminating nulls, which combine to form one Unicode null | |
| 2367 | 2765 character. |
|
5026
46cf825f6158
revamp DFC comment in text.h, some whitespace cleanup
Ben Wing <ben@xemacs.org>
parents:
4982
diff
changeset
|
2766 */ |
| 771 | 2767 |
| 2768 #define TO_EXTERNAL_FORMAT(source_type, source, sink_type, sink, codesys) \ | |
| 2769 do { \ | |
| 2770 dfc_conversion_type dfc_simplified_source_type; \ | |
| 2771 dfc_conversion_type dfc_simplified_sink_type; \ | |
| 2772 dfc_conversion_data dfc_source; \ | |
| 2773 dfc_conversion_data dfc_sink; \ | |
| 2774 Lisp_Object dfc_codesys = (codesys); \ | |
| 2775 \ | |
| 2776 type_checking_assert \ | |
| 2777 ((DFC_TYPE_##source_type == DFC_TYPE_DATA || \ | |
| 2778 DFC_TYPE_##source_type == DFC_TYPE_C_STRING || \ | |
| 2779 DFC_TYPE_##source_type == DFC_TYPE_LISP_STRING || \ | |
| 2780 DFC_TYPE_##source_type == DFC_TYPE_LISP_OPAQUE || \ | |
| 2781 DFC_TYPE_##source_type == DFC_TYPE_LISP_LSTREAM) \ | |
| 2782 && \ | |
| 2783 (DFC_TYPE_##sink_type == DFC_TYPE_ALLOCA || \ | |
| 2784 DFC_TYPE_##sink_type == DFC_TYPE_MALLOC || \ | |
| 2785 DFC_TYPE_##sink_type == DFC_TYPE_C_STRING_ALLOCA || \ | |
| 2786 DFC_TYPE_##sink_type == DFC_TYPE_C_STRING_MALLOC || \ | |
| 2787 DFC_TYPE_##sink_type == DFC_TYPE_LISP_LSTREAM || \ | |
| 2788 DFC_TYPE_##sink_type == DFC_TYPE_LISP_OPAQUE)); \ | |
| 2789 \ | |
| 2790 DFC_EXT_SOURCE_##source_type##_TO_ARGS (source, dfc_codesys); \ | |
| 2791 DFC_SINK_##sink_type##_TO_ARGS (sink); \ | |
| 2792 \ | |
| 2793 dfc_convert_to_external_format (dfc_simplified_source_type, &dfc_source, \ | |
| 2794 dfc_codesys, \ | |
| 2795 dfc_simplified_sink_type, &dfc_sink); \ | |
| 2796 \ | |
| 2797 DFC_##sink_type##_USE_CONVERTED_DATA (sink); \ | |
| 2798 } while (0) | |
| 2799 | |
| 2800 #define TO_INTERNAL_FORMAT(source_type, source, sink_type, sink, codesys) \ | |
| 2801 do { \ | |
| 2802 dfc_conversion_type dfc_simplified_source_type; \ | |
| 2803 dfc_conversion_type dfc_simplified_sink_type; \ | |
| 2804 dfc_conversion_data dfc_source; \ | |
| 2805 dfc_conversion_data dfc_sink; \ | |
| 2806 Lisp_Object dfc_codesys = (codesys); \ | |
| 2807 \ | |
| 2808 type_checking_assert \ | |
| 2809 ((DFC_TYPE_##source_type == DFC_TYPE_DATA || \ | |
| 2810 DFC_TYPE_##source_type == DFC_TYPE_C_STRING || \ | |
| 2811 DFC_TYPE_##source_type == DFC_TYPE_LISP_OPAQUE || \ | |
| 2812 DFC_TYPE_##source_type == DFC_TYPE_LISP_LSTREAM) \ | |
| 2813 && \ | |
| 2814 (DFC_TYPE_##sink_type == DFC_TYPE_ALLOCA || \ | |
| 2815 DFC_TYPE_##sink_type == DFC_TYPE_MALLOC || \ | |
| 2816 DFC_TYPE_##sink_type == DFC_TYPE_C_STRING_ALLOCA || \ | |
| 2817 DFC_TYPE_##sink_type == DFC_TYPE_C_STRING_MALLOC || \ | |
| 2818 DFC_TYPE_##sink_type == DFC_TYPE_LISP_STRING || \ | |
| 2819 DFC_TYPE_##sink_type == DFC_TYPE_LISP_LSTREAM || \ | |
| 2820 DFC_TYPE_##sink_type == DFC_TYPE_LISP_BUFFER)); \ | |
| 2821 \ | |
| 2822 DFC_INT_SOURCE_##source_type##_TO_ARGS (source, dfc_codesys); \ | |
| 2823 DFC_SINK_##sink_type##_TO_ARGS (sink); \ | |
| 2824 \ | |
| 2825 dfc_convert_to_internal_format (dfc_simplified_source_type, &dfc_source, \ | |
| 2826 dfc_codesys, \ | |
| 2827 dfc_simplified_sink_type, &dfc_sink); \ | |
| 2828 \ | |
| 2829 DFC_##sink_type##_USE_CONVERTED_DATA (sink); \ | |
| 2830 } while (0) | |
| 2831 | |
| 814 | 2832 #ifdef __cplusplus |
| 771 | 2833 |
| 814 | 2834 /* Error if you try to use a union here: "member `struct {anonymous |
| 2835 union}::{anonymous} {anonymous union}::data' with constructor not allowed | |
| 2836 in union" (Bytecount is a class) */ | |
| 2837 | |
| 2838 typedef struct | |
| 2839 #else | |
| 771 | 2840 typedef union |
| 814 | 2841 #endif |
| 771 | 2842 { |
| 2843 struct { const void *ptr; Bytecount len; } data; | |
| 2844 Lisp_Object lisp_object; | |
| 2845 } dfc_conversion_data; | |
| 2846 | |
| 2847 enum dfc_conversion_type | |
| 2848 { | |
| 2849 DFC_TYPE_DATA, | |
| 2850 DFC_TYPE_ALLOCA, | |
| 2851 DFC_TYPE_MALLOC, | |
| 2852 DFC_TYPE_C_STRING, | |
| 2853 DFC_TYPE_C_STRING_ALLOCA, | |
| 2854 DFC_TYPE_C_STRING_MALLOC, | |
| 2855 DFC_TYPE_LISP_STRING, | |
| 2856 DFC_TYPE_LISP_LSTREAM, | |
| 2857 DFC_TYPE_LISP_OPAQUE, | |
| 2858 DFC_TYPE_LISP_BUFFER | |
| 2859 }; | |
| 2860 typedef enum dfc_conversion_type dfc_conversion_type; | |
| 2861 | |
| 1743 | 2862 BEGIN_C_DECLS |
| 1650 | 2863 |
| 771 | 2864 /* WARNING: These use a static buffer. This can lead to disaster if |
| 2865 these functions are not used *very* carefully. Another reason to only use | |
| 2866 TO_EXTERNAL_FORMAT() and TO_INTERNAL_FORMAT(). */ | |
| 1632 | 2867 MODULE_API void |
| 771 | 2868 dfc_convert_to_external_format (dfc_conversion_type source_type, |
| 2869 dfc_conversion_data *source, | |
| 1318 | 2870 Lisp_Object codesys, |
| 771 | 2871 dfc_conversion_type sink_type, |
| 2872 dfc_conversion_data *sink); | |
| 1632 | 2873 MODULE_API void |
| 771 | 2874 dfc_convert_to_internal_format (dfc_conversion_type source_type, |
| 2875 dfc_conversion_data *source, | |
| 1318 | 2876 Lisp_Object codesys, |
| 771 | 2877 dfc_conversion_type sink_type, |
| 2878 dfc_conversion_data *sink); | |
| 2879 /* CPP Trickery */ | |
| 2880 #define DFC_CPP_CAR(x,y) (x) | |
| 2881 #define DFC_CPP_CDR(x,y) (y) | |
| 2882 | |
| 2883 /* Convert `source' to args for dfc_convert_to_external_format() */ | |
| 2884 #define DFC_EXT_SOURCE_DATA_TO_ARGS(val, codesys) do { \ | |
| 2885 dfc_source.data.ptr = DFC_CPP_CAR val; \ | |
| 2886 dfc_source.data.len = DFC_CPP_CDR val; \ | |
| 2887 dfc_simplified_source_type = DFC_TYPE_DATA; \ | |
| 2888 } while (0) | |
| 2889 #define DFC_EXT_SOURCE_C_STRING_TO_ARGS(val, codesys) do { \ | |
| 2890 dfc_source.data.len = \ | |
| 2891 strlen ((char *) (dfc_source.data.ptr = (val))); \ | |
| 2892 dfc_simplified_source_type = DFC_TYPE_DATA; \ | |
| 2893 } while (0) | |
| 2894 #define DFC_EXT_SOURCE_LISP_STRING_TO_ARGS(val, codesys) do { \ | |
| 2895 Lisp_Object dfc_slsta = (val); \ | |
| 2896 type_checking_assert (STRINGP (dfc_slsta)); \ | |
| 2897 dfc_source.lisp_object = dfc_slsta; \ | |
| 2898 dfc_simplified_source_type = DFC_TYPE_LISP_STRING; \ | |
| 2899 } while (0) | |
| 2900 #define DFC_EXT_SOURCE_LISP_LSTREAM_TO_ARGS(val, codesys) do { \ | |
| 2901 Lisp_Object dfc_sllta = (val); \ | |
| 2902 type_checking_assert (LSTREAMP (dfc_sllta)); \ | |
| 2903 dfc_source.lisp_object = dfc_sllta; \ | |
| 2904 dfc_simplified_source_type = DFC_TYPE_LISP_LSTREAM; \ | |
| 2905 } while (0) | |
| 2906 #define DFC_EXT_SOURCE_LISP_OPAQUE_TO_ARGS(val, codesys) do { \ | |
| 2907 Lisp_Opaque *dfc_slota = XOPAQUE (val); \ | |
| 2908 dfc_source.data.ptr = OPAQUE_DATA (dfc_slota); \ | |
| 2909 dfc_source.data.len = OPAQUE_SIZE (dfc_slota); \ | |
| 2910 dfc_simplified_source_type = DFC_TYPE_DATA; \ | |
| 2911 } while (0) | |
| 2912 | |
| 2913 /* Convert `source' to args for dfc_convert_to_internal_format() */ | |
| 2914 #define DFC_INT_SOURCE_DATA_TO_ARGS(val, codesys) \ | |
| 2915 DFC_EXT_SOURCE_DATA_TO_ARGS (val, codesys) | |
| 2916 #define DFC_INT_SOURCE_C_STRING_TO_ARGS(val, codesys) do { \ | |
| 2917 dfc_source.data.len = dfc_external_data_len (dfc_source.data.ptr = (val), \ | |
| 2918 codesys); \ | |
| 2919 dfc_simplified_source_type = DFC_TYPE_DATA; \ | |
| 2920 } while (0) | |
| 2921 #define DFC_INT_SOURCE_LISP_STRING_TO_ARGS(val, codesys) \ | |
| 2922 DFC_EXT_SOURCE_LISP_STRING_TO_ARGS (val, codesys) | |
| 2923 #define DFC_INT_SOURCE_LISP_LSTREAM_TO_ARGS(val, codesys) \ | |
| 2924 DFC_EXT_SOURCE_LISP_LSTREAM_TO_ARGS (val, codesys) | |
| 2925 #define DFC_INT_SOURCE_LISP_OPAQUE_TO_ARGS(val, codesys) \ | |
| 2926 DFC_EXT_SOURCE_LISP_OPAQUE_TO_ARGS (val, codesys) | |
| 2927 | |
| 2928 /* Convert `sink' to args for dfc_convert_to_*_format() */ | |
| 2929 #define DFC_SINK_ALLOCA_TO_ARGS(val) \ | |
| 2930 dfc_simplified_sink_type = DFC_TYPE_DATA | |
| 2931 #define DFC_SINK_C_STRING_ALLOCA_TO_ARGS(val) \ | |
| 2932 dfc_simplified_sink_type = DFC_TYPE_DATA | |
| 2933 #define DFC_SINK_MALLOC_TO_ARGS(val) \ | |
| 2934 dfc_simplified_sink_type = DFC_TYPE_DATA | |
| 2935 #define DFC_SINK_C_STRING_MALLOC_TO_ARGS(val) \ | |
| 2936 dfc_simplified_sink_type = DFC_TYPE_DATA | |
| 2937 #define DFC_SINK_LISP_STRING_TO_ARGS(val) \ | |
| 2938 dfc_simplified_sink_type = DFC_TYPE_DATA | |
| 2939 #define DFC_SINK_LISP_OPAQUE_TO_ARGS(val) \ | |
| 2940 dfc_simplified_sink_type = DFC_TYPE_DATA | |
| 2941 #define DFC_SINK_LISP_LSTREAM_TO_ARGS(val) do { \ | |
| 2942 Lisp_Object dfc_sllta = (val); \ | |
| 2943 type_checking_assert (LSTREAMP (dfc_sllta)); \ | |
| 2944 dfc_sink.lisp_object = dfc_sllta; \ | |
| 2945 dfc_simplified_sink_type = DFC_TYPE_LISP_LSTREAM; \ | |
| 2946 } while (0) | |
| 2947 #define DFC_SINK_LISP_BUFFER_TO_ARGS(val) do { \ | |
| 2948 struct buffer *dfc_slbta = XBUFFER (val); \ | |
| 2949 dfc_sink.lisp_object = \ | |
| 2950 make_lisp_buffer_output_stream \ | |
| 2951 (dfc_slbta, BUF_PT (dfc_slbta), 0); \ | |
| 2952 dfc_simplified_sink_type = DFC_TYPE_LISP_LSTREAM; \ | |
| 2953 } while (0) | |
| 2954 | |
| 2955 /* Assign to the `sink' lvalue(s) using the converted data. */ | |
| 2956 /* + 2 because we double zero-extended to account for Unicode conversion */ | |
| 2957 typedef union { char c; void *p; } *dfc_aliasing_voidpp; | |
| 2958 #define DFC_ALLOCA_USE_CONVERTED_DATA(sink) do { \ | |
| 851 | 2959 void * dfc_sink_ret = ALLOCA (dfc_sink.data.len + 2); \ |
| 771 | 2960 memcpy (dfc_sink_ret, dfc_sink.data.ptr, dfc_sink.data.len + 2); \ |
| 2367 | 2961 VOIDP_CAST (DFC_CPP_CAR sink) = dfc_sink_ret; \ |
| 771 | 2962 (DFC_CPP_CDR sink) = dfc_sink.data.len; \ |
| 2963 } while (0) | |
| 2964 #define DFC_MALLOC_USE_CONVERTED_DATA(sink) do { \ | |
| 2965 void * dfc_sink_ret = xmalloc (dfc_sink.data.len + 2); \ | |
| 2966 memcpy (dfc_sink_ret, dfc_sink.data.ptr, dfc_sink.data.len + 2); \ | |
| 2367 | 2967 VOIDP_CAST (DFC_CPP_CAR sink) = dfc_sink_ret; \ |
| 771 | 2968 (DFC_CPP_CDR sink) = dfc_sink.data.len; \ |
| 2969 } while (0) | |
| 2970 #define DFC_C_STRING_ALLOCA_USE_CONVERTED_DATA(sink) do { \ | |
| 851 | 2971 void * dfc_sink_ret = ALLOCA (dfc_sink.data.len + 2); \ |
| 771 | 2972 memcpy (dfc_sink_ret, dfc_sink.data.ptr, dfc_sink.data.len + 2); \ |
| 2367 | 2973 VOIDP_CAST (sink) = dfc_sink_ret; \ |
| 771 | 2974 } while (0) |
| 2975 #define DFC_C_STRING_MALLOC_USE_CONVERTED_DATA(sink) do { \ | |
| 2976 void * dfc_sink_ret = xmalloc (dfc_sink.data.len + 2); \ | |
| 2977 memcpy (dfc_sink_ret, dfc_sink.data.ptr, dfc_sink.data.len + 2); \ | |
| 2367 | 2978 VOIDP_CAST (sink) = dfc_sink_ret; \ |
| 771 | 2979 } while (0) |
|
5026
46cf825f6158
revamp DFC comment in text.h, some whitespace cleanup
Ben Wing <ben@xemacs.org>
parents:
4982
diff
changeset
|
2980 #define DFC_LISP_STRING_USE_CONVERTED_DATA(sink) \ |
| 867 | 2981 sink = make_string ((Ibyte *) dfc_sink.data.ptr, dfc_sink.data.len) |
|
5026
46cf825f6158
revamp DFC comment in text.h, some whitespace cleanup
Ben Wing <ben@xemacs.org>
parents:
4982
diff
changeset
|
2982 #define DFC_LISP_OPAQUE_USE_CONVERTED_DATA(sink) \ |
| 771 | 2983 sink = make_opaque (dfc_sink.data.ptr, dfc_sink.data.len) |
| 2984 #define DFC_LISP_LSTREAM_USE_CONVERTED_DATA(sink) /* data already used */ | |
|
5026
46cf825f6158
revamp DFC comment in text.h, some whitespace cleanup
Ben Wing <ben@xemacs.org>
parents:
4982
diff
changeset
|
2985 #define DFC_LISP_BUFFER_USE_CONVERTED_DATA(sink) \ |
| 771 | 2986 Lstream_delete (XLSTREAM (dfc_sink.lisp_object)) |
| 2987 | |
| 1318 | 2988 enum new_dfc_src_type |
| 2989 { | |
| 2990 DFC_EXTERNAL, | |
| 2991 DFC_SIZED_EXTERNAL, | |
| 2992 DFC_INTERNAL, | |
| 2993 DFC_SIZED_INTERNAL, | |
| 2994 DFC_LISP_STRING | |
| 2995 }; | |
| 2996 | |
| 1632 | 2997 MODULE_API void *new_dfc_convert_malloc (const void *src, Bytecount src_size, |
| 2998 enum new_dfc_src_type type, | |
| 2999 Lisp_Object codesys); | |
| 2367 | 3000 MODULE_API Bytecount new_dfc_convert_size (const char *srctext, |
| 3001 const void *src, | |
| 1632 | 3002 Bytecount src_size, |
| 3003 enum new_dfc_src_type type, | |
| 3004 Lisp_Object codesys); | |
| 2367 | 3005 MODULE_API void *new_dfc_convert_copy_data (const char *srctext, |
| 3006 void *alloca_data); | |
| 1318 | 3007 |
| 1743 | 3008 END_C_DECLS |
| 1650 | 3009 |
|
4981
4aebb0131297
Cleanups/renaming of EXTERNAL_TO_C_STRING and friends
Ben Wing <ben@xemacs.org>
parents:
4953
diff
changeset
|
3010 /* Version of EXTERNAL_TO_ITEXT that *RETURNS* the translated string, |
| 1318 | 3011 still in alloca() space. Requires some trickiness to do this, but gets |
| 3012 it done! */ | |
| 3013 | |
| 3014 /* NOTE: If you make two invocations of the dfc functions below in the same | |
| 3015 subexpression and use the exact same expression for the source in both | |
| 3016 cases, you will lose. In this unlikely case, you will get an abort, and | |
| 3017 need to rewrite the code. | |
| 3018 */ | |
| 3019 | |
| 3020 /* We need to use ALLOCA_FUNCALL_OK here. Some compilers have been known | |
| 3021 to choke when alloca() occurs as a funcall argument, and so we check | |
| 3022 this in configure. Rewriting the expressions below to use a temporary | |
| 3023 variable, so that the call to alloca() is outside of | |
| 2382 | 3024 new_dfc_convert_copy_data(), won't help because the entire NEW_DFC call |
| 1318 | 3025 could be inside of a function call. */ |
| 3026 | |
| 3027 #define NEW_DFC_CONVERT_1_ALLOCA(src, src_size, type, codesys) \ | |
| 2367 | 3028 new_dfc_convert_copy_data \ |
| 1318 | 3029 (#src, ALLOCA_FUNCALL_OK (new_dfc_convert_size (#src, src, src_size, \ |
| 3030 type, codesys))) | |
| 3031 | |
|
5026
46cf825f6158
revamp DFC comment in text.h, some whitespace cleanup
Ben Wing <ben@xemacs.org>
parents:
4982
diff
changeset
|
3032 #define EXTERNAL_TO_ITEXT(src, codesys) \ |
|
4981
4aebb0131297
Cleanups/renaming of EXTERNAL_TO_C_STRING and friends
Ben Wing <ben@xemacs.org>
parents:
4953
diff
changeset
|
3033 ((Ibyte *) NEW_DFC_CONVERT_1_ALLOCA (src, -1, DFC_EXTERNAL, codesys)) |
|
5026
46cf825f6158
revamp DFC comment in text.h, some whitespace cleanup
Ben Wing <ben@xemacs.org>
parents:
4982
diff
changeset
|
3034 #define EXTERNAL_TO_ITEXT_MALLOC(src, codesys) \ |
|
4981
4aebb0131297
Cleanups/renaming of EXTERNAL_TO_C_STRING and friends
Ben Wing <ben@xemacs.org>
parents:
4953
diff
changeset
|
3035 ((Ibyte *) new_dfc_convert_malloc (src, -1, DFC_EXTERNAL, codesys)) |
|
5026
46cf825f6158
revamp DFC comment in text.h, some whitespace cleanup
Ben Wing <ben@xemacs.org>
parents:
4982
diff
changeset
|
3036 #define SIZED_EXTERNAL_TO_ITEXT(src, len, codesys) \ |
|
4981
4aebb0131297
Cleanups/renaming of EXTERNAL_TO_C_STRING and friends
Ben Wing <ben@xemacs.org>
parents:
4953
diff
changeset
|
3037 ((Ibyte *) NEW_DFC_CONVERT_1_ALLOCA (src, len, DFC_SIZED_EXTERNAL, codesys)) |
|
5026
46cf825f6158
revamp DFC comment in text.h, some whitespace cleanup
Ben Wing <ben@xemacs.org>
parents:
4982
diff
changeset
|
3038 #define SIZED_EXTERNAL_TO_ITEXT_MALLOC(src, len, codesys) \ |
|
4981
4aebb0131297
Cleanups/renaming of EXTERNAL_TO_C_STRING and friends
Ben Wing <ben@xemacs.org>
parents:
4953
diff
changeset
|
3039 ((Ibyte *) new_dfc_convert_malloc (src, len, DFC_SIZED_EXTERNAL, codesys)) |
|
5026
46cf825f6158
revamp DFC comment in text.h, some whitespace cleanup
Ben Wing <ben@xemacs.org>
parents:
4982
diff
changeset
|
3040 #define ITEXT_TO_EXTERNAL(src, codesys) \ |
|
4981
4aebb0131297
Cleanups/renaming of EXTERNAL_TO_C_STRING and friends
Ben Wing <ben@xemacs.org>
parents:
4953
diff
changeset
|
3041 ((Extbyte *) NEW_DFC_CONVERT_1_ALLOCA (src, -1, DFC_INTERNAL, codesys)) |
|
5026
46cf825f6158
revamp DFC comment in text.h, some whitespace cleanup
Ben Wing <ben@xemacs.org>
parents:
4982
diff
changeset
|
3042 #define ITEXT_TO_EXTERNAL_MALLOC(src, codesys) \ |
|
4981
4aebb0131297
Cleanups/renaming of EXTERNAL_TO_C_STRING and friends
Ben Wing <ben@xemacs.org>
parents:
4953
diff
changeset
|
3043 ((Extbyte *) new_dfc_convert_malloc (src, -1, DFC_INTERNAL, codesys)) |
|
5026
46cf825f6158
revamp DFC comment in text.h, some whitespace cleanup
Ben Wing <ben@xemacs.org>
parents:
4982
diff
changeset
|
3044 #define LISP_STRING_TO_EXTERNAL(src, codesys) \ |
| 5013 | 3045 ((Extbyte *) NEW_DFC_CONVERT_1_ALLOCA (STORE_LISP_IN_VOID (src), -1, \ |
|
4981
4aebb0131297
Cleanups/renaming of EXTERNAL_TO_C_STRING and friends
Ben Wing <ben@xemacs.org>
parents:
4953
diff
changeset
|
3046 DFC_LISP_STRING, codesys)) |
|
5026
46cf825f6158
revamp DFC comment in text.h, some whitespace cleanup
Ben Wing <ben@xemacs.org>
parents:
4982
diff
changeset
|
3047 #define LISP_STRING_TO_EXTERNAL_MALLOC(src, codesys) \ |
| 5013 | 3048 ((Extbyte *) new_dfc_convert_malloc (STORE_LISP_IN_VOID (src), -1, \ |
|
4981
4aebb0131297
Cleanups/renaming of EXTERNAL_TO_C_STRING and friends
Ben Wing <ben@xemacs.org>
parents:
4953
diff
changeset
|
3049 DFC_LISP_STRING, codesys)) |
|
4aebb0131297
Cleanups/renaming of EXTERNAL_TO_C_STRING and friends
Ben Wing <ben@xemacs.org>
parents:
4953
diff
changeset
|
3050 /* In place of EXTERNAL_TO_LISP_STRING(), use build_extstring() and/or |
|
4aebb0131297
Cleanups/renaming of EXTERNAL_TO_C_STRING and friends
Ben Wing <ben@xemacs.org>
parents:
4953
diff
changeset
|
3051 make_extstring(). */ |
|
4aebb0131297
Cleanups/renaming of EXTERNAL_TO_C_STRING and friends
Ben Wing <ben@xemacs.org>
parents:
4953
diff
changeset
|
3052 |
|
4aebb0131297
Cleanups/renaming of EXTERNAL_TO_C_STRING and friends
Ben Wing <ben@xemacs.org>
parents:
4953
diff
changeset
|
3053 /* The next four have two outputs, so we make both of them be parameters */ |
|
4aebb0131297
Cleanups/renaming of EXTERNAL_TO_C_STRING and friends
Ben Wing <ben@xemacs.org>
parents:
4953
diff
changeset
|
3054 #define ITEXT_TO_SIZED_EXTERNAL(in, out, outlen, codesys) \ |
|
4aebb0131297
Cleanups/renaming of EXTERNAL_TO_C_STRING and friends
Ben Wing <ben@xemacs.org>
parents:
4953
diff
changeset
|
3055 TO_EXTERNAL_FORMAT (C_STRING, in, ALLOCA, (out, outlen), codesys) |
|
4aebb0131297
Cleanups/renaming of EXTERNAL_TO_C_STRING and friends
Ben Wing <ben@xemacs.org>
parents:
4953
diff
changeset
|
3056 #define LISP_STRING_TO_SIZED_EXTERNAL(in, out, outlen, codesys) \ |
|
4aebb0131297
Cleanups/renaming of EXTERNAL_TO_C_STRING and friends
Ben Wing <ben@xemacs.org>
parents:
4953
diff
changeset
|
3057 TO_EXTERNAL_FORMAT (LISP_STRING, in, ALLOCA, (out, outlen), codesys) |
|
4aebb0131297
Cleanups/renaming of EXTERNAL_TO_C_STRING and friends
Ben Wing <ben@xemacs.org>
parents:
4953
diff
changeset
|
3058 #define ITEXT_TO_SIZED_EXTERNAL_MALLOC(in, out, outlen, codesys) \ |
|
4aebb0131297
Cleanups/renaming of EXTERNAL_TO_C_STRING and friends
Ben Wing <ben@xemacs.org>
parents:
4953
diff
changeset
|
3059 TO_EXTERNAL_FORMAT (C_STRING, in, MALLOC, (out, outlen), codesys) |
|
4aebb0131297
Cleanups/renaming of EXTERNAL_TO_C_STRING and friends
Ben Wing <ben@xemacs.org>
parents:
4953
diff
changeset
|
3060 #define LISP_STRING_TO_SIZED_EXTERNAL_MALLOC(in, out, outlen, codesys) \ |
|
4aebb0131297
Cleanups/renaming of EXTERNAL_TO_C_STRING and friends
Ben Wing <ben@xemacs.org>
parents:
4953
diff
changeset
|
3061 TO_EXTERNAL_FORMAT (LISP_STRING, in, MALLOC, (out, outlen), codesys) |
| 771 | 3062 |
| 2367 | 3063 /* Wexttext functions. The type of Wexttext is selected at compile time |
| 3064 and will sometimes be wchar_t, sometimes char. */ | |
| 3065 | |
| 3066 int wcscmp_ascii (const wchar_t *s1, const Ascbyte *s2); | |
| 3067 int wcsncmp_ascii (const wchar_t *s1, const Ascbyte *s2, Charcount len); | |
| 3068 | |
| 3069 #ifdef WEXTTEXT_IS_WIDE /* defined under MS Windows i.e. WIN32_NATIVE */ | |
| 3070 #define WEXTTEXT_ZTERM_SIZE sizeof (wchar_t) | |
| 3071 /* Extra indirection needed in case of manifest constant as arg */ | |
| 3072 #define WEXTSTRING_1(arg) L##arg | |
| 3073 #define WEXTSTRING(arg) WEXTSTRING_1(arg) | |
| 3074 #define wext_strlen wcslen | |
| 3075 #define wext_strcmp wcscmp | |
| 3076 #define wext_strncmp wcsncmp | |
| 3077 #define wext_strcmp_ascii wcscmp_ascii | |
| 3078 #define wext_strncmp_ascii wcsncmp_ascii | |
| 3079 #define wext_strcpy wcscpy | |
| 3080 #define wext_strncpy wcsncpy | |
| 3081 #define wext_strchr wcschr | |
| 3082 #define wext_strrchr wcsrchr | |
| 3083 #define wext_strdup wcsdup | |
| 3084 #define wext_atol(str) wcstol (str, 0, 10) | |
| 3085 #define wext_sprintf wsprintfW /* Huh? both wsprintfA and wsprintfW? */ | |
| 3086 #define wext_getenv _wgetenv | |
|
4953
304aebb79cd3
function renamings to track names of char typedefs
Ben Wing <ben@xemacs.org>
parents:
4952
diff
changeset
|
3087 #define build_wext_string(str, cs) build_extstring ((Extbyte *) str, cs) |
| 2367 | 3088 #define WEXTTEXT_TO_8_BIT(arg) WEXTTEXT_TO_MULTIBYTE(arg) |
| 3089 #ifdef WIN32_NATIVE | |
| 3090 int XCDECL wext_retry_open (const Wexttext *path, int oflag, ...); | |
| 3091 #else | |
| 3092 #error Cannot handle Wexttext yet on this system | |
| 3093 #endif | |
| 3094 #define wext_access _waccess | |
| 3095 #define wext_stat _wstat | |
| 3096 #else | |
| 3097 #define WEXTTEXT_ZTERM_SIZE sizeof (char) | |
| 3098 #define WEXTSTRING(arg) arg | |
| 3099 #define wext_strlen strlen | |
| 3100 #define wext_strcmp strcmp | |
| 3101 #define wext_strncmp strncmp | |
| 3102 #define wext_strcmp_ascii strcmp | |
| 3103 #define wext_strncmp_ascii strncmp | |
| 3104 #define wext_strcpy strcpy | |
| 3105 #define wext_strncpy strncpy | |
| 3106 #define wext_strchr strchr | |
| 3107 #define wext_strrchr strrchr | |
| 3108 #define wext_strdup xstrdup | |
| 3109 #define wext_atol(str) atol (str) | |
| 3110 #define wext_sprintf sprintf | |
| 3111 #define wext_getenv getenv | |
|
4953
304aebb79cd3
function renamings to track names of char typedefs
Ben Wing <ben@xemacs.org>
parents:
4952
diff
changeset
|
3112 #define build_wext_string build_extstring |
| 2367 | 3113 #define wext_retry_open retry_open |
| 3114 #define wext_access access | |
| 3115 #define wext_stat stat | |
| 3116 #define WEXTTEXT_TO_8_BIT(arg) ((Extbyte *) arg) | |
| 3117 #endif | |
| 3118 | |
|
4952
19a72041c5ed
Mule-izing, various fixes related to char * arguments
Ben Wing <ben@xemacs.org>
parents:
4853
diff
changeset
|
3119 /* Standins for various encodings. |
| 1318 | 3120 |
| 3121 About encodings in X: | |
| 3122 | |
| 3123 X works with 5 different encodings: | |
| 3124 | |
| 3125 -- "Host Portable Character Encoding" == printable ASCII + space, tab, | |
| 3126 newline | |
| 3127 | |
| 3128 -- STRING encoding == ASCII + Latin-1 + tab, newline | |
| 3129 | |
| 3130 -- Locale-specific encoding | |
| 3131 | |
| 3132 -- Compound text == STRING encoding + ISO-2022 escape sequences to | |
| 3133 switch between different locale-specific encodings. | |
| 3134 | |
| 3135 -- ANSI C wide-character encoding | |
| 3136 | |
| 3137 The Host Portable Character Encoding (HPCE) is used for atom names, font | |
| 3138 names, color names, keysyms, geometry strings, resource manager quarks, | |
| 3139 display names, locale names, and various other things. When describing | |
| 3140 such strings, the X manual typically says "If the ... is not in the Host | |
| 3141 Portable Character Encoding, the result is implementation dependent." | |
| 3142 | |
| 3143 The wide-character encoding is used only in the Xwc* functions, which | |
| 3144 are provided as equivalents to Xmb* functions. | |
| 3145 | |
| 3146 STRING and compound text are used in the value of string properties and | |
| 3147 selection data, both of which are values with an associated type atom, | |
| 3148 which can be STRING or COMPOUND_TEXT. It can also be a locale name, as | |
| 3149 specified in setlocale() (#### as usual, there is no normalization | |
| 3150 whatsoever of these names). | |
| 3151 | |
| 3152 X also defines a type called "TEXT", which is used only as a requested | |
| 3153 type, and produces data in a type "convenient to the owner". However, | |
| 3154 there is some indication that X expects this to be the locale-specific | |
| 3155 encoding. | |
| 3156 | |
| 3157 According to the glossary, the locale is used in | |
| 3158 | |
| 3159 -- Encoding and processing of input method text | |
| 3160 -- Encoding of resource files and values | |
| 3161 -- Encoding and imaging of text strings | |
| 3162 -- Encoding and decoding for inter-client text communication | |
| 3163 | |
| 3164 The functions XmbTextListToTextProperty and XmbTextPropertyToTextList | |
| 3165 (and Xwc* equivalents) can be used to convert between the | |
| 3166 locale-specific encoding (XTextStyle), STRING (XStringStyle), and | |
| 3167 compound text (XCompoundTextStyle), as well as XStdICCTextStyle, which | |
| 3168 converts to STRING if possible, and if not, COMPOUND_TEXT. This is | |
| 3169 used, for example, in XmbSetWMProperties, in the window_name and | |
| 3170 icon_name properties (WM_NAME and WM_ICON_NAME), which are in the | |
| 3171 locale-specific encoding on input, and are stored as STRING if possible, | |
| 3172 COMPOUND_TEXT otherwise. | |
| 3173 */ | |
| 771 | 3174 |
|
4952
19a72041c5ed
Mule-izing, various fixes related to char * arguments
Ben Wing <ben@xemacs.org>
parents:
4853
diff
changeset
|
3175 #ifdef WEXTTEXT_IS_WIDE |
|
19a72041c5ed
Mule-izing, various fixes related to char * arguments
Ben Wing <ben@xemacs.org>
parents:
4853
diff
changeset
|
3176 #define Qcommand_argument_encoding Qmswindows_unicode |
|
19a72041c5ed
Mule-izing, various fixes related to char * arguments
Ben Wing <ben@xemacs.org>
parents:
4853
diff
changeset
|
3177 #define Qenvironment_variable_encoding Qmswindows_unicode |
|
19a72041c5ed
Mule-izing, various fixes related to char * arguments
Ben Wing <ben@xemacs.org>
parents:
4853
diff
changeset
|
3178 #else |
|
19a72041c5ed
Mule-izing, various fixes related to char * arguments
Ben Wing <ben@xemacs.org>
parents:
4853
diff
changeset
|
3179 #define Qcommand_argument_encoding Qnative |
|
19a72041c5ed
Mule-izing, various fixes related to char * arguments
Ben Wing <ben@xemacs.org>
parents:
4853
diff
changeset
|
3180 #define Qenvironment_variable_encoding Qnative |
|
19a72041c5ed
Mule-izing, various fixes related to char * arguments
Ben Wing <ben@xemacs.org>
parents:
4853
diff
changeset
|
3181 #endif |
|
19a72041c5ed
Mule-izing, various fixes related to char * arguments
Ben Wing <ben@xemacs.org>
parents:
4853
diff
changeset
|
3182 #define Qunix_host_name_encoding Qnative |
|
19a72041c5ed
Mule-izing, various fixes related to char * arguments
Ben Wing <ben@xemacs.org>
parents:
4853
diff
changeset
|
3183 #define Qunix_service_name_encoding Qnative |
|
5254
1537701f08a1
Support Roman month numbers, #'format-time-string
Aidan Kehoe <kehoea@parhasard.net>
parents:
5200
diff
changeset
|
3184 #define Qtime_function_encoding Qbinary |
|
4952
19a72041c5ed
Mule-izing, various fixes related to char * arguments
Ben Wing <ben@xemacs.org>
parents:
4853
diff
changeset
|
3185 #define Qtime_zone_encoding Qtime_function_encoding |
|
19a72041c5ed
Mule-izing, various fixes related to char * arguments
Ben Wing <ben@xemacs.org>
parents:
4853
diff
changeset
|
3186 #define Qmswindows_host_name_encoding Qmswindows_multibyte |
|
19a72041c5ed
Mule-izing, various fixes related to char * arguments
Ben Wing <ben@xemacs.org>
parents:
4853
diff
changeset
|
3187 #define Qmswindows_service_name_encoding Qmswindows_multibyte |
|
19a72041c5ed
Mule-izing, various fixes related to char * arguments
Ben Wing <ben@xemacs.org>
parents:
4853
diff
changeset
|
3188 #define Quser_name_encoding Qnative |
|
19a72041c5ed
Mule-izing, various fixes related to char * arguments
Ben Wing <ben@xemacs.org>
parents:
4853
diff
changeset
|
3189 #define Qerror_message_encoding Qnative |
|
19a72041c5ed
Mule-izing, various fixes related to char * arguments
Ben Wing <ben@xemacs.org>
parents:
4853
diff
changeset
|
3190 #define Qjpeg_error_message_encoding Qerror_message_encoding |
|
19a72041c5ed
Mule-izing, various fixes related to char * arguments
Ben Wing <ben@xemacs.org>
parents:
4853
diff
changeset
|
3191 #define Qtooltalk_encoding Qnative |
|
19a72041c5ed
Mule-izing, various fixes related to char * arguments
Ben Wing <ben@xemacs.org>
parents:
4853
diff
changeset
|
3192 #define Qgtk_encoding Qnative |
|
19a72041c5ed
Mule-izing, various fixes related to char * arguments
Ben Wing <ben@xemacs.org>
parents:
4853
diff
changeset
|
3193 |
|
19a72041c5ed
Mule-izing, various fixes related to char * arguments
Ben Wing <ben@xemacs.org>
parents:
4853
diff
changeset
|
3194 #define Qdll_symbol_encoding Qnative |
|
19a72041c5ed
Mule-izing, various fixes related to char * arguments
Ben Wing <ben@xemacs.org>
parents:
4853
diff
changeset
|
3195 #define Qdll_function_name_encoding Qdll_symbol_encoding |
|
19a72041c5ed
Mule-izing, various fixes related to char * arguments
Ben Wing <ben@xemacs.org>
parents:
4853
diff
changeset
|
3196 #define Qdll_variable_name_encoding Qdll_symbol_encoding |
|
19a72041c5ed
Mule-izing, various fixes related to char * arguments
Ben Wing <ben@xemacs.org>
parents:
4853
diff
changeset
|
3197 #define Qdll_filename_encoding Qfile_name |
|
19a72041c5ed
Mule-izing, various fixes related to char * arguments
Ben Wing <ben@xemacs.org>
parents:
4853
diff
changeset
|
3198 #define Qemodule_string_encoding Qnative |
|
19a72041c5ed
Mule-izing, various fixes related to char * arguments
Ben Wing <ben@xemacs.org>
parents:
4853
diff
changeset
|
3199 |
| 771 | 3200 /* !!#### Need to verify the encoding used in lwlib -- Qnative or Qctext? |
| 3201 Almost certainly the former. Use a standin for now. */ | |
| 3202 #define Qlwlib_encoding Qnative | |
| 3203 | |
| 1318 | 3204 /* The Host Portable Character Encoding. */ |
| 3205 #define Qx_hpc_encoding Qnative | |
| 3206 | |
| 3207 #define Qx_atom_name_encoding Qx_hpc_encoding | |
| 3208 #define Qx_font_name_encoding Qx_hpc_encoding | |
| 3209 #define Qx_color_name_encoding Qx_hpc_encoding | |
| 3210 #define Qx_keysym_encoding Qx_hpc_encoding | |
| 3211 #define Qx_geometry_encoding Qx_hpc_encoding | |
| 3212 #define Qx_resource_name_encoding Qx_hpc_encoding | |
| 3213 #define Qx_application_class_encoding Qx_hpc_encoding | |
| 771 | 3214 /* the following probably must agree with Qcommand_argument_encoding and |
| 3215 Qenvironment_variable_encoding */ | |
| 1318 | 3216 #define Qx_display_name_encoding Qx_hpc_encoding |
| 3217 #define Qx_xpm_data_encoding Qx_hpc_encoding | |
|
4834
b3ea9c582280
Use new cygwin_conv_path API with Cygwin 1.7 for converting names between Win32 and POSIX, UTF-8-aware, with attendant changes elsewhere
Ben Wing <ben@xemacs.org>
parents:
4790
diff
changeset
|
3218 #define Qx_error_message_encoding Qx_hpc_encoding |
| 1318 | 3219 |
| 2367 | 3220 /* !!#### Verify these! */ |
| 3221 #define Qxt_widget_arg_encoding Qnative | |
| 3222 #define Qdt_dnd_encoding Qnative | |
| 3223 | |
| 1318 | 3224 /* RedHat 6.2 contains a locale called "Francais" with the C-cedilla |
| 3225 encoded in ISO2022! */ | |
| 3226 #define Qlocale_name_encoding Qctext | |
| 771 | 3227 |
| 3228 #define Qstrerror_encoding Qnative | |
| 3229 | |
| 1318 | 3230 /* !!#### This exists to remind us that our hexify routine is totally |
| 3231 un-Muleized. */ | |
| 3232 #define Qdnd_hexify_encoding Qascii | |
| 3233 | |
| 771 | 3234 #define GET_STRERROR(var, num) \ |
| 3235 do { \ | |
| 3236 int __gsnum__ = (num); \ | |
| 3237 Extbyte * __gserr__ = strerror (__gsnum__); \ | |
| 3238 \ | |
| 3239 if (!__gserr__) \ | |
| 3240 { \ | |
|
4981
4aebb0131297
Cleanups/renaming of EXTERNAL_TO_C_STRING and friends
Ben Wing <ben@xemacs.org>
parents:
4953
diff
changeset
|
3241 var = alloca_ibytes (99); \ |
| 771 | 3242 qxesprintf (var, "Unknown error %d", __gsnum__); \ |
| 3243 } \ | |
| 3244 else \ | |
|
4981
4aebb0131297
Cleanups/renaming of EXTERNAL_TO_C_STRING and friends
Ben Wing <ben@xemacs.org>
parents:
4953
diff
changeset
|
3245 var = EXTERNAL_TO_ITEXT (__gserr__, Qstrerror_encoding); \ |
| 771 | 3246 } while (0) |
| 3247 | |
| 3248 #endif /* INCLUDED_text_h_ */ |
