Mercurial > hg > xemacs-beta
annotate src/file-coding.c @ 5518:3cc7470ea71c
gnuclient: if TMPDIR was set and connect failed, try again with /tmp
2011-06-03 Aidan Kehoe <kehoea@parhasard.net>
* gnuslib.c (connect_to_unix_server):
Retry with /tmp as a directory in which to search for Unix sockets
if an attempt to connect with some other directory failed (which
may be because gnuclient and gnuserv don't share an environment
value for TMPDIR, or because gnuserv was compiled with USE_TMPDIR
turned off).
author | Aidan Kehoe <kehoea@parhasard.net> |
---|---|
date | Fri, 03 Jun 2011 18:40:57 +0100 (2011-06-03) |
parents | 4dee0387b9de |
children | 56144c8593a8 |
rev | line source |
---|---|
771 | 1 /* Text encoding conversion functions; coding-system object. |
2 #### rename me to coding-system.c or coding.c | |
428 | 3 Copyright (C) 1991, 1995 Free Software Foundation, Inc. |
4 Copyright (C) 1995 Sun Microsystems, Inc. | |
5127
a9c41067dd88
more cleanups, terminology clarification, lots of doc work
Ben Wing <ben@xemacs.org>
parents:
5126
diff
changeset
|
5 Copyright (C) 2000, 2001, 2002, 2003, 2005, 2010 Ben Wing. |
428 | 6 |
7 This file is part of XEmacs. | |
8 | |
5402
308d34e9f07d
Changed bulk of GPLv2 or later files identified by script
Mats Lidell <matsl@xemacs.org>
parents:
5191
diff
changeset
|
9 XEmacs is free software: you can redistribute it and/or modify it |
428 | 10 under the terms of the GNU General Public License as published by the |
5402
308d34e9f07d
Changed bulk of GPLv2 or later files identified by script
Mats Lidell <matsl@xemacs.org>
parents:
5191
diff
changeset
|
11 Free Software Foundation, either version 3 of the License, or (at your |
308d34e9f07d
Changed bulk of GPLv2 or later files identified by script
Mats Lidell <matsl@xemacs.org>
parents:
5191
diff
changeset
|
12 option) any later version. |
428 | 13 |
14 XEmacs is distributed in the hope that it will be useful, but WITHOUT | |
15 ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or | |
16 FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License | |
17 for more details. | |
18 | |
19 You should have received a copy of the GNU General Public License | |
5402
308d34e9f07d
Changed bulk of GPLv2 or later files identified by script
Mats Lidell <matsl@xemacs.org>
parents:
5191
diff
changeset
|
20 along with XEmacs. If not, see <http://www.gnu.org/licenses/>. */ |
428 | 21 |
771 | 22 /* Synched up with: Not in FSF. */ |
23 | |
24 /* Authorship: | |
25 | |
26 Current primary author: Ben Wing <ben@xemacs.org> | |
27 | |
28 Rewritten by Ben Wing <ben@xemacs.org>, based originally on coding.c | |
29 from Mule 2.? but probably does not share one line of code with that | |
30 original source. Rewriting work started around Dec. 1994. or Jan. 1995. | |
31 Proceeded in earnest till Nov. 1995. | |
32 | |
33 Around Feb. 17, 1998, Andy Piper renamed what was then mule-coding.c to | |
34 file-coding.c, with the intention of using it to do end-of-line conversion | |
35 on non-MULE machines (specifically, on Windows machines). He separated | |
36 out the MULE stuff from non-MULE using ifdef's, and searched throughout | |
37 the rest of the source tree looking for coding-system-related code that | |
38 was ifdef MULE but should be ifdef HAVE_CODING_SYSTEMS. | |
39 | |
40 Sept. 4 - 8, 1998, Tomohiko Morioka added the UCS_4 and UTF_8 coding system | |
41 types, providing a primitive means of decoding and encoding externally- | |
42 formatted Unicode/UCS_4 and Unicode/UTF_8 data. | |
43 | |
44 January 25, 2000, Martin Buchholz redid and fleshed out the coding | |
45 system alias handling that was first added in prototype form by | |
46 Hrjove Niksic, April 15, 1999. | |
47 | |
48 April to May 2000, Ben Wing: More major reorganization. Adding features | |
49 needed for MS Windows (multibyte, unicode, unicode-to-multibyte), the | |
50 "chain" coding system for chaining two together, and doing a lot of | |
51 reorganization in preparation for properly abstracting out the different | |
52 coding system types. | |
53 | |
54 June 2001, Ben Wing: Added Unicode support. Eliminated previous | |
55 junky Unicode translation support. | |
56 | |
57 August 2001, Ben Wing: Moved Unicode support to unicode.c. Finished | |
58 abstracting everything except detection, which is hard to abstract (see | |
59 just below). | |
60 | |
61 September 2001, Ben Wing: Moved Mule code to mule-coding.c, Windows code | |
62 to intl-win32.c. Lots more rewriting; very little code is untouched | |
63 from before April 2000. Abstracted the detection code, added multiple | |
64 levels of likelihood to increase the reliability of the algorithm. | |
65 | |
66 October 2001, Ben Wing: HAVE_CODING_SYSTEMS is always now defined. | |
67 Removed the conditionals. | |
68 */ | |
69 | |
428 | 70 #include <config.h> |
71 #include "lisp.h" | |
72 | |
73 #include "buffer.h" | |
74 #include "elhash.h" | |
75 #include "insdel.h" | |
76 #include "lstream.h" | |
440 | 77 #include "opaque.h" |
771 | 78 #include "file-coding.h" |
4690
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
79 #include "extents.h" |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
80 #include "rangetab.h" |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
81 #include "chartab.h" |
771 | 82 |
83 #ifdef HAVE_ZLIB | |
84 #include "zlib.h" | |
428 | 85 #endif |
86 | |
87 Lisp_Object Vkeyboard_coding_system; | |
88 Lisp_Object Vterminal_coding_system; | |
89 Lisp_Object Vcoding_system_for_read; | |
90 Lisp_Object Vcoding_system_for_write; | |
91 Lisp_Object Vfile_name_coding_system; | |
92 | |
4690
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
93 Lisp_Object Qaliases, Qcharset_skip_chars_string; |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
94 |
771 | 95 #ifdef DEBUG_XEMACS |
96 Lisp_Object Vdebug_coding_detection; | |
440 | 97 #endif |
771 | 98 |
4690
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
99 #ifdef MULE |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
100 extern Lisp_Object Vcharset_ascii, Vcharset_control_1, |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
101 Vcharset_latin_iso8859_1; |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
102 #endif |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
103 |
771 | 104 typedef struct coding_system_type_entry |
105 { | |
106 struct coding_system_methods *meths; | |
107 } coding_system_type_entry; | |
108 | |
109 typedef struct | |
110 { | |
111 Dynarr_declare (coding_system_type_entry); | |
112 } coding_system_type_entry_dynarr; | |
113 | |
114 static coding_system_type_entry_dynarr *the_coding_system_type_entry_dynarr; | |
115 | |
1204 | 116 static const struct memory_description cste_description_1[] = { |
2551 | 117 { XD_BLOCK_PTR, offsetof (coding_system_type_entry, meths), 1, |
118 { &coding_system_methods_description } }, | |
771 | 119 { XD_END } |
120 }; | |
121 | |
1204 | 122 static const struct sized_memory_description cste_description = { |
771 | 123 sizeof (coding_system_type_entry), |
124 cste_description_1 | |
125 }; | |
126 | |
1204 | 127 static const struct memory_description csted_description_1[] = { |
771 | 128 XD_DYNARR_DESC (coding_system_type_entry_dynarr, &cste_description), |
428 | 129 { XD_END } |
130 }; | |
131 | |
1204 | 132 static const struct sized_memory_description csted_description = { |
771 | 133 sizeof (coding_system_type_entry_dynarr), |
134 csted_description_1 | |
135 }; | |
136 | |
137 static Lisp_Object Vcoding_system_type_list; | |
138 | |
139 /* Coding system currently associated with each coding category. */ | |
140 Lisp_Object coding_category_system[MAX_DETECTOR_CATEGORIES]; | |
141 | |
142 /* Table of all coding categories in decreasing order of priority. | |
143 This describes a permutation of the possible coding categories. */ | |
144 int coding_category_by_priority[MAX_DETECTOR_CATEGORIES]; | |
145 | |
146 /* Value used with to give a unique name to nameless coding systems */ | |
147 int coding_system_tick; | |
148 | |
149 int coding_detector_count; | |
150 int coding_detector_category_count; | |
151 | |
152 detector_dynarr *all_coding_detectors; | |
153 | |
1204 | 154 static const struct memory_description struct_detector_category_description_1[] |
771 | 155 = |
156 { | |
157 { XD_LISP_OBJECT, offsetof (struct detector_category, sym) }, | |
158 { XD_END } | |
159 }; | |
160 | |
1204 | 161 static const struct sized_memory_description struct_detector_category_description = |
771 | 162 { |
163 sizeof (struct detector_category), | |
164 struct_detector_category_description_1 | |
428 | 165 }; |
166 | |
1204 | 167 static const struct memory_description detector_category_dynarr_description_1[] = |
771 | 168 { |
169 XD_DYNARR_DESC (detector_category_dynarr, | |
170 &struct_detector_category_description), | |
171 { XD_END } | |
172 }; | |
173 | |
1204 | 174 static const struct sized_memory_description detector_category_dynarr_description = { |
771 | 175 sizeof (detector_category_dynarr), |
176 detector_category_dynarr_description_1 | |
177 }; | |
178 | |
1204 | 179 static const struct memory_description struct_detector_description_1[] |
771 | 180 = |
181 { | |
2367 | 182 { XD_BLOCK_PTR, offsetof (struct detector, cats), 1, |
2551 | 183 { &detector_category_dynarr_description } }, |
771 | 184 { XD_END } |
185 }; | |
186 | |
1204 | 187 static const struct sized_memory_description struct_detector_description = |
771 | 188 { |
189 sizeof (struct detector), | |
190 struct_detector_description_1 | |
191 }; | |
192 | |
1204 | 193 static const struct memory_description detector_dynarr_description_1[] = |
771 | 194 { |
195 XD_DYNARR_DESC (detector_dynarr, &struct_detector_description), | |
196 { XD_END } | |
197 }; | |
198 | |
1204 | 199 static const struct sized_memory_description detector_dynarr_description = { |
771 | 200 sizeof (detector_dynarr), |
201 detector_dynarr_description_1 | |
202 }; | |
428 | 203 |
204 Lisp_Object Qcoding_systemp; | |
205 | |
771 | 206 Lisp_Object Qraw_text; |
428 | 207 |
208 Lisp_Object Qmnemonic, Qeol_type; | |
209 Lisp_Object Qcr, Qcrlf, Qlf; | |
210 Lisp_Object Qeol_cr, Qeol_crlf, Qeol_lf; | |
211 Lisp_Object Qpost_read_conversion; | |
212 Lisp_Object Qpre_write_conversion; | |
213 | |
771 | 214 Lisp_Object Qtranslation_table_for_decode; |
215 Lisp_Object Qtranslation_table_for_encode; | |
216 Lisp_Object Qsafe_chars; | |
217 Lisp_Object Qsafe_charsets; | |
218 Lisp_Object Qmime_charset; | |
219 Lisp_Object Qvalid_codes; | |
220 | |
221 Lisp_Object Qno_conversion; | |
222 Lisp_Object Qconvert_eol; | |
440 | 223 Lisp_Object Qescape_quoted; |
771 | 224 Lisp_Object Qencode, Qdecode; |
225 | |
226 Lisp_Object Qconvert_eol_lf, Qconvert_eol_cr, Qconvert_eol_crlf; | |
227 Lisp_Object Qconvert_eol_autodetect; | |
228 | |
229 Lisp_Object Qnear_certainty, Qquite_probable, Qsomewhat_likely; | |
1494 | 230 Lisp_Object Qslightly_likely; |
771 | 231 Lisp_Object Qas_likely_as_unlikely, Qsomewhat_unlikely, Qquite_improbable; |
232 Lisp_Object Qnearly_impossible; | |
233 | |
234 Lisp_Object Qdo_eol, Qdo_coding; | |
235 | |
236 Lisp_Object Qcanonicalize_after_coding; | |
237 | |
1347 | 238 Lisp_Object QScoding_system_cookie; |
239 | |
4303 | 240 Lisp_Object Qposix_charset_to_coding_system_hash; |
241 | |
771 | 242 /* This is used to convert autodetected coding systems into existing |
243 systems. For example, the chain undecided->convert-eol-autodetect may | |
244 have its separate parts detected as mswindows-multibyte and | |
245 convert-eol-crlf, and the result needs to be mapped to | |
246 mswindows-multibyte-dos. */ | |
247 /* #### It's not clear we need this whole chain-canonicalize mechanism | |
248 any more. */ | |
249 static Lisp_Object Vchain_canonicalize_hash_table; | |
250 | |
251 #ifdef HAVE_ZLIB | |
252 Lisp_Object Qgzip; | |
428 | 253 #endif |
771 | 254 |
2297 | 255 /* Maps symbols (coding system names) to either coding system objects or |
256 (for aliases) other names. */ | |
771 | 257 static Lisp_Object Vcoding_system_hash_table; |
428 | 258 |
259 int enable_multibyte_characters; | |
260 | |
261 EXFUN (Fcopy_coding_system, 2); | |
262 | |
263 | |
264 /************************************************************************/ | |
771 | 265 /* Coding system object methods */ |
428 | 266 /************************************************************************/ |
267 | |
268 static Lisp_Object | |
269 mark_coding_system (Lisp_Object obj) | |
270 { | |
271 Lisp_Coding_System *codesys = XCODING_SYSTEM (obj); | |
272 | |
1204 | 273 #define MARKED_SLOT(x) mark_object (codesys->x); |
274 #include "coding-system-slots.h" | |
771 | 275 |
276 MAYBE_CODESYSMETH (codesys, mark, (obj)); | |
428 | 277 |
1204 | 278 return Qnil; |
428 | 279 } |
280 | |
281 static void | |
771 | 282 print_coding_system_properties (Lisp_Object obj, Lisp_Object printcharfun) |
283 { | |
284 Lisp_Coding_System *c = XCODING_SYSTEM (obj); | |
285 print_internal (c->methods->type, printcharfun, 1); | |
286 MAYBE_CODESYSMETH (c, print, (obj, printcharfun, 1)); | |
287 if (CODING_SYSTEM_EOL_TYPE (c) != EOL_AUTODETECT) | |
288 write_fmt_string_lisp (printcharfun, " eol-type=%s", | |
289 1, Fcoding_system_property (obj, Qeol_type)); | |
290 } | |
291 | |
292 static void | |
428 | 293 print_coding_system (Lisp_Object obj, Lisp_Object printcharfun, |
2286 | 294 int UNUSED (escapeflag)) |
428 | 295 { |
296 Lisp_Coding_System *c = XCODING_SYSTEM (obj); | |
297 if (print_readably) | |
5142
f965e31a35f0
reduce lcrecord headers to 2 words, rename printing_unreadable_object
Ben Wing <ben@xemacs.org>
parents:
5128
diff
changeset
|
298 printing_unreadable_lisp_object (obj, 0); |
771 | 299 |
300 write_fmt_string_lisp (printcharfun, "#<coding-system %s ", 1, c->name); | |
301 print_coding_system_properties (obj, printcharfun); | |
4952
19a72041c5ed
Mule-izing, various fixes related to char * arguments
Ben Wing <ben@xemacs.org>
parents:
4846
diff
changeset
|
302 write_ascstring (printcharfun, ">"); |
428 | 303 } |
304 | |
771 | 305 /* Print an abbreviated version of a coding system (but still containing |
306 all the information), for use within a coding system print method. */ | |
307 | |
308 static void | |
309 print_coding_system_in_print_method (Lisp_Object cs, Lisp_Object printcharfun, | |
2286 | 310 int UNUSED (escapeflag)) |
771 | 311 { |
800 | 312 write_fmt_string_lisp (printcharfun, "%s[", 1, XCODING_SYSTEM_NAME (cs)); |
771 | 313 print_coding_system_properties (cs, printcharfun); |
4952
19a72041c5ed
Mule-izing, various fixes related to char * arguments
Ben Wing <ben@xemacs.org>
parents:
4846
diff
changeset
|
314 write_ascstring (printcharfun, "]"); |
771 | 315 } |
316 | |
3263 | 317 #ifndef NEW_GC |
428 | 318 static void |
5127
a9c41067dd88
more cleanups, terminology clarification, lots of doc work
Ben Wing <ben@xemacs.org>
parents:
5126
diff
changeset
|
319 finalize_coding_system (Lisp_Object obj) |
428 | 320 { |
321 /* Since coding systems never go away, this function is not | |
322 necessary. But it would be necessary if we changed things | |
323 so that coding systems could go away. */ | |
5127
a9c41067dd88
more cleanups, terminology clarification, lots of doc work
Ben Wing <ben@xemacs.org>
parents:
5126
diff
changeset
|
324 MAYBE_XCODESYSMETH (obj, finalize, (obj)); |
771 | 325 } |
3263 | 326 #endif /* not NEW_GC */ |
771 | 327 |
328 static Bytecount | |
5127
a9c41067dd88
more cleanups, terminology clarification, lots of doc work
Ben Wing <ben@xemacs.org>
parents:
5126
diff
changeset
|
329 sizeof_coding_system (Lisp_Object obj) |
771 | 330 { |
5127
a9c41067dd88
more cleanups, terminology clarification, lots of doc work
Ben Wing <ben@xemacs.org>
parents:
5126
diff
changeset
|
331 const Lisp_Coding_System *p = XCODING_SYSTEM (obj); |
771 | 332 return offsetof (Lisp_Coding_System, data) + p->methods->extra_data_size; |
428 | 333 } |
334 | |
1204 | 335 static const struct memory_description coding_system_methods_description_1[] |
771 | 336 = { |
337 { XD_LISP_OBJECT, | |
338 offsetof (struct coding_system_methods, type) }, | |
339 { XD_LISP_OBJECT, | |
340 offsetof (struct coding_system_methods, predicate_symbol) }, | |
341 { XD_END } | |
342 }; | |
343 | |
1204 | 344 const struct sized_memory_description coding_system_methods_description = { |
771 | 345 sizeof (struct coding_system_methods), |
346 coding_system_methods_description_1 | |
347 }; | |
348 | |
1204 | 349 static const struct sized_memory_description coding_system_extra_description_map[] = |
350 { | |
351 { offsetof (Lisp_Coding_System, methods) }, | |
352 { offsetof (struct coding_system_methods, extra_description) }, | |
353 { -1 }, | |
771 | 354 }; |
355 | |
1204 | 356 static const struct memory_description coding_system_description[] = |
428 | 357 { |
2367 | 358 { XD_BLOCK_PTR, offsetof (Lisp_Coding_System, methods), 1, |
2551 | 359 { &coding_system_methods_description } }, |
1204 | 360 #define MARKED_SLOT(x) { XD_LISP_OBJECT, offsetof (Lisp_Coding_System, x) }, |
361 #define MARKED_SLOT_ARRAY(slot, size) \ | |
362 { XD_LISP_OBJECT_ARRAY, offsetof (Lisp_Coding_System, slot), size }, | |
363 #include "coding-system-slots.h" | |
2367 | 364 { XD_BLOCK_ARRAY, offsetof (Lisp_Coding_System, data), 1, |
2551 | 365 { coding_system_extra_description_map } }, |
1204 | 366 { XD_END } |
771 | 367 }; |
368 | |
1204 | 369 static const struct memory_description coding_system_empty_extra_description_1[] = |
370 { | |
371 { XD_END } | |
372 }; | |
373 | |
374 const struct sized_memory_description coding_system_empty_extra_description = { | |
375 0, coding_system_empty_extra_description_1 | |
376 }; | |
377 | |
5118
e0db3c197671
merge up to latest default branch, doesn't compile yet
Ben Wing <ben@xemacs.org>
diff
changeset
|
378 DEFINE_DUMPABLE_SIZABLE_LISP_OBJECT ("coding-system", coding_system, |
5124
623d57b7fbe8
separate regular and disksave finalization, print method fixes.
Ben Wing <ben@xemacs.org>
parents:
5120
diff
changeset
|
379 mark_coding_system, |
623d57b7fbe8
separate regular and disksave finalization, print method fixes.
Ben Wing <ben@xemacs.org>
parents:
5120
diff
changeset
|
380 print_coding_system, |
5169
6c6d78781d59
cleanup of code related to xfree(), better KKCC backtrace capabilities, document XD_INLINE_LISP_OBJECT_BLOCK_PTR, fix some memory leaks, other code cleanup
Ben Wing <ben@xemacs.org>
parents:
5142
diff
changeset
|
381 IF_OLD_GC (finalize_coding_system), |
5124
623d57b7fbe8
separate regular and disksave finalization, print method fixes.
Ben Wing <ben@xemacs.org>
parents:
5120
diff
changeset
|
382 0, 0, coding_system_description, |
623d57b7fbe8
separate regular and disksave finalization, print method fixes.
Ben Wing <ben@xemacs.org>
parents:
5120
diff
changeset
|
383 sizeof_coding_system, |
623d57b7fbe8
separate regular and disksave finalization, print method fixes.
Ben Wing <ben@xemacs.org>
parents:
5120
diff
changeset
|
384 Lisp_Coding_System); |
771 | 385 |
386 /************************************************************************/ | |
387 /* Creating coding systems */ | |
388 /************************************************************************/ | |
389 | |
390 static struct coding_system_methods * | |
391 decode_coding_system_type (Lisp_Object type, Error_Behavior errb) | |
428 | 392 { |
771 | 393 int i; |
394 | |
395 for (i = 0; i < Dynarr_length (the_coding_system_type_entry_dynarr); i++) | |
428 | 396 { |
771 | 397 if (EQ (type, |
398 Dynarr_at (the_coding_system_type_entry_dynarr, i).meths->type)) | |
399 return Dynarr_at (the_coding_system_type_entry_dynarr, i).meths; | |
428 | 400 } |
771 | 401 |
402 maybe_invalid_constant ("Invalid coding system type", type, | |
403 Qcoding_system, errb); | |
404 | |
405 return 0; | |
428 | 406 } |
407 | |
771 | 408 static int |
409 valid_coding_system_type_p (Lisp_Object type) | |
428 | 410 { |
771 | 411 return decode_coding_system_type (type, ERROR_ME_NOT) != 0; |
412 } | |
413 | |
4690
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
414 #ifdef MULE |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
415 static Lisp_Object Vdefault_query_coding_region_chartab_cache; |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
416 |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
417 /* Non-static because it's used in INITIALIZE_CODING_SYSTEM_TYPE_WITH_DATA. */ |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
418 Lisp_Object |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
419 default_query_method (Lisp_Object codesys, struct buffer *buf, |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
420 Charbpos end, int flags) |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
421 { |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
422 Charbpos pos = BUF_PT (buf), fail_range_start, fail_range_end; |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
423 Charbpos pos_byte = BYTE_BUF_PT (buf); |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
424 Lisp_Object safe_charsets = XCODING_SYSTEM_SAFE_CHARSETS (codesys); |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
425 Lisp_Object safe_chars = XCODING_SYSTEM_SAFE_CHARS (codesys), |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
426 result = Qnil; |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
427 enum query_coding_failure_reasons failed_reason, |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
428 previous_failed_reason = query_coding_succeeded; |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
429 |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
430 /* safe-charsets of t means the coding system can encode everything. */ |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
431 if (EQ (Qnil, safe_chars)) |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
432 { |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
433 if (EQ (Qt, safe_charsets)) |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
434 { |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
435 return Qnil; |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
436 } |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
437 |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
438 /* If we've no information on what characters the coding system can |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
439 encode, give up. */ |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
440 if (EQ (Qnil, safe_charsets) && EQ (Qnil, safe_chars)) |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
441 { |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
442 return Qunbound; |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
443 } |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
444 |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
445 safe_chars = Fgethash (safe_charsets, |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
446 Vdefault_query_coding_region_chartab_cache, |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
447 Qnil); |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
448 if (NILP (safe_chars)) |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
449 { |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
450 safe_chars = Fmake_char_table (Qgeneric); |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
451 { |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
452 EXTERNAL_LIST_LOOP_2 (safe_charset, safe_charsets) |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
453 Fput_char_table (safe_charset, Qt, safe_chars); |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
454 } |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
455 |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
456 Fputhash (safe_charsets, safe_chars, |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
457 Vdefault_query_coding_region_chartab_cache); |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
458 } |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
459 } |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
460 |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
461 if (flags & QUERY_METHOD_HIGHLIGHT && |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
462 /* If we're being called really early, live without highlights getting |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
463 cleared properly: */ |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
464 !(UNBOUNDP (XSYMBOL (Qquery_coding_clear_highlights)->function))) |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
465 { |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
466 /* It's okay to call Lisp here, the only non-stack object we may have |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
467 allocated up to this point is safe_chars, and that's |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
468 reachable from its entry in |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
469 Vdefault_query_coding_region_chartab_cache */ |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
470 call3 (Qquery_coding_clear_highlights, make_int (pos), make_int (end), |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
471 wrap_buffer (buf)); |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
472 } |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
473 |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
474 while (pos < end) |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
475 { |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
476 Ichar ch = BYTE_BUF_FETCH_CHAR (buf, pos_byte); |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
477 if (!EQ (Qnil, get_char_table (ch, safe_chars))) |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
478 { |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
479 pos++; |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
480 INC_BYTEBPOS (buf, pos_byte); |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
481 } |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
482 else |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
483 { |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
484 fail_range_start = pos; |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
485 while ((pos < end) && |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
486 (EQ (Qnil, get_char_table (ch, safe_chars)) |
4839
5f1c6ca4d05e
fix bad change to default_query_method, rewrite conditional for more correctness
Ben Wing <ben@xemacs.org>
parents:
4830
diff
changeset
|
487 && (failed_reason = query_coding_unencodable, |
5f1c6ca4d05e
fix bad change to default_query_method, rewrite conditional for more correctness
Ben Wing <ben@xemacs.org>
parents:
4830
diff
changeset
|
488 (previous_failed_reason == query_coding_succeeded |
5f1c6ca4d05e
fix bad change to default_query_method, rewrite conditional for more correctness
Ben Wing <ben@xemacs.org>
parents:
4830
diff
changeset
|
489 || previous_failed_reason == failed_reason)))) |
4690
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
490 { |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
491 pos++; |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
492 INC_BYTEBPOS (buf, pos_byte); |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
493 ch = BYTE_BUF_FETCH_CHAR (buf, pos_byte); |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
494 previous_failed_reason = failed_reason; |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
495 } |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
496 |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
497 if (fail_range_start == pos) |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
498 { |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
499 /* The character can actually be encoded; move on. */ |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
500 pos++; |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
501 INC_BYTEBPOS (buf, pos_byte); |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
502 } |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
503 else |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
504 { |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
505 assert (previous_failed_reason == query_coding_unencodable); |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
506 |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
507 if (flags & QUERY_METHOD_ERRORP) |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
508 { |
4952
19a72041c5ed
Mule-izing, various fixes related to char * arguments
Ben Wing <ben@xemacs.org>
parents:
4846
diff
changeset
|
509 signal_error_2 |
19a72041c5ed
Mule-izing, various fixes related to char * arguments
Ben Wing <ben@xemacs.org>
parents:
4846
diff
changeset
|
510 (Qtext_conversion_error, |
19a72041c5ed
Mule-izing, various fixes related to char * arguments
Ben Wing <ben@xemacs.org>
parents:
4846
diff
changeset
|
511 "Cannot encode using coding system", |
19a72041c5ed
Mule-izing, various fixes related to char * arguments
Ben Wing <ben@xemacs.org>
parents:
4846
diff
changeset
|
512 make_string_from_buffer (buf, fail_range_start, |
19a72041c5ed
Mule-izing, various fixes related to char * arguments
Ben Wing <ben@xemacs.org>
parents:
4846
diff
changeset
|
513 pos - fail_range_start), |
19a72041c5ed
Mule-izing, various fixes related to char * arguments
Ben Wing <ben@xemacs.org>
parents:
4846
diff
changeset
|
514 XCODING_SYSTEM_NAME (codesys)); |
4690
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
515 } |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
516 |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
517 if (NILP (result)) |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
518 { |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
519 result = Fmake_range_table (Qstart_closed_end_open); |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
520 } |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
521 |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
522 fail_range_end = pos; |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
523 |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
524 Fput_range_table (make_int (fail_range_start), |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
525 make_int (fail_range_end), |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
526 Qunencodable, |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
527 result); |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
528 previous_failed_reason = query_coding_succeeded; |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
529 |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
530 if (flags & QUERY_METHOD_HIGHLIGHT) |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
531 { |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
532 Lisp_Object extent |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
533 = Fmake_extent (make_int (fail_range_start), |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
534 make_int (fail_range_end), |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
535 wrap_buffer (buf)); |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
536 |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
537 Fset_extent_priority |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
538 (extent, make_int (2 + mouse_highlight_priority)); |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
539 Fset_extent_face (extent, Qquery_coding_warning_face); |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
540 } |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
541 } |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
542 } |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
543 } |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
544 |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
545 return result; |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
546 } |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
547 #else |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
548 Lisp_Object |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
549 default_query_method (Lisp_Object UNUSED (codesys), |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
550 struct buffer * UNUSED (buf), |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
551 Charbpos UNUSED (end), int UNUSED (flags)) |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
552 { |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
553 return Qnil; |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
554 } |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
555 #endif /* defined MULE */ |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
556 |
771 | 557 DEFUN ("valid-coding-system-type-p", Fvalid_coding_system_type_p, 1, 1, 0, /* |
558 Given a CODING-SYSTEM-TYPE, return non-nil if it is valid. | |
559 Valid types depend on how XEmacs was compiled but may include | |
3025 | 560 `undecided', `chain', `integer', `ccl', `iso2022', `big5', `shift-jis', |
561 `utf-16', `ucs-4', `utf-8', etc. | |
771 | 562 */ |
563 (coding_system_type)) | |
564 { | |
565 return valid_coding_system_type_p (coding_system_type) ? Qt : Qnil; | |
566 } | |
567 | |
568 DEFUN ("coding-system-type-list", Fcoding_system_type_list, 0, 0, 0, /* | |
569 Return a list of valid coding system types. | |
570 */ | |
571 ()) | |
572 { | |
573 return Fcopy_sequence (Vcoding_system_type_list); | |
574 } | |
575 | |
576 void | |
577 add_entry_to_coding_system_type_list (struct coding_system_methods *meths) | |
578 { | |
579 struct coding_system_type_entry entry; | |
580 | |
581 entry.meths = meths; | |
582 Dynarr_add (the_coding_system_type_entry_dynarr, entry); | |
583 Vcoding_system_type_list = Fcons (meths->type, Vcoding_system_type_list); | |
428 | 584 } |
585 | |
586 DEFUN ("coding-system-p", Fcoding_system_p, 1, 1, 0, /* | |
587 Return t if OBJECT is a coding system. | |
588 A coding system is an object that defines how text containing multiple | |
589 character sets is encoded into a stream of (typically 8-bit) bytes. | |
590 The coding system is used to decode the stream into a series of | |
591 characters (which may be from multiple charsets) when the text is read | |
592 from a file or process, and is used to encode the text back into the | |
593 same format when it is written out to a file or process. | |
594 | |
595 For example, many ISO2022-compliant coding systems (such as Compound | |
596 Text, which is used for inter-client data under the X Window System) | |
597 use escape sequences to switch between different charsets -- Japanese | |
598 Kanji, for example, is invoked with "ESC $ ( B"; ASCII is invoked | |
599 with "ESC ( B"; and Cyrillic is invoked with "ESC - L". See | |
600 `make-coding-system' for more information. | |
601 | |
602 Coding systems are normally identified using a symbol, and the | |
603 symbol is accepted in place of the actual coding system object whenever | |
604 a coding system is called for. (This is similar to how faces work.) | |
605 */ | |
606 (object)) | |
607 { | |
608 return CODING_SYSTEMP (object) ? Qt : Qnil; | |
609 } | |
610 | |
4303 | 611 static Lisp_Object |
612 find_coding_system (Lisp_Object coding_system_or_name, | |
613 int do_autoloads) | |
614 { | |
615 Lisp_Object lookup; | |
616 | |
617 if (NILP (coding_system_or_name)) | |
618 coding_system_or_name = Qbinary; | |
619 else if (CODING_SYSTEMP (coding_system_or_name)) | |
620 return coding_system_or_name; | |
621 else | |
622 CHECK_SYMBOL (coding_system_or_name); | |
623 | |
624 while (1) | |
625 { | |
626 lookup = | |
627 Fgethash (coding_system_or_name, Vcoding_system_hash_table, Qnil); | |
628 | |
629 if (CONSP (lookup) && do_autoloads) | |
630 { | |
631 struct gcpro gcpro1; | |
632 int length; | |
633 DECLARE_EISTRING (desired_base); | |
634 DECLARE_EISTRING (warning_info); | |
635 | |
636 eicpy_lstr (desired_base, XSYMBOL_NAME (coding_system_or_name)); | |
637 | |
638 /* Work out the name of the base coding system. */ | |
639 length = eilen (desired_base); | |
640 if (length > (int)(sizeof ("-unix") - 1)) | |
641 { | |
642 if (0 == qxestrcmp ((UAscbyte *)"-unix", (eidata (desired_base)) | |
643 + (length - (sizeof ("-unix") - 1)))) | |
644 { | |
645 eidel (desired_base, length - (sizeof ("-unix") - 1), | |
646 -1, 5, 5); | |
647 } | |
648 } | |
649 else if (length > (int)(sizeof ("-dos") - 1)) | |
650 { | |
651 if ((0 == qxestrcmp ((UAscbyte *)"-dos", (eidata (desired_base)) | |
652 + (length - (sizeof ("-dos") - 1)))) || | |
653 (0 == qxestrcmp ((UAscbyte *)"-mac", (eidata (desired_base)) | |
654 + (length - (sizeof ("-mac") - 1))))) | |
655 { | |
656 eidel (desired_base, length - (sizeof ("-dos") - 1), -1, | |
657 4, 4); | |
658 } | |
659 } | |
660 | |
4953
304aebb79cd3
function renamings to track names of char typedefs
Ben Wing <ben@xemacs.org>
parents:
4952
diff
changeset
|
661 coding_system_or_name = intern_istring (eidata (desired_base)); |
4303 | 662 |
663 /* Remove this coding system and its subsidiary coding | |
664 systems from the hash, to avoid calling this code recursively. */ | |
665 Fremhash (coding_system_or_name, Vcoding_system_hash_table); | |
666 Fremhash (add_suffix_to_symbol(coding_system_or_name, "-unix"), | |
667 Vcoding_system_hash_table); | |
668 Fremhash (add_suffix_to_symbol(coding_system_or_name, "-dos"), | |
669 Vcoding_system_hash_table); | |
670 Fremhash (add_suffix_to_symbol(coding_system_or_name, "-mac"), | |
671 Vcoding_system_hash_table); | |
672 | |
673 eicpy_ascii (warning_info, "Error autoloading coding system "); | |
674 eicat_lstr (warning_info, XSYMBOL_NAME (coding_system_or_name)); | |
675 | |
676 /* Keep around the form so it doesn't disappear from under | |
677 #'eval's feet. */ | |
678 GCPRO1 (lookup); | |
4952
19a72041c5ed
Mule-izing, various fixes related to char * arguments
Ben Wing <ben@xemacs.org>
parents:
4846
diff
changeset
|
679 call1_trapping_problems ((const CIbyte *) eidata (warning_info), |
4303 | 680 Qeval, lookup, 0); |
681 UNGCPRO; | |
682 | |
683 lookup = | |
684 Fgethash (coding_system_or_name, Vcoding_system_hash_table, Qnil); | |
685 } | |
686 | |
687 if (CODING_SYSTEMP (lookup) || NILP (lookup)) | |
688 return lookup; | |
689 | |
690 coding_system_or_name = lookup; | |
691 } | |
692 } | |
693 | |
428 | 694 DEFUN ("find-coding-system", Ffind_coding_system, 1, 1, 0, /* |
695 Retrieve the coding system of the given name. | |
696 | |
697 If CODING-SYSTEM-OR-NAME is a coding-system object, it is simply | |
698 returned. Otherwise, CODING-SYSTEM-OR-NAME should be a symbol. | |
699 If there is no such coding system, nil is returned. Otherwise the | |
700 associated coding system object is returned. | |
701 */ | |
702 (coding_system_or_name)) | |
703 { | |
4303 | 704 return find_coding_system(coding_system_or_name, 1); |
705 } | |
706 | |
707 DEFUN ("autoload-coding-system", Fautoload_coding_system, 2, 2, 0, /* | |
708 Define SYMBOL as a coding-system that is loaded on demand. | |
709 | |
710 FORM is a form to evaluate to define the coding-system. | |
711 */ | |
712 (symbol, form)) | |
713 { | |
714 Lisp_Object lookup; | |
715 | |
716 CHECK_SYMBOL (symbol); | |
717 CHECK_CONS (form); | |
718 | |
719 lookup = find_coding_system (symbol, 0); | |
720 | |
721 if (!NILP (lookup) && | |
722 /* Allow autoloads to be redefined. */ | |
723 !CONSP (lookup)) | |
440 | 724 { |
4303 | 725 invalid_operation ("Cannot redefine existing coding system", |
726 symbol); | |
440 | 727 } |
4303 | 728 |
729 Fputhash (symbol, form, Vcoding_system_hash_table); | |
730 Fputhash (add_suffix_to_symbol(symbol, "-unix"), form, | |
731 Vcoding_system_hash_table); | |
732 Fputhash (add_suffix_to_symbol(symbol, "-dos"), form, | |
733 Vcoding_system_hash_table); | |
734 Fputhash (add_suffix_to_symbol(symbol, "-mac"), form, | |
735 Vcoding_system_hash_table); | |
736 | |
737 /* Tell the POSIX locale infrastructure about this coding system (though | |
738 unfortunately it'll be too late for the startup locale sniffing. */ | |
739 if (!UNBOUNDP (Qposix_charset_to_coding_system_hash)) | |
740 { | |
741 Lisp_Object val = Fsymbol_value (Qposix_charset_to_coding_system_hash); | |
742 DECLARE_EISTRING (minimal_name); | |
743 Ibyte *full_name; | |
744 int len = XSTRING_LENGTH (XSYMBOL_NAME (symbol)), i; | |
745 | |
746 if (!NILP (val)) | |
747 { | |
748 full_name = XSTRING_DATA (XSYMBOL_NAME (symbol)); | |
749 for (i = 0; i < len; ++i) | |
750 { | |
751 if (full_name[i] >= '0' && full_name[i] <= '9') | |
752 { | |
753 eicat_ch (minimal_name, full_name[i]); | |
754 } | |
755 else if (full_name[i] >= 'a' && full_name[i] <= 'z') | |
756 { | |
757 eicat_ch (minimal_name, full_name[i]); | |
758 } | |
759 else if (full_name[i] >= 'A' && full_name[i] <= 'Z') | |
760 { | |
761 eicat_ch (minimal_name, full_name[i] + | |
762 ('a' - 'A')); | |
763 } | |
764 } | |
765 | |
766 if (eilen (minimal_name)) | |
767 { | |
768 CHECK_HASH_TABLE (val); | |
769 Fputhash (eimake_string(minimal_name), symbol, val); | |
770 } | |
771 } | |
772 } | |
773 | |
774 return Qt; | |
428 | 775 } |
776 | |
777 DEFUN ("get-coding-system", Fget_coding_system, 1, 1, 0, /* | |
778 Retrieve the coding system of the given name. | |
779 Same as `find-coding-system' except that if there is no such | |
780 coding system, an error is signaled instead of returning nil. | |
781 */ | |
782 (name)) | |
783 { | |
784 Lisp_Object coding_system = Ffind_coding_system (name); | |
785 | |
786 if (NILP (coding_system)) | |
563 | 787 invalid_argument ("No such coding system", name); |
428 | 788 return coding_system; |
789 } | |
790 | |
771 | 791 int |
792 coding_system_is_binary (Lisp_Object coding_system) | |
793 { | |
794 Lisp_Coding_System *cs = XCODING_SYSTEM (coding_system); | |
795 return | |
796 (EQ (CODING_SYSTEM_TYPE (cs), Qno_conversion) && | |
797 CODING_SYSTEM_EOL_TYPE (cs) == EOL_LF && | |
798 EQ (CODING_SYSTEM_POST_READ_CONVERSION (cs), Qnil) && | |
799 EQ (CODING_SYSTEM_PRE_WRITE_CONVERSION (cs), Qnil)); | |
800 } | |
801 | |
802 static Lisp_Object | |
803 coding_system_real_canonical (Lisp_Object cs) | |
804 { | |
805 if (!NILP (XCODING_SYSTEM_CANONICAL (cs))) | |
806 return XCODING_SYSTEM_CANONICAL (cs); | |
807 return cs; | |
808 } | |
809 | |
810 /* Return true if coding system is of the "standard" type that decodes | |
811 bytes into characters (suitable for decoding a text file). */ | |
812 int | |
813 coding_system_is_for_text_file (Lisp_Object coding_system) | |
814 { | |
815 return (XCODESYSMETH_OR_GIVEN | |
816 (coding_system, conversion_end_type, | |
817 (coding_system_real_canonical (coding_system)), | |
818 DECODES_BYTE_TO_CHARACTER) == | |
819 DECODES_BYTE_TO_CHARACTER); | |
820 } | |
821 | |
822 static int | |
823 decoding_source_sink_type_is_char (Lisp_Object cs, enum source_or_sink sex) | |
824 { | |
825 enum source_sink_type type = | |
826 XCODESYSMETH_OR_GIVEN (cs, conversion_end_type, | |
827 (coding_system_real_canonical (cs)), | |
828 DECODES_BYTE_TO_CHARACTER); | |
829 if (sex == CODING_SOURCE) | |
830 return (type == DECODES_CHARACTER_TO_CHARACTER || | |
831 type == DECODES_CHARACTER_TO_BYTE); | |
832 else | |
833 return (type == DECODES_CHARACTER_TO_CHARACTER || | |
834 type == DECODES_BYTE_TO_CHARACTER); | |
835 } | |
836 | |
837 static int | |
838 encoding_source_sink_type_is_char (Lisp_Object cs, enum source_or_sink sex) | |
839 { | |
840 return decoding_source_sink_type_is_char (cs, | |
841 /* Sex change */ | |
842 sex == CODING_SOURCE ? | |
843 CODING_SINK : CODING_SOURCE); | |
844 } | |
845 | |
846 /* Like Ffind_coding_system() but check that the coding system is of the | |
847 "standard" type that decodes bytes into characters (suitable for | |
848 decoding a text file), and if not, returns an appropriate wrapper that | |
849 does. Also, if EOL_WRAP is non-zero, check whether this coding system | |
850 wants EOL auto-detection, and if so, wrap with a convert-eol coding | |
851 system to do this. */ | |
852 | |
853 Lisp_Object | |
854 find_coding_system_for_text_file (Lisp_Object name, int eol_wrap) | |
855 { | |
856 Lisp_Object coding_system = Ffind_coding_system (name); | |
857 Lisp_Object wrapper = coding_system; | |
858 | |
859 if (NILP (coding_system)) | |
860 return Qnil; | |
861 if (!coding_system_is_for_text_file (coding_system)) | |
862 { | |
863 wrapper = XCODING_SYSTEM_TEXT_FILE_WRAPPER (coding_system); | |
864 if (NILP (wrapper)) | |
865 { | |
866 Lisp_Object chain; | |
867 if (!decoding_source_sink_type_is_char (coding_system, CODING_SINK)) | |
868 chain = list2 (coding_system, Qbinary); | |
869 else | |
870 chain = list1 (coding_system); | |
871 if (decoding_source_sink_type_is_char (coding_system, CODING_SOURCE)) | |
872 chain = Fcons (Qbinary, chain); | |
873 wrapper = | |
874 make_internal_coding_system | |
875 (coding_system, | |
876 "internal-text-file-wrapper", | |
877 Qchain, | |
878 Qunbound, list4 (Qchain, chain, | |
879 Qcanonicalize_after_coding, coding_system)); | |
880 XCODING_SYSTEM_TEXT_FILE_WRAPPER (coding_system) = wrapper; | |
881 } | |
882 } | |
883 | |
884 if (!eol_wrap || XCODING_SYSTEM_EOL_TYPE (coding_system) != EOL_AUTODETECT) | |
885 return wrapper; | |
886 | |
887 coding_system = wrapper; | |
888 wrapper = XCODING_SYSTEM_AUTO_EOL_WRAPPER (coding_system); | |
889 if (!NILP (wrapper)) | |
890 return wrapper; | |
891 wrapper = | |
892 make_internal_coding_system | |
893 (coding_system, | |
894 "internal-auto-eol-wrapper", | |
895 Qundecided, Qunbound, | |
896 list4 (Qcoding_system, coding_system, | |
897 Qdo_eol, Qt)); | |
898 XCODING_SYSTEM_AUTO_EOL_WRAPPER (coding_system) = wrapper; | |
899 return wrapper; | |
900 } | |
901 | |
902 /* Like Fget_coding_system() but verify that the coding system is of the | |
903 "standard" type that decodes bytes into characters (suitable for | |
904 decoding a text file), and if not, returns an appropriate wrapper that | |
905 does. Also, if EOL_WRAP is non-zero, check whether this coding system | |
906 wants EOL auto-detection, and if so, wrap with a convert-eol coding | |
907 system to do this. */ | |
908 | |
909 Lisp_Object | |
910 get_coding_system_for_text_file (Lisp_Object name, int eol_wrap) | |
911 { | |
912 Lisp_Object coding_system = find_coding_system_for_text_file (name, | |
913 eol_wrap); | |
914 if (NILP (coding_system)) | |
915 invalid_argument ("No such coding system", name); | |
916 return coding_system; | |
917 } | |
918 | |
919 /* We store the coding systems in hash tables with the names as the | |
920 key and the actual coding system object as the value. Occasionally | |
921 we need to use them in a list format. These routines provide us | |
922 with that. */ | |
428 | 923 struct coding_system_list_closure |
924 { | |
925 Lisp_Object *coding_system_list; | |
771 | 926 int normal; |
927 int internal; | |
428 | 928 }; |
929 | |
930 static int | |
4303 | 931 add_coding_system_to_list_mapper (Lisp_Object key, Lisp_Object value, |
428 | 932 void *coding_system_list_closure) |
933 { | |
934 /* This function can GC */ | |
935 struct coding_system_list_closure *cscl = | |
936 (struct coding_system_list_closure *) coding_system_list_closure; | |
937 Lisp_Object *coding_system_list = cscl->coding_system_list; | |
938 | |
771 | 939 /* We can't just use VALUE because KEY might be an alias, and we need |
4303 | 940 the real coding system object. |
941 | |
942 Autoloaded coding systems have conses for their values, and can't be | |
943 internal coding systems, or coding system aliases. */ | |
944 if (CONSP (value) || | |
945 (XCODING_SYSTEM (Ffind_coding_system (key))->internal_p ? | |
946 cscl->internal : cscl->normal)) | |
771 | 947 *coding_system_list = Fcons (key, *coding_system_list); |
428 | 948 return 0; |
949 } | |
950 | |
2297 | 951 /* #### should we specify a conventional for "all coding systems"? */ |
771 | 952 DEFUN ("coding-system-list", Fcoding_system_list, 0, 1, 0, /* |
428 | 953 Return a list of the names of all defined coding systems. |
771 | 954 If INTERNAL is nil, only the normal (non-internal) coding systems are |
955 included. (Internal coding systems are created for various internal | |
956 purposes, such as implementing EOL types of CRLF and CR; generally, you do | |
957 not want to see these.) If it is t, only the internal coding systems are | |
958 included. If it is any other non-nil value both normal and internal are | |
959 included. | |
428 | 960 */ |
771 | 961 (internal)) |
428 | 962 { |
963 Lisp_Object coding_system_list = Qnil; | |
964 struct gcpro gcpro1; | |
965 struct coding_system_list_closure coding_system_list_closure; | |
966 | |
967 GCPRO1 (coding_system_list); | |
968 coding_system_list_closure.coding_system_list = &coding_system_list; | |
771 | 969 coding_system_list_closure.normal = !EQ (internal, Qt); |
970 coding_system_list_closure.internal = !NILP (internal); | |
428 | 971 elisp_maphash (add_coding_system_to_list_mapper, Vcoding_system_hash_table, |
972 &coding_system_list_closure); | |
973 UNGCPRO; | |
974 | |
975 return coding_system_list; | |
976 } | |
977 | |
978 DEFUN ("coding-system-name", Fcoding_system_name, 1, 1, 0, /* | |
979 Return the name of the given coding system. | |
980 */ | |
981 (coding_system)) | |
982 { | |
983 coding_system = Fget_coding_system (coding_system); | |
984 return XCODING_SYSTEM_NAME (coding_system); | |
985 } | |
986 | |
987 static Lisp_Coding_System * | |
771 | 988 allocate_coding_system (struct coding_system_methods *codesys_meths, |
989 Bytecount data_size, | |
990 Lisp_Object name) | |
428 | 991 { |
771 | 992 Bytecount total_size = offsetof (Lisp_Coding_System, data) + data_size; |
5117
3742ea8250b5
Checking in final CVS version of workspace 'ben-lisp-object'
Ben Wing <ben@xemacs.org>
parents:
3025
diff
changeset
|
993 Lisp_Object obj = ALLOC_SIZED_LISP_OBJECT (total_size, coding_system); |
3742ea8250b5
Checking in final CVS version of workspace 'ben-lisp-object'
Ben Wing <ben@xemacs.org>
parents:
3025
diff
changeset
|
994 Lisp_Coding_System *codesys = XCODING_SYSTEM (obj); |
1204 | 995 |
771 | 996 codesys->methods = codesys_meths; |
1204 | 997 #define MARKED_SLOT(x) codesys->x = Qnil; |
998 #include "coding-system-slots.h" | |
999 | |
771 | 1000 CODING_SYSTEM_EOL_TYPE (codesys) = EOL_LF; |
1001 CODING_SYSTEM_NAME (codesys) = name; | |
1002 | |
1003 MAYBE_CODESYSMETH (codesys, init, (wrap_coding_system (codesys))); | |
428 | 1004 |
1005 return codesys; | |
1006 } | |
1007 | |
771 | 1008 static enum eol_type |
1009 symbol_to_eol_type (Lisp_Object symbol) | |
1010 { | |
1011 CHECK_SYMBOL (symbol); | |
1012 if (NILP (symbol)) return EOL_AUTODETECT; | |
1013 if (EQ (symbol, Qlf)) return EOL_LF; | |
1014 if (EQ (symbol, Qcrlf)) return EOL_CRLF; | |
1015 if (EQ (symbol, Qcr)) return EOL_CR; | |
1016 | |
1017 invalid_constant ("Unrecognized eol type", symbol); | |
1204 | 1018 RETURN_NOT_REACHED (EOL_AUTODETECT); |
771 | 1019 } |
1020 | |
1021 static Lisp_Object | |
1022 eol_type_to_symbol (enum eol_type type) | |
1023 { | |
1024 switch (type) | |
1025 { | |
2500 | 1026 default: ABORT (); |
771 | 1027 case EOL_LF: return Qlf; |
1028 case EOL_CRLF: return Qcrlf; | |
1029 case EOL_CR: return Qcr; | |
1030 case EOL_AUTODETECT: return Qnil; | |
1031 } | |
1032 } | |
1033 | |
1034 struct subsidiary_type | |
1035 { | |
4528
726060ee587c
First draft of g++ 4.3 warning removal patch. Builds. *Needs ChangeLogs.*
Stephen J. Turnbull <stephen@xemacs.org>
parents:
4522
diff
changeset
|
1036 const Ascbyte *extension; |
726060ee587c
First draft of g++ 4.3 warning removal patch. Builds. *Needs ChangeLogs.*
Stephen J. Turnbull <stephen@xemacs.org>
parents:
4522
diff
changeset
|
1037 const Ascbyte *mnemonic_ext; |
771 | 1038 enum eol_type eol; |
1039 }; | |
1040 | |
1041 static struct subsidiary_type coding_subsidiary_list[] = | |
1042 { { "-unix", "", EOL_LF }, | |
1043 { "-dos", ":T", EOL_CRLF }, | |
1044 { "-mac", ":t", EOL_CR } }; | |
1045 | |
1046 /* kludge */ | |
428 | 1047 static void |
771 | 1048 setup_eol_coding_systems (Lisp_Object codesys) |
428 | 1049 { |
793 | 1050 int len = XSTRING_LENGTH (XSYMBOL (XCODING_SYSTEM_NAME (codesys))->name); |
2367 | 1051 Ibyte *codesys_name = alloca_ibytes (len + 7); |
771 | 1052 int mlen = -1; |
867 | 1053 Ibyte *codesys_mnemonic = 0; |
771 | 1054 Lisp_Object codesys_name_sym, sub_codesys; |
1055 int i; | |
1056 | |
1057 memcpy (codesys_name, | |
793 | 1058 XSTRING_DATA (XSYMBOL (XCODING_SYSTEM_NAME (codesys))->name), len); |
771 | 1059 |
1060 if (STRINGP (XCODING_SYSTEM_MNEMONIC (codesys))) | |
428 | 1061 { |
771 | 1062 mlen = XSTRING_LENGTH (XCODING_SYSTEM_MNEMONIC (codesys)); |
2367 | 1063 codesys_mnemonic = alloca_ibytes (mlen + 7); |
771 | 1064 memcpy (codesys_mnemonic, |
1065 XSTRING_DATA (XCODING_SYSTEM_MNEMONIC (codesys)), mlen); | |
1066 } | |
1067 | |
1068 /* Create three "subsidiary" coding systems, decoding data encoded using | |
1069 each of the three EOL types. We do this for each subsidiary by | |
1070 copying the original coding system, setting the EOL type | |
1071 appropriately, and setting the CANONICAL member of the new coding | |
1072 system to be a chain consisting of the original coding system followed | |
1073 by a convert-eol coding system to do the EOL decoding. For EOL type | |
1074 LF, however, we don't need any decoding, so we skip creating a | |
1075 CANONICAL. | |
1076 | |
1077 If the original coding system is not a text-type coding system | |
1078 (decodes byte->char), we need to coerce it to one by the appropriate | |
1079 wrapping in CANONICAL. */ | |
1080 | |
1081 for (i = 0; i < countof (coding_subsidiary_list); i++) | |
1082 { | |
4528
726060ee587c
First draft of g++ 4.3 warning removal patch. Builds. *Needs ChangeLogs.*
Stephen J. Turnbull <stephen@xemacs.org>
parents:
4522
diff
changeset
|
1083 const Ascbyte *extension = coding_subsidiary_list[i].extension; |
726060ee587c
First draft of g++ 4.3 warning removal patch. Builds. *Needs ChangeLogs.*
Stephen J. Turnbull <stephen@xemacs.org>
parents:
4522
diff
changeset
|
1084 const Ascbyte *mnemonic_ext = coding_subsidiary_list[i].mnemonic_ext; |
771 | 1085 enum eol_type eol = coding_subsidiary_list[i].eol; |
1086 | |
2367 | 1087 qxestrcpy_ascii (codesys_name + len, extension); |
4953
304aebb79cd3
function renamings to track names of char typedefs
Ben Wing <ben@xemacs.org>
parents:
4952
diff
changeset
|
1088 codesys_name_sym = intern_istring (codesys_name); |
771 | 1089 if (mlen != -1) |
2367 | 1090 qxestrcpy_ascii (codesys_mnemonic + mlen, mnemonic_ext); |
771 | 1091 |
1092 sub_codesys = Fcopy_coding_system (codesys, codesys_name_sym); | |
1093 if (mlen != -1) | |
1094 XCODING_SYSTEM_MNEMONIC (sub_codesys) = | |
4953
304aebb79cd3
function renamings to track names of char typedefs
Ben Wing <ben@xemacs.org>
parents:
4952
diff
changeset
|
1095 build_istring (codesys_mnemonic); |
771 | 1096 |
1097 if (eol != EOL_LF) | |
1098 { | |
1099 Lisp_Object chain = list2 (get_coding_system_for_text_file | |
1100 (codesys, 0), | |
1101 eol == EOL_CR ? Qconvert_eol_cr : | |
1102 Qconvert_eol_crlf); | |
1103 Lisp_Object canon = | |
1104 make_internal_coding_system | |
1105 (sub_codesys, "internal-subsidiary-eol-wrapper", | |
1106 Qchain, Qunbound, | |
1107 mlen != -1 ? | |
4953
304aebb79cd3
function renamings to track names of char typedefs
Ben Wing <ben@xemacs.org>
parents:
4952
diff
changeset
|
1108 list6 (Qmnemonic, build_istring (codesys_mnemonic), |
771 | 1109 Qchain, chain, |
1110 Qcanonicalize_after_coding, sub_codesys) : | |
1111 list4 (Qchain, chain, | |
1112 Qcanonicalize_after_coding, sub_codesys)); | |
1113 XCODING_SYSTEM_CANONICAL (sub_codesys) = canon; | |
1114 } | |
1115 XCODING_SYSTEM_EOL_TYPE (sub_codesys) = eol; | |
1116 XCODING_SYSTEM_SUBSIDIARY_PARENT (sub_codesys) = codesys; | |
1117 XCODING_SYSTEM (codesys)->eol[eol] = sub_codesys; | |
428 | 1118 } |
1119 } | |
1120 | |
4690
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
1121 DEFUN ("coding-system-canonical-name-p", Fcoding_system_canonical_name_p, |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
1122 1, 1, 0, /* |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
1123 Return t if OBJECT names a coding system, and is not a coding system alias. |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
1124 */ |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
1125 (object)) |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
1126 { |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
1127 return CODING_SYSTEMP (Fgethash (object, Vcoding_system_hash_table, Qnil)) |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
1128 ? Qt : Qnil; |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
1129 } |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
1130 |
771 | 1131 /* Basic function to create new coding systems. For `make-coding-system', |
1132 NAME-OR-EXISTING is the NAME argument, PREFIX is null, and TYPE, | |
1133 DESCRIPTION, and PROPS are the same. All created coding systems are put | |
1134 in a hash table indexed by NAME. | |
1135 | |
1136 If PREFIX is a string, NAME-OR-EXISTING should specify an existing | |
1137 coding system (or nil), and an internal coding system will be created. | |
1138 The name of the coding system will be constructed by combining PREFIX | |
1139 with the name of the existing coding system (if given), and a number | |
1140 will be appended to insure uniqueness. In such a case, if Qunbound is | |
1141 given for DESCRIPTION, the description gets created based on the | |
1142 generated name. Also, if no mnemonic is given in the properties list, a | |
1143 mnemonic is created based on the generated name. | |
1144 | |
1145 For internal coding systems, the coding system is marked as internal | |
1146 (see `coding-system-list'), and no subsidiaries will be created or | |
1147 eol-wrapping will happen. Otherwise: | |
1148 | |
1149 -- if the eol-type property is `lf' or t, the coding system is merely | |
1150 created and returned. (For t, the coding system will be wrapped with | |
1151 an EOL autodetector when it's used to read a file.) | |
1152 | |
1153 -- if eol-type is `crlf' or `cr', after the coding system object is | |
1154 created, it will be wrapped in a chain with the appropriate | |
1155 convert-eol coding system (either `convert-eol-crlf' or | |
1156 `convert-eol-cr'), so that CRLF->LF or CR->LF conversion is done at | |
1157 decoding time, and the opposite at encoding time. The resulting | |
1158 chain becomes the CANONICAL field of the coding system object. | |
1159 | |
1160 -- if eol-type is nil or omitted, "subsidiaries" are generated: Three | |
1161 coding systems where the original coding system (before wrapping with | |
1162 convert-eol-autodetect) is either unwrapped or wrapped with | |
1163 convert-eol-crlf or convert-eol-cr, respectively, so that coding systems | |
1164 to handle LF, CRLF, and CR end-of-line indicators are created. (This | |
1165 crazy crap is based on existing behavior in other Mule versions, | |
1166 including FSF Emacs.) | |
1167 */ | |
428 | 1168 |
1169 static Lisp_Object | |
4528
726060ee587c
First draft of g++ 4.3 warning removal patch. Builds. *Needs ChangeLogs.*
Stephen J. Turnbull <stephen@xemacs.org>
parents:
4522
diff
changeset
|
1170 make_coding_system_1 (Lisp_Object name_or_existing, const Ascbyte *prefix, |
771 | 1171 Lisp_Object type, Lisp_Object description, |
1172 Lisp_Object props) | |
428 | 1173 { |
771 | 1174 Lisp_Coding_System *cs; |
1175 int need_to_setup_eol_systems = 1; | |
1176 enum eol_type eol_wrapper = EOL_AUTODETECT; | |
1177 struct coding_system_methods *meths; | |
1178 Lisp_Object csobj; | |
4690
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
1179 Lisp_Object defmnem = Qnil, aliases = Qnil; |
771 | 1180 |
1181 if (NILP (type)) | |
1182 type = Qundecided; | |
1183 meths = decode_coding_system_type (type, ERROR_ME); | |
1184 | |
1185 if (prefix) | |
428 | 1186 { |
867 | 1187 Ibyte *newname = |
771 | 1188 emacs_sprintf_malloc (NULL, "%s-%s-%d", |
1189 prefix, | |
867 | 1190 NILP (name_or_existing) ? (Ibyte *) "nil" : |
771 | 1191 XSTRING_DATA (Fsymbol_name (XCODING_SYSTEM_NAME |
1192 (name_or_existing))), | |
1193 ++coding_system_tick); | |
4953
304aebb79cd3
function renamings to track names of char typedefs
Ben Wing <ben@xemacs.org>
parents:
4952
diff
changeset
|
1194 name_or_existing = intern_istring (newname); |
4976
16112448d484
Rename xfree(FOO, TYPE) -> xfree(FOO)
Ben Wing <ben@xemacs.org>
parents:
4967
diff
changeset
|
1195 xfree (newname); |
771 | 1196 |
1197 if (UNBOUNDP (description)) | |
1198 { | |
1199 newname = | |
1200 emacs_sprintf_malloc | |
1201 (NULL, "For Internal Use (%s)", | |
1202 XSTRING_DATA (Fsymbol_name (name_or_existing))); | |
4953
304aebb79cd3
function renamings to track names of char typedefs
Ben Wing <ben@xemacs.org>
parents:
4952
diff
changeset
|
1203 description = build_istring (newname); |
4976
16112448d484
Rename xfree(FOO, TYPE) -> xfree(FOO)
Ben Wing <ben@xemacs.org>
parents:
4967
diff
changeset
|
1204 xfree (newname); |
771 | 1205 } |
1206 | |
1207 newname = emacs_sprintf_malloc (NULL, "Int%d", coding_system_tick); | |
4953
304aebb79cd3
function renamings to track names of char typedefs
Ben Wing <ben@xemacs.org>
parents:
4952
diff
changeset
|
1208 defmnem = build_istring (newname); |
4976
16112448d484
Rename xfree(FOO, TYPE) -> xfree(FOO)
Ben Wing <ben@xemacs.org>
parents:
4967
diff
changeset
|
1209 xfree (newname); |
428 | 1210 } |
771 | 1211 else |
1212 CHECK_SYMBOL (name_or_existing); | |
1213 | |
4303 | 1214 /* See is there an entry for name_or_existing in the defined coding system |
1215 hash table. */ | |
1216 csobj = find_coding_system (name_or_existing, 0); | |
1217 /* Error if it's there and not an autoload form. */ | |
1218 if (!NILP (csobj) && !CONSP (csobj)) | |
771 | 1219 invalid_operation ("Cannot redefine existing coding system", |
4303 | 1220 name_or_existing); |
771 | 1221 |
1222 cs = allocate_coding_system (meths, meths->extra_data_size, | |
1223 name_or_existing); | |
793 | 1224 csobj = wrap_coding_system (cs); |
771 | 1225 |
1226 cs->internal_p = !!prefix; | |
1227 | |
1228 if (NILP (description)) | |
4952
19a72041c5ed
Mule-izing, various fixes related to char * arguments
Ben Wing <ben@xemacs.org>
parents:
4846
diff
changeset
|
1229 description = build_ascstring (""); |
771 | 1230 else |
1231 CHECK_STRING (description); | |
1232 CODING_SYSTEM_DESCRIPTION (cs) = description; | |
1233 | |
1234 if (!NILP (defmnem)) | |
1235 CODING_SYSTEM_MNEMONIC (cs) = defmnem; | |
1236 | |
1237 { | |
1238 EXTERNAL_PROPERTY_LIST_LOOP_3 (key, value, props) | |
1239 { | |
1240 int recognized = 1; | |
1241 | |
1242 if (EQ (key, Qmnemonic)) | |
1243 { | |
1244 if (!NILP (value)) | |
1245 CHECK_STRING (value); | |
1246 CODING_SYSTEM_MNEMONIC (cs) = value; | |
1247 } | |
1248 | |
1249 else if (EQ (key, Qdocumentation)) | |
1250 { | |
1251 if (!NILP (value)) | |
1252 CHECK_STRING (value); | |
1253 CODING_SYSTEM_DOCUMENTATION (cs) = value; | |
1254 } | |
1255 | |
1256 else if (EQ (key, Qeol_type)) | |
1257 { | |
1258 need_to_setup_eol_systems = NILP (value); | |
1259 if (EQ (value, Qt)) | |
1260 value = Qnil; | |
1261 eol_wrapper = symbol_to_eol_type (value); | |
1262 } | |
1263 | |
1264 else if (EQ (key, Qpost_read_conversion)) | |
1265 CODING_SYSTEM_POST_READ_CONVERSION (cs) = value; | |
1266 else if (EQ (key, Qpre_write_conversion)) | |
1267 CODING_SYSTEM_PRE_WRITE_CONVERSION (cs) = value; | |
4690
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
1268 else if (EQ (key, Qaliases)) |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
1269 { |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
1270 EXTERNAL_LIST_LOOP_2 (alias, value) |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
1271 { |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
1272 CHECK_SYMBOL (alias); |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
1273 |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
1274 if (!NILP (Fcoding_system_canonical_name_p (alias))) |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
1275 { |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
1276 invalid_change ("Symbol is the canonical name of a " |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
1277 "coding system and cannot be redefined", |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
1278 alias); |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
1279 } |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
1280 } |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
1281 aliases = value; |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
1282 } |
771 | 1283 /* FSF compatibility */ |
1284 else if (EQ (key, Qtranslation_table_for_decode)) | |
1285 ; | |
1286 else if (EQ (key, Qtranslation_table_for_encode)) | |
1287 ; | |
1288 else if (EQ (key, Qsafe_chars)) | |
4690
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
1289 { |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
1290 CHECK_CHAR_TABLE (value); |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
1291 CODING_SYSTEM_SAFE_CHARS (cs) = value; |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
1292 } |
771 | 1293 else if (EQ (key, Qsafe_charsets)) |
4690
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
1294 { |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
1295 if (!EQ (Qt, value) |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
1296 /* Would be nice to actually do this check, but there are |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
1297 some order conflicts with japanese.el and |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
1298 mule-coding.el */ |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
1299 && 0) |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
1300 { |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
1301 #ifdef MULE |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
1302 EXTERNAL_LIST_LOOP_2 (safe_charset, value) |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
1303 CHECK_CHARSET (Ffind_charset (safe_charset)); |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
1304 #endif |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
1305 } |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
1306 |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
1307 CODING_SYSTEM_SAFE_CHARSETS (cs) = value; |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
1308 } |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
1309 else if (EQ (key, Qcategory)) |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
1310 { |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
1311 Fput (name_or_existing, intern ("coding-system-property"), |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
1312 Fplist_put (Fget (name_or_existing, |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
1313 intern ("coding-system-property"), |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
1314 Qnil), |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
1315 Qcategory, value)); |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
1316 } |
771 | 1317 else if (EQ (key, Qmime_charset)) |
1318 ; | |
1319 else if (EQ (key, Qvalid_codes)) | |
1320 ; | |
1321 else | |
1322 recognized = CODESYSMETH_OR_GIVEN (cs, putprop, | |
1323 (csobj, key, value), 0); | |
1324 | |
1325 if (!recognized) | |
1326 invalid_constant ("Unrecognized property", key); | |
1327 } | |
1328 } | |
1329 | |
1330 { | |
1331 XCODING_SYSTEM_CANONICAL (csobj) = | |
1332 CODESYSMETH_OR_GIVEN (cs, canonicalize, (csobj), Qnil); | |
1333 XCODING_SYSTEM_EOL_TYPE (csobj) = EOL_AUTODETECT; /* for copy-coding-system | |
1334 below */ | |
1335 | |
4303 | 1336 Fputhash (name_or_existing, csobj, Vcoding_system_hash_table); |
1337 | |
771 | 1338 if (need_to_setup_eol_systems && !cs->internal_p) |
1339 setup_eol_coding_systems (csobj); | |
1340 else if (eol_wrapper == EOL_CR || eol_wrapper == EOL_CRLF) | |
1341 { | |
1342 /* If a specific eol-type (other than LF) was specified, we handle | |
1343 this by converting the coding system into a chain that wraps the | |
1344 coding system along with a convert-eol system after it, in | |
1345 exactly that same switcheroo fashion that the normal | |
1346 canonicalize method works -- BUT we will run into a problem if | |
1347 we do it the obvious way, because when `chain' creates its | |
1348 substreams, the substream containing the coding system we're | |
1349 creating will have canonicalization expansion done on it, | |
1350 leading to infinite recursion. So we have to generate a new, | |
1351 internal coding system with the previous value of CANONICAL. */ | |
867 | 1352 Ibyte *newname = |
771 | 1353 emacs_sprintf_malloc |
1354 (NULL, "internal-eol-copy-%s-%d", | |
1355 XSTRING_DATA (Fsymbol_name (name_or_existing)), | |
1356 ++coding_system_tick); | |
4953
304aebb79cd3
function renamings to track names of char typedefs
Ben Wing <ben@xemacs.org>
parents:
4952
diff
changeset
|
1357 Lisp_Object newnamesym = intern_istring (newname); |
771 | 1358 Lisp_Object copied = Fcopy_coding_system (csobj, newnamesym); |
4976
16112448d484
Rename xfree(FOO, TYPE) -> xfree(FOO)
Ben Wing <ben@xemacs.org>
parents:
4967
diff
changeset
|
1359 xfree (newname); |
771 | 1360 |
1361 XCODING_SYSTEM_CANONICAL (csobj) = | |
1362 make_internal_coding_system | |
1363 (csobj, | |
1364 "internal-eol-wrapper", | |
1365 Qchain, Qunbound, | |
1366 list4 (Qchain, | |
1367 list2 (copied, | |
1368 eol_wrapper == EOL_CR ? | |
1369 Qconvert_eol_cr : | |
1370 Qconvert_eol_crlf), | |
1371 Qcanonicalize_after_coding, | |
1372 csobj)); | |
1373 } | |
1374 XCODING_SYSTEM_EOL_TYPE (csobj) = eol_wrapper; | |
4690
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
1375 |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
1376 { |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
1377 EXTERNAL_LIST_LOOP_2 (alias, aliases) |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
1378 Fdefine_coding_system_alias (alias, csobj); |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
1379 } |
771 | 1380 } |
1381 | |
1382 return csobj; | |
428 | 1383 } |
1384 | |
771 | 1385 Lisp_Object |
4528
726060ee587c
First draft of g++ 4.3 warning removal patch. Builds. *Needs ChangeLogs.*
Stephen J. Turnbull <stephen@xemacs.org>
parents:
4522
diff
changeset
|
1386 make_internal_coding_system (Lisp_Object existing, const Ascbyte *prefix, |
771 | 1387 Lisp_Object type, Lisp_Object description, |
1388 Lisp_Object props) | |
1389 { | |
1390 return make_coding_system_1 (existing, prefix, type, description, props); | |
1391 } | |
428 | 1392 |
4690
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
1393 DEFUN ("make-coding-system-internal", Fmake_coding_system_internal, 2, 4, 0, /* |
5083
88f955fa5a7f
Back out revision c673987f5f3d, undump mule/make-coding-system.el.
Aidan Kehoe <kehoea@parhasard.net>
parents:
5038
diff
changeset
|
1394 Create a new coding system object, and register NAME as its name. |
88f955fa5a7f
Back out revision c673987f5f3d, undump mule/make-coding-system.el.
Aidan Kehoe <kehoea@parhasard.net>
parents:
5038
diff
changeset
|
1395 |
88f955fa5a7f
Back out revision c673987f5f3d, undump mule/make-coding-system.el.
Aidan Kehoe <kehoea@parhasard.net>
parents:
5038
diff
changeset
|
1396 With Mule support, this does much of the work of `make-coding-system'. |
4690
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
1397 Without Mule support, it does all the work of that function, and an alias |
5083
88f955fa5a7f
Back out revision c673987f5f3d, undump mule/make-coding-system.el.
Aidan Kehoe <kehoea@parhasard.net>
parents:
5038
diff
changeset
|
1398 exists, mapping `make-coding-system' to `make-coding-system-internal'. |
88f955fa5a7f
Back out revision c673987f5f3d, undump mule/make-coding-system.el.
Aidan Kehoe <kehoea@parhasard.net>
parents:
5038
diff
changeset
|
1399 |
88f955fa5a7f
Back out revision c673987f5f3d, undump mule/make-coding-system.el.
Aidan Kehoe <kehoea@parhasard.net>
parents:
5038
diff
changeset
|
1400 You'll need a Mule XEmacs to read the complete docstring. Or you can |
88f955fa5a7f
Back out revision c673987f5f3d, undump mule/make-coding-system.el.
Aidan Kehoe <kehoea@parhasard.net>
parents:
5038
diff
changeset
|
1401 just read it in make-coding-system.el; something like the following |
88f955fa5a7f
Back out revision c673987f5f3d, undump mule/make-coding-system.el.
Aidan Kehoe <kehoea@parhasard.net>
parents:
5038
diff
changeset
|
1402 should work: |
4690
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
1403 |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
1404 \\[find-function-other-window] find-file RET \\[find-file] mule/make-coding-system.el RET |
771 | 1405 |
428 | 1406 */ |
771 | 1407 (name, type, description, props)) |
428 | 1408 { |
771 | 1409 return make_coding_system_1 (name, 0, type, description, props); |
428 | 1410 } |
1411 | |
1412 DEFUN ("copy-coding-system", Fcopy_coding_system, 2, 2, 0, /* | |
1413 Copy OLD-CODING-SYSTEM to NEW-NAME. | |
1414 If NEW-NAME does not name an existing coding system, a new one will | |
1415 be created. | |
771 | 1416 If you are using this function to create an alias, think again: |
1417 Use `define-coding-system-alias' instead. | |
428 | 1418 */ |
1419 (old_coding_system, new_name)) | |
1420 { | |
1421 Lisp_Object new_coding_system; | |
1422 old_coding_system = Fget_coding_system (old_coding_system); | |
771 | 1423 new_coding_system = |
4303 | 1424 UNBOUNDP (new_name) ? Qnil : find_coding_system (new_name, 0); |
428 | 1425 if (NILP (new_coding_system)) |
1426 { | |
793 | 1427 new_coding_system = |
1428 wrap_coding_system | |
1429 (allocate_coding_system | |
1430 (XCODING_SYSTEM (old_coding_system)->methods, | |
1431 XCODING_SYSTEM (old_coding_system)->methods->extra_data_size, | |
1432 new_name)); | |
771 | 1433 if (!UNBOUNDP (new_name)) |
1434 Fputhash (new_name, new_coding_system, Vcoding_system_hash_table); | |
428 | 1435 } |
771 | 1436 else if (XCODING_SYSTEM (old_coding_system)->methods != |
1437 XCODING_SYSTEM (new_coding_system)->methods) | |
1438 invalid_operation_2 ("Coding systems not same type", | |
1439 old_coding_system, new_coding_system); | |
428 | 1440 |
5127
a9c41067dd88
more cleanups, terminology clarification, lots of doc work
Ben Wing <ben@xemacs.org>
parents:
5126
diff
changeset
|
1441 copy_lisp_object (new_coding_system, old_coding_system); |
a9c41067dd88
more cleanups, terminology clarification, lots of doc work
Ben Wing <ben@xemacs.org>
parents:
5126
diff
changeset
|
1442 XCODING_SYSTEM (new_coding_system)->name = new_name; |
428 | 1443 return new_coding_system; |
1444 } | |
1445 | |
2297 | 1446 /* #### Shouldn't this really be a find/get pair? */ |
1447 | |
440 | 1448 DEFUN ("coding-system-alias-p", Fcoding_system_alias_p, 1, 1, 0, /* |
1449 Return t if OBJECT is a coding system alias. | |
1450 All coding system aliases are created by `define-coding-system-alias'. | |
1451 */ | |
1452 (object)) | |
428 | 1453 { |
440 | 1454 return SYMBOLP (Fgethash (object, Vcoding_system_hash_table, Qzero)) |
1455 ? Qt : Qnil; | |
1456 } | |
1457 | |
1458 DEFUN ("coding-system-aliasee", Fcoding_system_aliasee, 1, 1, 0, /* | |
1459 Return the coding-system symbol for which symbol ALIAS is an alias. | |
1460 */ | |
1461 (alias)) | |
1462 { | |
1463 Lisp_Object aliasee = Fgethash (alias, Vcoding_system_hash_table, Qnil); | |
1464 if (SYMBOLP (aliasee)) | |
1465 return aliasee; | |
1466 else | |
563 | 1467 invalid_argument ("Symbol is not a coding system alias", alias); |
1204 | 1468 RETURN_NOT_REACHED (Qnil); |
440 | 1469 } |
1470 | |
1471 /* A maphash function, for removing dangling coding system aliases. */ | |
1472 static int | |
2286 | 1473 dangling_coding_system_alias_p (Lisp_Object UNUSED (alias), |
440 | 1474 Lisp_Object aliasee, |
1475 void *dangling_aliases) | |
1476 { | |
1477 if (SYMBOLP (aliasee) | |
1478 && NILP (Fgethash (aliasee, Vcoding_system_hash_table, Qnil))) | |
428 | 1479 { |
440 | 1480 (*(int *) dangling_aliases)++; |
1481 return 1; | |
428 | 1482 } |
440 | 1483 else |
1484 return 0; | |
1485 } | |
1486 | |
1487 DEFUN ("define-coding-system-alias", Fdefine_coding_system_alias, 2, 2, 0, /* | |
1488 Define symbol ALIAS as an alias for coding system ALIASEE. | |
1489 | |
1490 You can use this function to redefine an alias that has already been defined, | |
1491 but you cannot redefine a name which is the canonical name for a coding system. | |
1492 \(a canonical name of a coding system is what is returned when you call | |
1493 `coding-system-name' on a coding system). | |
1494 | |
1495 ALIASEE itself can be an alias, which allows you to define nested aliases. | |
1496 | |
1497 You are forbidden, however, from creating alias loops or `dangling' aliases. | |
1498 These will be detected, and an error will be signaled if you attempt to do so. | |
1499 | |
1500 If ALIASEE is nil, then ALIAS will simply be undefined. | |
1501 | |
1502 See also `coding-system-alias-p', `coding-system-aliasee', | |
1503 and `coding-system-canonical-name-p'. | |
1504 */ | |
1505 (alias, aliasee)) | |
1506 { | |
2286 | 1507 Lisp_Object probe; |
440 | 1508 |
1509 CHECK_SYMBOL (alias); | |
1510 | |
1511 if (!NILP (Fcoding_system_canonical_name_p (alias))) | |
563 | 1512 invalid_change |
440 | 1513 ("Symbol is the canonical name of a coding system and cannot be redefined", |
1514 alias); | |
1515 | |
1516 if (NILP (aliasee)) | |
1517 { | |
771 | 1518 Lisp_Object subsidiary_unix = add_suffix_to_symbol (alias, "-unix"); |
1519 Lisp_Object subsidiary_dos = add_suffix_to_symbol (alias, "-dos"); | |
1520 Lisp_Object subsidiary_mac = add_suffix_to_symbol (alias, "-mac"); | |
440 | 1521 |
1522 Fremhash (alias, Vcoding_system_hash_table); | |
1523 | |
1524 /* Undefine subsidiary aliases, | |
1525 presumably created by a previous call to this function */ | |
1526 if (! NILP (Fcoding_system_alias_p (subsidiary_unix)) && | |
1527 ! NILP (Fcoding_system_alias_p (subsidiary_dos)) && | |
1528 ! NILP (Fcoding_system_alias_p (subsidiary_mac))) | |
1529 { | |
1530 Fdefine_coding_system_alias (subsidiary_unix, Qnil); | |
1531 Fdefine_coding_system_alias (subsidiary_dos, Qnil); | |
1532 Fdefine_coding_system_alias (subsidiary_mac, Qnil); | |
1533 } | |
1534 | |
1535 /* Undefine dangling coding system aliases. */ | |
1536 { | |
1537 int dangling_aliases; | |
1538 | |
1539 do { | |
1540 dangling_aliases = 0; | |
1541 elisp_map_remhash (dangling_coding_system_alias_p, | |
1542 Vcoding_system_hash_table, | |
1543 &dangling_aliases); | |
1544 } while (dangling_aliases > 0); | |
1545 } | |
1546 | |
1547 return Qnil; | |
1548 } | |
1549 | |
1550 if (CODING_SYSTEMP (aliasee)) | |
1551 aliasee = XCODING_SYSTEM_NAME (aliasee); | |
1552 | |
1553 /* Checks that aliasee names a coding-system */ | |
2286 | 1554 (void) Fget_coding_system (aliasee); |
440 | 1555 |
1556 /* Check for coding system alias loops */ | |
1557 if (EQ (alias, aliasee)) | |
563 | 1558 alias_loop: invalid_operation_2 |
440 | 1559 ("Attempt to create a coding system alias loop", alias, aliasee); |
1560 | |
1561 for (probe = aliasee; | |
1562 SYMBOLP (probe); | |
1563 probe = Fgethash (probe, Vcoding_system_hash_table, Qzero)) | |
1564 { | |
1565 if (EQ (probe, alias)) | |
1566 goto alias_loop; | |
1567 } | |
1568 | |
1569 Fputhash (alias, aliasee, Vcoding_system_hash_table); | |
1570 | |
1571 /* Set up aliases for subsidiaries. | |
2297 | 1572 #### There must be a better way to handle subsidiary coding systems. |
1573 Inquiring Minds Want To Know: shouldn't they always be chains? */ | |
440 | 1574 { |
1575 static const char *suffixes[] = { "-unix", "-dos", "-mac" }; | |
1576 int i; | |
1577 for (i = 0; i < countof (suffixes); i++) | |
1578 { | |
1579 Lisp_Object alias_subsidiary = | |
771 | 1580 add_suffix_to_symbol (alias, suffixes[i]); |
440 | 1581 Lisp_Object aliasee_subsidiary = |
771 | 1582 add_suffix_to_symbol (aliasee, suffixes[i]); |
440 | 1583 |
1584 if (! NILP (Ffind_coding_system (aliasee_subsidiary))) | |
1585 Fdefine_coding_system_alias (alias_subsidiary, aliasee_subsidiary); | |
1586 } | |
1587 } | |
428 | 1588 /* FSF return value is a vector of [ALIAS-unix ALIAS-dos ALIAS-mac], |
1589 but it doesn't look intentional, so I'd rather return something | |
1590 meaningful or nothing at all. */ | |
1591 return Qnil; | |
1592 } | |
1593 | |
1594 static Lisp_Object | |
771 | 1595 subsidiary_coding_system (Lisp_Object coding_system, enum eol_type type) |
428 | 1596 { |
1597 Lisp_Coding_System *cs = XCODING_SYSTEM (coding_system); | |
1598 Lisp_Object new_coding_system; | |
1599 | |
1600 switch (type) | |
1601 { | |
1602 case EOL_AUTODETECT: return coding_system; | |
1603 case EOL_LF: new_coding_system = CODING_SYSTEM_EOL_LF (cs); break; | |
1604 case EOL_CR: new_coding_system = CODING_SYSTEM_EOL_CR (cs); break; | |
1605 case EOL_CRLF: new_coding_system = CODING_SYSTEM_EOL_CRLF (cs); break; | |
2500 | 1606 default: ABORT (); return Qnil; |
428 | 1607 } |
1608 | |
1609 return NILP (new_coding_system) ? coding_system : new_coding_system; | |
1610 } | |
1611 | |
1612 DEFUN ("subsidiary-coding-system", Fsubsidiary_coding_system, 2, 2, 0, /* | |
1613 Return the subsidiary coding system of CODING-SYSTEM with eol type EOL-TYPE. | |
771 | 1614 The logically opposite operation is `coding-system-base'. |
428 | 1615 */ |
1616 (coding_system, eol_type)) | |
1617 { | |
771 | 1618 coding_system = get_coding_system_for_text_file (coding_system, 0); |
428 | 1619 |
1620 return subsidiary_coding_system (coding_system, | |
1621 symbol_to_eol_type (eol_type)); | |
1622 } | |
1623 | |
771 | 1624 DEFUN ("coding-system-base", Fcoding_system_base, |
1625 1, 1, 0, /* | |
1626 Return the base coding system of CODING-SYSTEM. | |
1627 If CODING-SYSTEM is a subsidiary, this returns its parent; otherwise, it | |
1628 returns CODING-SYSTEM. | |
1629 The logically opposite operation is `subsidiary-coding-system'. | |
1630 */ | |
1631 (coding_system)) | |
1632 { | |
1633 Lisp_Object base; | |
1634 | |
1635 coding_system = Fget_coding_system (coding_system); | |
1636 if (EQ (XCODING_SYSTEM_NAME (coding_system), Qbinary)) | |
1637 return Fget_coding_system (Qraw_text); /* hack! */ | |
1638 base = XCODING_SYSTEM_SUBSIDIARY_PARENT (coding_system); | |
1639 if (!NILP (base)) | |
1640 return base; | |
1641 return coding_system; | |
1642 } | |
1643 | |
1644 DEFUN ("coding-system-used-for-io", Fcoding_system_used_for_io, | |
1645 1, 1, 0, /* | |
1646 Return the coding system actually used for I/O. | |
1647 In some cases (e.g. when a particular EOL type is specified) this won't be | |
2297 | 1648 the coding system itself. This can be useful when trying to determine |
1649 precisely how data was decoded. | |
771 | 1650 */ |
1651 (coding_system)) | |
1652 { | |
1653 Lisp_Object canon; | |
1654 | |
1655 coding_system = Fget_coding_system (coding_system); | |
1656 canon = XCODING_SYSTEM_CANONICAL (coding_system); | |
1657 if (!NILP (canon)) | |
1658 return canon; | |
1659 return coding_system; | |
1660 } | |
1661 | |
428 | 1662 |
1663 /************************************************************************/ | |
1664 /* Coding system accessors */ | |
1665 /************************************************************************/ | |
1666 | |
771 | 1667 DEFUN ("coding-system-description", Fcoding_system_description, 1, 1, 0, /* |
1668 Return the description for CODING-SYSTEM. | |
1669 The `description' of a coding system is a short English phrase giving the | |
1670 name rendered according to English punctuation rules, plus possibly some | |
1671 explanatory text (typically in the form of a parenthetical phrase). The | |
1672 description is intended to be short enough that it can appear as a menu item, | |
1673 and clear enough to be recognizable even to someone who is assumed to have | |
1674 some basic familiarity with different encodings but may not know all the | |
1675 technical names; thus, for `cn-gb-2312' is described as "Chinese EUC" and | |
1676 `hz-gb-2312' is described as "Hz/ZW (Chinese)", where the actual name of | |
1677 the encoding is given, followed by a note that this is a Chinese encoding, | |
1678 because the great majority of people encountering this would have no idea | |
1679 what it is, and giving the language indicates whether the encoding should | |
1680 just be ignored or (conceivably) investigated more thoroughly. | |
428 | 1681 */ |
1682 (coding_system)) | |
1683 { | |
1684 coding_system = Fget_coding_system (coding_system); | |
771 | 1685 return XCODING_SYSTEM_DESCRIPTION (coding_system); |
428 | 1686 } |
1687 | |
1688 DEFUN ("coding-system-type", Fcoding_system_type, 1, 1, 0, /* | |
1689 Return the type of CODING-SYSTEM. | |
1690 */ | |
1691 (coding_system)) | |
1692 { | |
771 | 1693 coding_system = Fget_coding_system (coding_system); |
1694 return XCODING_SYSTEM_TYPE (coding_system); | |
428 | 1695 } |
1696 | |
1697 DEFUN ("coding-system-property", Fcoding_system_property, 2, 2, 0, /* | |
1698 Return the PROP property of CODING-SYSTEM. | |
1699 */ | |
1700 (coding_system, prop)) | |
1701 { | |
1702 coding_system = Fget_coding_system (coding_system); | |
1703 CHECK_SYMBOL (prop); | |
1704 | |
1705 if (EQ (prop, Qname)) | |
1706 return XCODING_SYSTEM_NAME (coding_system); | |
1707 else if (EQ (prop, Qtype)) | |
1708 return Fcoding_system_type (coding_system); | |
771 | 1709 else if (EQ (prop, Qdescription)) |
1710 return XCODING_SYSTEM_DESCRIPTION (coding_system); | |
428 | 1711 else if (EQ (prop, Qmnemonic)) |
1712 return XCODING_SYSTEM_MNEMONIC (coding_system); | |
771 | 1713 else if (EQ (prop, Qdocumentation)) |
1714 return XCODING_SYSTEM_DOCUMENTATION (coding_system); | |
428 | 1715 else if (EQ (prop, Qeol_type)) |
771 | 1716 return eol_type_to_symbol (XCODING_SYSTEM_EOL_TYPE |
1717 (coding_system)); | |
428 | 1718 else if (EQ (prop, Qeol_lf)) |
1719 return XCODING_SYSTEM_EOL_LF (coding_system); | |
1720 else if (EQ (prop, Qeol_crlf)) | |
1721 return XCODING_SYSTEM_EOL_CRLF (coding_system); | |
1722 else if (EQ (prop, Qeol_cr)) | |
1723 return XCODING_SYSTEM_EOL_CR (coding_system); | |
1724 else if (EQ (prop, Qpost_read_conversion)) | |
1725 return XCODING_SYSTEM_POST_READ_CONVERSION (coding_system); | |
1726 else if (EQ (prop, Qpre_write_conversion)) | |
1727 return XCODING_SYSTEM_PRE_WRITE_CONVERSION (coding_system); | |
4568
1d74a1d115ee
Add #'query-coding-region tests; do the work necessary to get them running.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4303
diff
changeset
|
1728 else if (EQ (prop, Qsafe_charsets)) |
1d74a1d115ee
Add #'query-coding-region tests; do the work necessary to get them running.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4303
diff
changeset
|
1729 return XCODING_SYSTEM_SAFE_CHARSETS (coding_system); |
1d74a1d115ee
Add #'query-coding-region tests; do the work necessary to get them running.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4303
diff
changeset
|
1730 else if (EQ (prop, Qsafe_chars)) |
1d74a1d115ee
Add #'query-coding-region tests; do the work necessary to get them running.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4303
diff
changeset
|
1731 return XCODING_SYSTEM_SAFE_CHARS (coding_system); |
771 | 1732 else |
1733 { | |
1734 Lisp_Object value = CODESYSMETH_OR_GIVEN (XCODING_SYSTEM (coding_system), | |
1735 getprop, | |
1736 (coding_system, prop), | |
1737 Qunbound); | |
1738 if (UNBOUNDP (value)) | |
1739 invalid_constant ("Unrecognized property", prop); | |
1740 return value; | |
1741 } | |
1742 } | |
1743 | |
1744 | |
1745 /************************************************************************/ | |
1746 /* Coding stream functions */ | |
1747 /************************************************************************/ | |
1748 | |
1749 /* A coding stream is a stream used for encoding or decoding text. The | |
1750 coding-stream object keeps track of the actual coding system, the stream | |
1751 that is at the other end, and data that needs to be persistent across | |
1752 the lifetime of the stream. */ | |
1753 | |
1204 | 1754 extern const struct sized_memory_description chain_coding_stream_description; |
1755 extern const struct sized_memory_description undecided_coding_stream_description; | |
1756 | |
1757 static const struct memory_description coding_stream_data_description_1 []= { | |
2551 | 1758 { XD_BLOCK_PTR, chain_coding_system, 1, |
1759 { &chain_coding_stream_description } }, | |
1760 { XD_BLOCK_PTR, undecided_coding_system, 1, | |
1761 { &undecided_coding_stream_description } }, | |
1204 | 1762 { XD_END } |
1763 }; | |
1764 | |
1765 static const struct sized_memory_description coding_stream_data_description = { | |
1766 sizeof (void *), coding_stream_data_description_1 | |
1767 }; | |
1768 | |
1769 static const struct memory_description coding_lstream_description[] = { | |
1770 { XD_INT, offsetof (struct coding_stream, type) }, | |
1771 { XD_LISP_OBJECT, offsetof (struct coding_stream, orig_codesys) }, | |
1772 { XD_LISP_OBJECT, offsetof (struct coding_stream, codesys) }, | |
1773 { XD_LISP_OBJECT, offsetof (struct coding_stream, other_end) }, | |
1774 { XD_UNION, offsetof (struct coding_stream, data), | |
2551 | 1775 XD_INDIRECT (0, 0), { &coding_stream_data_description } }, |
1204 | 1776 { XD_END } |
1777 }; | |
1778 | |
1779 DEFINE_LSTREAM_IMPLEMENTATION_WITH_DATA ("coding", coding); | |
771 | 1780 |
1781 /* Encoding and decoding are parallel operations, so we create just one | |
1782 stream for both. "Decoding" may involve the extra step of autodetection | |
1783 of the data format, but that's only because of the conventional | |
1784 definition of decoding as converting from external- to | |
1785 internal-formatted data. | |
1786 | |
2297 | 1787 [[ REWRITE ME! ]] |
1788 | |
771 | 1789 #### We really need to abstract out the concept of "data formats" and |
1790 define "converters" that convert from and to specified formats, | |
1791 eliminating the idea of decoding and encoding. When specifying a | |
1792 conversion process, we need to give the data formats themselves, not the | |
1793 conversion processes -- e.g. a coding system called "Unicode->multibyte" | |
1794 converts in both directions, and we could auto-detect the format of data | |
1795 at either end. */ | |
1796 | |
1797 static Bytecount | |
1798 coding_reader (Lstream *stream, unsigned char *data, Bytecount size) | |
1799 { | |
1800 unsigned char *orig_data = data; | |
1801 Bytecount read_size; | |
1802 int error_occurred = 0; | |
1803 struct coding_stream *str = CODING_STREAM_DATA (stream); | |
1804 | |
1805 /* We need to interface to coding_{de,en}code_1(), which expects to take | |
1806 some amount of data and store the result into a Dynarr. We have | |
1807 coding_{de,en}code_1() store into c->runoff, and take data from there | |
1808 as necessary. */ | |
1809 | |
1810 /* We loop until we have enough data, reading chunks from the other | |
1811 end and converting it. */ | |
1812 while (1) | |
1813 { | |
1814 /* Take data from convert_to if we can. Make sure to take at | |
1815 most SIZE bytes, and delete the data from convert_to. */ | |
1816 if (Dynarr_length (str->convert_to) > 0) | |
1817 { | |
1818 Bytecount chunk = | |
1819 min (size, (Bytecount) Dynarr_length (str->convert_to)); | |
4967 | 1820 memcpy (data, Dynarr_begin (str->convert_to), chunk); |
771 | 1821 Dynarr_delete_many (str->convert_to, 0, chunk); |
1822 data += chunk; | |
1823 size -= chunk; | |
1824 } | |
1825 | |
1826 if (size == 0) | |
1827 break; /* No more room for data */ | |
1828 | |
1829 if (str->eof) | |
1830 break; | |
1831 | |
1832 { | |
1833 /* Exhausted convert_to, so get some more. Read into convert_from, | |
1834 after existing "rejected" data from the last conversion. */ | |
1835 Bytecount rejected = Dynarr_length (str->convert_from); | |
1836 /* #### 1024 is arbitrary; we really need to separate 0 from EOF, | |
1837 and when we get 0, keep taking more data until we don't get 0 -- | |
1838 we don't know how much data the conversion routine might need | |
2297 | 1839 before it can generate any data of its own (eg, bzip2). */ |
814 | 1840 Bytecount readmore = |
1841 str->one_byte_at_a_time ? (Bytecount) 1 : | |
1842 max (size, (Bytecount) 1024); | |
771 | 1843 |
1844 Dynarr_add_many (str->convert_from, 0, readmore); | |
1845 read_size = Lstream_read (str->other_end, | |
1846 Dynarr_atp (str->convert_from, rejected), | |
1847 readmore); | |
1848 /* Trim size down to how much we actually got */ | |
5038 | 1849 Dynarr_set_lengthr (str->convert_from, rejected + max (0, read_size)); |
771 | 1850 } |
1851 | |
1852 if (read_size < 0) /* LSTREAM_ERROR */ | |
1853 { | |
1854 error_occurred = 1; | |
1855 break; | |
1856 } | |
1857 if (read_size == 0) /* LSTREAM_EOF */ | |
1858 /* There might be some more end data produced in the translation, | |
1859 so we set a flag and call the conversion method once more to | |
1860 output any final stuff it may be holding, any "go back to a sane | |
1861 state" escape sequences, etc. The conversion method is free to | |
1862 look at this flag, and we use it above to stop looping. */ | |
1863 str->eof = 1; | |
1864 { | |
1865 Bytecount processed; | |
1866 Bytecount to_process = Dynarr_length (str->convert_from); | |
1867 | |
1868 /* Convert the data, and save any rejected data in convert_from */ | |
1869 processed = | |
1870 XCODESYSMETH (str->codesys, convert, | |
4967 | 1871 (str, Dynarr_begin (str->convert_from), |
771 | 1872 str->convert_to, to_process)); |
1873 if (processed < 0) | |
1874 { | |
1875 error_occurred = 1; | |
1876 break; | |
1877 } | |
1878 assert (processed <= to_process); | |
1879 if (processed < to_process) | |
4967 | 1880 memmove (Dynarr_begin (str->convert_from), |
771 | 1881 Dynarr_atp (str->convert_from, processed), |
1882 to_process - processed); | |
5038 | 1883 Dynarr_set_lengthr (str->convert_from, to_process - processed); |
771 | 1884 } |
1885 } | |
1886 | |
1887 if (data - orig_data == 0) | |
1888 return error_occurred ? -1 : 0; | |
1889 else | |
1890 return data - orig_data; | |
1891 } | |
1892 | |
1893 static Bytecount | |
1894 coding_writer (Lstream *stream, const unsigned char *data, Bytecount size) | |
1895 { | |
1896 struct coding_stream *str = CODING_STREAM_DATA (stream); | |
1897 | |
1898 /* Convert all our data into convert_to, and then attempt to write | |
1899 it all out to the other end. */ | |
1900 Dynarr_reset (str->convert_to); | |
1901 size = XCODESYSMETH (str->codesys, convert, | |
1902 (str, data, str->convert_to, size)); | |
4967 | 1903 if (Lstream_write (str->other_end, Dynarr_begin (str->convert_to), |
771 | 1904 Dynarr_length (str->convert_to)) < 0) |
1905 return -1; | |
1906 else | |
1907 /* The return value indicates how much of the incoming data was | |
1908 processed, not how many bytes were written. */ | |
1909 return size; | |
1910 } | |
1911 | |
1912 static int | |
1913 encode_decode_source_sink_type_is_char (Lisp_Object cs, | |
1914 enum source_or_sink sex, | |
1915 enum encode_decode direction) | |
1916 { | |
1917 return (direction == CODING_DECODE ? | |
1918 decoding_source_sink_type_is_char (cs, sex) : | |
1919 encoding_source_sink_type_is_char (cs, sex)); | |
1920 } | |
1921 | |
1922 /* Ensure that the convert methods only get full characters sent to them to | |
1923 convert if the source of that conversion is characters; and that no such | |
1924 full-character checking happens when the source is bytes. Keep in mind | |
1925 that (1) the conversion_end_type return values take the perspective of | |
1926 encoding; (2) the source for decoding is the same as the sink for | |
1927 encoding; (3) when writing, the data is given to us, and we set our own | |
1928 stream to be character mode or not; (4) when reading, the data comes | |
1929 from the other_end stream, and we set that one to be character mode or | |
1930 not. This is consistent with the comment above the prototype for | |
1931 Lstream_set_character_mode(), which lays out rules for who is allowed to | |
1932 modify the character type mode on a stream. | |
1933 | |
814 | 1934 If we're a read stream, we're always setting character mode on the |
1935 source, but we also set it on ourselves consistent with the flag that | |
1936 can disable this (see again the comment above | |
1937 Lstream_set_character_mode()). | |
1938 */ | |
771 | 1939 |
1940 static void | |
1941 set_coding_character_mode (Lstream *stream) | |
1942 { | |
1943 struct coding_stream *str = CODING_STREAM_DATA (stream); | |
1944 Lstream *stream_to_set = | |
1945 stream->flags & LSTREAM_FL_WRITE ? stream : str->other_end; | |
1946 if (encode_decode_source_sink_type_is_char | |
1947 (str->codesys, CODING_SOURCE, str->direction)) | |
1948 Lstream_set_character_mode (stream_to_set); | |
1949 else | |
1950 Lstream_unset_character_mode (stream_to_set); | |
814 | 1951 if (str->set_char_mode_on_us_when_reading && |
1952 (stream->flags & LSTREAM_FL_READ)) | |
1953 { | |
1954 if (encode_decode_source_sink_type_is_char | |
1955 (str->codesys, CODING_SINK, str->direction)) | |
1956 Lstream_set_character_mode (stream); | |
1957 else | |
1958 Lstream_unset_character_mode (stream); | |
1959 } | |
771 | 1960 } |
1961 | |
1962 static Lisp_Object | |
1963 coding_marker (Lisp_Object stream) | |
1964 { | |
1965 struct coding_stream *str = CODING_STREAM_DATA (XLSTREAM (stream)); | |
1966 | |
1967 mark_object (str->orig_codesys); | |
1968 mark_object (str->codesys); | |
1969 MAYBE_XCODESYSMETH (str->codesys, mark_coding_stream, (str)); | |
1970 return wrap_lstream (str->other_end); | |
1971 } | |
1972 | |
1973 static int | |
1974 coding_rewinder (Lstream *stream) | |
1975 { | |
1976 struct coding_stream *str = CODING_STREAM_DATA (stream); | |
1977 MAYBE_XCODESYSMETH (str->codesys, rewind_coding_stream, (str)); | |
1978 | |
1979 str->ch = 0; | |
1980 Dynarr_reset (str->convert_to); | |
1981 Dynarr_reset (str->convert_from); | |
1982 return Lstream_rewind (str->other_end); | |
1983 } | |
1984 | |
1985 static int | |
1986 coding_seekable_p (Lstream *stream) | |
1987 { | |
1988 struct coding_stream *str = CODING_STREAM_DATA (stream); | |
1989 return Lstream_seekable_p (str->other_end); | |
1990 } | |
1991 | |
1992 static int | |
1993 coding_flusher (Lstream *stream) | |
1994 { | |
1995 struct coding_stream *str = CODING_STREAM_DATA (stream); | |
1996 return Lstream_flush (str->other_end); | |
1997 } | |
1998 | |
1999 static int | |
2000 coding_closer (Lstream *stream) | |
2001 { | |
2002 struct coding_stream *str = CODING_STREAM_DATA (stream); | |
2003 if (stream->flags & LSTREAM_FL_WRITE) | |
2004 { | |
2005 str->eof = 1; | |
2006 coding_writer (stream, 0, 0); | |
2007 str->eof = 0; | |
2008 } | |
2009 /* It's safe to free the runoff dynarrs now because they are used only | |
2010 during conversion. We need to keep the type-specific data around, | |
2011 though, because of canonicalize_after_coding. */ | |
2012 if (str->convert_to) | |
2013 { | |
2014 Dynarr_free (str->convert_to); | |
2015 str->convert_to = 0; | |
2016 } | |
2017 if (str->convert_from) | |
428 | 2018 { |
771 | 2019 Dynarr_free (str->convert_from); |
2020 str->convert_from = 0; | |
2021 } | |
2022 | |
800 | 2023 if (str->no_close_other) |
2024 return Lstream_flush (str->other_end); | |
2025 else | |
2026 return Lstream_close (str->other_end); | |
771 | 2027 } |
2028 | |
2029 static void | |
2030 coding_finalizer (Lstream *stream) | |
2031 { | |
2032 struct coding_stream *str = CODING_STREAM_DATA (stream); | |
2033 | |
2034 assert (!str->finalized); | |
2035 MAYBE_XCODESYSMETH (str->codesys, finalize_coding_stream, (str)); | |
2036 if (str->data) | |
2037 { | |
4976
16112448d484
Rename xfree(FOO, TYPE) -> xfree(FOO)
Ben Wing <ben@xemacs.org>
parents:
4967
diff
changeset
|
2038 xfree (str->data); |
771 | 2039 str->data = 0; |
2040 } | |
2041 str->finalized = 1; | |
2042 } | |
2043 | |
2044 static Lisp_Object | |
2045 coding_stream_canonicalize_after_coding (Lstream *stream) | |
2046 { | |
2047 struct coding_stream *str = CODING_STREAM_DATA (stream); | |
2048 | |
2049 return XCODESYSMETH_OR_GIVEN (str->codesys, canonicalize_after_coding, | |
2050 (str), str->codesys); | |
2051 } | |
2052 | |
2053 Lisp_Object | |
2054 coding_stream_detected_coding_system (Lstream *stream) | |
2055 { | |
2056 Lisp_Object codesys = | |
2057 coding_stream_canonicalize_after_coding (stream); | |
2058 if (NILP (codesys)) | |
2059 return Fget_coding_system (Qidentity); | |
2060 return codesys; | |
2061 } | |
2062 | |
2063 Lisp_Object | |
2064 coding_stream_coding_system (Lstream *stream) | |
2065 { | |
2066 return CODING_STREAM_DATA (stream)->codesys; | |
2067 } | |
2068 | |
2069 /* Change the coding system associated with a stream. */ | |
2070 | |
2071 void | |
2072 set_coding_stream_coding_system (Lstream *lstr, Lisp_Object codesys) | |
2073 { | |
2074 struct coding_stream *str = CODING_STREAM_DATA (lstr); | |
2075 if (EQ (str->orig_codesys, codesys)) | |
2076 return; | |
2077 /* We do the equivalent of closing the stream, destroying it, and | |
2078 reinitializing it. This includes flushing out the data and signalling | |
2079 EOF, if we're a writing stream; we also replace the type-specific data | |
2080 with the data appropriate for the new coding system. */ | |
2081 if (!NILP (str->codesys)) | |
2082 { | |
2083 if (lstr->flags & LSTREAM_FL_WRITE) | |
2084 { | |
2085 Lstream_flush (lstr); | |
2086 str->eof = 1; | |
2087 coding_writer (lstr, 0, 0); | |
2088 str->eof = 0; | |
2089 } | |
2090 MAYBE_XCODESYSMETH (str->codesys, finalize_coding_stream, (str)); | |
2091 } | |
2092 str->orig_codesys = codesys; | |
2093 str->codesys = coding_system_real_canonical (codesys); | |
2094 | |
2095 if (str->data) | |
2096 { | |
4976
16112448d484
Rename xfree(FOO, TYPE) -> xfree(FOO)
Ben Wing <ben@xemacs.org>
parents:
4967
diff
changeset
|
2097 xfree (str->data); |
771 | 2098 str->data = 0; |
428 | 2099 } |
771 | 2100 if (XCODING_SYSTEM_METHODS (str->codesys)->coding_data_size) |
1204 | 2101 { |
2102 str->data = | |
2103 xmalloc_and_zero (XCODING_SYSTEM_METHODS (str->codesys)-> | |
2104 coding_data_size); | |
2105 str->type = XCODING_SYSTEM_METHODS (str->codesys)->enumtype; | |
2106 } | |
771 | 2107 MAYBE_XCODESYSMETH (str->codesys, init_coding_stream, (str)); |
2108 /* The new coding system may have different ideas regarding whether its | |
2109 ends are characters or bytes. */ | |
2110 set_coding_character_mode (lstr); | |
2111 } | |
2112 | |
2113 /* WARNING WARNING WARNING WARNING!!!!! If you open up a coding | |
2114 stream for writing, no automatic code detection will be performed. | |
2115 The reason for this is that automatic code detection requires a | |
2116 seekable input. Things will also fail if you open a coding | |
2117 stream for reading using a non-fully-specified coding system and | |
2118 a non-seekable input stream. */ | |
2119 | |
2120 static Lisp_Object | |
2121 make_coding_stream_1 (Lstream *stream, Lisp_Object codesys, | |
800 | 2122 const char *mode, enum encode_decode direction, |
802 | 2123 int flags) |
771 | 2124 { |
2125 Lstream *lstr = Lstream_new (lstream_coding, mode); | |
2126 struct coding_stream *str = CODING_STREAM_DATA (lstr); | |
2127 | |
2128 codesys = Fget_coding_system (codesys); | |
2129 xzero (*str); | |
2130 str->codesys = Qnil; | |
2131 str->orig_codesys = Qnil; | |
2132 str->us = lstr; | |
2133 str->other_end = stream; | |
2134 str->convert_to = Dynarr_new (unsigned_char); | |
2135 str->convert_from = Dynarr_new (unsigned_char); | |
2136 str->direction = direction; | |
814 | 2137 if (flags & LSTREAM_FL_NO_CLOSE_OTHER) |
802 | 2138 str->no_close_other = 1; |
814 | 2139 if (flags & LSTREAM_FL_READ_ONE_BYTE_AT_A_TIME) |
802 | 2140 str->one_byte_at_a_time = 1; |
814 | 2141 if (!(flags & LSTREAM_FL_NO_INIT_CHAR_MODE_WHEN_READING)) |
2142 str->set_char_mode_on_us_when_reading = 1; | |
802 | 2143 |
771 | 2144 set_coding_stream_coding_system (lstr, codesys); |
793 | 2145 return wrap_lstream (lstr); |
771 | 2146 } |
2147 | |
814 | 2148 /* FLAGS: |
2149 | |
2150 LSTREAM_FL_NO_CLOSE_OTHER | |
2151 Don't close STREAM (the stream at the other end) when this stream is | |
2152 closed. | |
2153 | |
2154 LSTREAM_FL_READ_ONE_BYTE_AT_A_TIME | |
2155 When reading from STREAM, read and process one byte at a time rather | |
2156 than in large chunks. This is for reading from TTY's, so we don't | |
2157 block. #### We should instead create a non-blocking filedesc stream | |
2158 that emulates the behavior as necessary using select(), when the | |
2159 fcntls don't work. (As seems to be the case on Cygwin.) | |
2160 | |
2161 LSTREAM_FL_NO_INIT_CHAR_MODE_WHEN_READING | |
2162 When reading from STREAM, read and process one byte at a time rather | |
2163 than in large chunks. This is for reading from TTY's, so we don't | |
2164 block. #### We should instead create a non-blocking filedesc stream | |
2165 that emulates the behavior as necessary using select(), when the | |
2166 fcntls don't work. (As seems to be the case on Cygwin.) | |
2167 */ | |
771 | 2168 Lisp_Object |
2169 make_coding_input_stream (Lstream *stream, Lisp_Object codesys, | |
802 | 2170 enum encode_decode direction, int flags) |
771 | 2171 { |
800 | 2172 return make_coding_stream_1 (stream, codesys, "r", direction, |
802 | 2173 flags); |
771 | 2174 } |
2175 | |
814 | 2176 /* FLAGS: |
2177 | |
2178 LSTREAM_FL_NO_CLOSE_OTHER | |
2179 Don't close STREAM (the stream at the other end) when this stream is | |
2180 closed. | |
2181 */ | |
771 | 2182 Lisp_Object |
2183 make_coding_output_stream (Lstream *stream, Lisp_Object codesys, | |
802 | 2184 enum encode_decode direction, int flags) |
771 | 2185 { |
800 | 2186 return make_coding_stream_1 (stream, codesys, "w", direction, |
802 | 2187 flags); |
771 | 2188 } |
2189 | |
2190 static Lisp_Object | |
2191 encode_decode_coding_region (Lisp_Object start, Lisp_Object end, | |
2192 Lisp_Object coding_system, Lisp_Object buffer, | |
2193 enum encode_decode direction) | |
2194 { | |
2195 Charbpos b, e; | |
2196 struct buffer *buf = decode_buffer (buffer, 0); | |
2197 Lisp_Object instream = Qnil, to_outstream = Qnil, outstream = Qnil; | |
2198 Lisp_Object from_outstream = Qnil, auto_outstream = Qnil; | |
2199 Lisp_Object lb_outstream = Qnil; | |
2200 Lisp_Object next; | |
2201 Lstream *istr, *ostr; | |
2202 struct gcpro gcpro1, gcpro2, gcpro3, gcpro4, gcpro5; | |
2203 struct gcpro ngcpro1; | |
2204 int source_char, sink_char; | |
2205 | |
2206 get_buffer_range_char (buf, start, end, &b, &e, 0); | |
2207 barf_if_buffer_read_only (buf, b, e); | |
2208 | |
2209 GCPRO5 (instream, to_outstream, outstream, from_outstream, lb_outstream); | |
2210 NGCPRO1 (auto_outstream); | |
2211 | |
2212 coding_system = Fget_coding_system (coding_system); | |
2213 source_char = encode_decode_source_sink_type_is_char (coding_system, | |
2214 CODING_SOURCE, | |
2215 direction); | |
2216 sink_char = encode_decode_source_sink_type_is_char (coding_system, | |
2217 CODING_SINK, | |
2218 direction); | |
2219 | |
2220 /* Order is IN <---> [TO] -> OUT -> [FROM] -> [AUTODETECT-EOL] -> LB */ | |
2221 instream = make_lisp_buffer_input_stream (buf, b, e, 0); | |
2222 next = lb_outstream = make_lisp_buffer_output_stream (buf, b, 0); | |
2223 | |
2224 if (direction == CODING_DECODE && | |
2225 XCODING_SYSTEM_EOL_TYPE (coding_system) == EOL_AUTODETECT) | |
2226 next = auto_outstream = | |
2227 make_coding_output_stream | |
800 | 2228 (XLSTREAM (next), Fget_coding_system (Qconvert_eol_autodetect), |
2229 CODING_DECODE, 0); | |
771 | 2230 |
2231 if (!sink_char) | |
2232 next = from_outstream = | |
800 | 2233 make_coding_output_stream (XLSTREAM (next), Qbinary, CODING_DECODE, 0); |
771 | 2234 outstream = make_coding_output_stream (XLSTREAM (next), coding_system, |
800 | 2235 direction, 0); |
771 | 2236 if (!source_char) |
428 | 2237 { |
771 | 2238 to_outstream = |
2239 make_coding_output_stream (XLSTREAM (outstream), | |
800 | 2240 Qbinary, CODING_ENCODE, 0); |
771 | 2241 ostr = XLSTREAM (to_outstream); |
2242 } | |
2243 else | |
2244 ostr = XLSTREAM (outstream); | |
2245 istr = XLSTREAM (instream); | |
2246 | |
2247 /* The chain of streams looks like this: | |
2248 | |
2297 | 2249 [BUFFER] <----- (( read from/send to loop )) |
771 | 2250 ------> [CHAR->BYTE i.e. ENCODE AS BINARY if source is |
2251 in bytes] | |
2252 ------> [ENCODE/DECODE AS SPECIFIED] | |
2253 ------> [BYTE->CHAR i.e. DECODE AS BINARY | |
2254 if sink is in bytes] | |
2255 ------> [AUTODETECT EOL if | |
2256 we're decoding and | |
2257 coding system calls | |
2258 for this] | |
2259 ------> [BUFFER] | |
2260 */ | |
2367 | 2261 |
2262 /* #### See comment | |
2263 | |
2264 EFFICIENCY OF CODING CONVERSION WITH MULTIPLE COPIES/CHAINS | |
2265 | |
2266 in text.c. | |
2267 */ | |
2268 | |
771 | 2269 while (1) |
2270 { | |
2271 char tempbuf[1024]; /* some random amount */ | |
2272 Charbpos newpos, even_newer_pos; | |
2273 Charbpos oldpos = lisp_buffer_stream_startpos (istr); | |
2274 Bytecount size_in_bytes = | |
2275 Lstream_read (istr, tempbuf, sizeof (tempbuf)); | |
2276 | |
2277 if (!size_in_bytes) | |
2278 break; | |
2279 newpos = lisp_buffer_stream_startpos (istr); | |
2280 Lstream_write (ostr, tempbuf, size_in_bytes); | |
2281 even_newer_pos = lisp_buffer_stream_startpos (istr); | |
2282 buffer_delete_range (buf, even_newer_pos - (newpos - oldpos), | |
2283 even_newer_pos, 0); | |
428 | 2284 } |
771 | 2285 |
2286 { | |
2287 Charcount retlen = | |
2288 lisp_buffer_stream_startpos (XLSTREAM (instream)) - b; | |
2289 Lstream_close (istr); | |
2290 Lstream_close (ostr); | |
2291 NUNGCPRO; | |
2292 UNGCPRO; | |
2293 Lstream_delete (istr); | |
2294 if (!NILP (from_outstream)) | |
2295 Lstream_delete (XLSTREAM (from_outstream)); | |
2296 Lstream_delete (XLSTREAM (outstream)); | |
2297 if (!NILP (to_outstream)) | |
2298 Lstream_delete (XLSTREAM (to_outstream)); | |
2299 if (!NILP (auto_outstream)) | |
2300 Lstream_delete (XLSTREAM (auto_outstream)); | |
2301 Lstream_delete (XLSTREAM (lb_outstream)); | |
2302 return make_int (retlen); | |
2303 } | |
2304 } | |
2305 | |
3302 | 2306 DEFUN ("decode-coding-region", Fdecode_coding_region, 3, 4, |
2307 "*r\nzDecode from coding system: \ni", /* | |
771 | 2308 Decode the text between START and END which is encoded in CODING-SYSTEM. |
2309 This is useful if you've read in encoded text from a file without decoding | |
2310 it (e.g. you read in a JIS-formatted file but used the `binary' or | |
2311 `no-conversion' coding system, so that it shows up as "^[$B!<!+^[(B"). | |
2312 Return length of decoded text. | |
3302 | 2313 BUFFER defaults to the current buffer if unspecified, and when interactive. |
771 | 2314 */ |
2315 (start, end, coding_system, buffer)) | |
2316 { | |
2317 return encode_decode_coding_region (start, end, coding_system, buffer, | |
2318 CODING_DECODE); | |
2319 } | |
2320 | |
3302 | 2321 DEFUN ("encode-coding-region", Fencode_coding_region, 3, 4, |
2322 "*r\nzEncode to coding system: \ni", /* | |
771 | 2323 Encode the text between START and END using CODING-SYSTEM. |
2324 This will, for example, convert Japanese characters into stuff such as | |
3302 | 2325 "^[$B!<!+^[(B" if you use the JIS encoding. Return length of encoded text. |
2326 BUFFER defaults to the current buffer if unspecified, and when interactive. | |
771 | 2327 */ |
2328 (start, end, coding_system, buffer)) | |
2329 { | |
2330 return encode_decode_coding_region (start, end, coding_system, buffer, | |
2331 CODING_ENCODE); | |
428 | 2332 } |
2333 | |
4690
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
2334 DEFUN ("query-coding-region", Fquery_coding_region, 3, 7, 0, /* |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
2335 Work out whether CODING-SYSTEM can losslessly encode a region. |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
2336 |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
2337 START and END are the beginning and end of the region to check. |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
2338 CODING-SYSTEM is the coding system to try. |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
2339 |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
2340 Optional argument BUFFER is the buffer to check, and defaults to the current |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
2341 buffer. |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
2342 |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
2343 IGNORE-INVALID-SEQUENCESP, also an optional argument, says to treat XEmacs |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
2344 characters which have an unambiguous encoded representation, despite being |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
2345 undefined in what they represent, as encodable. These chiefly arise with |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
2346 variable-length encodings like UTF-8 and UTF-16, where an invalid sequence |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
2347 is passed through to XEmacs as a sequence of characters with a defined |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
2348 correspondence to the octets on disk, but no non-error semantics; see the |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
2349 `invalid-sequence-coding-system' argument to `set-language-info'. |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
2350 |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
2351 They can also arise with fixed-length encodings like ISO 8859-7, where |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
2352 certain octets on disk have undefined values, and treating them as |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
2353 corresponding to the ISO 8859-1 characters with the same numerical values |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
2354 may lead to data that is not understood by other applications. |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
2355 |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
2356 Optional argument ERRORP says to signal a `text-conversion-error' if some |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
2357 character in the region cannot be encoded, and defaults to nil. |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
2358 |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
2359 Optional argument HIGHLIGHT says to display unencodable characters in the |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
2360 region using `query-coding-warning-face'. It defaults to nil. |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
2361 |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
2362 This function can return multiple values; the intention is that callers use |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
2363 `multiple-value-bind' or the related CL multiple value functions to deal |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
2364 with it. The first result is `t' if the region can be encoded using |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
2365 CODING-SYSTEM, or `nil' if not. If the region cannot be encoded using |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
2366 CODING-SYSTEM, the second result is a range table describing the positions |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
2367 of the unencodable characters. |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
2368 |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
2369 Ranges that describe characters that would be ignored were |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
2370 IGNORE-INVALID-SEQUENCESP non-nil map to the symbol `invalid-sequence'; |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
2371 other ranges map to the symbol `unencodable'. If IGNORE-INVALID-SEQUENCESP |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
2372 is non-nil, all ranges will map to the symbol `unencodable'. See |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
2373 `make-range-table' for more details of range tables. |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
2374 */ |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
2375 (start, end, coding_system, buffer, ignore_invalid_sequencesp, |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
2376 errorp, highlight)) |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
2377 { |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
2378 Charbpos b, e; |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
2379 struct buffer *buf = decode_buffer (buffer, 1); |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
2380 Lisp_Object result; |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
2381 int flags = 0, speccount = specpdl_depth (); |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
2382 |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
2383 coding_system = Fget_coding_system (coding_system); |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
2384 |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
2385 get_buffer_range_char (buf, start, end, &b, &e, 0); |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
2386 |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
2387 if (buf != current_buffer) |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
2388 { |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
2389 record_unwind_protect (save_current_buffer_restore, Fcurrent_buffer ()); |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
2390 set_buffer_internal (buf); |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
2391 } |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
2392 |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
2393 record_unwind_protect (save_excursion_restore, save_excursion_save ()); |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
2394 |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
2395 BUF_SET_PT (buf, b); |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
2396 |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
2397 if (!NILP (ignore_invalid_sequencesp)) |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
2398 { |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
2399 flags |= QUERY_METHOD_IGNORE_INVALID_SEQUENCES; |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
2400 } |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
2401 |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
2402 if (!NILP (errorp)) |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
2403 { |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
2404 flags |= QUERY_METHOD_ERRORP; |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
2405 } |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
2406 |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
2407 if (!NILP (highlight)) |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
2408 { |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
2409 flags |= QUERY_METHOD_HIGHLIGHT; |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
2410 } |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
2411 |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
2412 result = XCODESYSMETH_OR_GIVEN (coding_system, query, |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
2413 (coding_system, buf, e, flags), Qunbound); |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
2414 |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
2415 if (UNBOUNDP (result)) |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
2416 { |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
2417 signal_error (Qtext_conversion_error, |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
2418 "Coding system doesn't say what it can encode", |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
2419 XCODING_SYSTEM_NAME (coding_system)); |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
2420 } |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
2421 |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
2422 result = (NILP (result)) ? Qt : values2 (Qnil, result); |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
2423 |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
2424 return unbind_to_1 (speccount, result); |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
2425 } |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
2426 |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
2427 |
428 | 2428 |
2429 /************************************************************************/ | |
771 | 2430 /* Chain methods */ |
428 | 2431 /************************************************************************/ |
2432 | |
771 | 2433 /* #### Need a way to create "opposite-direction" coding systems. */ |
2434 | |
2435 /* Chain two or more coding systems together to make a combination coding | |
2436 system. */ | |
2437 | |
2438 struct chain_coding_system | |
2439 { | |
2440 /* List of coding systems, in decode order */ | |
2441 Lisp_Object *chain; | |
2442 /* Number of coding systems in list */ | |
2443 int count; | |
2444 /* Coding system to return as a result of canonicalize-after-coding */ | |
2445 Lisp_Object canonicalize_after_coding; | |
2446 }; | |
2447 | |
2448 struct chain_coding_stream | |
2449 { | |
2450 int initted; | |
2451 /* Lstreams for chain coding system */ | |
2452 Lisp_Object *lstreams; | |
2453 int lstream_count; | |
2454 }; | |
2455 | |
1204 | 2456 static const struct memory_description chain_coding_system_description[] = { |
2457 { XD_INT, offsetof (struct chain_coding_system, count) }, | |
2367 | 2458 { XD_BLOCK_PTR, offsetof (struct chain_coding_system, chain), |
2551 | 2459 XD_INDIRECT (0, 0), { &lisp_object_description } }, |
1204 | 2460 { XD_LISP_OBJECT, offsetof (struct chain_coding_system, |
2461 canonicalize_after_coding) }, | |
771 | 2462 { XD_END } |
2463 }; | |
2464 | |
1204 | 2465 static const struct memory_description chain_coding_stream_description_1 [] = { |
2466 { XD_INT, offsetof (struct chain_coding_stream, lstream_count) }, | |
2367 | 2467 { XD_BLOCK_PTR, offsetof (struct chain_coding_stream, lstreams), |
2551 | 2468 XD_INDIRECT (0, 0), { &lisp_object_description } }, |
771 | 2469 { XD_END } |
2470 }; | |
2471 | |
1204 | 2472 const struct sized_memory_description chain_coding_stream_description = { |
2473 sizeof (struct chain_coding_stream), chain_coding_stream_description_1 | |
2474 }; | |
2475 | |
2476 DEFINE_CODING_SYSTEM_TYPE_WITH_DATA (chain); | |
2477 | |
771 | 2478 static Lisp_Object |
2479 chain_canonicalize (Lisp_Object codesys) | |
2480 { | |
2481 /* We make use of the fact that this method is called at init time, after | |
2482 properties have been parsed. init_method is called too early. */ | |
2483 /* #### It's not clear we need this whole chain-canonicalize mechanism | |
2484 any more. */ | |
2485 Lisp_Object chain = Flist (XCODING_SYSTEM_CHAIN_COUNT (codesys), | |
2486 XCODING_SYSTEM_CHAIN_CHAIN (codesys)); | |
2487 chain = Fcons (XCODING_SYSTEM_PRE_WRITE_CONVERSION (codesys), | |
2488 Fcons (XCODING_SYSTEM_POST_READ_CONVERSION (codesys), | |
2489 chain)); | |
2490 Fputhash (chain, codesys, Vchain_canonicalize_hash_table); | |
2491 return codesys; | |
2492 } | |
2493 | |
2494 static Lisp_Object | |
2495 chain_canonicalize_after_coding (struct coding_stream *str) | |
2496 { | |
2497 Lisp_Object cac = | |
2498 XCODING_SYSTEM_CHAIN_CANONICALIZE_AFTER_CODING (str->codesys); | |
2499 if (!NILP (cac)) | |
2500 return cac; | |
2501 return str->codesys; | |
2502 #if 0 | |
2503 struct chain_coding_stream *data = CODING_STREAM_TYPE_DATA (str, chain); | |
2504 Lisp_Object us = str->codesys, codesys; | |
2505 int i; | |
2506 Lisp_Object chain; | |
2507 Lisp_Object tail; | |
2508 int changed = 0; | |
2509 | |
2510 /* #### It's not clear we need this whole chain-canonicalize mechanism | |
2511 any more. */ | |
2512 if (str->direction == CODING_ENCODE || !data->initted) | |
2513 return us; | |
2514 | |
2515 chain = Flist (XCODING_SYSTEM_CHAIN_COUNT (us), | |
2516 XCODING_SYSTEM_CHAIN_CHAIN (us)); | |
2517 | |
2518 tail = chain; | |
2519 for (i = 0; i < XCODING_SYSTEM_CHAIN_COUNT (us); i++) | |
2520 { | |
2521 codesys = (coding_stream_canonicalize_after_coding | |
2522 (XLSTREAM (data->lstreams[i]))); | |
2523 if (!EQ (codesys, XCAR (tail))) | |
2524 changed = 1; | |
2525 XCAR (tail) = codesys; | |
2526 tail = XCDR (tail); | |
2527 } | |
2528 | |
2529 if (!changed) | |
2530 return us; | |
2531 | |
2532 chain = delq_no_quit (Qnil, chain); | |
2533 | |
2534 if (NILP (XCODING_SYSTEM_PRE_WRITE_CONVERSION (us)) && | |
2535 NILP (XCODING_SYSTEM_POST_READ_CONVERSION (us))) | |
2536 { | |
2537 if (NILP (chain)) | |
2538 return Qnil; | |
2539 if (NILP (XCDR (chain))) | |
2540 return XCAR (chain); | |
2541 } | |
2542 | |
2543 codesys = Fgethash (Fcons (XCODING_SYSTEM_PRE_WRITE_CONVERSION (us), | |
2544 Fcons (XCODING_SYSTEM_POST_READ_CONVERSION (us), | |
2545 chain)), Vchain_canonicalize_hash_table, | |
2546 Qnil); | |
2547 if (!NILP (codesys)) | |
2548 return codesys; | |
2549 return make_internal_coding_system | |
2550 (us, "internal-chain-canonicalizer-wrapper", | |
2551 Qchain, Qunbound, list2 (Qchain, chain)); | |
2552 #endif /* 0 */ | |
2553 } | |
2554 | |
2555 static void | |
2556 chain_init (Lisp_Object codesys) | |
2557 { | |
2558 XCODING_SYSTEM_CHAIN_CANONICALIZE_AFTER_CODING (codesys) = Qnil; | |
2559 } | |
2560 | |
2561 static void | |
2562 chain_mark (Lisp_Object codesys) | |
2563 { | |
2564 int i; | |
2565 | |
2566 for (i = 0; i < XCODING_SYSTEM_CHAIN_COUNT (codesys); i++) | |
2567 mark_object (XCODING_SYSTEM_CHAIN_CHAIN (codesys)[i]); | |
2568 mark_object (XCODING_SYSTEM_CHAIN_CANONICALIZE_AFTER_CODING (codesys)); | |
2569 } | |
2570 | |
2571 static void | |
2572 chain_mark_coding_stream_1 (struct chain_coding_stream *data) | |
2573 { | |
2574 int i; | |
2575 | |
2576 for (i = 0; i < data->lstream_count; i++) | |
2577 mark_object (data->lstreams[i]); | |
2578 } | |
2579 | |
2580 static void | |
2581 chain_mark_coding_stream (struct coding_stream *str) | |
2582 { | |
2583 chain_mark_coding_stream_1 (CODING_STREAM_TYPE_DATA (str, chain)); | |
2584 } | |
2585 | |
2586 static void | |
2587 chain_print (Lisp_Object cs, Lisp_Object printcharfun, int escapeflag) | |
2588 { | |
2589 int i; | |
2590 | |
4952
19a72041c5ed
Mule-izing, various fixes related to char * arguments
Ben Wing <ben@xemacs.org>
parents:
4846
diff
changeset
|
2591 write_ascstring (printcharfun, "("); |
771 | 2592 for (i = 0; i < XCODING_SYSTEM_CHAIN_COUNT (cs); i++) |
2593 { | |
4952
19a72041c5ed
Mule-izing, various fixes related to char * arguments
Ben Wing <ben@xemacs.org>
parents:
4846
diff
changeset
|
2594 write_ascstring (printcharfun, i == 0 ? "" : "->"); |
771 | 2595 print_coding_system_in_print_method (XCODING_SYSTEM_CHAIN_CHAIN (cs)[i], |
2596 printcharfun, escapeflag); | |
2597 } | |
2598 { | |
2599 Lisp_Object cac = XCODING_SYSTEM_CHAIN_CANONICALIZE_AFTER_CODING (cs); | |
2600 if (!NILP (cac)) | |
2601 { | |
2602 if (i > 0) | |
4952
19a72041c5ed
Mule-izing, various fixes related to char * arguments
Ben Wing <ben@xemacs.org>
parents:
4846
diff
changeset
|
2603 write_ascstring (printcharfun, " "); |
19a72041c5ed
Mule-izing, various fixes related to char * arguments
Ben Wing <ben@xemacs.org>
parents:
4846
diff
changeset
|
2604 write_ascstring (printcharfun, "canonicalize-after-coding="); |
771 | 2605 print_coding_system_in_print_method (cac, printcharfun, escapeflag); |
2606 } | |
2607 } | |
2608 | |
4952
19a72041c5ed
Mule-izing, various fixes related to char * arguments
Ben Wing <ben@xemacs.org>
parents:
4846
diff
changeset
|
2609 write_ascstring (printcharfun, ")"); |
771 | 2610 } |
2611 | |
2612 static void | |
2613 chain_rewind_coding_stream_1 (struct chain_coding_stream *data) | |
2614 { | |
2615 /* Each will rewind the next; there is always at least one stream (the | |
2616 dynarr stream at the end) if we're initted */ | |
2617 if (data->initted) | |
2618 Lstream_rewind (XLSTREAM (data->lstreams[0])); | |
2619 } | |
2620 | |
2621 static void | |
2622 chain_rewind_coding_stream (struct coding_stream *str) | |
2623 { | |
2624 chain_rewind_coding_stream_1 (CODING_STREAM_TYPE_DATA (str, chain)); | |
2625 } | |
2626 | |
2627 static void | |
2628 chain_init_coding_streams_1 (struct chain_coding_stream *data, | |
2629 unsigned_char_dynarr *dst, | |
2630 int ncodesys, Lisp_Object *codesys, | |
2631 enum encode_decode direction) | |
2632 { | |
2633 int i; | |
2634 Lisp_Object lstream_out; | |
2635 | |
2636 data->lstream_count = ncodesys + 1; | |
2637 data->lstreams = xnew_array (Lisp_Object, data->lstream_count); | |
2638 | |
2639 lstream_out = make_dynarr_output_stream (dst); | |
2640 Lstream_set_buffering (XLSTREAM (lstream_out), LSTREAM_UNBUFFERED, 0); | |
2641 data->lstreams[data->lstream_count - 1] = lstream_out; | |
2642 | |
2643 for (i = ncodesys - 1; i >= 0; i--) | |
2644 { | |
2645 data->lstreams[i] = | |
2646 make_coding_output_stream | |
2647 (XLSTREAM (lstream_out), | |
2648 codesys[direction == CODING_ENCODE ? ncodesys - (i + 1) : i], | |
800 | 2649 direction, 0); |
771 | 2650 lstream_out = data->lstreams[i]; |
2651 Lstream_set_buffering (XLSTREAM (lstream_out), LSTREAM_UNBUFFERED, | |
2652 0); | |
2653 } | |
2654 data->initted = 1; | |
2655 } | |
2656 | |
2657 static Bytecount | |
2658 chain_convert (struct coding_stream *str, const UExtbyte *src, | |
2659 unsigned_char_dynarr *dst, Bytecount n) | |
2660 { | |
2661 struct chain_coding_stream *data = CODING_STREAM_TYPE_DATA (str, chain); | |
2662 | |
2663 if (str->eof) | |
2664 { | |
2665 /* Each will close the next; there is always at least one stream (the | |
2666 dynarr stream at the end) if we're initted. We need to close now | |
2667 because more data may be generated. */ | |
2668 if (data->initted) | |
2669 Lstream_close (XLSTREAM (data->lstreams[0])); | |
2670 return n; | |
2671 } | |
2672 | |
2673 if (!data->initted) | |
2674 chain_init_coding_streams_1 | |
2675 (data, dst, XCODING_SYSTEM_CHAIN_COUNT (str->codesys), | |
2676 XCODING_SYSTEM_CHAIN_CHAIN (str->codesys), str->direction); | |
2677 | |
2678 if (Lstream_write (XLSTREAM (data->lstreams[0]), src, n) < 0) | |
2679 return -1; | |
2680 return n; | |
2681 } | |
2682 | |
2683 static void | |
2684 chain_finalize_coding_stream_1 (struct chain_coding_stream *data) | |
2685 { | |
2686 if (data->lstreams) | |
2687 { | |
2297 | 2688 /* During GC, these objects are unmarked, and are about to be freed. |
2689 We do NOT want them on the free list, and that will cause lots of | |
2690 nastiness including crashes. Just let them be freed normally. */ | |
771 | 2691 if (!gc_in_progress) |
2692 { | |
2693 int i; | |
2297 | 2694 /* Order of deletion is important here! Delete from the head of |
2695 the chain and work your way towards the tail. In general, | |
2696 when you delete an object, there should be *NO* pointers to it | |
2697 anywhere. Deleting back-to-front would be a problem because | |
2698 there are pointers going forward. If there were pointers in | |
2699 both directions, you'd have to disconnect the pointers to a | |
2700 particular object before deleting it. */ | |
771 | 2701 for (i = 0; i < data->lstream_count; i++) |
2702 Lstream_delete (XLSTREAM ((data->lstreams)[i])); | |
2703 } | |
4976
16112448d484
Rename xfree(FOO, TYPE) -> xfree(FOO)
Ben Wing <ben@xemacs.org>
parents:
4967
diff
changeset
|
2704 xfree (data->lstreams); |
5169
6c6d78781d59
cleanup of code related to xfree(), better KKCC backtrace capabilities, document XD_INLINE_LISP_OBJECT_BLOCK_PTR, fix some memory leaks, other code cleanup
Ben Wing <ben@xemacs.org>
parents:
5142
diff
changeset
|
2705 data->lstreams = 0; |
771 | 2706 } |
2707 } | |
2708 | |
2709 static void | |
2710 chain_finalize_coding_stream (struct coding_stream *str) | |
2711 { | |
2712 chain_finalize_coding_stream_1 (CODING_STREAM_TYPE_DATA (str, chain)); | |
2713 } | |
2714 | |
2715 static void | |
2716 chain_finalize (Lisp_Object c) | |
2717 { | |
2718 if (XCODING_SYSTEM_CHAIN_CHAIN (c)) | |
4976
16112448d484
Rename xfree(FOO, TYPE) -> xfree(FOO)
Ben Wing <ben@xemacs.org>
parents:
4967
diff
changeset
|
2719 xfree (XCODING_SYSTEM_CHAIN_CHAIN (c)); |
771 | 2720 } |
2721 | |
428 | 2722 static int |
771 | 2723 chain_putprop (Lisp_Object codesys, Lisp_Object key, Lisp_Object value) |
2724 { | |
2725 if (EQ (key, Qchain)) | |
2726 { | |
2727 Lisp_Object *cslist; | |
2728 int count = 0; | |
2729 int i; | |
2730 | |
2367 | 2731 { |
2732 EXTERNAL_LIST_LOOP_2 (elt, value) | |
2733 { | |
2734 Fget_coding_system (elt); | |
2735 count++; | |
2736 } | |
2737 } | |
771 | 2738 |
2739 cslist = xnew_array (Lisp_Object, count); | |
2740 XCODING_SYSTEM_CHAIN_CHAIN (codesys) = cslist; | |
2741 | |
2742 count = 0; | |
2367 | 2743 { |
2744 EXTERNAL_LIST_LOOP_2 (elt, value) | |
2745 { | |
2746 cslist[count] = Fget_coding_system (elt); | |
2747 count++; | |
2748 } | |
2749 } | |
771 | 2750 |
2751 XCODING_SYSTEM_CHAIN_COUNT (codesys) = count; | |
2752 | |
2753 for (i = 0; i < count - 1; i++) | |
2754 { | |
2755 if (decoding_source_sink_type_is_char (cslist[i], CODING_SINK) != | |
2756 decoding_source_sink_type_is_char (cslist[i + 1], CODING_SOURCE)) | |
2757 invalid_argument_2 ("Sink of first must match source of second", | |
2758 cslist[i], cslist[i + 1]); | |
2759 } | |
2760 } | |
2761 else if (EQ (key, Qcanonicalize_after_coding)) | |
2762 XCODING_SYSTEM_CHAIN_CANONICALIZE_AFTER_CODING (codesys) = | |
2763 Fget_coding_system (value); | |
2764 else | |
2765 return 0; | |
2766 return 1; | |
2767 } | |
2768 | |
2769 static Lisp_Object | |
2770 chain_getprop (Lisp_Object coding_system, Lisp_Object prop) | |
2771 { | |
2772 if (EQ (prop, Qchain)) | |
2773 { | |
2774 Lisp_Object result = Qnil; | |
2775 int i; | |
2776 | |
2777 for (i = 0; i < XCODING_SYSTEM_CHAIN_COUNT (coding_system); i++) | |
2778 result = Fcons (XCODING_SYSTEM_CHAIN_CHAIN (coding_system)[i], | |
2779 result); | |
2780 | |
2781 return Fnreverse (result); | |
2782 } | |
2783 else if (EQ (prop, Qcanonicalize_after_coding)) | |
2784 return XCODING_SYSTEM_CHAIN_CANONICALIZE_AFTER_CODING (coding_system); | |
2785 else | |
2786 return Qunbound; | |
2787 } | |
2788 | |
2789 static enum source_sink_type | |
2790 chain_conversion_end_type (Lisp_Object codesys) | |
2791 { | |
2792 Lisp_Object *cslist = XCODING_SYSTEM_CHAIN_CHAIN (codesys); | |
2793 int n = XCODING_SYSTEM_CHAIN_COUNT (codesys); | |
2794 int charp_source, charp_sink; | |
2795 | |
2796 if (n == 0) | |
2797 return DECODES_BYTE_TO_BYTE; /* arbitrary */ | |
2798 charp_source = decoding_source_sink_type_is_char (cslist[0], CODING_SOURCE); | |
2799 charp_sink = decoding_source_sink_type_is_char (cslist[n - 1], CODING_SINK); | |
2800 | |
2801 switch (charp_source * 2 + charp_sink) | |
2802 { | |
2803 case 0: return DECODES_BYTE_TO_BYTE; | |
2804 case 1: return DECODES_BYTE_TO_CHARACTER; | |
2805 case 2: return DECODES_CHARACTER_TO_BYTE; | |
2806 case 3: return DECODES_CHARACTER_TO_CHARACTER; | |
2807 } | |
2808 | |
2500 | 2809 ABORT (); |
771 | 2810 return DECODES_BYTE_TO_BYTE; |
2811 } | |
2812 | |
2813 | |
2814 /************************************************************************/ | |
2815 /* No-conversion methods */ | |
2816 /************************************************************************/ | |
2817 | |
2818 /* "No conversion"; used for binary files. We use quotes because there | |
2819 really is some conversion being applied (it does byte<->char | |
2820 conversion), but it appears to the user as if the text is read in | |
2297 | 2821 without conversion. |
2822 | |
2823 #### Shouldn't we _call_ it that, then? And while we're at it, | |
2824 separate it into "to_internal" and "to_external"? */ | |
771 | 2825 DEFINE_CODING_SYSTEM_TYPE (no_conversion); |
2826 | |
2827 /* This is used when reading in "binary" files -- i.e. files that may | |
2828 contain all 256 possible byte values and that are not to be | |
2829 interpreted as being in any particular encoding. */ | |
2830 static Bytecount | |
2831 no_conversion_convert (struct coding_stream *str, | |
2832 const UExtbyte *src, | |
2833 unsigned_char_dynarr *dst, Bytecount n) | |
2834 { | |
2835 UExtbyte c; | |
2836 unsigned int ch = str->ch; | |
2837 Bytecount orign = n; | |
2838 | |
2839 if (str->direction == CODING_DECODE) | |
2840 { | |
2841 while (n--) | |
2842 { | |
2843 c = *src++; | |
2844 | |
2845 DECODE_ADD_BINARY_CHAR (c, dst); | |
2846 } | |
2847 | |
2848 if (str->eof) | |
2849 DECODE_OUTPUT_PARTIAL_CHAR (ch, dst); | |
2850 } | |
2851 else | |
2852 { | |
2853 | |
2854 while (n--) | |
2855 { | |
2856 c = *src++; | |
826 | 2857 if (byte_ascii_p (c)) |
771 | 2858 { |
2859 assert (ch == 0); | |
2860 Dynarr_add (dst, c); | |
2861 } | |
2862 #ifdef MULE | |
867 | 2863 else if (ibyte_leading_byte_p (c)) |
771 | 2864 { |
2865 assert (ch == 0); | |
2866 if (c == LEADING_BYTE_LATIN_ISO8859_1 || | |
2867 c == LEADING_BYTE_CONTROL_1) | |
2868 ch = c; | |
2869 else | |
2297 | 2870 /* #### This is just plain unacceptable. */ |
771 | 2871 Dynarr_add (dst, '~'); /* untranslatable character */ |
2872 } | |
2873 else | |
2874 { | |
2875 if (ch == LEADING_BYTE_LATIN_ISO8859_1) | |
2876 Dynarr_add (dst, c); | |
2877 else if (ch == LEADING_BYTE_CONTROL_1) | |
2878 { | |
2879 assert (c < 0xC0); | |
2880 Dynarr_add (dst, c - 0x20); | |
2881 } | |
2882 /* else it should be the second or third byte of an | |
2883 untranslatable character, so ignore it */ | |
2884 ch = 0; | |
2885 } | |
2886 #endif /* MULE */ | |
2887 | |
2888 } | |
2889 } | |
2890 | |
2891 str->ch = ch; | |
2892 return orign; | |
2893 } | |
2894 | |
2895 DEFINE_DETECTOR (no_conversion); | |
2896 DEFINE_DETECTOR_CATEGORY (no_conversion, no_conversion); | |
2897 | |
2898 struct no_conversion_detector | |
2899 { | |
2900 int dummy; | |
2901 }; | |
2902 | |
2903 static void | |
2286 | 2904 no_conversion_detect (struct detection_state *st, const UExtbyte *UNUSED (src), |
2905 Bytecount UNUSED (n)) | |
771 | 2906 { |
2907 /* Hack until we get better handling of this stuff! */ | |
2908 DET_RESULT (st, no_conversion) = DET_SLIGHTLY_LIKELY; | |
2909 } | |
2910 | |
2911 | |
2912 /************************************************************************/ | |
2913 /* Convert-eol methods */ | |
2914 /************************************************************************/ | |
2915 | |
2916 /* This is used to handle end-of-line (EOL) differences. It is | |
2819 | 2917 character-to-character, and works (when encoding) *BEFORE* sending data to |
2918 the main encoding routine -- thus, that routine must handle different EOL | |
2919 types itself if it does line-oriented type processing. This is unavoidable | |
2920 because we don't know whether the output of the main encoding routine is | |
2921 ASCII compatible (UTF-16 is definitely not, for example). [[ sjt sez this | |
2922 is bogus. There should be _no_ EOL processing (or processing of any kind) | |
2923 after conversion to external. ]] | |
771 | 2924 |
793 | 2925 There is one parameter: `subtype', either `cr', `lf', `crlf', or nil. |
771 | 2926 */ |
2927 | |
2928 struct convert_eol_coding_system | |
2929 { | |
2930 enum eol_type subtype; | |
2132 | 2931 int dummy; /* On some architectures (eg ia64) the portable dumper can |
2932 produce unaligned access errors without this field. Probably | |
2933 because the combined structure of this structure and | |
2934 Lisp_Coding_System is not properly aligned. */ | |
771 | 2935 }; |
2936 | |
2937 #define CODING_SYSTEM_CONVERT_EOL_SUBTYPE(codesys) \ | |
2938 (CODING_SYSTEM_TYPE_DATA (codesys, convert_eol)->subtype) | |
2939 #define XCODING_SYSTEM_CONVERT_EOL_SUBTYPE(codesys) \ | |
2940 (XCODING_SYSTEM_TYPE_DATA (codesys, convert_eol)->subtype) | |
2941 | |
2942 struct convert_eol_coding_stream | |
2943 { | |
2944 enum eol_type actual; | |
2945 }; | |
2946 | |
1204 | 2947 static const struct memory_description |
771 | 2948 convert_eol_coding_system_description[] = { |
2949 { XD_END } | |
2950 }; | |
2951 | |
1204 | 2952 DEFINE_CODING_SYSTEM_TYPE_WITH_DATA (convert_eol); |
2953 | |
771 | 2954 static void |
2286 | 2955 convert_eol_print (Lisp_Object cs, Lisp_Object printcharfun, |
2956 int UNUSED (escapeflag)) | |
771 | 2957 { |
2958 struct convert_eol_coding_system *data = | |
2959 XCODING_SYSTEM_TYPE_DATA (cs, convert_eol); | |
2960 | |
2961 write_fmt_string (printcharfun, "(%s)", | |
2962 data->subtype == EOL_LF ? "lf" : | |
2963 data->subtype == EOL_CRLF ? "crlf" : | |
2964 data->subtype == EOL_CR ? "cr" : | |
793 | 2965 data->subtype == EOL_AUTODETECT ? "nil" : |
2500 | 2966 (ABORT(), "")); |
771 | 2967 } |
2968 | |
2969 static enum source_sink_type | |
2286 | 2970 convert_eol_conversion_end_type (Lisp_Object UNUSED (codesys)) |
771 | 2971 { |
2972 return DECODES_CHARACTER_TO_CHARACTER; | |
2973 } | |
2974 | |
2975 static int | |
2976 convert_eol_putprop (Lisp_Object codesys, | |
2977 Lisp_Object key, | |
2978 Lisp_Object value) | |
2979 { | |
2980 struct convert_eol_coding_system *data = | |
2981 XCODING_SYSTEM_TYPE_DATA (codesys, convert_eol); | |
2982 | |
2983 if (EQ (key, Qsubtype)) | |
2984 { | |
2985 if (EQ (value, Qlf) /* || EQ (value, Qunix) */) | |
2986 data->subtype = EOL_LF; | |
2987 else if (EQ (value, Qcrlf) /* || EQ (value, Qdos) */) | |
2988 data->subtype = EOL_CRLF; | |
2989 else if (EQ (value, Qcr) /* || EQ (value, Qmac) */) | |
2990 data->subtype = EOL_CR; | |
793 | 2991 else if (EQ (value, Qnil)) |
771 | 2992 data->subtype = EOL_AUTODETECT; |
2993 else invalid_constant ("Unrecognized eol type", value); | |
2994 } | |
2995 else | |
2996 return 0; | |
2997 return 1; | |
2998 } | |
2999 | |
3000 static Lisp_Object | |
3001 convert_eol_getprop (Lisp_Object coding_system, Lisp_Object prop) | |
3002 { | |
3003 struct convert_eol_coding_system *data = | |
3004 XCODING_SYSTEM_TYPE_DATA (coding_system, convert_eol); | |
3005 | |
3006 if (EQ (prop, Qsubtype)) | |
3007 { | |
3008 switch (data->subtype) | |
3009 { | |
3010 case EOL_LF: return Qlf; | |
3011 case EOL_CRLF: return Qcrlf; | |
3012 case EOL_CR: return Qcr; | |
793 | 3013 case EOL_AUTODETECT: return Qnil; |
2500 | 3014 default: ABORT (); |
771 | 3015 } |
3016 } | |
3017 | |
3018 return Qunbound; | |
3019 } | |
3020 | |
3021 static void | |
3022 convert_eol_init_coding_stream (struct coding_stream *str) | |
3023 { | |
3024 struct convert_eol_coding_stream *data = | |
3025 CODING_STREAM_TYPE_DATA (str, convert_eol); | |
3026 data->actual = XCODING_SYSTEM_CONVERT_EOL_SUBTYPE (str->codesys); | |
3027 } | |
3028 | |
3029 static Bytecount | |
867 | 3030 convert_eol_convert (struct coding_stream *str, const Ibyte *src, |
771 | 3031 unsigned_char_dynarr *dst, Bytecount n) |
3032 { | |
3033 if (str->direction == CODING_DECODE) | |
3034 { | |
3035 struct convert_eol_coding_stream *data = | |
3036 CODING_STREAM_TYPE_DATA (str, convert_eol); | |
3037 | |
3038 if (data->actual == EOL_AUTODETECT) | |
3039 { | |
3040 Bytecount n2 = n; | |
867 | 3041 const Ibyte *src2 = src; |
771 | 3042 |
3043 for (; n2; n2--) | |
3044 { | |
867 | 3045 Ibyte c = *src2++; |
771 | 3046 if (c == '\n') |
3047 { | |
3048 data->actual = EOL_LF; | |
3049 break; | |
3050 } | |
3051 else if (c == '\r') | |
3052 { | |
3053 if (n2 == 1) | |
3054 { | |
3055 /* If we're seeing a '\r' at the end of the data, then | |
3056 reject the '\r' right now so it doesn't become an | |
3057 issue in the code below -- unless we're at the end of | |
3058 the stream, in which case we can't do that (because | |
3059 then the '\r' will never get written out), and in any | |
3060 case we should be recognizing it at EOL_CR format. */ | |
3061 if (str->eof) | |
3062 data->actual = EOL_CR; | |
3063 else | |
3064 n--; | |
3065 break; | |
3066 } | |
3067 else if (*src2 == '\n') | |
3068 data->actual = EOL_CRLF; | |
3069 else | |
3070 data->actual = EOL_CR; | |
3071 break; | |
3072 } | |
3073 } | |
3074 } | |
3075 | |
3076 /* str->eof is set, the caller reached EOF on the other end and has | |
3077 no new data to give us. The only data we get is the data we | |
3078 rejected from last time. */ | |
3079 if (data->actual == EOL_LF || data->actual == EOL_AUTODETECT || | |
3080 (str->eof)) | |
3081 Dynarr_add_many (dst, src, n); | |
3082 else | |
3083 { | |
867 | 3084 const Ibyte *end = src + n; |
771 | 3085 while (1) |
3086 { | |
3087 /* Find the next section with no \r and add it. */ | |
867 | 3088 const Ibyte *runstart = src; |
3089 src = (Ibyte *) memchr (src, '\r', end - src); | |
771 | 3090 if (!src) |
3091 src = end; | |
3092 Dynarr_add_many (dst, runstart, src - runstart); | |
3093 /* Stop if at end ... */ | |
3094 if (src == end) | |
3095 break; | |
3096 /* ... else, translate as necessary. */ | |
3097 src++; | |
3098 if (data->actual == EOL_CR) | |
3099 Dynarr_add (dst, '\n'); | |
3100 /* We need to be careful here with CRLF. If we see a CR at the | |
3101 end of the data, we don't know if it's part of a CRLF, so we | |
3102 reject it. Otherwise: If it's part of a CRLF, eat it and | |
3103 loop; the following LF gets added next time around. If it's | |
3104 not part of a CRLF, add the CR and loop. The following | |
3105 character will be processed in the next loop iteration. This | |
3106 correctly handles a sequence like CR+CR+LF. */ | |
3107 else if (src == end) | |
3108 return n - 1; /* reject the CR at the end; we'll get it again | |
3109 next time the convert method is called */ | |
3110 else if (*src != '\n') | |
3111 Dynarr_add (dst, '\r'); | |
3112 } | |
3113 } | |
3114 | |
3115 return n; | |
3116 } | |
3117 else | |
3118 { | |
3119 enum eol_type subtype = | |
3120 XCODING_SYSTEM_CONVERT_EOL_SUBTYPE (str->codesys); | |
867 | 3121 const Ibyte *end = src + n; |
771 | 3122 |
3123 /* We try to be relatively efficient here. */ | |
3124 if (subtype == EOL_LF) | |
3125 Dynarr_add_many (dst, src, n); | |
3126 else | |
3127 { | |
3128 while (1) | |
3129 { | |
3130 /* Find the next section with no \n and add it. */ | |
867 | 3131 const Ibyte *runstart = src; |
3132 src = (Ibyte *) memchr (src, '\n', end - src); | |
771 | 3133 if (!src) |
3134 src = end; | |
3135 Dynarr_add_many (dst, runstart, src - runstart); | |
3136 /* Stop if at end ... */ | |
3137 if (src == end) | |
3138 break; | |
3139 /* ... else, skip over \n and add its translation. */ | |
3140 src++; | |
3141 Dynarr_add (dst, '\r'); | |
3142 if (subtype == EOL_CRLF) | |
3143 Dynarr_add (dst, '\n'); | |
3144 } | |
3145 } | |
3146 | |
3147 return n; | |
3148 } | |
3149 } | |
3150 | |
3151 static Lisp_Object | |
3152 convert_eol_canonicalize_after_coding (struct coding_stream *str) | |
3153 { | |
3154 struct convert_eol_coding_stream *data = | |
3155 CODING_STREAM_TYPE_DATA (str, convert_eol); | |
3156 | |
3157 if (str->direction == CODING_ENCODE) | |
3158 return str->codesys; | |
3159 | |
3160 switch (data->actual) | |
3161 { | |
3162 case EOL_LF: return Fget_coding_system (Qconvert_eol_lf); | |
3163 case EOL_CRLF: return Fget_coding_system (Qconvert_eol_crlf); | |
3164 case EOL_CR: return Fget_coding_system (Qconvert_eol_cr); | |
3165 case EOL_AUTODETECT: return str->codesys; | |
2500 | 3166 default: ABORT (); return Qnil; |
771 | 3167 } |
3168 } | |
3169 | |
3170 | |
3171 /************************************************************************/ | |
3172 /* Undecided methods */ | |
3173 /************************************************************************/ | |
3174 | |
3175 /* Do autodetection. We can autodetect the EOL type only, the coding | |
3176 system only, or both. We only do autodetection when decoding; when | |
3177 encoding, we just pass the data through. | |
3178 | |
3179 When doing just EOL detection, a coding system can be specified; if so, | |
3180 we will decode this data through the coding system before doing EOL | |
3181 detection. The reason for specifying this is so that | |
3182 canonicalize-after-coding works: We will canonicalize the specified | |
3183 coding system into the appropriate EOL type. When doing both coding and | |
3184 EOL detection, we do similar canonicalization, and also catch situations | |
3185 where the EOL type is overspecified, i.e. the detected coding system | |
3186 specifies an EOL type, and either switch to the equivalent | |
3187 non-EOL-processing coding system (if possible), or terminate EOL | |
3188 detection and use the specified EOL type. This prevents data from being | |
3189 EOL-processed twice. | |
3190 */ | |
3191 | |
3192 struct undecided_coding_system | |
3193 { | |
3194 int do_eol, do_coding; | |
3195 Lisp_Object cs; | |
3196 }; | |
3197 | |
3198 struct undecided_coding_stream | |
3199 { | |
3200 Lisp_Object actual; | |
3201 /* Either 2 or 3 lstreams here; see undecided_convert */ | |
3202 struct chain_coding_stream c; | |
3203 | |
3204 struct detection_state *st; | |
3205 }; | |
3206 | |
1204 | 3207 static const struct memory_description undecided_coding_system_description[] = { |
3208 { XD_LISP_OBJECT, offsetof (struct undecided_coding_system, cs) }, | |
771 | 3209 { XD_END } |
3210 }; | |
3211 | |
1204 | 3212 static const struct memory_description undecided_coding_stream_description_1 [] = { |
3213 { XD_LISP_OBJECT, offsetof (struct undecided_coding_stream, actual) }, | |
2367 | 3214 { XD_BLOCK_ARRAY, offsetof (struct undecided_coding_stream, c), |
2551 | 3215 1, { &chain_coding_stream_description } }, |
1204 | 3216 { XD_END } |
3217 }; | |
3218 | |
3219 const struct sized_memory_description undecided_coding_stream_description = { | |
3220 sizeof (struct undecided_coding_stream), undecided_coding_stream_description_1 | |
3221 }; | |
3222 | |
3223 DEFINE_CODING_SYSTEM_TYPE_WITH_DATA (undecided); | |
3224 | |
771 | 3225 static void |
3226 undecided_init (Lisp_Object codesys) | |
3227 { | |
3228 struct undecided_coding_system *data = | |
3229 XCODING_SYSTEM_TYPE_DATA (codesys, undecided); | |
3230 | |
3231 data->cs = Qnil; | |
3232 } | |
3233 | |
3234 static void | |
3235 undecided_mark (Lisp_Object codesys) | |
3236 { | |
3237 struct undecided_coding_system *data = | |
3238 XCODING_SYSTEM_TYPE_DATA (codesys, undecided); | |
3239 | |
3240 mark_object (data->cs); | |
3241 } | |
3242 | |
3243 static void | |
3244 undecided_print (Lisp_Object cs, Lisp_Object printcharfun, int escapeflag) | |
3245 { | |
3246 struct undecided_coding_system *data = | |
3247 XCODING_SYSTEM_TYPE_DATA (cs, undecided); | |
3248 int need_space = 0; | |
3249 | |
4952
19a72041c5ed
Mule-izing, various fixes related to char * arguments
Ben Wing <ben@xemacs.org>
parents:
4846
diff
changeset
|
3250 write_ascstring (printcharfun, "("); |
771 | 3251 if (data->do_eol) |
3252 { | |
4952
19a72041c5ed
Mule-izing, various fixes related to char * arguments
Ben Wing <ben@xemacs.org>
parents:
4846
diff
changeset
|
3253 write_ascstring (printcharfun, "do-eol"); |
771 | 3254 need_space = 1; |
3255 } | |
3256 if (data->do_coding) | |
3257 { | |
3258 if (need_space) | |
4952
19a72041c5ed
Mule-izing, various fixes related to char * arguments
Ben Wing <ben@xemacs.org>
parents:
4846
diff
changeset
|
3259 write_ascstring (printcharfun, " "); |
19a72041c5ed
Mule-izing, various fixes related to char * arguments
Ben Wing <ben@xemacs.org>
parents:
4846
diff
changeset
|
3260 write_ascstring (printcharfun, "do-coding"); |
771 | 3261 need_space = 1; |
3262 } | |
3263 if (!NILP (data->cs)) | |
3264 { | |
3265 if (need_space) | |
4952
19a72041c5ed
Mule-izing, various fixes related to char * arguments
Ben Wing <ben@xemacs.org>
parents:
4846
diff
changeset
|
3266 write_ascstring (printcharfun, " "); |
19a72041c5ed
Mule-izing, various fixes related to char * arguments
Ben Wing <ben@xemacs.org>
parents:
4846
diff
changeset
|
3267 write_ascstring (printcharfun, "coding-system="); |
771 | 3268 print_coding_system_in_print_method (data->cs, printcharfun, escapeflag); |
3269 } | |
4952
19a72041c5ed
Mule-izing, various fixes related to char * arguments
Ben Wing <ben@xemacs.org>
parents:
4846
diff
changeset
|
3270 write_ascstring (printcharfun, ")"); |
771 | 3271 } |
3272 | |
3273 static void | |
3274 undecided_mark_coding_stream (struct coding_stream *str) | |
3275 { | |
1204 | 3276 mark_object (CODING_STREAM_TYPE_DATA (str, undecided)->actual); |
771 | 3277 chain_mark_coding_stream_1 (&CODING_STREAM_TYPE_DATA (str, undecided)->c); |
3278 } | |
3279 | |
3280 static int | |
3281 undecided_putprop (Lisp_Object codesys, Lisp_Object key, Lisp_Object value) | |
3282 { | |
3283 struct undecided_coding_system *data = | |
3284 XCODING_SYSTEM_TYPE_DATA (codesys, undecided); | |
3285 | |
3286 if (EQ (key, Qdo_eol)) | |
3287 data->do_eol = 1; | |
3288 else if (EQ (key, Qdo_coding)) | |
3289 data->do_coding = 1; | |
3290 else if (EQ (key, Qcoding_system)) | |
3291 data->cs = get_coding_system_for_text_file (value, 0); | |
3292 else | |
3293 return 0; | |
3294 return 1; | |
3295 } | |
3296 | |
3297 static Lisp_Object | |
3298 undecided_getprop (Lisp_Object codesys, Lisp_Object prop) | |
3299 { | |
3300 struct undecided_coding_system *data = | |
3301 XCODING_SYSTEM_TYPE_DATA (codesys, undecided); | |
3302 | |
3303 if (EQ (prop, Qdo_eol)) | |
3304 return data->do_eol ? Qt : Qnil; | |
3305 if (EQ (prop, Qdo_coding)) | |
3306 return data->do_coding ? Qt : Qnil; | |
3307 if (EQ (prop, Qcoding_system)) | |
3308 return data->cs; | |
3309 return Qunbound; | |
3310 } | |
3311 | |
3312 static struct detection_state * | |
3313 allocate_detection_state (void) | |
3314 { | |
3315 int i; | |
3316 Bytecount size = MAX_ALIGN_SIZE (sizeof (struct detection_state)); | |
3317 struct detection_state *block; | |
3318 | |
3319 for (i = 0; i < coding_detector_count; i++) | |
3320 size += MAX_ALIGN_SIZE (Dynarr_at (all_coding_detectors, i).data_size); | |
3321 | |
3322 block = (struct detection_state *) xmalloc_and_zero (size); | |
3323 | |
3324 size = MAX_ALIGN_SIZE (sizeof (struct detection_state)); | |
3325 for (i = 0; i < coding_detector_count; i++) | |
3326 { | |
3327 block->data_offset[i] = size; | |
3328 size += MAX_ALIGN_SIZE (Dynarr_at (all_coding_detectors, i).data_size); | |
3329 } | |
3330 | |
3331 return block; | |
3332 } | |
3333 | |
3334 static void | |
3335 free_detection_state (struct detection_state *st) | |
3336 { | |
3337 int i; | |
3338 | |
3339 for (i = 0; i < coding_detector_count; i++) | |
3340 { | |
3341 if (Dynarr_at (all_coding_detectors, i).finalize_detection_state_method) | |
3342 Dynarr_at (all_coding_detectors, i).finalize_detection_state_method | |
3343 (st); | |
3344 } | |
3345 | |
4976
16112448d484
Rename xfree(FOO, TYPE) -> xfree(FOO)
Ben Wing <ben@xemacs.org>
parents:
4967
diff
changeset
|
3346 xfree (st); |
771 | 3347 } |
3348 | |
3349 static int | |
3350 coding_category_symbol_to_id (Lisp_Object symbol) | |
428 | 3351 { |
3352 int i; | |
3353 | |
3354 CHECK_SYMBOL (symbol); | |
771 | 3355 for (i = 0; i < coding_detector_count; i++) |
3356 { | |
3357 detector_category_dynarr *cats = | |
3358 Dynarr_at (all_coding_detectors, i).cats; | |
3359 int j; | |
3360 | |
3361 for (j = 0; j < Dynarr_length (cats); j++) | |
3362 if (EQ (Dynarr_at (cats, j).sym, symbol)) | |
3363 return Dynarr_at (cats, j).id; | |
3364 } | |
3365 | |
563 | 3366 invalid_constant ("Unrecognized coding category", symbol); |
1204 | 3367 RETURN_NOT_REACHED (0); |
428 | 3368 } |
3369 | |
771 | 3370 static Lisp_Object |
3371 coding_category_id_to_symbol (int id) | |
428 | 3372 { |
3373 int i; | |
771 | 3374 |
3375 for (i = 0; i < coding_detector_count; i++) | |
3376 { | |
3377 detector_category_dynarr *cats = | |
3378 Dynarr_at (all_coding_detectors, i).cats; | |
3379 int j; | |
3380 | |
3381 for (j = 0; j < Dynarr_length (cats); j++) | |
3382 if (id == Dynarr_at (cats, j).id) | |
3383 return Dynarr_at (cats, j).sym; | |
3384 } | |
3385 | |
2500 | 3386 ABORT (); |
771 | 3387 return Qnil; /* (usually) not reached */ |
428 | 3388 } |
3389 | |
771 | 3390 static Lisp_Object |
3391 detection_result_number_to_symbol (enum detection_result result) | |
428 | 3392 { |
1494 | 3393 /* let compiler warn if not all enumerators are handled */ |
3394 switch (result) { | |
3395 #define FROB(sym, num) case num: return (sym) | |
771 | 3396 FROB (Qnear_certainty, DET_NEAR_CERTAINTY); |
3397 FROB (Qquite_probable, DET_QUITE_PROBABLE); | |
3398 FROB (Qsomewhat_likely, DET_SOMEWHAT_LIKELY); | |
1494 | 3399 FROB (Qslightly_likely, DET_SLIGHTLY_LIKELY); |
771 | 3400 FROB (Qas_likely_as_unlikely, DET_AS_LIKELY_AS_UNLIKELY); |
3401 FROB (Qsomewhat_unlikely, DET_SOMEWHAT_UNLIKELY); | |
3402 FROB (Qquite_improbable, DET_QUITE_IMPROBABLE); | |
3403 FROB (Qnearly_impossible, DET_NEARLY_IMPOSSIBLE); | |
3404 #undef FROB | |
1494 | 3405 } |
771 | 3406 |
2500 | 3407 ABORT (); |
771 | 3408 return Qnil; /* (usually) not reached */ |
3409 } | |
3410 | |
778 | 3411 #if 0 /* not used */ |
771 | 3412 static enum detection_result |
3413 detection_result_symbol_to_number (Lisp_Object symbol) | |
3414 { | |
1494 | 3415 /* using switch here would be bad style, and doesn't help */ |
771 | 3416 #define FROB(sym, num) if (EQ (symbol, sym)) return (num) |
3417 FROB (Qnear_certainty, DET_NEAR_CERTAINTY); | |
3418 FROB (Qquite_probable, DET_QUITE_PROBABLE); | |
3419 FROB (Qsomewhat_likely, DET_SOMEWHAT_LIKELY); | |
1494 | 3420 FROB (Qslightly_likely, DET_SLIGHTLY_LIKELY); |
771 | 3421 FROB (Qas_likely_as_unlikely, DET_AS_LIKELY_AS_UNLIKELY); |
3422 FROB (Qsomewhat_unlikely, DET_SOMEWHAT_UNLIKELY); | |
3423 FROB (Qquite_improbable, DET_QUITE_IMPROBABLE); | |
3424 FROB (Qnearly_impossible, DET_NEARLY_IMPOSSIBLE); | |
3425 #undef FROB | |
3426 | |
3427 invalid_constant ("Unrecognized detection result", symbol); | |
3428 return ((enum detection_result) 0); /* not reached */ | |
3429 } | |
778 | 3430 #endif /* 0 */ |
771 | 3431 |
3432 /* Set all detection results for a given detector to a specified value. */ | |
3433 void | |
3434 set_detection_results (struct detection_state *st, int detector, int given) | |
3435 { | |
3436 detector_category_dynarr *cats = | |
3437 Dynarr_at (all_coding_detectors, detector).cats; | |
3438 int i; | |
3439 | |
3440 for (i = 0; i < Dynarr_length (cats); i++) | |
3441 st->categories[Dynarr_at (cats, i).id] = given; | |
3442 } | |
428 | 3443 |
3444 static int | |
3445 acceptable_control_char_p (int c) | |
3446 { | |
3447 switch (c) | |
3448 { | |
3449 /* Allow and ignore control characters that you might | |
3450 reasonably see in a text file */ | |
3451 case '\r': | |
3452 case '\n': | |
3453 case '\t': | |
3454 case 7: /* bell */ | |
3455 case 8: /* backspace */ | |
3456 case 11: /* vertical tab */ | |
3457 case 12: /* form feed */ | |
3458 case 26: /* MS-DOS C-z junk */ | |
3459 case 31: /* '^_' -- for info */ | |
3460 return 1; | |
3461 default: | |
3462 return 0; | |
3463 } | |
3464 } | |
3465 | |
771 | 3466 #ifdef DEBUG_XEMACS |
3467 | |
3468 static UExtbyte | |
3469 hex_digit_to_char (int digit) | |
428 | 3470 { |
771 | 3471 if (digit < 10) |
3472 return digit + '0'; | |
3473 else | |
3474 return digit - 10 + 'A'; | |
428 | 3475 } |
3476 | |
771 | 3477 static void |
3478 output_bytes_in_ascii_and_hex (const UExtbyte *src, Bytecount n) | |
428 | 3479 { |
3425 | 3480 Extbyte *ascii = alloca_array (Extbyte, n + 1); |
3481 Extbyte *hex = alloca_array (Extbyte, 3 * n + 1); | |
771 | 3482 int i; |
3413 | 3483 DECLARE_EISTRING (eistr_ascii); |
3484 DECLARE_EISTRING (eistr_hex); | |
771 | 3485 |
3486 for (i = 0; i < n; i++) | |
428 | 3487 { |
3425 | 3488 Extbyte c = src[i]; |
771 | 3489 if (c < 0x20) |
3490 ascii[i] = '.'; | |
428 | 3491 else |
771 | 3492 ascii[i] = c; |
3493 hex[3 * i] = hex_digit_to_char (c >> 4); | |
3494 hex[3 * i + 1] = hex_digit_to_char (c & 0xF); | |
3495 hex[3 * i + 2] = ' '; | |
428 | 3496 } |
771 | 3497 ascii[i] = '\0'; |
3498 hex[3 * i - 1] = '\0'; | |
3413 | 3499 |
3500 eicpy_ext(eistr_hex, hex, Qbinary); | |
3501 eicpy_ext(eistr_ascii, ascii, Qbinary); | |
3502 | |
3425 | 3503 stderr_out ("%s %s", eidata(eistr_ascii), eidata(eistr_hex)); |
428 | 3504 } |
3505 | |
771 | 3506 #endif /* DEBUG_XEMACS */ |
3507 | |
3508 /* Attempt to determine the encoding of the given text. Before calling | |
3509 this function for the first time, you must zero out the detection state. | |
428 | 3510 |
3511 Returns: | |
3512 | |
771 | 3513 0 == keep going |
3514 1 == stop | |
428 | 3515 */ |
3516 | |
3517 static int | |
771 | 3518 detect_coding_type (struct detection_state *st, const UExtbyte *src, |
3519 Bytecount n) | |
428 | 3520 { |
771 | 3521 Bytecount n2 = n; |
3522 const UExtbyte *src2 = src; | |
3523 int i; | |
3524 | |
3525 #ifdef DEBUG_XEMACS | |
3526 if (!NILP (Vdebug_coding_detection)) | |
3527 { | |
3528 int bytes = min (16, n); | |
3529 stderr_out ("detect_coding_type: processing %ld bytes\n", n); | |
3530 stderr_out ("First %d: ", bytes); | |
3531 output_bytes_in_ascii_and_hex (src, bytes); | |
3532 stderr_out ("\nLast %d: ", bytes); | |
3533 output_bytes_in_ascii_and_hex (src + n - bytes, bytes); | |
3534 stderr_out ("\n"); | |
3535 } | |
3536 #endif /* DEBUG_XEMACS */ | |
428 | 3537 if (!st->seen_non_ascii) |
3538 { | |
771 | 3539 for (; n2; n2--, src2++) |
428 | 3540 { |
771 | 3541 UExtbyte c = *src2; |
428 | 3542 if ((c < 0x20 && !acceptable_control_char_p (c)) || c >= 0x80) |
3543 { | |
3544 st->seen_non_ascii = 1; | |
3545 break; | |
3546 } | |
3547 } | |
3548 } | |
3549 | |
771 | 3550 for (i = 0; i < coding_detector_count; i++) |
3551 Dynarr_at (all_coding_detectors, i).detect_method (st, src, n); | |
3552 | |
3553 st->bytes_seen += n; | |
3554 | |
3555 #ifdef DEBUG_XEMACS | |
3556 if (!NILP (Vdebug_coding_detection)) | |
3557 { | |
3558 stderr_out ("seen_non_ascii: %d\n", st->seen_non_ascii); | |
1494 | 3559 if (coding_detector_category_count <= 0) |
3560 stderr_out ("found %d detector categories\n", | |
3561 coding_detector_category_count); | |
771 | 3562 for (i = 0; i < coding_detector_category_count; i++) |
3563 stderr_out_lisp | |
3564 ("%s: %s\n", | |
3565 2, | |
3566 coding_category_id_to_symbol (i), | |
3567 detection_result_number_to_symbol ((enum detection_result) | |
3568 st->categories[i])); | |
3569 } | |
3570 #endif /* DEBUG_XEMACS */ | |
3571 | |
3572 { | |
3573 int not_unlikely = 0; | |
3574 int retval; | |
3575 | |
3576 for (i = 0; i < coding_detector_category_count; i++) | |
3577 if (st->categories[i] >= 0) | |
3578 not_unlikely++; | |
3579 | |
3580 retval = (not_unlikely <= 1 | |
3581 #if 0 /* this is bogus */ | |
3582 || st->bytes_seen >= MAX_BYTES_PROCESSED_FOR_DETECTION | |
428 | 3583 #endif |
771 | 3584 ); |
3585 | |
3586 #ifdef DEBUG_XEMACS | |
3587 if (!NILP (Vdebug_coding_detection)) | |
3588 stderr_out ("detect_coding_type: returning %d (%s)\n", | |
3589 retval, retval ? "stop" : "keep going"); | |
3590 #endif /* DEBUG_XEMACS */ | |
3591 | |
3592 return retval; | |
428 | 3593 } |
3594 } | |
3595 | |
3596 static Lisp_Object | |
771 | 3597 detected_coding_system (struct detection_state *st) |
428 | 3598 { |
771 | 3599 int i; |
3600 int even = 1; | |
3601 | |
3602 if (st->seen_non_ascii) | |
3603 { | |
3604 for (i = 0; i < coding_detector_category_count; i++) | |
3605 if (st->categories[i] != DET_AS_LIKELY_AS_UNLIKELY) | |
3606 { | |
3607 even = 0; | |
3608 break; | |
3609 } | |
3610 } | |
3611 | |
3612 /* #### Here we are ignoring the results of detection when it's all | |
3613 ASCII. This is obviously a bad thing. But we need to fix up the | |
3614 existing detection methods somewhat before we can switch. */ | |
3615 if (even) | |
428 | 3616 { |
3617 /* If the file was entirely or basically ASCII, use the | |
3618 default value of `buffer-file-coding-system'. */ | |
3619 Lisp_Object retval = | |
3620 XBUFFER (Vbuffer_defaults)->buffer_file_coding_system; | |
3621 if (!NILP (retval)) | |
3622 { | |
771 | 3623 retval = find_coding_system_for_text_file (retval, 0); |
428 | 3624 if (NILP (retval)) |
3625 { | |
3626 warn_when_safe | |
3627 (Qbad_variable, Qwarning, | |
3628 "Invalid `default-buffer-file-coding-system', set to nil"); | |
3629 XBUFFER (Vbuffer_defaults)->buffer_file_coding_system = Qnil; | |
3630 } | |
3631 } | |
3632 if (NILP (retval)) | |
4100 | 3633 retval = Fget_coding_system (Qbinary); |
428 | 3634 return retval; |
3635 } | |
3636 else | |
3637 { | |
771 | 3638 int likelihood; |
3639 Lisp_Object retval = Qnil; | |
3640 | |
3641 /* Look through the coding categories first by likelihood and then by | |
3642 priority and find the first one that is allowed. */ | |
3643 | |
3644 for (likelihood = DET_HIGHEST; likelihood >= DET_LOWEST; likelihood--) | |
428 | 3645 { |
771 | 3646 for (i = 0; i < coding_detector_category_count; i++) |
3647 { | |
3648 int cat = coding_category_by_priority[i]; | |
3649 if (st->categories[cat] == likelihood && | |
3650 !NILP (coding_category_system[cat])) | |
3651 { | |
3652 retval = (get_coding_system_for_text_file | |
3653 (coding_category_system[cat], 0)); | |
3654 if (likelihood < DET_AS_LIKELY_AS_UNLIKELY) | |
3655 warn_when_safe_lispobj | |
3656 (intern ("detection"), | |
793 | 3657 Qwarning, |
771 | 3658 emacs_sprintf_string_lisp |
3659 ( | |
3660 "Detected coding %s is unlikely to be correct (likelihood == `%s')", | |
3661 Qnil, 2, XCODING_SYSTEM_NAME (retval), | |
3662 detection_result_number_to_symbol | |
3663 ((enum detection_result) likelihood))); | |
3664 return retval; | |
3665 } | |
3666 } | |
428 | 3667 } |
771 | 3668 |
3669 return Fget_coding_system (Qraw_text); | |
428 | 3670 } |
3671 } | |
3672 | |
1347 | 3673 /* Look for a coding system in the string (skipping over leading |
3674 blanks). If found, return it, otherwise nil. */ | |
3675 | |
3676 static Lisp_Object | |
2531 | 3677 snarf_coding_system (const UExtbyte *p, Bytecount len) |
1347 | 3678 { |
3679 Bytecount n; | |
2531 | 3680 UExtbyte *name; |
1347 | 3681 |
3682 while (*p == ' ' || *p == '\t') p++, len--; | |
3683 len = min (len, 1000); | |
3684 name = alloca_ibytes (len + 1); | |
3685 memcpy (name, p, len); | |
3686 name[len] = '\0'; | |
3687 | |
3688 /* Get coding system name */ | |
3689 /* Characters valid in a MIME charset name (rfc 1521), | |
3690 and in a Lisp symbol name. */ | |
3691 n = qxestrspn (name, | |
3692 "ABCDEFGHIJKLMNOPQRSTUVWXYZ" | |
3693 "abcdefghijklmnopqrstuvwxyz" | |
3694 "0123456789" | |
3695 "!$%&*+-.^_{|}~"); | |
3696 if (n > 0) | |
3697 { | |
3698 name[n] = '\0'; | |
4953
304aebb79cd3
function renamings to track names of char typedefs
Ben Wing <ben@xemacs.org>
parents:
4952
diff
changeset
|
3699 /* This call to intern_istring() is OK because we already verified that |
2531 | 3700 there are only ASCII characters in the string */ |
4953
304aebb79cd3
function renamings to track names of char typedefs
Ben Wing <ben@xemacs.org>
parents:
4952
diff
changeset
|
3701 return find_coding_system_for_text_file (intern_istring ((Ibyte *) name), 0); |
1347 | 3702 } |
3703 | |
3704 return Qnil; | |
3705 } | |
3706 | |
428 | 3707 /* Given a seekable read stream and potential coding system and EOL type |
3708 as specified, do any autodetection that is called for. If the | |
3709 coding system and/or EOL type are not `autodetect', they will be left | |
3710 alone; but this function will never return an autodetect coding system | |
3711 or EOL type. | |
3712 | |
3713 This function does not automatically fetch subsidiary coding systems; | |
3714 that should be unnecessary with the explicit eol-type argument. */ | |
3715 | |
3716 #define LENGTH(string_constant) (sizeof (string_constant) - 1) | |
3717 | |
771 | 3718 static Lisp_Object |
3719 unwind_free_detection_state (Lisp_Object opaque) | |
3720 { | |
3721 struct detection_state *st = | |
3722 (struct detection_state *) get_opaque_ptr (opaque); | |
3723 free_detection_state (st); | |
3724 free_opaque_ptr (opaque); | |
3725 return Qnil; | |
3726 } | |
3727 | |
1347 | 3728 /* #### This duplicates code in `find-coding-system-magic-cookie-in-file' |
3729 in files.el. Look into combining them. */ | |
3730 | |
771 | 3731 static Lisp_Object |
3732 look_for_coding_system_magic_cookie (const UExtbyte *data, Bytecount len) | |
428 | 3733 { |
771 | 3734 const UExtbyte *p; |
3735 const UExtbyte *scan_end; | |
2531 | 3736 Bytecount cookie_len; |
771 | 3737 |
3738 /* Look for initial "-*-"; mode line prefix */ | |
3739 for (p = data, | |
3740 scan_end = data + len - LENGTH ("-*-coding:?-*-"); | |
3741 p <= scan_end | |
3742 && *p != '\n' | |
3743 && *p != '\r'; | |
3744 p++) | |
3745 if (*p == '-' && *(p+1) == '*' && *(p+2) == '-') | |
3746 { | |
3747 const UExtbyte *local_vars_beg = p + 3; | |
3748 /* Look for final "-*-"; mode line suffix */ | |
3749 for (p = local_vars_beg, | |
3750 scan_end = data + len - LENGTH ("-*-"); | |
3751 p <= scan_end | |
428 | 3752 && *p != '\n' |
3753 && *p != '\r'; | |
771 | 3754 p++) |
3755 if (*p == '-' && *(p+1) == '*' && *(p+2) == '-') | |
3756 { | |
3757 const UExtbyte *suffix = p; | |
3758 /* Look for "coding:" */ | |
3759 for (p = local_vars_beg, | |
3760 scan_end = suffix - LENGTH ("coding:?"); | |
3761 p <= scan_end; | |
3762 p++) | |
3763 if (memcmp ("coding:", p, LENGTH ("coding:")) == 0 | |
3764 && (p == local_vars_beg | |
3765 || (*(p-1) == ' ' || | |
3766 *(p-1) == '\t' || | |
3767 *(p-1) == ';'))) | |
3768 { | |
3769 p += LENGTH ("coding:"); | |
1347 | 3770 return snarf_coding_system (p, suffix - p); |
771 | 3771 break; |
3772 } | |
3773 break; | |
3774 } | |
3775 break; | |
3776 } | |
3777 | |
2531 | 3778 /* Look for ;;;###coding system */ |
3779 | |
3780 cookie_len = LENGTH (";;;###coding system: "); | |
3781 | |
3782 for (p = data, | |
3783 scan_end = data + len - cookie_len; | |
3784 p <= scan_end; | |
3785 p++) | |
1347 | 3786 { |
2531 | 3787 if (*p == ';' && !memcmp (p, ";;;###coding system: ", cookie_len)) |
3788 { | |
3789 const UExtbyte *suffix; | |
3790 | |
3791 p += cookie_len; | |
3792 suffix = p; | |
3793 while (suffix < scan_end && !isspace (*suffix)) | |
3794 suffix++; | |
3795 return snarf_coding_system (p, suffix - p); | |
3796 } | |
1347 | 3797 } |
3798 | |
3799 return Qnil; | |
771 | 3800 } |
3801 | |
3802 static Lisp_Object | |
3803 determine_real_coding_system (Lstream *stream) | |
3804 { | |
3805 struct detection_state *st = allocate_detection_state (); | |
3806 int depth = record_unwind_protect (unwind_free_detection_state, | |
3807 make_opaque_ptr (st)); | |
3808 UExtbyte buf[4096]; | |
3809 Bytecount nread = Lstream_read (stream, buf, sizeof (buf)); | |
3810 Lisp_Object coding_system = look_for_coding_system_magic_cookie (buf, nread); | |
3811 | |
3812 if (NILP (coding_system)) | |
3813 { | |
3814 while (1) | |
3815 { | |
3816 if (detect_coding_type (st, buf, nread)) | |
428 | 3817 break; |
771 | 3818 nread = Lstream_read (stream, buf, sizeof (buf)); |
3819 if (nread == 0) | |
3820 break; | |
428 | 3821 } |
771 | 3822 |
3823 coding_system = detected_coding_system (st); | |
428 | 3824 } |
3825 | |
3826 Lstream_rewind (stream); | |
771 | 3827 |
3828 unbind_to (depth); | |
3829 return coding_system; | |
3830 } | |
3831 | |
3832 static void | |
3833 undecided_init_coding_stream (struct coding_stream *str) | |
3834 { | |
3835 struct undecided_coding_stream *data = | |
3836 CODING_STREAM_TYPE_DATA (str, undecided); | |
3837 struct undecided_coding_system *csdata = | |
3838 XCODING_SYSTEM_TYPE_DATA (str->codesys, undecided); | |
3839 | |
3840 data->actual = Qnil; | |
3841 | |
3842 if (str->direction == CODING_DECODE) | |
3843 { | |
3844 Lstream *lst = str->other_end; | |
3845 | |
3846 if ((lst->flags & LSTREAM_FL_READ) && | |
3847 Lstream_seekable_p (lst) && | |
3848 csdata->do_coding) | |
3849 /* We can determine the coding system now. */ | |
3850 data->actual = determine_real_coding_system (lst); | |
3851 } | |
1494 | 3852 |
3853 #ifdef DEBUG_XEMACS | |
3854 if (!NILP (Vdebug_coding_detection)) | |
3855 stderr_out_lisp ("detected coding system: %s\n", 1, data->actual); | |
3856 #endif /* DEBUG_XEMACS */ | |
771 | 3857 } |
3858 | |
3859 static void | |
3860 undecided_rewind_coding_stream (struct coding_stream *str) | |
3861 { | |
3862 chain_rewind_coding_stream_1 (&CODING_STREAM_TYPE_DATA (str, undecided)->c); | |
3863 } | |
3864 | |
3865 static void | |
3866 undecided_finalize_coding_stream (struct coding_stream *str) | |
3867 { | |
3868 struct undecided_coding_stream *data = | |
3869 CODING_STREAM_TYPE_DATA (str, undecided); | |
3870 | |
3871 chain_finalize_coding_stream_1 | |
3872 (&CODING_STREAM_TYPE_DATA (str, undecided)->c); | |
3873 if (data->st) | |
3874 free_detection_state (data->st); | |
3875 } | |
3876 | |
3877 static Lisp_Object | |
3878 undecided_canonicalize (Lisp_Object codesys) | |
3879 { | |
3880 struct undecided_coding_system *csdata = | |
3881 XCODING_SYSTEM_TYPE_DATA (codesys, undecided); | |
3882 if (!csdata->do_eol && !csdata->do_coding) | |
3883 return NILP (csdata->cs) ? Fget_coding_system (Qbinary) : csdata->cs; | |
3884 if (csdata->do_eol && !csdata->do_coding && NILP (csdata->cs)) | |
3885 return Fget_coding_system (Qconvert_eol_autodetect); | |
3886 return codesys; | |
3887 } | |
3888 | |
3889 static Bytecount | |
3890 undecided_convert (struct coding_stream *str, const UExtbyte *src, | |
3891 unsigned_char_dynarr *dst, Bytecount n) | |
3892 { | |
3893 int first_time = 0; | |
3894 | |
3895 if (str->direction == CODING_DECODE) | |
3896 { | |
3897 /* At this point, we have only the following possibilities: | |
3898 | |
3899 do_eol && do_coding | |
3900 do_coding only | |
3901 do_eol only and a coding system was specified | |
3902 | |
3903 Other possibilities are removed during undecided_canonicalize. | |
3904 | |
3905 Therefore, our substreams are either | |
3906 | |
3907 lstream_coding -> lstream_dynarr, or | |
3908 lstream_coding -> lstream_eol -> lstream_dynarr. | |
3909 */ | |
3910 struct undecided_coding_system *csdata = | |
3911 XCODING_SYSTEM_TYPE_DATA (str->codesys, undecided); | |
3912 struct undecided_coding_stream *data = | |
3913 CODING_STREAM_TYPE_DATA (str, undecided); | |
3914 | |
3915 if (str->eof) | |
3916 { | |
3917 /* Each will close the next. We need to close now because more | |
3918 data may be generated. */ | |
3919 if (data->c.initted) | |
3920 Lstream_close (XLSTREAM (data->c.lstreams[0])); | |
3921 return n; | |
3922 } | |
3923 | |
3924 if (!data->c.initted) | |
3925 { | |
3926 data->c.lstream_count = csdata->do_eol ? 3 : 2; | |
3927 data->c.lstreams = xnew_array (Lisp_Object, data->c.lstream_count); | |
3928 | |
3929 data->c.lstreams[data->c.lstream_count - 1] = | |
3930 make_dynarr_output_stream (dst); | |
3931 Lstream_set_buffering | |
3932 (XLSTREAM (data->c.lstreams[data->c.lstream_count - 1]), | |
3933 LSTREAM_UNBUFFERED, 0); | |
3934 if (csdata->do_eol) | |
3935 { | |
3936 data->c.lstreams[1] = | |
3937 make_coding_output_stream | |
3938 (XLSTREAM (data->c.lstreams[data->c.lstream_count - 1]), | |
3939 Fget_coding_system (Qconvert_eol_autodetect), | |
800 | 3940 CODING_DECODE, 0); |
771 | 3941 Lstream_set_buffering |
3942 (XLSTREAM (data->c.lstreams[1]), | |
3943 LSTREAM_UNBUFFERED, 0); | |
3944 } | |
3945 | |
3946 data->c.lstreams[0] = | |
3947 make_coding_output_stream | |
3948 (XLSTREAM (data->c.lstreams[1]), | |
3949 /* Substitute binary if we need to detect the encoding */ | |
3950 csdata->do_coding ? Qbinary : csdata->cs, | |
800 | 3951 CODING_DECODE, 0); |
771 | 3952 Lstream_set_buffering (XLSTREAM (data->c.lstreams[0]), |
3953 LSTREAM_UNBUFFERED, 0); | |
3954 | |
3955 first_time = 1; | |
3956 data->c.initted = 1; | |
3957 } | |
3958 | |
3959 /* If necessary, do encoding-detection now. We do this when we're a | |
3960 writing stream or a non-seekable reading stream, meaning that we | |
3961 can't just process the whole input, rewind, and start over. */ | |
3962 | |
3963 if (csdata->do_coding) | |
3964 { | |
3965 int actual_was_nil = NILP (data->actual); | |
3966 if (NILP (data->actual)) | |
3967 { | |
3968 if (!data->st) | |
3969 data->st = allocate_detection_state (); | |
3970 if (first_time) | |
3971 /* #### This is cheesy. What we really ought to do is buffer | |
3972 up a certain minimum amount of data to get a better result. | |
3973 */ | |
3974 data->actual = look_for_coding_system_magic_cookie (src, n); | |
3975 if (NILP (data->actual)) | |
3976 { | |
3977 /* #### This is cheesy. What we really ought to do is buffer | |
3978 up a certain minimum amount of data so as to get a less | |
3979 random result when doing subprocess detection. */ | |
3980 detect_coding_type (data->st, src, n); | |
3981 data->actual = detected_coding_system (data->st); | |
4100 | 3982 /* kludge to prevent infinite recursion */ |
3983 if (XCODING_SYSTEM(data->actual)->methods->enumtype == undecided_coding_system) | |
3984 data->actual = Fget_coding_system (Qbinary); | |
771 | 3985 } |
3986 } | |
3987 /* We need to set the detected coding system if we actually have | |
3988 such a coding system but didn't before. That is the case | |
3989 either when we just detected it in the previous code or when | |
3990 it was detected during undecided_init_coding_stream(). We | |
3991 can check for that using first_time. */ | |
3992 if (!NILP (data->actual) && (actual_was_nil || first_time)) | |
3993 { | |
3994 /* If the detected coding system doesn't allow for EOL | |
3995 autodetection, try to get the equivalent that does; | |
3996 otherwise, disable EOL detection (overriding whatever | |
3997 may already have been detected). */ | |
3998 if (XCODING_SYSTEM_EOL_TYPE (data->actual) != EOL_AUTODETECT) | |
3999 { | |
4000 if (!NILP (XCODING_SYSTEM_SUBSIDIARY_PARENT (data->actual))) | |
4001 data->actual = | |
4002 XCODING_SYSTEM_SUBSIDIARY_PARENT (data->actual); | |
4003 else if (data->c.lstream_count == 3) | |
4004 set_coding_stream_coding_system | |
4005 (XLSTREAM (data->c.lstreams[1]), | |
4006 Fget_coding_system (Qidentity)); | |
4007 } | |
4008 set_coding_stream_coding_system | |
4009 (XLSTREAM (data->c.lstreams[0]), data->actual); | |
4010 } | |
4011 } | |
4012 | |
4013 if (Lstream_write (XLSTREAM (data->c.lstreams[0]), src, n) < 0) | |
4014 return -1; | |
4015 return n; | |
4016 } | |
4017 else | |
4018 return no_conversion_convert (str, src, dst, n); | |
4019 } | |
4020 | |
4021 static Lisp_Object | |
4022 undecided_canonicalize_after_coding (struct coding_stream *str) | |
4023 { | |
4024 struct undecided_coding_stream *data = | |
4025 CODING_STREAM_TYPE_DATA (str, undecided); | |
4026 Lisp_Object ret, eolret; | |
4027 | |
4028 if (str->direction == CODING_ENCODE) | |
4029 return str->codesys; | |
4030 | |
4031 if (!data->c.initted) | |
4647
e4ed58cb0e5b
Fix bugs with #'find-file, 0-length files, & coding-system-for-read specified.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4569
diff
changeset
|
4032 return str->codesys; |
771 | 4033 |
4034 ret = coding_stream_canonicalize_after_coding | |
4035 (XLSTREAM (data->c.lstreams[0])); | |
4036 if (NILP (ret)) | |
4647
e4ed58cb0e5b
Fix bugs with #'find-file, 0-length files, & coding-system-for-read specified.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4569
diff
changeset
|
4037 ret = str->codesys; |
771 | 4038 if (XCODING_SYSTEM_EOL_TYPE (ret) != EOL_AUTODETECT) |
4039 return ret; | |
4040 eolret = coding_stream_canonicalize_after_coding | |
4041 (XLSTREAM (data->c.lstreams[1])); | |
4042 if (!EQ (XCODING_SYSTEM_TYPE (eolret), Qconvert_eol)) | |
4043 return ret; | |
4044 return | |
4045 Fsubsidiary_coding_system (ret, Fcoding_system_property (eolret, | |
4046 Qsubtype)); | |
4047 } | |
4048 | |
4049 | |
4050 /************************************************************************/ | |
4051 /* Lisp interface: Coding category functions and detection */ | |
4052 /************************************************************************/ | |
4053 | |
4054 DEFUN ("coding-category-list", Fcoding_category_list, 0, 0, 0, /* | |
4055 Return a list of all recognized coding categories. | |
4056 */ | |
4057 ()) | |
4058 { | |
4059 int i; | |
4060 Lisp_Object list = Qnil; | |
4061 | |
4062 for (i = 0; i < coding_detector_count; i++) | |
4063 { | |
4064 detector_category_dynarr *cats = | |
4065 Dynarr_at (all_coding_detectors, i).cats; | |
4066 int j; | |
4067 | |
4068 for (j = 0; j < Dynarr_length (cats); j++) | |
4069 list = Fcons (Dynarr_at (cats, j).sym, list); | |
4070 } | |
4071 | |
4072 return Fnreverse (list); | |
4073 } | |
4074 | |
4075 DEFUN ("set-coding-priority-list", Fset_coding_priority_list, 1, 1, 0, /* | |
4076 Change the priority order of the coding categories. | |
4077 LIST should be list of coding categories, in descending order of | |
4078 priority. Unspecified coding categories will be lower in priority | |
4079 than all specified ones, in the same relative order they were in | |
4080 previously. | |
4081 */ | |
4082 (list)) | |
4083 { | |
4084 int *category_to_priority = | |
4085 alloca_array (int, coding_detector_category_count); | |
4086 int i, j; | |
4087 | |
4088 /* First generate a list that maps coding categories to priorities. */ | |
4089 | |
4090 for (i = 0; i < coding_detector_category_count; i++) | |
4091 category_to_priority[i] = -1; | |
4092 | |
4093 /* Highest priority comes from the specified list. */ | |
4094 i = 0; | |
2367 | 4095 { |
4096 EXTERNAL_LIST_LOOP_2 (elt, list) | |
4097 { | |
4098 int cat = coding_category_symbol_to_id (elt); | |
4099 | |
4100 if (category_to_priority[cat] >= 0) | |
4101 sferror ("Duplicate coding category in list", elt); | |
4102 category_to_priority[cat] = i++; | |
4103 } | |
4104 } | |
771 | 4105 |
4106 /* Now go through the existing categories by priority to retrieve | |
4107 the categories not yet specified and preserve their priority | |
4108 order. */ | |
4109 for (j = 0; j < coding_detector_category_count; j++) | |
4110 { | |
4111 int cat = coding_category_by_priority[j]; | |
4112 if (category_to_priority[cat] < 0) | |
4113 category_to_priority[cat] = i++; | |
4114 } | |
4115 | |
4116 /* Now we need to construct the inverse of the mapping we just | |
4117 constructed. */ | |
4118 | |
4119 for (i = 0; i < coding_detector_category_count; i++) | |
4120 coding_category_by_priority[category_to_priority[i]] = i; | |
4121 | |
4122 /* Phew! That was confusing. */ | |
4123 return Qnil; | |
4124 } | |
4125 | |
4126 DEFUN ("coding-priority-list", Fcoding_priority_list, 0, 0, 0, /* | |
4127 Return a list of coding categories in descending order of priority. | |
4128 */ | |
4129 ()) | |
4130 { | |
4131 int i; | |
4132 Lisp_Object list = Qnil; | |
4133 | |
4134 for (i = 0; i < coding_detector_category_count; i++) | |
4135 list = | |
4136 Fcons (coding_category_id_to_symbol (coding_category_by_priority[i]), | |
4137 list); | |
4138 return Fnreverse (list); | |
4139 } | |
4140 | |
4141 DEFUN ("set-coding-category-system", Fset_coding_category_system, 2, 2, 0, /* | |
4142 Change the coding system associated with a coding category. | |
4143 */ | |
4144 (coding_category, coding_system)) | |
4145 { | |
4146 coding_category_system[coding_category_symbol_to_id (coding_category)] = | |
4147 Fget_coding_system (coding_system); | |
4148 return Qnil; | |
4149 } | |
4150 | |
4151 DEFUN ("coding-category-system", Fcoding_category_system, 1, 1, 0, /* | |
4152 Return the coding system associated with a coding category. | |
4153 */ | |
4154 (coding_category)) | |
4155 { | |
4156 Lisp_Object sys = | |
4157 coding_category_system[coding_category_symbol_to_id (coding_category)]; | |
4158 | |
4159 if (!NILP (sys)) | |
4160 return XCODING_SYSTEM_NAME (sys); | |
4161 return Qnil; | |
4162 } | |
4163 | |
800 | 4164 /* Detect the encoding of STREAM. Assumes stream is at the begnning and will |
4165 read through to the end of STREAM, leaving it there but open. */ | |
4166 | |
771 | 4167 Lisp_Object |
4168 detect_coding_stream (Lisp_Object stream) | |
4169 { | |
4170 Lisp_Object val = Qnil; | |
4171 struct gcpro gcpro1, gcpro2, gcpro3; | |
4172 UExtbyte random_buffer[65536]; | |
4173 Lisp_Object binary_instream = | |
4174 make_coding_input_stream | |
4175 (XLSTREAM (stream), Qbinary, | |
814 | 4176 CODING_ENCODE, LSTREAM_FL_NO_CLOSE_OTHER); |
771 | 4177 Lisp_Object decstream = |
4178 make_coding_input_stream | |
4179 (XLSTREAM (binary_instream), | |
800 | 4180 Qundecided, CODING_DECODE, 0); |
771 | 4181 Lstream *decstr = XLSTREAM (decstream); |
4182 | |
4183 GCPRO3 (decstream, stream, binary_instream); | |
4184 /* Read and discard all data; detection happens as a side effect of this, | |
4185 and we examine what was detected afterwards. */ | |
4186 while (Lstream_read (decstr, random_buffer, sizeof (random_buffer)) > 0) | |
4187 ; | |
4188 | |
4189 val = coding_stream_detected_coding_system (decstr); | |
4190 Lstream_close (decstr); | |
4191 Lstream_delete (decstr); | |
4192 Lstream_delete (XLSTREAM (binary_instream)); | |
4193 UNGCPRO; | |
4194 return val; | |
428 | 4195 } |
4196 | |
4197 DEFUN ("detect-coding-region", Fdetect_coding_region, 2, 3, 0, /* | |
4198 Detect coding system of the text in the region between START and END. | |
444 | 4199 Return a list of possible coding systems ordered by priority. |
3025 | 4200 If only ASCII characters are found, return `undecided' or one of |
428 | 4201 its subsidiary coding systems according to a detected end-of-line |
4202 type. Optional arg BUFFER defaults to the current buffer. | |
4203 */ | |
4204 (start, end, buffer)) | |
4205 { | |
4206 Lisp_Object val = Qnil; | |
4207 struct buffer *buf = decode_buffer (buffer, 0); | |
665 | 4208 Charbpos b, e; |
771 | 4209 Lisp_Object lb_instream; |
428 | 4210 |
4211 get_buffer_range_char (buf, start, end, &b, &e, 0); | |
4212 lb_instream = make_lisp_buffer_input_stream (buf, b, e, 0); | |
771 | 4213 |
4214 val = detect_coding_stream (lb_instream); | |
4215 Lstream_delete (XLSTREAM (lb_instream)); | |
428 | 4216 return val; |
4217 } | |
4218 | |
4219 | |
771 | 4220 |
4221 #ifdef DEBUG_XEMACS | |
4222 | |
428 | 4223 /************************************************************************/ |
771 | 4224 /* Internal methods */ |
4225 /************************************************************************/ | |
4226 | |
4227 /* Raw (internally-formatted) data. */ | |
4228 DEFINE_CODING_SYSTEM_TYPE (internal); | |
428 | 4229 |
665 | 4230 static Bytecount |
2286 | 4231 internal_convert (struct coding_stream *UNUSED (str), const UExtbyte *src, |
771 | 4232 unsigned_char_dynarr *dst, Bytecount n) |
4233 { | |
4234 Bytecount orign = n; | |
4235 Dynarr_add_many (dst, src, n); | |
4236 return orign; | |
4237 } | |
4238 | |
4239 #endif /* DEBUG_XEMACS */ | |
4240 | |
4241 | |
4242 | |
4243 #ifdef HAVE_ZLIB | |
4244 | |
4245 /************************************************************************/ | |
4246 /* Gzip methods */ | |
4247 /************************************************************************/ | |
4248 | |
4249 struct gzip_coding_system | |
428 | 4250 { |
771 | 4251 int level; /* 0 through 9, or -1 for default */ |
4252 }; | |
4253 | |
4254 #define CODING_SYSTEM_GZIP_LEVEL(codesys) \ | |
4255 (CODING_SYSTEM_TYPE_DATA (codesys, gzip)->level) | |
4256 #define XCODING_SYSTEM_GZIP_LEVEL(codesys) \ | |
4257 (XCODING_SYSTEM_TYPE_DATA (codesys, gzip)->level) | |
4258 | |
4259 struct gzip_coding_stream | |
428 | 4260 { |
771 | 4261 z_stream stream; |
4262 int stream_initted; | |
4263 int reached_eof; /* #### this should be handled by the caller, once we | |
4264 return LSTREAM_EOF */ | |
4265 }; | |
4266 | |
1204 | 4267 static const struct memory_description |
771 | 4268 gzip_coding_system_description[] = { |
4269 { XD_END } | |
4270 }; | |
4271 | |
1204 | 4272 DEFINE_CODING_SYSTEM_TYPE_WITH_DATA (gzip); |
4273 | |
771 | 4274 enum source_sink_type |
4275 gzip_conversion_end_type (Lisp_Object codesys) | |
4276 { | |
4277 return DECODES_BYTE_TO_BYTE; | |
428 | 4278 } |
4279 | |
4280 static void | |
771 | 4281 gzip_init (Lisp_Object codesys) |
4282 { | |
4283 struct gzip_coding_system *data = XCODING_SYSTEM_TYPE_DATA (codesys, gzip); | |
4284 data->level = -1; | |
4285 } | |
4286 | |
4287 static void | |
4288 gzip_print (Lisp_Object cs, Lisp_Object printcharfun, int escapeflag) | |
428 | 4289 { |
771 | 4290 struct gzip_coding_system *data = XCODING_SYSTEM_TYPE_DATA (cs, gzip); |
4291 | |
4952
19a72041c5ed
Mule-izing, various fixes related to char * arguments
Ben Wing <ben@xemacs.org>
parents:
4846
diff
changeset
|
4292 write_ascstring (printcharfun, "("); |
771 | 4293 if (data->level == -1) |
4952
19a72041c5ed
Mule-izing, various fixes related to char * arguments
Ben Wing <ben@xemacs.org>
parents:
4846
diff
changeset
|
4294 write_ascstring (printcharfun, "default"); |
771 | 4295 else |
4296 print_internal (make_int (data->level), printcharfun, 0); | |
4952
19a72041c5ed
Mule-izing, various fixes related to char * arguments
Ben Wing <ben@xemacs.org>
parents:
4846
diff
changeset
|
4297 write_ascstring (printcharfun, ")"); |
428 | 4298 } |
4299 | |
4300 static int | |
771 | 4301 gzip_putprop (Lisp_Object codesys, Lisp_Object key, Lisp_Object value) |
428 | 4302 { |
771 | 4303 struct gzip_coding_system *data = XCODING_SYSTEM_TYPE_DATA (codesys, gzip); |
4304 | |
4305 if (EQ (key, Qlevel)) | |
428 | 4306 { |
771 | 4307 if (EQ (value, Qdefault)) |
4308 data->level = -1; | |
4309 else | |
428 | 4310 { |
5307
c096d8051f89
Have NATNUMP give t for positive bignums; check limits appropriately.
Aidan Kehoe <kehoea@parhasard.net>
parents:
5191
diff
changeset
|
4311 check_integer_range (value, Qzero, make_int (9)); |
771 | 4312 data->level = XINT (value); |
428 | 4313 } |
4314 } | |
4315 else | |
771 | 4316 return 0; |
4317 return 1; | |
428 | 4318 } |
4319 | |
4320 static Lisp_Object | |
771 | 4321 gzip_getprop (Lisp_Object coding_system, Lisp_Object prop) |
428 | 4322 { |
771 | 4323 struct gzip_coding_system *data = |
4324 XCODING_SYSTEM_TYPE_DATA (coding_system, gzip); | |
4325 | |
4326 if (EQ (prop, Qlevel)) | |
428 | 4327 { |
771 | 4328 if (data->level == -1) |
4329 return Qdefault; | |
4330 return make_int (data->level); | |
428 | 4331 } |
771 | 4332 |
4333 return Qunbound; | |
428 | 4334 } |
4335 | |
4336 static void | |
771 | 4337 gzip_init_coding_stream (struct coding_stream *str) |
428 | 4338 { |
771 | 4339 struct gzip_coding_stream *data = CODING_STREAM_TYPE_DATA (str, gzip); |
4340 if (data->stream_initted) | |
428 | 4341 { |
771 | 4342 if (str->direction == CODING_DECODE) |
4343 inflateEnd (&data->stream); | |
4344 else | |
4345 deflateEnd (&data->stream); | |
4346 data->stream_initted = 0; | |
428 | 4347 } |
771 | 4348 data->reached_eof = 0; |
428 | 4349 } |
4350 | |
4351 static void | |
771 | 4352 gzip_rewind_coding_stream (struct coding_stream *str) |
428 | 4353 { |
771 | 4354 gzip_init_coding_stream (str); |
428 | 4355 } |
4356 | |
771 | 4357 static Bytecount |
4358 gzip_convert (struct coding_stream *str, | |
4359 const UExtbyte *src, | |
4360 unsigned_char_dynarr *dst, Bytecount n) | |
428 | 4361 { |
771 | 4362 struct gzip_coding_stream *data = CODING_STREAM_TYPE_DATA (str, gzip); |
4363 int zerr; | |
4364 if (str->direction == CODING_DECODE) | |
428 | 4365 { |
771 | 4366 if (data->reached_eof) |
4367 return n; /* eat the data */ | |
4368 | |
4369 if (!data->stream_initted) | |
428 | 4370 { |
771 | 4371 xzero (data->stream); |
4372 if (inflateInit (&data->stream) != Z_OK) | |
4373 return LSTREAM_ERROR; | |
4374 data->stream_initted = 1; | |
428 | 4375 } |
771 | 4376 |
4377 data->stream.next_in = (Bytef *) src; | |
4378 data->stream.avail_in = n; | |
4379 | |
4380 /* Normally we stop when we've fed all data to the decompressor; but | |
4381 if we're at the end of the input, and the decompressor hasn't | |
4382 reported EOF, we need to keep going, as there might be more output | |
4383 to generate. Z_OK from the decompressor means input was processed | |
4384 or output was generated; if neither, we break out of the loop. | |
4385 Other return values are: | |
4386 | |
4387 Z_STREAM_END EOF from decompressor | |
4388 Z_DATA_ERROR Corrupted data | |
4389 Z_BUF_ERROR No progress possible (this should happen if | |
4390 we try to feed it an incomplete file) | |
4391 Z_MEM_ERROR Out of memory | |
4392 Z_STREAM_ERROR (should never happen) | |
4393 Z_NEED_DICT (#### when will this happen?) | |
4394 */ | |
4395 while (data->stream.avail_in > 0 || str->eof) | |
4396 { | |
4397 /* Reserve an output buffer of the same size as the input buffer; | |
4398 if that's not enough, we keep reserving the same size. */ | |
4399 Bytecount reserved = n; | |
4400 Dynarr_add_many (dst, 0, reserved); | |
4401 /* Careful here! Don't retrieve the pointer until after | |
4402 reserving the space, or it might be bogus */ | |
4403 data->stream.next_out = | |
4404 Dynarr_atp (dst, Dynarr_length (dst) - reserved); | |
4405 data->stream.avail_out = reserved; | |
4406 zerr = inflate (&data->stream, Z_NO_FLUSH); | |
4407 /* Lop off the unused portion */ | |
5038 | 4408 Dynarr_set_lengthr (dst, Dynarr_length (dst) - data->stream.avail_out); |
771 | 4409 if (zerr != Z_OK) |
4410 break; | |
4411 } | |
4412 | |
4413 if (zerr == Z_STREAM_END) | |
4414 data->reached_eof = 1; | |
4415 | |
4416 if ((Bytecount) data->stream.avail_in < n) | |
4417 return n - data->stream.avail_in; | |
4418 | |
4419 if (zerr == Z_OK || zerr == Z_STREAM_END) | |
4420 return 0; | |
4421 | |
4422 return LSTREAM_ERROR; | |
428 | 4423 } |
4424 else | |
4425 { | |
771 | 4426 if (!data->stream_initted) |
4427 { | |
4428 int level = XCODING_SYSTEM_GZIP_LEVEL (str->codesys); | |
4429 xzero (data->stream); | |
4430 if (deflateInit (&data->stream, | |
4431 level == -1 ? Z_DEFAULT_COMPRESSION : level) != | |
4432 Z_OK) | |
4433 return LSTREAM_ERROR; | |
4434 data->stream_initted = 1; | |
428 | 4435 } |
771 | 4436 |
4437 data->stream.next_in = (Bytef *) src; | |
4438 data->stream.avail_in = n; | |
4439 | |
4440 /* Normally we stop when we've fed all data to the compressor; but if | |
4441 we're at the end of the input, and the compressor hasn't reported | |
4442 EOF, we need to keep going, as there might be more output to | |
4443 generate. (To signal EOF on our end, we set the FLUSH parameter | |
4444 to Z_FINISH; when all data is output, Z_STREAM_END will be | |
4445 returned.) Z_OK from the compressor means input was processed or | |
4446 output was generated; if neither, we break out of the loop. Other | |
4447 return values are: | |
4448 | |
4449 Z_STREAM_END EOF from compressor | |
4450 Z_BUF_ERROR No progress possible (should never happen) | |
4451 Z_STREAM_ERROR (should never happen) | |
4452 */ | |
4453 while (data->stream.avail_in > 0 || str->eof) | |
4454 { | |
4455 /* Reserve an output buffer of the same size as the input buffer; | |
4456 if that's not enough, we keep reserving the same size. */ | |
4457 Bytecount reserved = n; | |
4458 Dynarr_add_many (dst, 0, reserved); | |
4459 /* Careful here! Don't retrieve the pointer until after | |
4460 reserving the space, or it might be bogus */ | |
4461 data->stream.next_out = | |
4462 Dynarr_atp (dst, Dynarr_length (dst) - reserved); | |
4463 data->stream.avail_out = reserved; | |
4464 zerr = | |
4465 deflate (&data->stream, | |
4466 str->eof ? Z_FINISH : Z_NO_FLUSH); | |
4467 /* Lop off the unused portion */ | |
5038 | 4468 Dynarr_set_lengthr (dst, Dynarr_length (dst) - data->stream.avail_out); |
771 | 4469 if (zerr != Z_OK) |
4470 break; | |
4471 } | |
4472 | |
4473 if ((Bytecount) data->stream.avail_in < n) | |
4474 return n - data->stream.avail_in; | |
4475 | |
4476 if (zerr == Z_OK || zerr == Z_STREAM_END) | |
4477 return 0; | |
4478 | |
4479 return LSTREAM_ERROR; | |
428 | 4480 } |
4481 } | |
4482 | |
771 | 4483 #endif /* HAVE_ZLIB */ |
428 | 4484 |
4485 | |
4486 /************************************************************************/ | |
4487 /* Initialization */ | |
4488 /************************************************************************/ | |
4489 | |
4490 void | |
4491 syms_of_file_coding (void) | |
4492 { | |
5117
3742ea8250b5
Checking in final CVS version of workspace 'ben-lisp-object'
Ben Wing <ben@xemacs.org>
parents:
3025
diff
changeset
|
4493 INIT_LISP_OBJECT (coding_system); |
442 | 4494 |
771 | 4495 DEFSUBR (Fvalid_coding_system_type_p); |
4496 DEFSUBR (Fcoding_system_type_list); | |
428 | 4497 DEFSUBR (Fcoding_system_p); |
4303 | 4498 DEFSUBR (Fautoload_coding_system); |
428 | 4499 DEFSUBR (Ffind_coding_system); |
4500 DEFSUBR (Fget_coding_system); | |
4501 DEFSUBR (Fcoding_system_list); | |
4502 DEFSUBR (Fcoding_system_name); | |
4690
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
4503 DEFSUBR (Fmake_coding_system_internal); |
428 | 4504 DEFSUBR (Fcopy_coding_system); |
440 | 4505 DEFSUBR (Fcoding_system_canonical_name_p); |
4506 DEFSUBR (Fcoding_system_alias_p); | |
4507 DEFSUBR (Fcoding_system_aliasee); | |
428 | 4508 DEFSUBR (Fdefine_coding_system_alias); |
4509 DEFSUBR (Fsubsidiary_coding_system); | |
771 | 4510 DEFSUBR (Fcoding_system_base); |
4511 DEFSUBR (Fcoding_system_used_for_io); | |
428 | 4512 |
4513 DEFSUBR (Fcoding_system_type); | |
771 | 4514 DEFSUBR (Fcoding_system_description); |
428 | 4515 DEFSUBR (Fcoding_system_property); |
4516 | |
4517 DEFSUBR (Fcoding_category_list); | |
4518 DEFSUBR (Fset_coding_priority_list); | |
4519 DEFSUBR (Fcoding_priority_list); | |
4520 DEFSUBR (Fset_coding_category_system); | |
4521 DEFSUBR (Fcoding_category_system); | |
4522 | |
4523 DEFSUBR (Fdetect_coding_region); | |
4524 DEFSUBR (Fdecode_coding_region); | |
4525 DEFSUBR (Fencode_coding_region); | |
4690
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
4526 DEFSUBR (Fquery_coding_region); |
563 | 4527 DEFSYMBOL_MULTIWORD_PREDICATE (Qcoding_systemp); |
4528 DEFSYMBOL (Qno_conversion); | |
771 | 4529 DEFSYMBOL (Qconvert_eol); |
4530 DEFSYMBOL (Qconvert_eol_autodetect); | |
4531 DEFSYMBOL (Qconvert_eol_lf); | |
4532 DEFSYMBOL (Qconvert_eol_cr); | |
4533 DEFSYMBOL (Qconvert_eol_crlf); | |
563 | 4534 DEFSYMBOL (Qraw_text); |
771 | 4535 |
563 | 4536 DEFSYMBOL (Qmnemonic); |
4537 DEFSYMBOL (Qeol_type); | |
4538 DEFSYMBOL (Qpost_read_conversion); | |
4539 DEFSYMBOL (Qpre_write_conversion); | |
4540 | |
771 | 4541 DEFSYMBOL (Qtranslation_table_for_decode); |
4542 DEFSYMBOL (Qtranslation_table_for_encode); | |
4543 DEFSYMBOL (Qsafe_chars); | |
4544 DEFSYMBOL (Qsafe_charsets); | |
4545 DEFSYMBOL (Qmime_charset); | |
4546 DEFSYMBOL (Qvalid_codes); | |
4547 | |
563 | 4548 DEFSYMBOL (Qcr); |
4549 DEFSYMBOL (Qlf); | |
4550 DEFSYMBOL (Qcrlf); | |
4551 DEFSYMBOL (Qeol_cr); | |
4552 DEFSYMBOL (Qeol_lf); | |
4553 DEFSYMBOL (Qeol_crlf); | |
4554 DEFSYMBOL (Qencode); | |
4555 DEFSYMBOL (Qdecode); | |
428 | 4556 |
771 | 4557 DEFSYMBOL (Qnear_certainty); |
4558 DEFSYMBOL (Qquite_probable); | |
4559 DEFSYMBOL (Qsomewhat_likely); | |
1494 | 4560 DEFSYMBOL (Qslightly_likely); |
771 | 4561 DEFSYMBOL (Qas_likely_as_unlikely); |
4562 DEFSYMBOL (Qsomewhat_unlikely); | |
4563 DEFSYMBOL (Qquite_improbable); | |
4564 DEFSYMBOL (Qnearly_impossible); | |
4565 | |
4566 DEFSYMBOL (Qdo_eol); | |
4567 DEFSYMBOL (Qdo_coding); | |
4568 | |
4569 DEFSYMBOL (Qcanonicalize_after_coding); | |
4570 | |
4303 | 4571 DEFSYMBOL (Qposix_charset_to_coding_system_hash); |
4572 | |
771 | 4573 DEFSYMBOL (Qescape_quoted); |
4574 | |
4690
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
4575 DEFSYMBOL (Qquery_coding_warning_face); |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
4576 DEFSYMBOL (Qaliases); |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
4577 DEFSYMBOL (Qcharset_skip_chars_string); |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
4578 |
771 | 4579 #ifdef HAVE_ZLIB |
4580 DEFSYMBOL (Qgzip); | |
4581 #endif | |
4582 | |
428 | 4583 } |
4584 | |
4585 void | |
4586 lstream_type_create_file_coding (void) | |
4587 { | |
771 | 4588 LSTREAM_HAS_METHOD (coding, reader); |
4589 LSTREAM_HAS_METHOD (coding, writer); | |
4590 LSTREAM_HAS_METHOD (coding, rewinder); | |
4591 LSTREAM_HAS_METHOD (coding, seekable_p); | |
4592 LSTREAM_HAS_METHOD (coding, marker); | |
4593 LSTREAM_HAS_METHOD (coding, flusher); | |
4594 LSTREAM_HAS_METHOD (coding, closer); | |
4595 LSTREAM_HAS_METHOD (coding, finalizer); | |
4596 } | |
4597 | |
4598 void | |
4599 coding_system_type_create (void) | |
4600 { | |
4601 int i; | |
4602 | |
4603 staticpro (&Vcoding_system_hash_table); | |
4604 Vcoding_system_hash_table = | |
5191
71ee43b8a74d
Add #'equalp as a hash test by default; add #'define-hash-table-test, GNU API
Aidan Kehoe <kehoea@parhasard.net>
parents:
5169
diff
changeset
|
4605 make_lisp_hash_table (50, HASH_TABLE_NON_WEAK, Qeq); |
771 | 4606 |
4607 the_coding_system_type_entry_dynarr = Dynarr_new (coding_system_type_entry); | |
2367 | 4608 dump_add_root_block_ptr (&the_coding_system_type_entry_dynarr, |
771 | 4609 &csted_description); |
4610 | |
4611 Vcoding_system_type_list = Qnil; | |
4612 staticpro (&Vcoding_system_type_list); | |
4613 | |
4614 /* Initialize to something reasonable ... */ | |
4615 for (i = 0; i < MAX_DETECTOR_CATEGORIES; i++) | |
4616 { | |
4617 coding_category_system[i] = Qnil; | |
1204 | 4618 dump_add_root_lisp_object (&coding_category_system[i]); |
771 | 4619 coding_category_by_priority[i] = i; |
4620 } | |
4621 | |
4622 dump_add_opaque (coding_category_by_priority, | |
4623 sizeof (coding_category_by_priority)); | |
4624 | |
4625 all_coding_detectors = Dynarr_new2 (detector_dynarr, struct detector); | |
2367 | 4626 dump_add_root_block_ptr (&all_coding_detectors, |
771 | 4627 &detector_dynarr_description); |
4628 | |
4629 dump_add_opaque_int (&coding_system_tick); | |
4630 dump_add_opaque_int (&coding_detector_count); | |
4631 dump_add_opaque_int (&coding_detector_category_count); | |
4632 | |
4633 INITIALIZE_CODING_SYSTEM_TYPE (no_conversion, | |
4634 "no-conversion-coding-system-p"); | |
4635 CODING_SYSTEM_HAS_METHOD (no_conversion, convert); | |
4636 | |
4637 INITIALIZE_DETECTOR (no_conversion); | |
4638 DETECTOR_HAS_METHOD (no_conversion, detect); | |
4639 INITIALIZE_DETECTOR_CATEGORY (no_conversion, no_conversion); | |
4640 | |
4641 INITIALIZE_CODING_SYSTEM_TYPE_WITH_DATA (convert_eol, | |
4642 "convert-eol-coding-system-p"); | |
4643 CODING_SYSTEM_HAS_METHOD (convert_eol, print); | |
4644 CODING_SYSTEM_HAS_METHOD (convert_eol, convert); | |
4645 CODING_SYSTEM_HAS_METHOD (convert_eol, getprop); | |
4646 CODING_SYSTEM_HAS_METHOD (convert_eol, putprop); | |
4647 CODING_SYSTEM_HAS_METHOD (convert_eol, conversion_end_type); | |
4648 CODING_SYSTEM_HAS_METHOD (convert_eol, canonicalize_after_coding); | |
4649 CODING_SYSTEM_HAS_METHOD (convert_eol, init_coding_stream); | |
4650 | |
4651 INITIALIZE_CODING_SYSTEM_TYPE_WITH_DATA (undecided, | |
4652 "undecided-coding-system-p"); | |
4653 CODING_SYSTEM_HAS_METHOD (undecided, init); | |
4654 CODING_SYSTEM_HAS_METHOD (undecided, mark); | |
4655 CODING_SYSTEM_HAS_METHOD (undecided, print); | |
4656 CODING_SYSTEM_HAS_METHOD (undecided, convert); | |
4657 CODING_SYSTEM_HAS_METHOD (undecided, putprop); | |
4658 CODING_SYSTEM_HAS_METHOD (undecided, getprop); | |
4659 CODING_SYSTEM_HAS_METHOD (undecided, init_coding_stream); | |
4660 CODING_SYSTEM_HAS_METHOD (undecided, rewind_coding_stream); | |
4661 CODING_SYSTEM_HAS_METHOD (undecided, finalize_coding_stream); | |
4662 CODING_SYSTEM_HAS_METHOD (undecided, mark_coding_stream); | |
4663 CODING_SYSTEM_HAS_METHOD (undecided, canonicalize); | |
4664 CODING_SYSTEM_HAS_METHOD (undecided, canonicalize_after_coding); | |
4665 | |
4666 INITIALIZE_CODING_SYSTEM_TYPE_WITH_DATA (chain, "chain-coding-system-p"); | |
4667 | |
4668 CODING_SYSTEM_HAS_METHOD (chain, print); | |
4669 CODING_SYSTEM_HAS_METHOD (chain, canonicalize); | |
4670 CODING_SYSTEM_HAS_METHOD (chain, init); | |
4671 CODING_SYSTEM_HAS_METHOD (chain, mark); | |
4672 CODING_SYSTEM_HAS_METHOD (chain, mark_coding_stream); | |
4673 CODING_SYSTEM_HAS_METHOD (chain, convert); | |
4674 CODING_SYSTEM_HAS_METHOD (chain, rewind_coding_stream); | |
4675 CODING_SYSTEM_HAS_METHOD (chain, finalize_coding_stream); | |
4676 CODING_SYSTEM_HAS_METHOD (chain, finalize); | |
4677 CODING_SYSTEM_HAS_METHOD (chain, putprop); | |
4678 CODING_SYSTEM_HAS_METHOD (chain, getprop); | |
4679 CODING_SYSTEM_HAS_METHOD (chain, conversion_end_type); | |
4680 CODING_SYSTEM_HAS_METHOD (chain, canonicalize_after_coding); | |
4681 | |
4682 #ifdef DEBUG_XEMACS | |
4683 INITIALIZE_CODING_SYSTEM_TYPE (internal, "internal-coding-system-p"); | |
4684 CODING_SYSTEM_HAS_METHOD (internal, convert); | |
4685 #endif | |
4686 | |
4687 #ifdef HAVE_ZLIB | |
4688 INITIALIZE_CODING_SYSTEM_TYPE_WITH_DATA (gzip, "gzip-coding-system-p"); | |
4689 CODING_SYSTEM_HAS_METHOD (gzip, conversion_end_type); | |
4690 CODING_SYSTEM_HAS_METHOD (gzip, convert); | |
4691 CODING_SYSTEM_HAS_METHOD (gzip, init); | |
4692 CODING_SYSTEM_HAS_METHOD (gzip, print); | |
4693 CODING_SYSTEM_HAS_METHOD (gzip, init_coding_stream); | |
4694 CODING_SYSTEM_HAS_METHOD (gzip, rewind_coding_stream); | |
4695 CODING_SYSTEM_HAS_METHOD (gzip, putprop); | |
4696 CODING_SYSTEM_HAS_METHOD (gzip, getprop); | |
4697 #endif | |
4698 } | |
4699 | |
4700 void | |
4701 reinit_coding_system_type_create (void) | |
4702 { | |
4703 REINITIALIZE_CODING_SYSTEM_TYPE (no_conversion); | |
4704 REINITIALIZE_CODING_SYSTEM_TYPE (convert_eol); | |
4705 REINITIALIZE_CODING_SYSTEM_TYPE (undecided); | |
4706 REINITIALIZE_CODING_SYSTEM_TYPE (chain); | |
4707 #if 0 | |
4708 REINITIALIZE_CODING_SYSTEM_TYPE (text_file_wrapper); | |
4709 #endif /* 0 */ | |
4710 #ifdef DEBUG_XEMACS | |
4711 REINITIALIZE_CODING_SYSTEM_TYPE (internal); | |
4712 #endif | |
4713 #ifdef HAVE_ZLIB | |
4714 REINITIALIZE_CODING_SYSTEM_TYPE (gzip); | |
4715 #endif | |
4716 } | |
4717 | |
4718 void | |
4719 reinit_vars_of_file_coding (void) | |
4720 { | |
428 | 4721 } |
4722 | |
4723 void | |
4724 vars_of_file_coding (void) | |
4725 { | |
771 | 4726 /* We always have file-coding support */ |
428 | 4727 Fprovide (intern ("file-coding")); |
4728 | |
4952
19a72041c5ed
Mule-izing, various fixes related to char * arguments
Ben Wing <ben@xemacs.org>
parents:
4846
diff
changeset
|
4729 QScoding_system_cookie = build_ascstring (";;;###coding system: "); |
1347 | 4730 staticpro (&QScoding_system_cookie); |
4731 | |
1242 | 4732 #ifdef HAVE_DEFAULT_EOL_DETECTION |
2297 | 4733 /* #### Find a more appropriate place for this comment. |
4734 WARNING: The existing categories are intimately tied to the function | |
1242 | 4735 `coding-system-category' in coding.el. If you change a category, or |
4736 change the layout of any coding system associated with a category, you | |
4737 need to check that function and make sure it's written properly. */ | |
4738 | |
4739 Fprovide (intern ("unix-default-eol-detection")); | |
4740 #endif | |
4741 | |
428 | 4742 DEFVAR_LISP ("keyboard-coding-system", &Vkeyboard_coding_system /* |
3142 | 4743 Default coding system used for TTY and X11 keyboard input. |
5384
3889ef128488
Fix misspelled words, and some grammar, across the entire source tree.
Jerry James <james@xemacs.org>
parents:
5345
diff
changeset
|
4744 Under X11, used only to interpret the character for a key event when that |
3142 | 4745 event has a KeySym of NoSymbol but does have an associated string keysym, |
4746 something that's seen with input methods. | |
4747 | |
4748 If you need to set these things to different coding systems, call the | |
4749 function `set-console-tty-coding-system' for the TTY and use this variable | |
4750 for X11. | |
428 | 4751 */ ); |
4752 Vkeyboard_coding_system = Qnil; | |
4753 | |
4754 DEFVAR_LISP ("terminal-coding-system", &Vterminal_coding_system /* | |
4755 Coding system used for TTY display output. | |
4756 Not used under a windowing system. | |
4757 */ ); | |
4758 Vterminal_coding_system = Qnil; | |
4759 | |
4760 DEFVAR_LISP ("coding-system-for-read", &Vcoding_system_for_read /* | |
440 | 4761 Overriding coding system used when reading from a file or process. |
4762 You should bind this variable with `let', but do not set it globally. | |
4763 If this is non-nil, it specifies the coding system that will be used | |
4764 to decode input on read operations, such as from a file or process. | |
4765 It overrides `buffer-file-coding-system-for-read', | |
428 | 4766 `insert-file-contents-pre-hook', etc. Use those variables instead of |
440 | 4767 this one for permanent changes to the environment. */ ); |
428 | 4768 Vcoding_system_for_read = Qnil; |
4769 | |
4770 DEFVAR_LISP ("coding-system-for-write", | |
4771 &Vcoding_system_for_write /* | |
440 | 4772 Overriding coding system used when writing to a file or process. |
4773 You should bind this variable with `let', but do not set it globally. | |
4774 If this is non-nil, it specifies the coding system that will be used | |
4775 to encode output for write operations, such as to a file or process. | |
4776 It overrides `buffer-file-coding-system', `write-region-pre-hook', etc. | |
4777 Use those variables instead of this one for permanent changes to the | |
4778 environment. */ ); | |
428 | 4779 Vcoding_system_for_write = Qnil; |
4780 | |
4781 DEFVAR_LISP ("file-name-coding-system", &Vfile_name_coding_system /* | |
4782 Coding system used to convert pathnames when accessing files. | |
4783 */ ); | |
4784 Vfile_name_coding_system = Qnil; | |
4785 | |
4786 DEFVAR_BOOL ("enable-multibyte-characters", &enable_multibyte_characters /* | |
771 | 4787 Setting this has no effect. It is purely for FSF compatibility. |
428 | 4788 */ ); |
4789 enable_multibyte_characters = 1; | |
771 | 4790 |
4791 Vchain_canonicalize_hash_table = | |
5191
71ee43b8a74d
Add #'equalp as a hash test by default; add #'define-hash-table-test, GNU API
Aidan Kehoe <kehoea@parhasard.net>
parents:
5169
diff
changeset
|
4792 make_lisp_hash_table (50, HASH_TABLE_NON_WEAK, Qequal); |
771 | 4793 staticpro (&Vchain_canonicalize_hash_table); |
4794 | |
4795 #ifdef DEBUG_XEMACS | |
4796 DEFVAR_LISP ("debug-coding-detection", &Vdebug_coding_detection /* | |
4797 If non-nil, display debug information about detection operations in progress. | |
4798 Information is displayed on stderr. | |
4799 */ ); | |
4800 Vdebug_coding_detection = Qnil; | |
4801 #endif | |
4690
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
4802 |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
4803 #ifdef MULE |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
4804 Vdefault_query_coding_region_chartab_cache |
5191
71ee43b8a74d
Add #'equalp as a hash test by default; add #'define-hash-table-test, GNU API
Aidan Kehoe <kehoea@parhasard.net>
parents:
5169
diff
changeset
|
4805 = make_lisp_hash_table (25, HASH_TABLE_NON_WEAK, Qequal); |
4690
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
4806 staticpro (&Vdefault_query_coding_region_chartab_cache); |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
4807 #endif |
428 | 4808 } |
4809 | |
2297 | 4810 /* #### reformat this for consistent appearance? */ |
4811 | |
428 | 4812 void |
4813 complex_vars_of_file_coding (void) | |
4814 { | |
4690
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
4815 Fmake_coding_system_internal |
771 | 4816 (Qconvert_eol_cr, Qconvert_eol, |
4952
19a72041c5ed
Mule-izing, various fixes related to char * arguments
Ben Wing <ben@xemacs.org>
parents:
4846
diff
changeset
|
4817 build_defer_string ("Convert CR to LF"), |
5345
db326b8fe982
Use Ben's recently-introduced listu (), where appropriate.
Aidan Kehoe <kehoea@parhasard.net>
parents:
5307
diff
changeset
|
4818 listu (Qdocumentation, |
db326b8fe982
Use Ben's recently-introduced listu (), where appropriate.
Aidan Kehoe <kehoea@parhasard.net>
parents:
5307
diff
changeset
|
4819 build_defer_string ( |
771 | 4820 "Converts CR (used to mark the end of a line on Macintosh systems) to LF\n" |
4821 "(used internally and under Unix to mark the end of a line)."), | |
5345
db326b8fe982
Use Ben's recently-introduced listu (), where appropriate.
Aidan Kehoe <kehoea@parhasard.net>
parents:
5307
diff
changeset
|
4822 Qmnemonic, build_ascstring ("CR->LF"), |
db326b8fe982
Use Ben's recently-introduced listu (), where appropriate.
Aidan Kehoe <kehoea@parhasard.net>
parents:
5307
diff
changeset
|
4823 Qsubtype, Qcr, |
db326b8fe982
Use Ben's recently-introduced listu (), where appropriate.
Aidan Kehoe <kehoea@parhasard.net>
parents:
5307
diff
changeset
|
4824 /* VERY IMPORTANT! Tell make-coding-system not to generate |
db326b8fe982
Use Ben's recently-introduced listu (), where appropriate.
Aidan Kehoe <kehoea@parhasard.net>
parents:
5307
diff
changeset
|
4825 subsidiaries -- it needs the coding systems we're creating |
771 | 4826 to do so! */ |
5345
db326b8fe982
Use Ben's recently-introduced listu (), where appropriate.
Aidan Kehoe <kehoea@parhasard.net>
parents:
5307
diff
changeset
|
4827 Qeol_type, Qlf, |
db326b8fe982
Use Ben's recently-introduced listu (), where appropriate.
Aidan Kehoe <kehoea@parhasard.net>
parents:
5307
diff
changeset
|
4828 Qsafe_charsets, Qt, |
db326b8fe982
Use Ben's recently-introduced listu (), where appropriate.
Aidan Kehoe <kehoea@parhasard.net>
parents:
5307
diff
changeset
|
4829 Qunbound)); |
4690
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
4830 Fmake_coding_system_internal |
771 | 4831 (Qconvert_eol_lf, Qconvert_eol, |
4952
19a72041c5ed
Mule-izing, various fixes related to char * arguments
Ben Wing <ben@xemacs.org>
parents:
4846
diff
changeset
|
4832 build_defer_string ("Convert LF to LF (do nothing)"), |
5345
db326b8fe982
Use Ben's recently-introduced listu (), where appropriate.
Aidan Kehoe <kehoea@parhasard.net>
parents:
5307
diff
changeset
|
4833 listu (Qdocumentation, |
db326b8fe982
Use Ben's recently-introduced listu (), where appropriate.
Aidan Kehoe <kehoea@parhasard.net>
parents:
5307
diff
changeset
|
4834 build_defer_string ("Do nothing."), |
db326b8fe982
Use Ben's recently-introduced listu (), where appropriate.
Aidan Kehoe <kehoea@parhasard.net>
parents:
5307
diff
changeset
|
4835 Qmnemonic, build_ascstring ("LF->LF"), |
db326b8fe982
Use Ben's recently-introduced listu (), where appropriate.
Aidan Kehoe <kehoea@parhasard.net>
parents:
5307
diff
changeset
|
4836 Qsubtype, Qlf, |
db326b8fe982
Use Ben's recently-introduced listu (), where appropriate.
Aidan Kehoe <kehoea@parhasard.net>
parents:
5307
diff
changeset
|
4837 /* VERY IMPORTANT! Tell make-coding-system not to generate |
771 | 4838 subsidiaries -- it needs the coding systems we're creating |
4839 to do so! */ | |
5345
db326b8fe982
Use Ben's recently-introduced listu (), where appropriate.
Aidan Kehoe <kehoea@parhasard.net>
parents:
5307
diff
changeset
|
4840 Qeol_type, Qlf, |
db326b8fe982
Use Ben's recently-introduced listu (), where appropriate.
Aidan Kehoe <kehoea@parhasard.net>
parents:
5307
diff
changeset
|
4841 Qsafe_charsets, Qt, |
db326b8fe982
Use Ben's recently-introduced listu (), where appropriate.
Aidan Kehoe <kehoea@parhasard.net>
parents:
5307
diff
changeset
|
4842 Qunbound)); |
4690
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
4843 |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
4844 Fmake_coding_system_internal |
771 | 4845 (Qconvert_eol_crlf, Qconvert_eol, |
4952
19a72041c5ed
Mule-izing, various fixes related to char * arguments
Ben Wing <ben@xemacs.org>
parents:
4846
diff
changeset
|
4846 build_defer_string ("Convert CRLF to LF"), |
5345
db326b8fe982
Use Ben's recently-introduced listu (), where appropriate.
Aidan Kehoe <kehoea@parhasard.net>
parents:
5307
diff
changeset
|
4847 listu (Qdocumentation, |
db326b8fe982
Use Ben's recently-introduced listu (), where appropriate.
Aidan Kehoe <kehoea@parhasard.net>
parents:
5307
diff
changeset
|
4848 build_defer_string ( |
771 | 4849 "Converts CR+LF (used to mark the end of a line on Macintosh systems) to LF\n" |
4850 "(used internally and under Unix to mark the end of a line)."), | |
5345
db326b8fe982
Use Ben's recently-introduced listu (), where appropriate.
Aidan Kehoe <kehoea@parhasard.net>
parents:
5307
diff
changeset
|
4851 Qmnemonic, build_ascstring ("CRLF->LF"), |
db326b8fe982
Use Ben's recently-introduced listu (), where appropriate.
Aidan Kehoe <kehoea@parhasard.net>
parents:
5307
diff
changeset
|
4852 Qsubtype, Qcrlf, |
db326b8fe982
Use Ben's recently-introduced listu (), where appropriate.
Aidan Kehoe <kehoea@parhasard.net>
parents:
5307
diff
changeset
|
4853 /* VERY IMPORTANT! Tell make-coding-system not to generate |
db326b8fe982
Use Ben's recently-introduced listu (), where appropriate.
Aidan Kehoe <kehoea@parhasard.net>
parents:
5307
diff
changeset
|
4854 subsidiaries -- it needs the coding systems we're creating |
db326b8fe982
Use Ben's recently-introduced listu (), where appropriate.
Aidan Kehoe <kehoea@parhasard.net>
parents:
5307
diff
changeset
|
4855 to do so! */ |
db326b8fe982
Use Ben's recently-introduced listu (), where appropriate.
Aidan Kehoe <kehoea@parhasard.net>
parents:
5307
diff
changeset
|
4856 Qeol_type, Qlf, |
db326b8fe982
Use Ben's recently-introduced listu (), where appropriate.
Aidan Kehoe <kehoea@parhasard.net>
parents:
5307
diff
changeset
|
4857 Qsafe_charsets, Qt, |
db326b8fe982
Use Ben's recently-introduced listu (), where appropriate.
Aidan Kehoe <kehoea@parhasard.net>
parents:
5307
diff
changeset
|
4858 Qunbound)); |
4690
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
4859 |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
4860 Fmake_coding_system_internal |
771 | 4861 (Qconvert_eol_autodetect, Qconvert_eol, |
4952
19a72041c5ed
Mule-izing, various fixes related to char * arguments
Ben Wing <ben@xemacs.org>
parents:
4846
diff
changeset
|
4862 build_defer_string ("Autodetect EOL type"), |
5345
db326b8fe982
Use Ben's recently-introduced listu (), where appropriate.
Aidan Kehoe <kehoea@parhasard.net>
parents:
5307
diff
changeset
|
4863 listu (Qdocumentation, |
db326b8fe982
Use Ben's recently-introduced listu (), where appropriate.
Aidan Kehoe <kehoea@parhasard.net>
parents:
5307
diff
changeset
|
4864 build_defer_string ("Autodetect the end-of-line type."), |
db326b8fe982
Use Ben's recently-introduced listu (), where appropriate.
Aidan Kehoe <kehoea@parhasard.net>
parents:
5307
diff
changeset
|
4865 Qmnemonic, build_ascstring ("Auto-EOL"), |
db326b8fe982
Use Ben's recently-introduced listu (), where appropriate.
Aidan Kehoe <kehoea@parhasard.net>
parents:
5307
diff
changeset
|
4866 Qsubtype, Qnil, |
db326b8fe982
Use Ben's recently-introduced listu (), where appropriate.
Aidan Kehoe <kehoea@parhasard.net>
parents:
5307
diff
changeset
|
4867 /* VERY IMPORTANT! Tell make-coding-system not to generate |
db326b8fe982
Use Ben's recently-introduced listu (), where appropriate.
Aidan Kehoe <kehoea@parhasard.net>
parents:
5307
diff
changeset
|
4868 subsidiaries -- it needs the coding systems we're creating |
db326b8fe982
Use Ben's recently-introduced listu (), where appropriate.
Aidan Kehoe <kehoea@parhasard.net>
parents:
5307
diff
changeset
|
4869 to do so! */ |
db326b8fe982
Use Ben's recently-introduced listu (), where appropriate.
Aidan Kehoe <kehoea@parhasard.net>
parents:
5307
diff
changeset
|
4870 Qeol_type, Qlf, |
db326b8fe982
Use Ben's recently-introduced listu (), where appropriate.
Aidan Kehoe <kehoea@parhasard.net>
parents:
5307
diff
changeset
|
4871 Qsafe_charsets, Qt, |
db326b8fe982
Use Ben's recently-introduced listu (), where appropriate.
Aidan Kehoe <kehoea@parhasard.net>
parents:
5307
diff
changeset
|
4872 Qunbound)); |
4690
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
4873 |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
4874 Fmake_coding_system_internal |
771 | 4875 (Qundecided, Qundecided, |
4952
19a72041c5ed
Mule-izing, various fixes related to char * arguments
Ben Wing <ben@xemacs.org>
parents:
4846
diff
changeset
|
4876 build_defer_string ("Undecided (auto-detect)"), |
5345
db326b8fe982
Use Ben's recently-introduced listu (), where appropriate.
Aidan Kehoe <kehoea@parhasard.net>
parents:
5307
diff
changeset
|
4877 listu (Qdocumentation, |
db326b8fe982
Use Ben's recently-introduced listu (), where appropriate.
Aidan Kehoe <kehoea@parhasard.net>
parents:
5307
diff
changeset
|
4878 build_defer_string ("Automatically detects the correct encoding."), |
db326b8fe982
Use Ben's recently-introduced listu (), where appropriate.
Aidan Kehoe <kehoea@parhasard.net>
parents:
5307
diff
changeset
|
4879 Qmnemonic, build_ascstring ("Auto"), |
db326b8fe982
Use Ben's recently-introduced listu (), where appropriate.
Aidan Kehoe <kehoea@parhasard.net>
parents:
5307
diff
changeset
|
4880 Qdo_eol, Qt, Qdo_coding, Qt, |
db326b8fe982
Use Ben's recently-introduced listu (), where appropriate.
Aidan Kehoe <kehoea@parhasard.net>
parents:
5307
diff
changeset
|
4881 /* We do EOL detection ourselves so we don't need to be |
db326b8fe982
Use Ben's recently-introduced listu (), where appropriate.
Aidan Kehoe <kehoea@parhasard.net>
parents:
5307
diff
changeset
|
4882 wrapped in an EOL detector. (It doesn't actually hurt, |
db326b8fe982
Use Ben's recently-introduced listu (), where appropriate.
Aidan Kehoe <kehoea@parhasard.net>
parents:
5307
diff
changeset
|
4883 though, I don't think.) */ |
db326b8fe982
Use Ben's recently-introduced listu (), where appropriate.
Aidan Kehoe <kehoea@parhasard.net>
parents:
5307
diff
changeset
|
4884 Qeol_type, Qlf, |
db326b8fe982
Use Ben's recently-introduced listu (), where appropriate.
Aidan Kehoe <kehoea@parhasard.net>
parents:
5307
diff
changeset
|
4885 Qunbound)); |
771 | 4886 |
4690
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
4887 Fmake_coding_system_internal |
771 | 4888 (intern ("undecided-dos"), Qundecided, |
4952
19a72041c5ed
Mule-izing, various fixes related to char * arguments
Ben Wing <ben@xemacs.org>
parents:
4846
diff
changeset
|
4889 build_defer_string ("Undecided (auto-detect) (CRLF)"), |
5345
db326b8fe982
Use Ben's recently-introduced listu (), where appropriate.
Aidan Kehoe <kehoea@parhasard.net>
parents:
5307
diff
changeset
|
4890 listu (Qdocumentation, |
db326b8fe982
Use Ben's recently-introduced listu (), where appropriate.
Aidan Kehoe <kehoea@parhasard.net>
parents:
5307
diff
changeset
|
4891 build_defer_string |
db326b8fe982
Use Ben's recently-introduced listu (), where appropriate.
Aidan Kehoe <kehoea@parhasard.net>
parents:
5307
diff
changeset
|
4892 ("Automatically detects the correct encoding; EOL type of CRLF forced."), |
db326b8fe982
Use Ben's recently-introduced listu (), where appropriate.
Aidan Kehoe <kehoea@parhasard.net>
parents:
5307
diff
changeset
|
4893 Qmnemonic, build_ascstring ("Auto"), |
db326b8fe982
Use Ben's recently-introduced listu (), where appropriate.
Aidan Kehoe <kehoea@parhasard.net>
parents:
5307
diff
changeset
|
4894 Qdo_coding, Qt, |
db326b8fe982
Use Ben's recently-introduced listu (), where appropriate.
Aidan Kehoe <kehoea@parhasard.net>
parents:
5307
diff
changeset
|
4895 Qeol_type, Qcrlf, |
db326b8fe982
Use Ben's recently-introduced listu (), where appropriate.
Aidan Kehoe <kehoea@parhasard.net>
parents:
5307
diff
changeset
|
4896 Qunbound)); |
771 | 4897 |
4690
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
4898 Fmake_coding_system_internal |
771 | 4899 (intern ("undecided-unix"), Qundecided, |
4952
19a72041c5ed
Mule-izing, various fixes related to char * arguments
Ben Wing <ben@xemacs.org>
parents:
4846
diff
changeset
|
4900 build_defer_string ("Undecided (auto-detect) (LF)"), |
5345
db326b8fe982
Use Ben's recently-introduced listu (), where appropriate.
Aidan Kehoe <kehoea@parhasard.net>
parents:
5307
diff
changeset
|
4901 listu (Qdocumentation, |
db326b8fe982
Use Ben's recently-introduced listu (), where appropriate.
Aidan Kehoe <kehoea@parhasard.net>
parents:
5307
diff
changeset
|
4902 build_defer_string |
db326b8fe982
Use Ben's recently-introduced listu (), where appropriate.
Aidan Kehoe <kehoea@parhasard.net>
parents:
5307
diff
changeset
|
4903 ("Automatically detects the correct encoding; EOL type of LF forced."), |
db326b8fe982
Use Ben's recently-introduced listu (), where appropriate.
Aidan Kehoe <kehoea@parhasard.net>
parents:
5307
diff
changeset
|
4904 Qmnemonic, build_ascstring ("Auto"), |
db326b8fe982
Use Ben's recently-introduced listu (), where appropriate.
Aidan Kehoe <kehoea@parhasard.net>
parents:
5307
diff
changeset
|
4905 Qdo_coding, Qt, |
db326b8fe982
Use Ben's recently-introduced listu (), where appropriate.
Aidan Kehoe <kehoea@parhasard.net>
parents:
5307
diff
changeset
|
4906 Qeol_type, Qlf, |
db326b8fe982
Use Ben's recently-introduced listu (), where appropriate.
Aidan Kehoe <kehoea@parhasard.net>
parents:
5307
diff
changeset
|
4907 Qunbound));; |
771 | 4908 |
4690
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
4909 Fmake_coding_system_internal |
771 | 4910 (intern ("undecided-mac"), Qundecided, |
4952
19a72041c5ed
Mule-izing, various fixes related to char * arguments
Ben Wing <ben@xemacs.org>
parents:
4846
diff
changeset
|
4911 build_defer_string ("Undecided (auto-detect) (CR)"), |
5345
db326b8fe982
Use Ben's recently-introduced listu (), where appropriate.
Aidan Kehoe <kehoea@parhasard.net>
parents:
5307
diff
changeset
|
4912 listu (Qdocumentation, |
db326b8fe982
Use Ben's recently-introduced listu (), where appropriate.
Aidan Kehoe <kehoea@parhasard.net>
parents:
5307
diff
changeset
|
4913 build_defer_string |
db326b8fe982
Use Ben's recently-introduced listu (), where appropriate.
Aidan Kehoe <kehoea@parhasard.net>
parents:
5307
diff
changeset
|
4914 ("Automatically detects the correct encoding; EOL type of CR forced."), |
db326b8fe982
Use Ben's recently-introduced listu (), where appropriate.
Aidan Kehoe <kehoea@parhasard.net>
parents:
5307
diff
changeset
|
4915 Qmnemonic, build_ascstring ("Auto"), |
db326b8fe982
Use Ben's recently-introduced listu (), where appropriate.
Aidan Kehoe <kehoea@parhasard.net>
parents:
5307
diff
changeset
|
4916 Qdo_coding, Qt, |
db326b8fe982
Use Ben's recently-introduced listu (), where appropriate.
Aidan Kehoe <kehoea@parhasard.net>
parents:
5307
diff
changeset
|
4917 Qeol_type, Qcr, |
db326b8fe982
Use Ben's recently-introduced listu (), where appropriate.
Aidan Kehoe <kehoea@parhasard.net>
parents:
5307
diff
changeset
|
4918 Qunbound)); |
771 | 4919 |
428 | 4920 /* Need to create this here or we're really screwed. */ |
4690
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
4921 Fmake_coding_system_internal |
428 | 4922 (Qraw_text, Qno_conversion, |
4952
19a72041c5ed
Mule-izing, various fixes related to char * arguments
Ben Wing <ben@xemacs.org>
parents:
4846
diff
changeset
|
4923 build_defer_string ("Raw Text"), |
5345
db326b8fe982
Use Ben's recently-introduced listu (), where appropriate.
Aidan Kehoe <kehoea@parhasard.net>
parents:
5307
diff
changeset
|
4924 listu (Qdocumentation, |
db326b8fe982
Use Ben's recently-introduced listu (), where appropriate.
Aidan Kehoe <kehoea@parhasard.net>
parents:
5307
diff
changeset
|
4925 build_defer_string ("Raw text converts only line-break " |
db326b8fe982
Use Ben's recently-introduced listu (), where appropriate.
Aidan Kehoe <kehoea@parhasard.net>
parents:
5307
diff
changeset
|
4926 "codes, and acts otherwise like " |
db326b8fe982
Use Ben's recently-introduced listu (), where appropriate.
Aidan Kehoe <kehoea@parhasard.net>
parents:
5307
diff
changeset
|
4927 "`binary'."), |
db326b8fe982
Use Ben's recently-introduced listu (), where appropriate.
Aidan Kehoe <kehoea@parhasard.net>
parents:
5307
diff
changeset
|
4928 Qmnemonic, build_ascstring ("Raw"), |
4690
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
4929 #ifdef MULE |
5345
db326b8fe982
Use Ben's recently-introduced listu (), where appropriate.
Aidan Kehoe <kehoea@parhasard.net>
parents:
5307
diff
changeset
|
4930 Qsafe_charsets, list3 (Vcharset_ascii, Vcharset_control_1, |
db326b8fe982
Use Ben's recently-introduced listu (), where appropriate.
Aidan Kehoe <kehoea@parhasard.net>
parents:
5307
diff
changeset
|
4931 Vcharset_latin_iso8859_1), |
db326b8fe982
Use Ben's recently-introduced listu (), where appropriate.
Aidan Kehoe <kehoea@parhasard.net>
parents:
5307
diff
changeset
|
4932 |
4690
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
4933 #endif |
5345
db326b8fe982
Use Ben's recently-introduced listu (), where appropriate.
Aidan Kehoe <kehoea@parhasard.net>
parents:
5307
diff
changeset
|
4934 Qunbound)); |
db326b8fe982
Use Ben's recently-introduced listu (), where appropriate.
Aidan Kehoe <kehoea@parhasard.net>
parents:
5307
diff
changeset
|
4935 |
4690
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
4936 |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
4937 Fmake_coding_system_internal |
428 | 4938 (Qbinary, Qno_conversion, |
4952
19a72041c5ed
Mule-izing, various fixes related to char * arguments
Ben Wing <ben@xemacs.org>
parents:
4846
diff
changeset
|
4939 build_defer_string ("Binary"), |
5345
db326b8fe982
Use Ben's recently-introduced listu (), where appropriate.
Aidan Kehoe <kehoea@parhasard.net>
parents:
5307
diff
changeset
|
4940 listu (Qdocumentation, |
db326b8fe982
Use Ben's recently-introduced listu (), where appropriate.
Aidan Kehoe <kehoea@parhasard.net>
parents:
5307
diff
changeset
|
4941 build_defer_string ( |
771 | 4942 "This coding system is as close as it comes to doing no conversion.\n" |
4943 "On input, each byte is converted directly into the character\n" | |
4944 "with the corresponding code -- i.e. from the `ascii', `control-1',\n" | |
4945 "or `latin-1' character sets. On output, these characters are\n" | |
4946 "converted back to the corresponding bytes, and other characters\n" | |
4947 "are converted to the default character, i.e. `~'."), | |
5345
db326b8fe982
Use Ben's recently-introduced listu (), where appropriate.
Aidan Kehoe <kehoea@parhasard.net>
parents:
5307
diff
changeset
|
4948 Qeol_type, Qlf, |
db326b8fe982
Use Ben's recently-introduced listu (), where appropriate.
Aidan Kehoe <kehoea@parhasard.net>
parents:
5307
diff
changeset
|
4949 Qmnemonic, build_ascstring ("Binary"), |
4690
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
4950 #ifdef MULE |
5345
db326b8fe982
Use Ben's recently-introduced listu (), where appropriate.
Aidan Kehoe <kehoea@parhasard.net>
parents:
5307
diff
changeset
|
4951 Qsafe_charsets, list3 (Vcharset_ascii, Vcharset_control_1, |
db326b8fe982
Use Ben's recently-introduced listu (), where appropriate.
Aidan Kehoe <kehoea@parhasard.net>
parents:
5307
diff
changeset
|
4952 Vcharset_latin_iso8859_1), |
4690
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
4953 #endif |
5345
db326b8fe982
Use Ben's recently-introduced listu (), where appropriate.
Aidan Kehoe <kehoea@parhasard.net>
parents:
5307
diff
changeset
|
4954 Qunbound)); |
428 | 4955 |
771 | 4956 /* Formerly aliased to raw-text! Completely bogus and not even the same |
4957 as FSF Emacs. */ | |
4958 Fdefine_coding_system_alias (Qno_conversion, Qbinary); | |
4959 Fdefine_coding_system_alias (intern ("no-conversion-unix"), | |
4960 intern ("raw-text-unix")); | |
4961 Fdefine_coding_system_alias (intern ("no-conversion-dos"), | |
4962 intern ("raw-text-dos")); | |
4963 Fdefine_coding_system_alias (intern ("no-conversion-mac"), | |
4964 intern ("raw-text-mac")); | |
4965 | |
1318 | 4966 /* These three below will get their defaults set correctly |
4967 in code-init.el. We init them now so we can handle stuff at dump | |
771 | 4968 time before we get to code-init.el. */ |
1318 | 4969 Fdefine_coding_system_alias (Qnative, Qbinary); |
440 | 4970 Fdefine_coding_system_alias (Qterminal, Qbinary); |
4971 Fdefine_coding_system_alias (Qkeyboard, Qbinary); | |
4972 | |
1318 | 4973 Fdefine_coding_system_alias (Qfile_name, Qnative); |
771 | 4974 Fdefine_coding_system_alias (Qidentity, Qconvert_eol_lf); |
4975 | |
428 | 4976 /* Need this for bootstrapping */ |
771 | 4977 coding_category_system[detector_category_no_conversion] = |
428 | 4978 Fget_coding_system (Qraw_text); |
4979 } |