Mercurial > hg > xemacs-beta
annotate src/file-coding.c @ 5167:e374ea766cc1
clean up, rearrange allocation statistics code
-------------------- ChangeLog entries follow: --------------------
src/ChangeLog addition:
2010-03-21 Ben Wing <ben@xemacs.org>
* alloc.c:
* alloc.c (assert_proper_sizing):
* alloc.c (c_readonly):
* alloc.c (malloced_storage_size):
* alloc.c (fixed_type_block_overhead):
* alloc.c (lisp_object_storage_size):
* alloc.c (inc_lrecord_stats):
* alloc.c (dec_lrecord_stats):
* alloc.c (pluralize_word):
* alloc.c (object_memory_usage_stats):
* alloc.c (Fobject_memory_usage):
* alloc.c (compute_memusage_stats_length):
* alloc.c (disksave_object_finalization_1):
* alloc.c (Fgarbage_collect):
* mc-alloc.c:
* mc-alloc.c (mc_alloced_storage_size):
* mc-alloc.h:
No functionality change here. Collect the allocations-statistics
code that was scattered throughout alloc.c into one place. Add
remaining section headings so that all sections have headings
clearly identifying the start of the section and its purpose.
Expose mc_alloced_storage_size() even when not MEMORY_USAGE_STATS;
this fixes build problems and is related to the export of
lisp_object_storage_size() and malloced_storage_size() when
non-MEMORY_USAGE_STATS in the previous change set.
author | Ben Wing <ben@xemacs.org> |
---|---|
date | Sun, 21 Mar 2010 04:41:49 -0500 |
parents | f965e31a35f0 |
children | 6c6d78781d59 |
rev | line source |
---|---|
771 | 1 /* Text encoding conversion functions; coding-system object. |
2 #### rename me to coding-system.c or coding.c | |
428 | 3 Copyright (C) 1991, 1995 Free Software Foundation, Inc. |
4 Copyright (C) 1995 Sun Microsystems, Inc. | |
5127
a9c41067dd88
more cleanups, terminology clarification, lots of doc work
Ben Wing <ben@xemacs.org>
parents:
5126
diff
changeset
|
5 Copyright (C) 2000, 2001, 2002, 2003, 2005, 2010 Ben Wing. |
428 | 6 |
7 This file is part of XEmacs. | |
8 | |
9 XEmacs is free software; you can redistribute it and/or modify it | |
10 under the terms of the GNU General Public License as published by the | |
11 Free Software Foundation; either version 2, or (at your option) any | |
12 later version. | |
13 | |
14 XEmacs is distributed in the hope that it will be useful, but WITHOUT | |
15 ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or | |
16 FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License | |
17 for more details. | |
18 | |
19 You should have received a copy of the GNU General Public License | |
20 along with XEmacs; see the file COPYING. If not, write to | |
21 the Free Software Foundation, Inc., 59 Temple Place - Suite 330, | |
22 Boston, MA 02111-1307, USA. */ | |
23 | |
771 | 24 /* Synched up with: Not in FSF. */ |
25 | |
26 /* Authorship: | |
27 | |
28 Current primary author: Ben Wing <ben@xemacs.org> | |
29 | |
30 Rewritten by Ben Wing <ben@xemacs.org>, based originally on coding.c | |
31 from Mule 2.? but probably does not share one line of code with that | |
32 original source. Rewriting work started around Dec. 1994. or Jan. 1995. | |
33 Proceeded in earnest till Nov. 1995. | |
34 | |
35 Around Feb. 17, 1998, Andy Piper renamed what was then mule-coding.c to | |
36 file-coding.c, with the intention of using it to do end-of-line conversion | |
37 on non-MULE machines (specifically, on Windows machines). He separated | |
38 out the MULE stuff from non-MULE using ifdef's, and searched throughout | |
39 the rest of the source tree looking for coding-system-related code that | |
40 was ifdef MULE but should be ifdef HAVE_CODING_SYSTEMS. | |
41 | |
42 Sept. 4 - 8, 1998, Tomohiko Morioka added the UCS_4 and UTF_8 coding system | |
43 types, providing a primitive means of decoding and encoding externally- | |
44 formatted Unicode/UCS_4 and Unicode/UTF_8 data. | |
45 | |
46 January 25, 2000, Martin Buchholz redid and fleshed out the coding | |
47 system alias handling that was first added in prototype form by | |
48 Hrjove Niksic, April 15, 1999. | |
49 | |
50 April to May 2000, Ben Wing: More major reorganization. Adding features | |
51 needed for MS Windows (multibyte, unicode, unicode-to-multibyte), the | |
52 "chain" coding system for chaining two together, and doing a lot of | |
53 reorganization in preparation for properly abstracting out the different | |
54 coding system types. | |
55 | |
56 June 2001, Ben Wing: Added Unicode support. Eliminated previous | |
57 junky Unicode translation support. | |
58 | |
59 August 2001, Ben Wing: Moved Unicode support to unicode.c. Finished | |
60 abstracting everything except detection, which is hard to abstract (see | |
61 just below). | |
62 | |
63 September 2001, Ben Wing: Moved Mule code to mule-coding.c, Windows code | |
64 to intl-win32.c. Lots more rewriting; very little code is untouched | |
65 from before April 2000. Abstracted the detection code, added multiple | |
66 levels of likelihood to increase the reliability of the algorithm. | |
67 | |
68 October 2001, Ben Wing: HAVE_CODING_SYSTEMS is always now defined. | |
69 Removed the conditionals. | |
70 */ | |
71 | |
428 | 72 #include <config.h> |
73 #include "lisp.h" | |
74 | |
75 #include "buffer.h" | |
76 #include "elhash.h" | |
77 #include "insdel.h" | |
78 #include "lstream.h" | |
440 | 79 #include "opaque.h" |
771 | 80 #include "file-coding.h" |
4690
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
81 #include "extents.h" |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
82 #include "rangetab.h" |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
83 #include "chartab.h" |
771 | 84 |
85 #ifdef HAVE_ZLIB | |
86 #include "zlib.h" | |
428 | 87 #endif |
88 | |
89 Lisp_Object Vkeyboard_coding_system; | |
90 Lisp_Object Vterminal_coding_system; | |
91 Lisp_Object Vcoding_system_for_read; | |
92 Lisp_Object Vcoding_system_for_write; | |
93 Lisp_Object Vfile_name_coding_system; | |
94 | |
4690
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
95 Lisp_Object Qaliases, Qcharset_skip_chars_string; |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
96 |
771 | 97 #ifdef DEBUG_XEMACS |
98 Lisp_Object Vdebug_coding_detection; | |
440 | 99 #endif |
771 | 100 |
4690
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
101 #ifdef MULE |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
102 extern Lisp_Object Vcharset_ascii, Vcharset_control_1, |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
103 Vcharset_latin_iso8859_1; |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
104 #endif |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
105 |
771 | 106 typedef struct coding_system_type_entry |
107 { | |
108 struct coding_system_methods *meths; | |
109 } coding_system_type_entry; | |
110 | |
111 typedef struct | |
112 { | |
113 Dynarr_declare (coding_system_type_entry); | |
114 } coding_system_type_entry_dynarr; | |
115 | |
116 static coding_system_type_entry_dynarr *the_coding_system_type_entry_dynarr; | |
117 | |
1204 | 118 static const struct memory_description cste_description_1[] = { |
2551 | 119 { XD_BLOCK_PTR, offsetof (coding_system_type_entry, meths), 1, |
120 { &coding_system_methods_description } }, | |
771 | 121 { XD_END } |
122 }; | |
123 | |
1204 | 124 static const struct sized_memory_description cste_description = { |
771 | 125 sizeof (coding_system_type_entry), |
126 cste_description_1 | |
127 }; | |
128 | |
1204 | 129 static const struct memory_description csted_description_1[] = { |
771 | 130 XD_DYNARR_DESC (coding_system_type_entry_dynarr, &cste_description), |
428 | 131 { XD_END } |
132 }; | |
133 | |
1204 | 134 static const struct sized_memory_description csted_description = { |
771 | 135 sizeof (coding_system_type_entry_dynarr), |
136 csted_description_1 | |
137 }; | |
138 | |
139 static Lisp_Object Vcoding_system_type_list; | |
140 | |
141 /* Coding system currently associated with each coding category. */ | |
142 Lisp_Object coding_category_system[MAX_DETECTOR_CATEGORIES]; | |
143 | |
144 /* Table of all coding categories in decreasing order of priority. | |
145 This describes a permutation of the possible coding categories. */ | |
146 int coding_category_by_priority[MAX_DETECTOR_CATEGORIES]; | |
147 | |
148 /* Value used with to give a unique name to nameless coding systems */ | |
149 int coding_system_tick; | |
150 | |
151 int coding_detector_count; | |
152 int coding_detector_category_count; | |
153 | |
154 detector_dynarr *all_coding_detectors; | |
155 | |
1204 | 156 static const struct memory_description struct_detector_category_description_1[] |
771 | 157 = |
158 { | |
159 { XD_LISP_OBJECT, offsetof (struct detector_category, sym) }, | |
160 { XD_END } | |
161 }; | |
162 | |
1204 | 163 static const struct sized_memory_description struct_detector_category_description = |
771 | 164 { |
165 sizeof (struct detector_category), | |
166 struct_detector_category_description_1 | |
428 | 167 }; |
168 | |
1204 | 169 static const struct memory_description detector_category_dynarr_description_1[] = |
771 | 170 { |
171 XD_DYNARR_DESC (detector_category_dynarr, | |
172 &struct_detector_category_description), | |
173 { XD_END } | |
174 }; | |
175 | |
1204 | 176 static const struct sized_memory_description detector_category_dynarr_description = { |
771 | 177 sizeof (detector_category_dynarr), |
178 detector_category_dynarr_description_1 | |
179 }; | |
180 | |
1204 | 181 static const struct memory_description struct_detector_description_1[] |
771 | 182 = |
183 { | |
2367 | 184 { XD_BLOCK_PTR, offsetof (struct detector, cats), 1, |
2551 | 185 { &detector_category_dynarr_description } }, |
771 | 186 { XD_END } |
187 }; | |
188 | |
1204 | 189 static const struct sized_memory_description struct_detector_description = |
771 | 190 { |
191 sizeof (struct detector), | |
192 struct_detector_description_1 | |
193 }; | |
194 | |
1204 | 195 static const struct memory_description detector_dynarr_description_1[] = |
771 | 196 { |
197 XD_DYNARR_DESC (detector_dynarr, &struct_detector_description), | |
198 { XD_END } | |
199 }; | |
200 | |
1204 | 201 static const struct sized_memory_description detector_dynarr_description = { |
771 | 202 sizeof (detector_dynarr), |
203 detector_dynarr_description_1 | |
204 }; | |
428 | 205 |
206 Lisp_Object Qcoding_systemp; | |
207 | |
771 | 208 Lisp_Object Qraw_text; |
428 | 209 |
210 Lisp_Object Qmnemonic, Qeol_type; | |
211 Lisp_Object Qcr, Qcrlf, Qlf; | |
212 Lisp_Object Qeol_cr, Qeol_crlf, Qeol_lf; | |
213 Lisp_Object Qpost_read_conversion; | |
214 Lisp_Object Qpre_write_conversion; | |
215 | |
771 | 216 Lisp_Object Qtranslation_table_for_decode; |
217 Lisp_Object Qtranslation_table_for_encode; | |
218 Lisp_Object Qsafe_chars; | |
219 Lisp_Object Qsafe_charsets; | |
220 Lisp_Object Qmime_charset; | |
221 Lisp_Object Qvalid_codes; | |
222 | |
223 Lisp_Object Qno_conversion; | |
224 Lisp_Object Qconvert_eol; | |
440 | 225 Lisp_Object Qescape_quoted; |
771 | 226 Lisp_Object Qencode, Qdecode; |
227 | |
228 Lisp_Object Qconvert_eol_lf, Qconvert_eol_cr, Qconvert_eol_crlf; | |
229 Lisp_Object Qconvert_eol_autodetect; | |
230 | |
231 Lisp_Object Qnear_certainty, Qquite_probable, Qsomewhat_likely; | |
1494 | 232 Lisp_Object Qslightly_likely; |
771 | 233 Lisp_Object Qas_likely_as_unlikely, Qsomewhat_unlikely, Qquite_improbable; |
234 Lisp_Object Qnearly_impossible; | |
235 | |
236 Lisp_Object Qdo_eol, Qdo_coding; | |
237 | |
238 Lisp_Object Qcanonicalize_after_coding; | |
239 | |
1347 | 240 Lisp_Object QScoding_system_cookie; |
241 | |
4303 | 242 Lisp_Object Qposix_charset_to_coding_system_hash; |
243 | |
771 | 244 /* This is used to convert autodetected coding systems into existing |
245 systems. For example, the chain undecided->convert-eol-autodetect may | |
246 have its separate parts detected as mswindows-multibyte and | |
247 convert-eol-crlf, and the result needs to be mapped to | |
248 mswindows-multibyte-dos. */ | |
249 /* #### It's not clear we need this whole chain-canonicalize mechanism | |
250 any more. */ | |
251 static Lisp_Object Vchain_canonicalize_hash_table; | |
252 | |
253 #ifdef HAVE_ZLIB | |
254 Lisp_Object Qgzip; | |
428 | 255 #endif |
771 | 256 |
2297 | 257 /* Maps symbols (coding system names) to either coding system objects or |
258 (for aliases) other names. */ | |
771 | 259 static Lisp_Object Vcoding_system_hash_table; |
428 | 260 |
261 int enable_multibyte_characters; | |
262 | |
263 EXFUN (Fcopy_coding_system, 2); | |
264 | |
265 | |
266 /************************************************************************/ | |
771 | 267 /* Coding system object methods */ |
428 | 268 /************************************************************************/ |
269 | |
270 static Lisp_Object | |
271 mark_coding_system (Lisp_Object obj) | |
272 { | |
273 Lisp_Coding_System *codesys = XCODING_SYSTEM (obj); | |
274 | |
1204 | 275 #define MARKED_SLOT(x) mark_object (codesys->x); |
276 #include "coding-system-slots.h" | |
771 | 277 |
278 MAYBE_CODESYSMETH (codesys, mark, (obj)); | |
428 | 279 |
1204 | 280 return Qnil; |
428 | 281 } |
282 | |
283 static void | |
771 | 284 print_coding_system_properties (Lisp_Object obj, Lisp_Object printcharfun) |
285 { | |
286 Lisp_Coding_System *c = XCODING_SYSTEM (obj); | |
287 print_internal (c->methods->type, printcharfun, 1); | |
288 MAYBE_CODESYSMETH (c, print, (obj, printcharfun, 1)); | |
289 if (CODING_SYSTEM_EOL_TYPE (c) != EOL_AUTODETECT) | |
290 write_fmt_string_lisp (printcharfun, " eol-type=%s", | |
291 1, Fcoding_system_property (obj, Qeol_type)); | |
292 } | |
293 | |
294 static void | |
428 | 295 print_coding_system (Lisp_Object obj, Lisp_Object printcharfun, |
2286 | 296 int UNUSED (escapeflag)) |
428 | 297 { |
298 Lisp_Coding_System *c = XCODING_SYSTEM (obj); | |
299 if (print_readably) | |
5142
f965e31a35f0
reduce lcrecord headers to 2 words, rename printing_unreadable_object
Ben Wing <ben@xemacs.org>
parents:
5128
diff
changeset
|
300 printing_unreadable_lisp_object (obj, 0); |
771 | 301 |
302 write_fmt_string_lisp (printcharfun, "#<coding-system %s ", 1, c->name); | |
303 print_coding_system_properties (obj, printcharfun); | |
4952
19a72041c5ed
Mule-izing, various fixes related to char * arguments
Ben Wing <ben@xemacs.org>
parents:
4846
diff
changeset
|
304 write_ascstring (printcharfun, ">"); |
428 | 305 } |
306 | |
771 | 307 /* Print an abbreviated version of a coding system (but still containing |
308 all the information), for use within a coding system print method. */ | |
309 | |
310 static void | |
311 print_coding_system_in_print_method (Lisp_Object cs, Lisp_Object printcharfun, | |
2286 | 312 int UNUSED (escapeflag)) |
771 | 313 { |
800 | 314 write_fmt_string_lisp (printcharfun, "%s[", 1, XCODING_SYSTEM_NAME (cs)); |
771 | 315 print_coding_system_properties (cs, printcharfun); |
4952
19a72041c5ed
Mule-izing, various fixes related to char * arguments
Ben Wing <ben@xemacs.org>
parents:
4846
diff
changeset
|
316 write_ascstring (printcharfun, "]"); |
771 | 317 } |
318 | |
3263 | 319 #ifndef NEW_GC |
428 | 320 static void |
5127
a9c41067dd88
more cleanups, terminology clarification, lots of doc work
Ben Wing <ben@xemacs.org>
parents:
5126
diff
changeset
|
321 finalize_coding_system (Lisp_Object obj) |
428 | 322 { |
323 /* Since coding systems never go away, this function is not | |
324 necessary. But it would be necessary if we changed things | |
325 so that coding systems could go away. */ | |
5127
a9c41067dd88
more cleanups, terminology clarification, lots of doc work
Ben Wing <ben@xemacs.org>
parents:
5126
diff
changeset
|
326 MAYBE_XCODESYSMETH (obj, finalize, (obj)); |
771 | 327 } |
3263 | 328 #endif /* not NEW_GC */ |
771 | 329 |
330 static Bytecount | |
5127
a9c41067dd88
more cleanups, terminology clarification, lots of doc work
Ben Wing <ben@xemacs.org>
parents:
5126
diff
changeset
|
331 sizeof_coding_system (Lisp_Object obj) |
771 | 332 { |
5127
a9c41067dd88
more cleanups, terminology clarification, lots of doc work
Ben Wing <ben@xemacs.org>
parents:
5126
diff
changeset
|
333 const Lisp_Coding_System *p = XCODING_SYSTEM (obj); |
771 | 334 return offsetof (Lisp_Coding_System, data) + p->methods->extra_data_size; |
428 | 335 } |
336 | |
1204 | 337 static const struct memory_description coding_system_methods_description_1[] |
771 | 338 = { |
339 { XD_LISP_OBJECT, | |
340 offsetof (struct coding_system_methods, type) }, | |
341 { XD_LISP_OBJECT, | |
342 offsetof (struct coding_system_methods, predicate_symbol) }, | |
343 { XD_END } | |
344 }; | |
345 | |
1204 | 346 const struct sized_memory_description coding_system_methods_description = { |
771 | 347 sizeof (struct coding_system_methods), |
348 coding_system_methods_description_1 | |
349 }; | |
350 | |
1204 | 351 static const struct sized_memory_description coding_system_extra_description_map[] = |
352 { | |
353 { offsetof (Lisp_Coding_System, methods) }, | |
354 { offsetof (struct coding_system_methods, extra_description) }, | |
355 { -1 }, | |
771 | 356 }; |
357 | |
1204 | 358 static const struct memory_description coding_system_description[] = |
428 | 359 { |
2367 | 360 { XD_BLOCK_PTR, offsetof (Lisp_Coding_System, methods), 1, |
2551 | 361 { &coding_system_methods_description } }, |
1204 | 362 #define MARKED_SLOT(x) { XD_LISP_OBJECT, offsetof (Lisp_Coding_System, x) }, |
363 #define MARKED_SLOT_ARRAY(slot, size) \ | |
364 { XD_LISP_OBJECT_ARRAY, offsetof (Lisp_Coding_System, slot), size }, | |
365 #include "coding-system-slots.h" | |
2367 | 366 { XD_BLOCK_ARRAY, offsetof (Lisp_Coding_System, data), 1, |
2551 | 367 { coding_system_extra_description_map } }, |
1204 | 368 { XD_END } |
771 | 369 }; |
370 | |
1204 | 371 static const struct memory_description coding_system_empty_extra_description_1[] = |
372 { | |
373 { XD_END } | |
374 }; | |
375 | |
376 const struct sized_memory_description coding_system_empty_extra_description = { | |
377 0, coding_system_empty_extra_description_1 | |
378 }; | |
379 | |
3263 | 380 #ifdef NEW_GC |
5118
e0db3c197671
merge up to latest default branch, doesn't compile yet
Ben Wing <ben@xemacs.org>
diff
changeset
|
381 DEFINE_DUMPABLE_SIZABLE_LISP_OBJECT ("coding-system", coding_system, |
5124
623d57b7fbe8
separate regular and disksave finalization, print method fixes.
Ben Wing <ben@xemacs.org>
parents:
5120
diff
changeset
|
382 mark_coding_system, |
623d57b7fbe8
separate regular and disksave finalization, print method fixes.
Ben Wing <ben@xemacs.org>
parents:
5120
diff
changeset
|
383 print_coding_system, |
623d57b7fbe8
separate regular and disksave finalization, print method fixes.
Ben Wing <ben@xemacs.org>
parents:
5120
diff
changeset
|
384 0, 0, 0, coding_system_description, |
623d57b7fbe8
separate regular and disksave finalization, print method fixes.
Ben Wing <ben@xemacs.org>
parents:
5120
diff
changeset
|
385 sizeof_coding_system, |
623d57b7fbe8
separate regular and disksave finalization, print method fixes.
Ben Wing <ben@xemacs.org>
parents:
5120
diff
changeset
|
386 Lisp_Coding_System); |
3263 | 387 #else /* not NEW_GC */ |
5118
e0db3c197671
merge up to latest default branch, doesn't compile yet
Ben Wing <ben@xemacs.org>
diff
changeset
|
388 DEFINE_DUMPABLE_SIZABLE_LISP_OBJECT ("coding-system", coding_system, |
5124
623d57b7fbe8
separate regular and disksave finalization, print method fixes.
Ben Wing <ben@xemacs.org>
parents:
5120
diff
changeset
|
389 mark_coding_system, |
623d57b7fbe8
separate regular and disksave finalization, print method fixes.
Ben Wing <ben@xemacs.org>
parents:
5120
diff
changeset
|
390 print_coding_system, |
623d57b7fbe8
separate regular and disksave finalization, print method fixes.
Ben Wing <ben@xemacs.org>
parents:
5120
diff
changeset
|
391 finalize_coding_system, |
623d57b7fbe8
separate regular and disksave finalization, print method fixes.
Ben Wing <ben@xemacs.org>
parents:
5120
diff
changeset
|
392 0, 0, coding_system_description, |
623d57b7fbe8
separate regular and disksave finalization, print method fixes.
Ben Wing <ben@xemacs.org>
parents:
5120
diff
changeset
|
393 sizeof_coding_system, |
623d57b7fbe8
separate regular and disksave finalization, print method fixes.
Ben Wing <ben@xemacs.org>
parents:
5120
diff
changeset
|
394 Lisp_Coding_System); |
3263 | 395 #endif /* not NEW_GC */ |
771 | 396 |
397 /************************************************************************/ | |
398 /* Creating coding systems */ | |
399 /************************************************************************/ | |
400 | |
401 static struct coding_system_methods * | |
402 decode_coding_system_type (Lisp_Object type, Error_Behavior errb) | |
428 | 403 { |
771 | 404 int i; |
405 | |
406 for (i = 0; i < Dynarr_length (the_coding_system_type_entry_dynarr); i++) | |
428 | 407 { |
771 | 408 if (EQ (type, |
409 Dynarr_at (the_coding_system_type_entry_dynarr, i).meths->type)) | |
410 return Dynarr_at (the_coding_system_type_entry_dynarr, i).meths; | |
428 | 411 } |
771 | 412 |
413 maybe_invalid_constant ("Invalid coding system type", type, | |
414 Qcoding_system, errb); | |
415 | |
416 return 0; | |
428 | 417 } |
418 | |
771 | 419 static int |
420 valid_coding_system_type_p (Lisp_Object type) | |
428 | 421 { |
771 | 422 return decode_coding_system_type (type, ERROR_ME_NOT) != 0; |
423 } | |
424 | |
4690
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
425 #ifdef MULE |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
426 static Lisp_Object Vdefault_query_coding_region_chartab_cache; |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
427 |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
428 /* Non-static because it's used in INITIALIZE_CODING_SYSTEM_TYPE_WITH_DATA. */ |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
429 Lisp_Object |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
430 default_query_method (Lisp_Object codesys, struct buffer *buf, |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
431 Charbpos end, int flags) |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
432 { |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
433 Charbpos pos = BUF_PT (buf), fail_range_start, fail_range_end; |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
434 Charbpos pos_byte = BYTE_BUF_PT (buf); |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
435 Lisp_Object safe_charsets = XCODING_SYSTEM_SAFE_CHARSETS (codesys); |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
436 Lisp_Object safe_chars = XCODING_SYSTEM_SAFE_CHARS (codesys), |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
437 result = Qnil; |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
438 enum query_coding_failure_reasons failed_reason, |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
439 previous_failed_reason = query_coding_succeeded; |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
440 |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
441 /* safe-charsets of t means the coding system can encode everything. */ |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
442 if (EQ (Qnil, safe_chars)) |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
443 { |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
444 if (EQ (Qt, safe_charsets)) |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
445 { |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
446 return Qnil; |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
447 } |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
448 |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
449 /* If we've no information on what characters the coding system can |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
450 encode, give up. */ |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
451 if (EQ (Qnil, safe_charsets) && EQ (Qnil, safe_chars)) |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
452 { |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
453 return Qunbound; |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
454 } |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
455 |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
456 safe_chars = Fgethash (safe_charsets, |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
457 Vdefault_query_coding_region_chartab_cache, |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
458 Qnil); |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
459 if (NILP (safe_chars)) |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
460 { |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
461 safe_chars = Fmake_char_table (Qgeneric); |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
462 { |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
463 EXTERNAL_LIST_LOOP_2 (safe_charset, safe_charsets) |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
464 Fput_char_table (safe_charset, Qt, safe_chars); |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
465 } |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
466 |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
467 Fputhash (safe_charsets, safe_chars, |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
468 Vdefault_query_coding_region_chartab_cache); |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
469 } |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
470 } |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
471 |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
472 if (flags & QUERY_METHOD_HIGHLIGHT && |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
473 /* If we're being called really early, live without highlights getting |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
474 cleared properly: */ |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
475 !(UNBOUNDP (XSYMBOL (Qquery_coding_clear_highlights)->function))) |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
476 { |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
477 /* It's okay to call Lisp here, the only non-stack object we may have |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
478 allocated up to this point is safe_chars, and that's |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
479 reachable from its entry in |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
480 Vdefault_query_coding_region_chartab_cache */ |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
481 call3 (Qquery_coding_clear_highlights, make_int (pos), make_int (end), |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
482 wrap_buffer (buf)); |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
483 } |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
484 |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
485 while (pos < end) |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
486 { |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
487 Ichar ch = BYTE_BUF_FETCH_CHAR (buf, pos_byte); |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
488 if (!EQ (Qnil, get_char_table (ch, safe_chars))) |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
489 { |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
490 pos++; |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
491 INC_BYTEBPOS (buf, pos_byte); |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
492 } |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
493 else |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
494 { |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
495 fail_range_start = pos; |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
496 while ((pos < end) && |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
497 (EQ (Qnil, get_char_table (ch, safe_chars)) |
4839
5f1c6ca4d05e
fix bad change to default_query_method, rewrite conditional for more correctness
Ben Wing <ben@xemacs.org>
parents:
4830
diff
changeset
|
498 && (failed_reason = query_coding_unencodable, |
5f1c6ca4d05e
fix bad change to default_query_method, rewrite conditional for more correctness
Ben Wing <ben@xemacs.org>
parents:
4830
diff
changeset
|
499 (previous_failed_reason == query_coding_succeeded |
5f1c6ca4d05e
fix bad change to default_query_method, rewrite conditional for more correctness
Ben Wing <ben@xemacs.org>
parents:
4830
diff
changeset
|
500 || previous_failed_reason == failed_reason)))) |
4690
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
501 { |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
502 pos++; |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
503 INC_BYTEBPOS (buf, pos_byte); |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
504 ch = BYTE_BUF_FETCH_CHAR (buf, pos_byte); |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
505 previous_failed_reason = failed_reason; |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
506 } |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
507 |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
508 if (fail_range_start == pos) |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
509 { |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
510 /* The character can actually be encoded; move on. */ |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
511 pos++; |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
512 INC_BYTEBPOS (buf, pos_byte); |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
513 } |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
514 else |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
515 { |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
516 assert (previous_failed_reason == query_coding_unencodable); |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
517 |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
518 if (flags & QUERY_METHOD_ERRORP) |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
519 { |
4952
19a72041c5ed
Mule-izing, various fixes related to char * arguments
Ben Wing <ben@xemacs.org>
parents:
4846
diff
changeset
|
520 signal_error_2 |
19a72041c5ed
Mule-izing, various fixes related to char * arguments
Ben Wing <ben@xemacs.org>
parents:
4846
diff
changeset
|
521 (Qtext_conversion_error, |
19a72041c5ed
Mule-izing, various fixes related to char * arguments
Ben Wing <ben@xemacs.org>
parents:
4846
diff
changeset
|
522 "Cannot encode using coding system", |
19a72041c5ed
Mule-izing, various fixes related to char * arguments
Ben Wing <ben@xemacs.org>
parents:
4846
diff
changeset
|
523 make_string_from_buffer (buf, fail_range_start, |
19a72041c5ed
Mule-izing, various fixes related to char * arguments
Ben Wing <ben@xemacs.org>
parents:
4846
diff
changeset
|
524 pos - fail_range_start), |
19a72041c5ed
Mule-izing, various fixes related to char * arguments
Ben Wing <ben@xemacs.org>
parents:
4846
diff
changeset
|
525 XCODING_SYSTEM_NAME (codesys)); |
4690
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
526 } |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
527 |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
528 if (NILP (result)) |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
529 { |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
530 result = Fmake_range_table (Qstart_closed_end_open); |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
531 } |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
532 |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
533 fail_range_end = pos; |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
534 |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
535 Fput_range_table (make_int (fail_range_start), |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
536 make_int (fail_range_end), |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
537 Qunencodable, |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
538 result); |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
539 previous_failed_reason = query_coding_succeeded; |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
540 |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
541 if (flags & QUERY_METHOD_HIGHLIGHT) |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
542 { |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
543 Lisp_Object extent |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
544 = Fmake_extent (make_int (fail_range_start), |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
545 make_int (fail_range_end), |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
546 wrap_buffer (buf)); |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
547 |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
548 Fset_extent_priority |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
549 (extent, make_int (2 + mouse_highlight_priority)); |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
550 Fset_extent_face (extent, Qquery_coding_warning_face); |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
551 } |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
552 } |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
553 } |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
554 } |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
555 |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
556 return result; |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
557 } |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
558 #else |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
559 Lisp_Object |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
560 default_query_method (Lisp_Object UNUSED (codesys), |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
561 struct buffer * UNUSED (buf), |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
562 Charbpos UNUSED (end), int UNUSED (flags)) |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
563 { |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
564 return Qnil; |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
565 } |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
566 #endif /* defined MULE */ |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
567 |
771 | 568 DEFUN ("valid-coding-system-type-p", Fvalid_coding_system_type_p, 1, 1, 0, /* |
569 Given a CODING-SYSTEM-TYPE, return non-nil if it is valid. | |
570 Valid types depend on how XEmacs was compiled but may include | |
3025 | 571 `undecided', `chain', `integer', `ccl', `iso2022', `big5', `shift-jis', |
572 `utf-16', `ucs-4', `utf-8', etc. | |
771 | 573 */ |
574 (coding_system_type)) | |
575 { | |
576 return valid_coding_system_type_p (coding_system_type) ? Qt : Qnil; | |
577 } | |
578 | |
579 DEFUN ("coding-system-type-list", Fcoding_system_type_list, 0, 0, 0, /* | |
580 Return a list of valid coding system types. | |
581 */ | |
582 ()) | |
583 { | |
584 return Fcopy_sequence (Vcoding_system_type_list); | |
585 } | |
586 | |
587 void | |
588 add_entry_to_coding_system_type_list (struct coding_system_methods *meths) | |
589 { | |
590 struct coding_system_type_entry entry; | |
591 | |
592 entry.meths = meths; | |
593 Dynarr_add (the_coding_system_type_entry_dynarr, entry); | |
594 Vcoding_system_type_list = Fcons (meths->type, Vcoding_system_type_list); | |
428 | 595 } |
596 | |
597 DEFUN ("coding-system-p", Fcoding_system_p, 1, 1, 0, /* | |
598 Return t if OBJECT is a coding system. | |
599 A coding system is an object that defines how text containing multiple | |
600 character sets is encoded into a stream of (typically 8-bit) bytes. | |
601 The coding system is used to decode the stream into a series of | |
602 characters (which may be from multiple charsets) when the text is read | |
603 from a file or process, and is used to encode the text back into the | |
604 same format when it is written out to a file or process. | |
605 | |
606 For example, many ISO2022-compliant coding systems (such as Compound | |
607 Text, which is used for inter-client data under the X Window System) | |
608 use escape sequences to switch between different charsets -- Japanese | |
609 Kanji, for example, is invoked with "ESC $ ( B"; ASCII is invoked | |
610 with "ESC ( B"; and Cyrillic is invoked with "ESC - L". See | |
611 `make-coding-system' for more information. | |
612 | |
613 Coding systems are normally identified using a symbol, and the | |
614 symbol is accepted in place of the actual coding system object whenever | |
615 a coding system is called for. (This is similar to how faces work.) | |
616 */ | |
617 (object)) | |
618 { | |
619 return CODING_SYSTEMP (object) ? Qt : Qnil; | |
620 } | |
621 | |
4303 | 622 static Lisp_Object |
623 find_coding_system (Lisp_Object coding_system_or_name, | |
624 int do_autoloads) | |
625 { | |
626 Lisp_Object lookup; | |
627 | |
628 if (NILP (coding_system_or_name)) | |
629 coding_system_or_name = Qbinary; | |
630 else if (CODING_SYSTEMP (coding_system_or_name)) | |
631 return coding_system_or_name; | |
632 else | |
633 CHECK_SYMBOL (coding_system_or_name); | |
634 | |
635 while (1) | |
636 { | |
637 lookup = | |
638 Fgethash (coding_system_or_name, Vcoding_system_hash_table, Qnil); | |
639 | |
640 if (CONSP (lookup) && do_autoloads) | |
641 { | |
642 struct gcpro gcpro1; | |
643 int length; | |
644 DECLARE_EISTRING (desired_base); | |
645 DECLARE_EISTRING (warning_info); | |
646 | |
647 eicpy_lstr (desired_base, XSYMBOL_NAME (coding_system_or_name)); | |
648 | |
649 /* Work out the name of the base coding system. */ | |
650 length = eilen (desired_base); | |
651 if (length > (int)(sizeof ("-unix") - 1)) | |
652 { | |
653 if (0 == qxestrcmp ((UAscbyte *)"-unix", (eidata (desired_base)) | |
654 + (length - (sizeof ("-unix") - 1)))) | |
655 { | |
656 eidel (desired_base, length - (sizeof ("-unix") - 1), | |
657 -1, 5, 5); | |
658 } | |
659 } | |
660 else if (length > (int)(sizeof ("-dos") - 1)) | |
661 { | |
662 if ((0 == qxestrcmp ((UAscbyte *)"-dos", (eidata (desired_base)) | |
663 + (length - (sizeof ("-dos") - 1)))) || | |
664 (0 == qxestrcmp ((UAscbyte *)"-mac", (eidata (desired_base)) | |
665 + (length - (sizeof ("-mac") - 1))))) | |
666 { | |
667 eidel (desired_base, length - (sizeof ("-dos") - 1), -1, | |
668 4, 4); | |
669 } | |
670 } | |
671 | |
4953
304aebb79cd3
function renamings to track names of char typedefs
Ben Wing <ben@xemacs.org>
parents:
4952
diff
changeset
|
672 coding_system_or_name = intern_istring (eidata (desired_base)); |
4303 | 673 |
674 /* Remove this coding system and its subsidiary coding | |
675 systems from the hash, to avoid calling this code recursively. */ | |
676 Fremhash (coding_system_or_name, Vcoding_system_hash_table); | |
677 Fremhash (add_suffix_to_symbol(coding_system_or_name, "-unix"), | |
678 Vcoding_system_hash_table); | |
679 Fremhash (add_suffix_to_symbol(coding_system_or_name, "-dos"), | |
680 Vcoding_system_hash_table); | |
681 Fremhash (add_suffix_to_symbol(coding_system_or_name, "-mac"), | |
682 Vcoding_system_hash_table); | |
683 | |
684 eicpy_ascii (warning_info, "Error autoloading coding system "); | |
685 eicat_lstr (warning_info, XSYMBOL_NAME (coding_system_or_name)); | |
686 | |
687 /* Keep around the form so it doesn't disappear from under | |
688 #'eval's feet. */ | |
689 GCPRO1 (lookup); | |
4952
19a72041c5ed
Mule-izing, various fixes related to char * arguments
Ben Wing <ben@xemacs.org>
parents:
4846
diff
changeset
|
690 call1_trapping_problems ((const CIbyte *) eidata (warning_info), |
4303 | 691 Qeval, lookup, 0); |
692 UNGCPRO; | |
693 | |
694 lookup = | |
695 Fgethash (coding_system_or_name, Vcoding_system_hash_table, Qnil); | |
696 } | |
697 | |
698 if (CODING_SYSTEMP (lookup) || NILP (lookup)) | |
699 return lookup; | |
700 | |
701 coding_system_or_name = lookup; | |
702 } | |
703 } | |
704 | |
428 | 705 DEFUN ("find-coding-system", Ffind_coding_system, 1, 1, 0, /* |
706 Retrieve the coding system of the given name. | |
707 | |
708 If CODING-SYSTEM-OR-NAME is a coding-system object, it is simply | |
709 returned. Otherwise, CODING-SYSTEM-OR-NAME should be a symbol. | |
710 If there is no such coding system, nil is returned. Otherwise the | |
711 associated coding system object is returned. | |
712 */ | |
713 (coding_system_or_name)) | |
714 { | |
4303 | 715 return find_coding_system(coding_system_or_name, 1); |
716 } | |
717 | |
718 DEFUN ("autoload-coding-system", Fautoload_coding_system, 2, 2, 0, /* | |
719 Define SYMBOL as a coding-system that is loaded on demand. | |
720 | |
721 FORM is a form to evaluate to define the coding-system. | |
722 */ | |
723 (symbol, form)) | |
724 { | |
725 Lisp_Object lookup; | |
726 | |
727 CHECK_SYMBOL (symbol); | |
728 CHECK_CONS (form); | |
729 | |
730 lookup = find_coding_system (symbol, 0); | |
731 | |
732 if (!NILP (lookup) && | |
733 /* Allow autoloads to be redefined. */ | |
734 !CONSP (lookup)) | |
440 | 735 { |
4303 | 736 invalid_operation ("Cannot redefine existing coding system", |
737 symbol); | |
440 | 738 } |
4303 | 739 |
740 Fputhash (symbol, form, Vcoding_system_hash_table); | |
741 Fputhash (add_suffix_to_symbol(symbol, "-unix"), form, | |
742 Vcoding_system_hash_table); | |
743 Fputhash (add_suffix_to_symbol(symbol, "-dos"), form, | |
744 Vcoding_system_hash_table); | |
745 Fputhash (add_suffix_to_symbol(symbol, "-mac"), form, | |
746 Vcoding_system_hash_table); | |
747 | |
748 /* Tell the POSIX locale infrastructure about this coding system (though | |
749 unfortunately it'll be too late for the startup locale sniffing. */ | |
750 if (!UNBOUNDP (Qposix_charset_to_coding_system_hash)) | |
751 { | |
752 Lisp_Object val = Fsymbol_value (Qposix_charset_to_coding_system_hash); | |
753 DECLARE_EISTRING (minimal_name); | |
754 Ibyte *full_name; | |
755 int len = XSTRING_LENGTH (XSYMBOL_NAME (symbol)), i; | |
756 | |
757 if (!NILP (val)) | |
758 { | |
759 full_name = XSTRING_DATA (XSYMBOL_NAME (symbol)); | |
760 for (i = 0; i < len; ++i) | |
761 { | |
762 if (full_name[i] >= '0' && full_name[i] <= '9') | |
763 { | |
764 eicat_ch (minimal_name, full_name[i]); | |
765 } | |
766 else if (full_name[i] >= 'a' && full_name[i] <= 'z') | |
767 { | |
768 eicat_ch (minimal_name, full_name[i]); | |
769 } | |
770 else if (full_name[i] >= 'A' && full_name[i] <= 'Z') | |
771 { | |
772 eicat_ch (minimal_name, full_name[i] + | |
773 ('a' - 'A')); | |
774 } | |
775 } | |
776 | |
777 if (eilen (minimal_name)) | |
778 { | |
779 CHECK_HASH_TABLE (val); | |
780 Fputhash (eimake_string(minimal_name), symbol, val); | |
781 } | |
782 } | |
783 } | |
784 | |
785 return Qt; | |
428 | 786 } |
787 | |
788 DEFUN ("get-coding-system", Fget_coding_system, 1, 1, 0, /* | |
789 Retrieve the coding system of the given name. | |
790 Same as `find-coding-system' except that if there is no such | |
791 coding system, an error is signaled instead of returning nil. | |
792 */ | |
793 (name)) | |
794 { | |
795 Lisp_Object coding_system = Ffind_coding_system (name); | |
796 | |
797 if (NILP (coding_system)) | |
563 | 798 invalid_argument ("No such coding system", name); |
428 | 799 return coding_system; |
800 } | |
801 | |
771 | 802 int |
803 coding_system_is_binary (Lisp_Object coding_system) | |
804 { | |
805 Lisp_Coding_System *cs = XCODING_SYSTEM (coding_system); | |
806 return | |
807 (EQ (CODING_SYSTEM_TYPE (cs), Qno_conversion) && | |
808 CODING_SYSTEM_EOL_TYPE (cs) == EOL_LF && | |
809 EQ (CODING_SYSTEM_POST_READ_CONVERSION (cs), Qnil) && | |
810 EQ (CODING_SYSTEM_PRE_WRITE_CONVERSION (cs), Qnil)); | |
811 } | |
812 | |
813 static Lisp_Object | |
814 coding_system_real_canonical (Lisp_Object cs) | |
815 { | |
816 if (!NILP (XCODING_SYSTEM_CANONICAL (cs))) | |
817 return XCODING_SYSTEM_CANONICAL (cs); | |
818 return cs; | |
819 } | |
820 | |
821 /* Return true if coding system is of the "standard" type that decodes | |
822 bytes into characters (suitable for decoding a text file). */ | |
823 int | |
824 coding_system_is_for_text_file (Lisp_Object coding_system) | |
825 { | |
826 return (XCODESYSMETH_OR_GIVEN | |
827 (coding_system, conversion_end_type, | |
828 (coding_system_real_canonical (coding_system)), | |
829 DECODES_BYTE_TO_CHARACTER) == | |
830 DECODES_BYTE_TO_CHARACTER); | |
831 } | |
832 | |
833 static int | |
834 decoding_source_sink_type_is_char (Lisp_Object cs, enum source_or_sink sex) | |
835 { | |
836 enum source_sink_type type = | |
837 XCODESYSMETH_OR_GIVEN (cs, conversion_end_type, | |
838 (coding_system_real_canonical (cs)), | |
839 DECODES_BYTE_TO_CHARACTER); | |
840 if (sex == CODING_SOURCE) | |
841 return (type == DECODES_CHARACTER_TO_CHARACTER || | |
842 type == DECODES_CHARACTER_TO_BYTE); | |
843 else | |
844 return (type == DECODES_CHARACTER_TO_CHARACTER || | |
845 type == DECODES_BYTE_TO_CHARACTER); | |
846 } | |
847 | |
848 static int | |
849 encoding_source_sink_type_is_char (Lisp_Object cs, enum source_or_sink sex) | |
850 { | |
851 return decoding_source_sink_type_is_char (cs, | |
852 /* Sex change */ | |
853 sex == CODING_SOURCE ? | |
854 CODING_SINK : CODING_SOURCE); | |
855 } | |
856 | |
857 /* Like Ffind_coding_system() but check that the coding system is of the | |
858 "standard" type that decodes bytes into characters (suitable for | |
859 decoding a text file), and if not, returns an appropriate wrapper that | |
860 does. Also, if EOL_WRAP is non-zero, check whether this coding system | |
861 wants EOL auto-detection, and if so, wrap with a convert-eol coding | |
862 system to do this. */ | |
863 | |
864 Lisp_Object | |
865 find_coding_system_for_text_file (Lisp_Object name, int eol_wrap) | |
866 { | |
867 Lisp_Object coding_system = Ffind_coding_system (name); | |
868 Lisp_Object wrapper = coding_system; | |
869 | |
870 if (NILP (coding_system)) | |
871 return Qnil; | |
872 if (!coding_system_is_for_text_file (coding_system)) | |
873 { | |
874 wrapper = XCODING_SYSTEM_TEXT_FILE_WRAPPER (coding_system); | |
875 if (NILP (wrapper)) | |
876 { | |
877 Lisp_Object chain; | |
878 if (!decoding_source_sink_type_is_char (coding_system, CODING_SINK)) | |
879 chain = list2 (coding_system, Qbinary); | |
880 else | |
881 chain = list1 (coding_system); | |
882 if (decoding_source_sink_type_is_char (coding_system, CODING_SOURCE)) | |
883 chain = Fcons (Qbinary, chain); | |
884 wrapper = | |
885 make_internal_coding_system | |
886 (coding_system, | |
887 "internal-text-file-wrapper", | |
888 Qchain, | |
889 Qunbound, list4 (Qchain, chain, | |
890 Qcanonicalize_after_coding, coding_system)); | |
891 XCODING_SYSTEM_TEXT_FILE_WRAPPER (coding_system) = wrapper; | |
892 } | |
893 } | |
894 | |
895 if (!eol_wrap || XCODING_SYSTEM_EOL_TYPE (coding_system) != EOL_AUTODETECT) | |
896 return wrapper; | |
897 | |
898 coding_system = wrapper; | |
899 wrapper = XCODING_SYSTEM_AUTO_EOL_WRAPPER (coding_system); | |
900 if (!NILP (wrapper)) | |
901 return wrapper; | |
902 wrapper = | |
903 make_internal_coding_system | |
904 (coding_system, | |
905 "internal-auto-eol-wrapper", | |
906 Qundecided, Qunbound, | |
907 list4 (Qcoding_system, coding_system, | |
908 Qdo_eol, Qt)); | |
909 XCODING_SYSTEM_AUTO_EOL_WRAPPER (coding_system) = wrapper; | |
910 return wrapper; | |
911 } | |
912 | |
913 /* Like Fget_coding_system() but verify that the coding system is of the | |
914 "standard" type that decodes bytes into characters (suitable for | |
915 decoding a text file), and if not, returns an appropriate wrapper that | |
916 does. Also, if EOL_WRAP is non-zero, check whether this coding system | |
917 wants EOL auto-detection, and if so, wrap with a convert-eol coding | |
918 system to do this. */ | |
919 | |
920 Lisp_Object | |
921 get_coding_system_for_text_file (Lisp_Object name, int eol_wrap) | |
922 { | |
923 Lisp_Object coding_system = find_coding_system_for_text_file (name, | |
924 eol_wrap); | |
925 if (NILP (coding_system)) | |
926 invalid_argument ("No such coding system", name); | |
927 return coding_system; | |
928 } | |
929 | |
930 /* We store the coding systems in hash tables with the names as the | |
931 key and the actual coding system object as the value. Occasionally | |
932 we need to use them in a list format. These routines provide us | |
933 with that. */ | |
428 | 934 struct coding_system_list_closure |
935 { | |
936 Lisp_Object *coding_system_list; | |
771 | 937 int normal; |
938 int internal; | |
428 | 939 }; |
940 | |
941 static int | |
4303 | 942 add_coding_system_to_list_mapper (Lisp_Object key, Lisp_Object value, |
428 | 943 void *coding_system_list_closure) |
944 { | |
945 /* This function can GC */ | |
946 struct coding_system_list_closure *cscl = | |
947 (struct coding_system_list_closure *) coding_system_list_closure; | |
948 Lisp_Object *coding_system_list = cscl->coding_system_list; | |
949 | |
771 | 950 /* We can't just use VALUE because KEY might be an alias, and we need |
4303 | 951 the real coding system object. |
952 | |
953 Autoloaded coding systems have conses for their values, and can't be | |
954 internal coding systems, or coding system aliases. */ | |
955 if (CONSP (value) || | |
956 (XCODING_SYSTEM (Ffind_coding_system (key))->internal_p ? | |
957 cscl->internal : cscl->normal)) | |
771 | 958 *coding_system_list = Fcons (key, *coding_system_list); |
428 | 959 return 0; |
960 } | |
961 | |
2297 | 962 /* #### should we specify a conventional for "all coding systems"? */ |
771 | 963 DEFUN ("coding-system-list", Fcoding_system_list, 0, 1, 0, /* |
428 | 964 Return a list of the names of all defined coding systems. |
771 | 965 If INTERNAL is nil, only the normal (non-internal) coding systems are |
966 included. (Internal coding systems are created for various internal | |
967 purposes, such as implementing EOL types of CRLF and CR; generally, you do | |
968 not want to see these.) If it is t, only the internal coding systems are | |
969 included. If it is any other non-nil value both normal and internal are | |
970 included. | |
428 | 971 */ |
771 | 972 (internal)) |
428 | 973 { |
974 Lisp_Object coding_system_list = Qnil; | |
975 struct gcpro gcpro1; | |
976 struct coding_system_list_closure coding_system_list_closure; | |
977 | |
978 GCPRO1 (coding_system_list); | |
979 coding_system_list_closure.coding_system_list = &coding_system_list; | |
771 | 980 coding_system_list_closure.normal = !EQ (internal, Qt); |
981 coding_system_list_closure.internal = !NILP (internal); | |
428 | 982 elisp_maphash (add_coding_system_to_list_mapper, Vcoding_system_hash_table, |
983 &coding_system_list_closure); | |
984 UNGCPRO; | |
985 | |
986 return coding_system_list; | |
987 } | |
988 | |
989 DEFUN ("coding-system-name", Fcoding_system_name, 1, 1, 0, /* | |
990 Return the name of the given coding system. | |
991 */ | |
992 (coding_system)) | |
993 { | |
994 coding_system = Fget_coding_system (coding_system); | |
995 return XCODING_SYSTEM_NAME (coding_system); | |
996 } | |
997 | |
998 static Lisp_Coding_System * | |
771 | 999 allocate_coding_system (struct coding_system_methods *codesys_meths, |
1000 Bytecount data_size, | |
1001 Lisp_Object name) | |
428 | 1002 { |
771 | 1003 Bytecount total_size = offsetof (Lisp_Coding_System, data) + data_size; |
5117
3742ea8250b5
Checking in final CVS version of workspace 'ben-lisp-object'
Ben Wing <ben@xemacs.org>
parents:
3025
diff
changeset
|
1004 Lisp_Object obj = ALLOC_SIZED_LISP_OBJECT (total_size, coding_system); |
3742ea8250b5
Checking in final CVS version of workspace 'ben-lisp-object'
Ben Wing <ben@xemacs.org>
parents:
3025
diff
changeset
|
1005 Lisp_Coding_System *codesys = XCODING_SYSTEM (obj); |
1204 | 1006 |
771 | 1007 codesys->methods = codesys_meths; |
1204 | 1008 #define MARKED_SLOT(x) codesys->x = Qnil; |
1009 #include "coding-system-slots.h" | |
1010 | |
771 | 1011 CODING_SYSTEM_EOL_TYPE (codesys) = EOL_LF; |
1012 CODING_SYSTEM_NAME (codesys) = name; | |
1013 | |
1014 MAYBE_CODESYSMETH (codesys, init, (wrap_coding_system (codesys))); | |
428 | 1015 |
1016 return codesys; | |
1017 } | |
1018 | |
771 | 1019 static enum eol_type |
1020 symbol_to_eol_type (Lisp_Object symbol) | |
1021 { | |
1022 CHECK_SYMBOL (symbol); | |
1023 if (NILP (symbol)) return EOL_AUTODETECT; | |
1024 if (EQ (symbol, Qlf)) return EOL_LF; | |
1025 if (EQ (symbol, Qcrlf)) return EOL_CRLF; | |
1026 if (EQ (symbol, Qcr)) return EOL_CR; | |
1027 | |
1028 invalid_constant ("Unrecognized eol type", symbol); | |
1204 | 1029 RETURN_NOT_REACHED (EOL_AUTODETECT); |
771 | 1030 } |
1031 | |
1032 static Lisp_Object | |
1033 eol_type_to_symbol (enum eol_type type) | |
1034 { | |
1035 switch (type) | |
1036 { | |
2500 | 1037 default: ABORT (); |
771 | 1038 case EOL_LF: return Qlf; |
1039 case EOL_CRLF: return Qcrlf; | |
1040 case EOL_CR: return Qcr; | |
1041 case EOL_AUTODETECT: return Qnil; | |
1042 } | |
1043 } | |
1044 | |
1045 struct subsidiary_type | |
1046 { | |
4528
726060ee587c
First draft of g++ 4.3 warning removal patch. Builds. *Needs ChangeLogs.*
Stephen J. Turnbull <stephen@xemacs.org>
parents:
4522
diff
changeset
|
1047 const Ascbyte *extension; |
726060ee587c
First draft of g++ 4.3 warning removal patch. Builds. *Needs ChangeLogs.*
Stephen J. Turnbull <stephen@xemacs.org>
parents:
4522
diff
changeset
|
1048 const Ascbyte *mnemonic_ext; |
771 | 1049 enum eol_type eol; |
1050 }; | |
1051 | |
1052 static struct subsidiary_type coding_subsidiary_list[] = | |
1053 { { "-unix", "", EOL_LF }, | |
1054 { "-dos", ":T", EOL_CRLF }, | |
1055 { "-mac", ":t", EOL_CR } }; | |
1056 | |
1057 /* kludge */ | |
428 | 1058 static void |
771 | 1059 setup_eol_coding_systems (Lisp_Object codesys) |
428 | 1060 { |
793 | 1061 int len = XSTRING_LENGTH (XSYMBOL (XCODING_SYSTEM_NAME (codesys))->name); |
2367 | 1062 Ibyte *codesys_name = alloca_ibytes (len + 7); |
771 | 1063 int mlen = -1; |
867 | 1064 Ibyte *codesys_mnemonic = 0; |
771 | 1065 Lisp_Object codesys_name_sym, sub_codesys; |
1066 int i; | |
1067 | |
1068 memcpy (codesys_name, | |
793 | 1069 XSTRING_DATA (XSYMBOL (XCODING_SYSTEM_NAME (codesys))->name), len); |
771 | 1070 |
1071 if (STRINGP (XCODING_SYSTEM_MNEMONIC (codesys))) | |
428 | 1072 { |
771 | 1073 mlen = XSTRING_LENGTH (XCODING_SYSTEM_MNEMONIC (codesys)); |
2367 | 1074 codesys_mnemonic = alloca_ibytes (mlen + 7); |
771 | 1075 memcpy (codesys_mnemonic, |
1076 XSTRING_DATA (XCODING_SYSTEM_MNEMONIC (codesys)), mlen); | |
1077 } | |
1078 | |
1079 /* Create three "subsidiary" coding systems, decoding data encoded using | |
1080 each of the three EOL types. We do this for each subsidiary by | |
1081 copying the original coding system, setting the EOL type | |
1082 appropriately, and setting the CANONICAL member of the new coding | |
1083 system to be a chain consisting of the original coding system followed | |
1084 by a convert-eol coding system to do the EOL decoding. For EOL type | |
1085 LF, however, we don't need any decoding, so we skip creating a | |
1086 CANONICAL. | |
1087 | |
1088 If the original coding system is not a text-type coding system | |
1089 (decodes byte->char), we need to coerce it to one by the appropriate | |
1090 wrapping in CANONICAL. */ | |
1091 | |
1092 for (i = 0; i < countof (coding_subsidiary_list); i++) | |
1093 { | |
4528
726060ee587c
First draft of g++ 4.3 warning removal patch. Builds. *Needs ChangeLogs.*
Stephen J. Turnbull <stephen@xemacs.org>
parents:
4522
diff
changeset
|
1094 const Ascbyte *extension = coding_subsidiary_list[i].extension; |
726060ee587c
First draft of g++ 4.3 warning removal patch. Builds. *Needs ChangeLogs.*
Stephen J. Turnbull <stephen@xemacs.org>
parents:
4522
diff
changeset
|
1095 const Ascbyte *mnemonic_ext = coding_subsidiary_list[i].mnemonic_ext; |
771 | 1096 enum eol_type eol = coding_subsidiary_list[i].eol; |
1097 | |
2367 | 1098 qxestrcpy_ascii (codesys_name + len, extension); |
4953
304aebb79cd3
function renamings to track names of char typedefs
Ben Wing <ben@xemacs.org>
parents:
4952
diff
changeset
|
1099 codesys_name_sym = intern_istring (codesys_name); |
771 | 1100 if (mlen != -1) |
2367 | 1101 qxestrcpy_ascii (codesys_mnemonic + mlen, mnemonic_ext); |
771 | 1102 |
1103 sub_codesys = Fcopy_coding_system (codesys, codesys_name_sym); | |
1104 if (mlen != -1) | |
1105 XCODING_SYSTEM_MNEMONIC (sub_codesys) = | |
4953
304aebb79cd3
function renamings to track names of char typedefs
Ben Wing <ben@xemacs.org>
parents:
4952
diff
changeset
|
1106 build_istring (codesys_mnemonic); |
771 | 1107 |
1108 if (eol != EOL_LF) | |
1109 { | |
1110 Lisp_Object chain = list2 (get_coding_system_for_text_file | |
1111 (codesys, 0), | |
1112 eol == EOL_CR ? Qconvert_eol_cr : | |
1113 Qconvert_eol_crlf); | |
1114 Lisp_Object canon = | |
1115 make_internal_coding_system | |
1116 (sub_codesys, "internal-subsidiary-eol-wrapper", | |
1117 Qchain, Qunbound, | |
1118 mlen != -1 ? | |
4953
304aebb79cd3
function renamings to track names of char typedefs
Ben Wing <ben@xemacs.org>
parents:
4952
diff
changeset
|
1119 list6 (Qmnemonic, build_istring (codesys_mnemonic), |
771 | 1120 Qchain, chain, |
1121 Qcanonicalize_after_coding, sub_codesys) : | |
1122 list4 (Qchain, chain, | |
1123 Qcanonicalize_after_coding, sub_codesys)); | |
1124 XCODING_SYSTEM_CANONICAL (sub_codesys) = canon; | |
1125 } | |
1126 XCODING_SYSTEM_EOL_TYPE (sub_codesys) = eol; | |
1127 XCODING_SYSTEM_SUBSIDIARY_PARENT (sub_codesys) = codesys; | |
1128 XCODING_SYSTEM (codesys)->eol[eol] = sub_codesys; | |
428 | 1129 } |
1130 } | |
1131 | |
4690
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
1132 DEFUN ("coding-system-canonical-name-p", Fcoding_system_canonical_name_p, |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
1133 1, 1, 0, /* |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
1134 Return t if OBJECT names a coding system, and is not a coding system alias. |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
1135 */ |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
1136 (object)) |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
1137 { |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
1138 return CODING_SYSTEMP (Fgethash (object, Vcoding_system_hash_table, Qnil)) |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
1139 ? Qt : Qnil; |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
1140 } |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
1141 |
771 | 1142 /* Basic function to create new coding systems. For `make-coding-system', |
1143 NAME-OR-EXISTING is the NAME argument, PREFIX is null, and TYPE, | |
1144 DESCRIPTION, and PROPS are the same. All created coding systems are put | |
1145 in a hash table indexed by NAME. | |
1146 | |
1147 If PREFIX is a string, NAME-OR-EXISTING should specify an existing | |
1148 coding system (or nil), and an internal coding system will be created. | |
1149 The name of the coding system will be constructed by combining PREFIX | |
1150 with the name of the existing coding system (if given), and a number | |
1151 will be appended to insure uniqueness. In such a case, if Qunbound is | |
1152 given for DESCRIPTION, the description gets created based on the | |
1153 generated name. Also, if no mnemonic is given in the properties list, a | |
1154 mnemonic is created based on the generated name. | |
1155 | |
1156 For internal coding systems, the coding system is marked as internal | |
1157 (see `coding-system-list'), and no subsidiaries will be created or | |
1158 eol-wrapping will happen. Otherwise: | |
1159 | |
1160 -- if the eol-type property is `lf' or t, the coding system is merely | |
1161 created and returned. (For t, the coding system will be wrapped with | |
1162 an EOL autodetector when it's used to read a file.) | |
1163 | |
1164 -- if eol-type is `crlf' or `cr', after the coding system object is | |
1165 created, it will be wrapped in a chain with the appropriate | |
1166 convert-eol coding system (either `convert-eol-crlf' or | |
1167 `convert-eol-cr'), so that CRLF->LF or CR->LF conversion is done at | |
1168 decoding time, and the opposite at encoding time. The resulting | |
1169 chain becomes the CANONICAL field of the coding system object. | |
1170 | |
1171 -- if eol-type is nil or omitted, "subsidiaries" are generated: Three | |
1172 coding systems where the original coding system (before wrapping with | |
1173 convert-eol-autodetect) is either unwrapped or wrapped with | |
1174 convert-eol-crlf or convert-eol-cr, respectively, so that coding systems | |
1175 to handle LF, CRLF, and CR end-of-line indicators are created. (This | |
1176 crazy crap is based on existing behavior in other Mule versions, | |
1177 including FSF Emacs.) | |
1178 */ | |
428 | 1179 |
1180 static Lisp_Object | |
4528
726060ee587c
First draft of g++ 4.3 warning removal patch. Builds. *Needs ChangeLogs.*
Stephen J. Turnbull <stephen@xemacs.org>
parents:
4522
diff
changeset
|
1181 make_coding_system_1 (Lisp_Object name_or_existing, const Ascbyte *prefix, |
771 | 1182 Lisp_Object type, Lisp_Object description, |
1183 Lisp_Object props) | |
428 | 1184 { |
771 | 1185 Lisp_Coding_System *cs; |
1186 int need_to_setup_eol_systems = 1; | |
1187 enum eol_type eol_wrapper = EOL_AUTODETECT; | |
1188 struct coding_system_methods *meths; | |
1189 Lisp_Object csobj; | |
4690
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
1190 Lisp_Object defmnem = Qnil, aliases = Qnil; |
771 | 1191 |
1192 if (NILP (type)) | |
1193 type = Qundecided; | |
1194 meths = decode_coding_system_type (type, ERROR_ME); | |
1195 | |
1196 if (prefix) | |
428 | 1197 { |
867 | 1198 Ibyte *newname = |
771 | 1199 emacs_sprintf_malloc (NULL, "%s-%s-%d", |
1200 prefix, | |
867 | 1201 NILP (name_or_existing) ? (Ibyte *) "nil" : |
771 | 1202 XSTRING_DATA (Fsymbol_name (XCODING_SYSTEM_NAME |
1203 (name_or_existing))), | |
1204 ++coding_system_tick); | |
4953
304aebb79cd3
function renamings to track names of char typedefs
Ben Wing <ben@xemacs.org>
parents:
4952
diff
changeset
|
1205 name_or_existing = intern_istring (newname); |
4976
16112448d484
Rename xfree(FOO, TYPE) -> xfree(FOO)
Ben Wing <ben@xemacs.org>
parents:
4967
diff
changeset
|
1206 xfree (newname); |
771 | 1207 |
1208 if (UNBOUNDP (description)) | |
1209 { | |
1210 newname = | |
1211 emacs_sprintf_malloc | |
1212 (NULL, "For Internal Use (%s)", | |
1213 XSTRING_DATA (Fsymbol_name (name_or_existing))); | |
4953
304aebb79cd3
function renamings to track names of char typedefs
Ben Wing <ben@xemacs.org>
parents:
4952
diff
changeset
|
1214 description = build_istring (newname); |
4976
16112448d484
Rename xfree(FOO, TYPE) -> xfree(FOO)
Ben Wing <ben@xemacs.org>
parents:
4967
diff
changeset
|
1215 xfree (newname); |
771 | 1216 } |
1217 | |
1218 newname = emacs_sprintf_malloc (NULL, "Int%d", coding_system_tick); | |
4953
304aebb79cd3
function renamings to track names of char typedefs
Ben Wing <ben@xemacs.org>
parents:
4952
diff
changeset
|
1219 defmnem = build_istring (newname); |
4976
16112448d484
Rename xfree(FOO, TYPE) -> xfree(FOO)
Ben Wing <ben@xemacs.org>
parents:
4967
diff
changeset
|
1220 xfree (newname); |
428 | 1221 } |
771 | 1222 else |
1223 CHECK_SYMBOL (name_or_existing); | |
1224 | |
4303 | 1225 /* See is there an entry for name_or_existing in the defined coding system |
1226 hash table. */ | |
1227 csobj = find_coding_system (name_or_existing, 0); | |
1228 /* Error if it's there and not an autoload form. */ | |
1229 if (!NILP (csobj) && !CONSP (csobj)) | |
771 | 1230 invalid_operation ("Cannot redefine existing coding system", |
4303 | 1231 name_or_existing); |
771 | 1232 |
1233 cs = allocate_coding_system (meths, meths->extra_data_size, | |
1234 name_or_existing); | |
793 | 1235 csobj = wrap_coding_system (cs); |
771 | 1236 |
1237 cs->internal_p = !!prefix; | |
1238 | |
1239 if (NILP (description)) | |
4952
19a72041c5ed
Mule-izing, various fixes related to char * arguments
Ben Wing <ben@xemacs.org>
parents:
4846
diff
changeset
|
1240 description = build_ascstring (""); |
771 | 1241 else |
1242 CHECK_STRING (description); | |
1243 CODING_SYSTEM_DESCRIPTION (cs) = description; | |
1244 | |
1245 if (!NILP (defmnem)) | |
1246 CODING_SYSTEM_MNEMONIC (cs) = defmnem; | |
1247 | |
1248 { | |
1249 EXTERNAL_PROPERTY_LIST_LOOP_3 (key, value, props) | |
1250 { | |
1251 int recognized = 1; | |
1252 | |
1253 if (EQ (key, Qmnemonic)) | |
1254 { | |
1255 if (!NILP (value)) | |
1256 CHECK_STRING (value); | |
1257 CODING_SYSTEM_MNEMONIC (cs) = value; | |
1258 } | |
1259 | |
1260 else if (EQ (key, Qdocumentation)) | |
1261 { | |
1262 if (!NILP (value)) | |
1263 CHECK_STRING (value); | |
1264 CODING_SYSTEM_DOCUMENTATION (cs) = value; | |
1265 } | |
1266 | |
1267 else if (EQ (key, Qeol_type)) | |
1268 { | |
1269 need_to_setup_eol_systems = NILP (value); | |
1270 if (EQ (value, Qt)) | |
1271 value = Qnil; | |
1272 eol_wrapper = symbol_to_eol_type (value); | |
1273 } | |
1274 | |
1275 else if (EQ (key, Qpost_read_conversion)) | |
1276 CODING_SYSTEM_POST_READ_CONVERSION (cs) = value; | |
1277 else if (EQ (key, Qpre_write_conversion)) | |
1278 CODING_SYSTEM_PRE_WRITE_CONVERSION (cs) = value; | |
4690
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
1279 else if (EQ (key, Qaliases)) |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
1280 { |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
1281 EXTERNAL_LIST_LOOP_2 (alias, value) |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
1282 { |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
1283 CHECK_SYMBOL (alias); |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
1284 |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
1285 if (!NILP (Fcoding_system_canonical_name_p (alias))) |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
1286 { |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
1287 invalid_change ("Symbol is the canonical name of a " |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
1288 "coding system and cannot be redefined", |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
1289 alias); |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
1290 } |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
1291 } |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
1292 aliases = value; |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
1293 } |
771 | 1294 /* FSF compatibility */ |
1295 else if (EQ (key, Qtranslation_table_for_decode)) | |
1296 ; | |
1297 else if (EQ (key, Qtranslation_table_for_encode)) | |
1298 ; | |
1299 else if (EQ (key, Qsafe_chars)) | |
4690
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
1300 { |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
1301 CHECK_CHAR_TABLE (value); |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
1302 CODING_SYSTEM_SAFE_CHARS (cs) = value; |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
1303 } |
771 | 1304 else if (EQ (key, Qsafe_charsets)) |
4690
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
1305 { |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
1306 if (!EQ (Qt, value) |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
1307 /* Would be nice to actually do this check, but there are |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
1308 some order conflicts with japanese.el and |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
1309 mule-coding.el */ |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
1310 && 0) |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
1311 { |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
1312 #ifdef MULE |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
1313 EXTERNAL_LIST_LOOP_2 (safe_charset, value) |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
1314 CHECK_CHARSET (Ffind_charset (safe_charset)); |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
1315 #endif |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
1316 } |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
1317 |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
1318 CODING_SYSTEM_SAFE_CHARSETS (cs) = value; |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
1319 } |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
1320 else if (EQ (key, Qcategory)) |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
1321 { |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
1322 Fput (name_or_existing, intern ("coding-system-property"), |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
1323 Fplist_put (Fget (name_or_existing, |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
1324 intern ("coding-system-property"), |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
1325 Qnil), |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
1326 Qcategory, value)); |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
1327 } |
771 | 1328 else if (EQ (key, Qmime_charset)) |
1329 ; | |
1330 else if (EQ (key, Qvalid_codes)) | |
1331 ; | |
1332 else | |
1333 recognized = CODESYSMETH_OR_GIVEN (cs, putprop, | |
1334 (csobj, key, value), 0); | |
1335 | |
1336 if (!recognized) | |
1337 invalid_constant ("Unrecognized property", key); | |
1338 } | |
1339 } | |
1340 | |
1341 { | |
1342 XCODING_SYSTEM_CANONICAL (csobj) = | |
1343 CODESYSMETH_OR_GIVEN (cs, canonicalize, (csobj), Qnil); | |
1344 XCODING_SYSTEM_EOL_TYPE (csobj) = EOL_AUTODETECT; /* for copy-coding-system | |
1345 below */ | |
1346 | |
4303 | 1347 Fputhash (name_or_existing, csobj, Vcoding_system_hash_table); |
1348 | |
771 | 1349 if (need_to_setup_eol_systems && !cs->internal_p) |
1350 setup_eol_coding_systems (csobj); | |
1351 else if (eol_wrapper == EOL_CR || eol_wrapper == EOL_CRLF) | |
1352 { | |
1353 /* If a specific eol-type (other than LF) was specified, we handle | |
1354 this by converting the coding system into a chain that wraps the | |
1355 coding system along with a convert-eol system after it, in | |
1356 exactly that same switcheroo fashion that the normal | |
1357 canonicalize method works -- BUT we will run into a problem if | |
1358 we do it the obvious way, because when `chain' creates its | |
1359 substreams, the substream containing the coding system we're | |
1360 creating will have canonicalization expansion done on it, | |
1361 leading to infinite recursion. So we have to generate a new, | |
1362 internal coding system with the previous value of CANONICAL. */ | |
867 | 1363 Ibyte *newname = |
771 | 1364 emacs_sprintf_malloc |
1365 (NULL, "internal-eol-copy-%s-%d", | |
1366 XSTRING_DATA (Fsymbol_name (name_or_existing)), | |
1367 ++coding_system_tick); | |
4953
304aebb79cd3
function renamings to track names of char typedefs
Ben Wing <ben@xemacs.org>
parents:
4952
diff
changeset
|
1368 Lisp_Object newnamesym = intern_istring (newname); |
771 | 1369 Lisp_Object copied = Fcopy_coding_system (csobj, newnamesym); |
4976
16112448d484
Rename xfree(FOO, TYPE) -> xfree(FOO)
Ben Wing <ben@xemacs.org>
parents:
4967
diff
changeset
|
1370 xfree (newname); |
771 | 1371 |
1372 XCODING_SYSTEM_CANONICAL (csobj) = | |
1373 make_internal_coding_system | |
1374 (csobj, | |
1375 "internal-eol-wrapper", | |
1376 Qchain, Qunbound, | |
1377 list4 (Qchain, | |
1378 list2 (copied, | |
1379 eol_wrapper == EOL_CR ? | |
1380 Qconvert_eol_cr : | |
1381 Qconvert_eol_crlf), | |
1382 Qcanonicalize_after_coding, | |
1383 csobj)); | |
1384 } | |
1385 XCODING_SYSTEM_EOL_TYPE (csobj) = eol_wrapper; | |
4690
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
1386 |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
1387 { |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
1388 EXTERNAL_LIST_LOOP_2 (alias, aliases) |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
1389 Fdefine_coding_system_alias (alias, csobj); |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
1390 } |
771 | 1391 } |
1392 | |
1393 return csobj; | |
428 | 1394 } |
1395 | |
771 | 1396 Lisp_Object |
4528
726060ee587c
First draft of g++ 4.3 warning removal patch. Builds. *Needs ChangeLogs.*
Stephen J. Turnbull <stephen@xemacs.org>
parents:
4522
diff
changeset
|
1397 make_internal_coding_system (Lisp_Object existing, const Ascbyte *prefix, |
771 | 1398 Lisp_Object type, Lisp_Object description, |
1399 Lisp_Object props) | |
1400 { | |
1401 return make_coding_system_1 (existing, prefix, type, description, props); | |
1402 } | |
428 | 1403 |
4690
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
1404 DEFUN ("make-coding-system-internal", Fmake_coding_system_internal, 2, 4, 0, /* |
5083
88f955fa5a7f
Back out revision c673987f5f3d, undump mule/make-coding-system.el.
Aidan Kehoe <kehoea@parhasard.net>
parents:
5038
diff
changeset
|
1405 Create a new coding system object, and register NAME as its name. |
88f955fa5a7f
Back out revision c673987f5f3d, undump mule/make-coding-system.el.
Aidan Kehoe <kehoea@parhasard.net>
parents:
5038
diff
changeset
|
1406 |
88f955fa5a7f
Back out revision c673987f5f3d, undump mule/make-coding-system.el.
Aidan Kehoe <kehoea@parhasard.net>
parents:
5038
diff
changeset
|
1407 With Mule support, this does much of the work of `make-coding-system'. |
4690
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
1408 Without Mule support, it does all the work of that function, and an alias |
5083
88f955fa5a7f
Back out revision c673987f5f3d, undump mule/make-coding-system.el.
Aidan Kehoe <kehoea@parhasard.net>
parents:
5038
diff
changeset
|
1409 exists, mapping `make-coding-system' to `make-coding-system-internal'. |
88f955fa5a7f
Back out revision c673987f5f3d, undump mule/make-coding-system.el.
Aidan Kehoe <kehoea@parhasard.net>
parents:
5038
diff
changeset
|
1410 |
88f955fa5a7f
Back out revision c673987f5f3d, undump mule/make-coding-system.el.
Aidan Kehoe <kehoea@parhasard.net>
parents:
5038
diff
changeset
|
1411 You'll need a Mule XEmacs to read the complete docstring. Or you can |
88f955fa5a7f
Back out revision c673987f5f3d, undump mule/make-coding-system.el.
Aidan Kehoe <kehoea@parhasard.net>
parents:
5038
diff
changeset
|
1412 just read it in make-coding-system.el; something like the following |
88f955fa5a7f
Back out revision c673987f5f3d, undump mule/make-coding-system.el.
Aidan Kehoe <kehoea@parhasard.net>
parents:
5038
diff
changeset
|
1413 should work: |
4690
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
1414 |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
1415 \\[find-function-other-window] find-file RET \\[find-file] mule/make-coding-system.el RET |
771 | 1416 |
428 | 1417 */ |
771 | 1418 (name, type, description, props)) |
428 | 1419 { |
771 | 1420 return make_coding_system_1 (name, 0, type, description, props); |
428 | 1421 } |
1422 | |
1423 DEFUN ("copy-coding-system", Fcopy_coding_system, 2, 2, 0, /* | |
1424 Copy OLD-CODING-SYSTEM to NEW-NAME. | |
1425 If NEW-NAME does not name an existing coding system, a new one will | |
1426 be created. | |
771 | 1427 If you are using this function to create an alias, think again: |
1428 Use `define-coding-system-alias' instead. | |
428 | 1429 */ |
1430 (old_coding_system, new_name)) | |
1431 { | |
1432 Lisp_Object new_coding_system; | |
1433 old_coding_system = Fget_coding_system (old_coding_system); | |
771 | 1434 new_coding_system = |
4303 | 1435 UNBOUNDP (new_name) ? Qnil : find_coding_system (new_name, 0); |
428 | 1436 if (NILP (new_coding_system)) |
1437 { | |
793 | 1438 new_coding_system = |
1439 wrap_coding_system | |
1440 (allocate_coding_system | |
1441 (XCODING_SYSTEM (old_coding_system)->methods, | |
1442 XCODING_SYSTEM (old_coding_system)->methods->extra_data_size, | |
1443 new_name)); | |
771 | 1444 if (!UNBOUNDP (new_name)) |
1445 Fputhash (new_name, new_coding_system, Vcoding_system_hash_table); | |
428 | 1446 } |
771 | 1447 else if (XCODING_SYSTEM (old_coding_system)->methods != |
1448 XCODING_SYSTEM (new_coding_system)->methods) | |
1449 invalid_operation_2 ("Coding systems not same type", | |
1450 old_coding_system, new_coding_system); | |
428 | 1451 |
5127
a9c41067dd88
more cleanups, terminology clarification, lots of doc work
Ben Wing <ben@xemacs.org>
parents:
5126
diff
changeset
|
1452 copy_lisp_object (new_coding_system, old_coding_system); |
a9c41067dd88
more cleanups, terminology clarification, lots of doc work
Ben Wing <ben@xemacs.org>
parents:
5126
diff
changeset
|
1453 XCODING_SYSTEM (new_coding_system)->name = new_name; |
428 | 1454 return new_coding_system; |
1455 } | |
1456 | |
2297 | 1457 /* #### Shouldn't this really be a find/get pair? */ |
1458 | |
440 | 1459 DEFUN ("coding-system-alias-p", Fcoding_system_alias_p, 1, 1, 0, /* |
1460 Return t if OBJECT is a coding system alias. | |
1461 All coding system aliases are created by `define-coding-system-alias'. | |
1462 */ | |
1463 (object)) | |
428 | 1464 { |
440 | 1465 return SYMBOLP (Fgethash (object, Vcoding_system_hash_table, Qzero)) |
1466 ? Qt : Qnil; | |
1467 } | |
1468 | |
1469 DEFUN ("coding-system-aliasee", Fcoding_system_aliasee, 1, 1, 0, /* | |
1470 Return the coding-system symbol for which symbol ALIAS is an alias. | |
1471 */ | |
1472 (alias)) | |
1473 { | |
1474 Lisp_Object aliasee = Fgethash (alias, Vcoding_system_hash_table, Qnil); | |
1475 if (SYMBOLP (aliasee)) | |
1476 return aliasee; | |
1477 else | |
563 | 1478 invalid_argument ("Symbol is not a coding system alias", alias); |
1204 | 1479 RETURN_NOT_REACHED (Qnil); |
440 | 1480 } |
1481 | |
1482 /* A maphash function, for removing dangling coding system aliases. */ | |
1483 static int | |
2286 | 1484 dangling_coding_system_alias_p (Lisp_Object UNUSED (alias), |
440 | 1485 Lisp_Object aliasee, |
1486 void *dangling_aliases) | |
1487 { | |
1488 if (SYMBOLP (aliasee) | |
1489 && NILP (Fgethash (aliasee, Vcoding_system_hash_table, Qnil))) | |
428 | 1490 { |
440 | 1491 (*(int *) dangling_aliases)++; |
1492 return 1; | |
428 | 1493 } |
440 | 1494 else |
1495 return 0; | |
1496 } | |
1497 | |
1498 DEFUN ("define-coding-system-alias", Fdefine_coding_system_alias, 2, 2, 0, /* | |
1499 Define symbol ALIAS as an alias for coding system ALIASEE. | |
1500 | |
1501 You can use this function to redefine an alias that has already been defined, | |
1502 but you cannot redefine a name which is the canonical name for a coding system. | |
1503 \(a canonical name of a coding system is what is returned when you call | |
1504 `coding-system-name' on a coding system). | |
1505 | |
1506 ALIASEE itself can be an alias, which allows you to define nested aliases. | |
1507 | |
1508 You are forbidden, however, from creating alias loops or `dangling' aliases. | |
1509 These will be detected, and an error will be signaled if you attempt to do so. | |
1510 | |
1511 If ALIASEE is nil, then ALIAS will simply be undefined. | |
1512 | |
1513 See also `coding-system-alias-p', `coding-system-aliasee', | |
1514 and `coding-system-canonical-name-p'. | |
1515 */ | |
1516 (alias, aliasee)) | |
1517 { | |
2286 | 1518 Lisp_Object probe; |
440 | 1519 |
1520 CHECK_SYMBOL (alias); | |
1521 | |
1522 if (!NILP (Fcoding_system_canonical_name_p (alias))) | |
563 | 1523 invalid_change |
440 | 1524 ("Symbol is the canonical name of a coding system and cannot be redefined", |
1525 alias); | |
1526 | |
1527 if (NILP (aliasee)) | |
1528 { | |
771 | 1529 Lisp_Object subsidiary_unix = add_suffix_to_symbol (alias, "-unix"); |
1530 Lisp_Object subsidiary_dos = add_suffix_to_symbol (alias, "-dos"); | |
1531 Lisp_Object subsidiary_mac = add_suffix_to_symbol (alias, "-mac"); | |
440 | 1532 |
1533 Fremhash (alias, Vcoding_system_hash_table); | |
1534 | |
1535 /* Undefine subsidiary aliases, | |
1536 presumably created by a previous call to this function */ | |
1537 if (! NILP (Fcoding_system_alias_p (subsidiary_unix)) && | |
1538 ! NILP (Fcoding_system_alias_p (subsidiary_dos)) && | |
1539 ! NILP (Fcoding_system_alias_p (subsidiary_mac))) | |
1540 { | |
1541 Fdefine_coding_system_alias (subsidiary_unix, Qnil); | |
1542 Fdefine_coding_system_alias (subsidiary_dos, Qnil); | |
1543 Fdefine_coding_system_alias (subsidiary_mac, Qnil); | |
1544 } | |
1545 | |
1546 /* Undefine dangling coding system aliases. */ | |
1547 { | |
1548 int dangling_aliases; | |
1549 | |
1550 do { | |
1551 dangling_aliases = 0; | |
1552 elisp_map_remhash (dangling_coding_system_alias_p, | |
1553 Vcoding_system_hash_table, | |
1554 &dangling_aliases); | |
1555 } while (dangling_aliases > 0); | |
1556 } | |
1557 | |
1558 return Qnil; | |
1559 } | |
1560 | |
1561 if (CODING_SYSTEMP (aliasee)) | |
1562 aliasee = XCODING_SYSTEM_NAME (aliasee); | |
1563 | |
1564 /* Checks that aliasee names a coding-system */ | |
2286 | 1565 (void) Fget_coding_system (aliasee); |
440 | 1566 |
1567 /* Check for coding system alias loops */ | |
1568 if (EQ (alias, aliasee)) | |
563 | 1569 alias_loop: invalid_operation_2 |
440 | 1570 ("Attempt to create a coding system alias loop", alias, aliasee); |
1571 | |
1572 for (probe = aliasee; | |
1573 SYMBOLP (probe); | |
1574 probe = Fgethash (probe, Vcoding_system_hash_table, Qzero)) | |
1575 { | |
1576 if (EQ (probe, alias)) | |
1577 goto alias_loop; | |
1578 } | |
1579 | |
1580 Fputhash (alias, aliasee, Vcoding_system_hash_table); | |
1581 | |
1582 /* Set up aliases for subsidiaries. | |
2297 | 1583 #### There must be a better way to handle subsidiary coding systems. |
1584 Inquiring Minds Want To Know: shouldn't they always be chains? */ | |
440 | 1585 { |
1586 static const char *suffixes[] = { "-unix", "-dos", "-mac" }; | |
1587 int i; | |
1588 for (i = 0; i < countof (suffixes); i++) | |
1589 { | |
1590 Lisp_Object alias_subsidiary = | |
771 | 1591 add_suffix_to_symbol (alias, suffixes[i]); |
440 | 1592 Lisp_Object aliasee_subsidiary = |
771 | 1593 add_suffix_to_symbol (aliasee, suffixes[i]); |
440 | 1594 |
1595 if (! NILP (Ffind_coding_system (aliasee_subsidiary))) | |
1596 Fdefine_coding_system_alias (alias_subsidiary, aliasee_subsidiary); | |
1597 } | |
1598 } | |
428 | 1599 /* FSF return value is a vector of [ALIAS-unix ALIAS-dos ALIAS-mac], |
1600 but it doesn't look intentional, so I'd rather return something | |
1601 meaningful or nothing at all. */ | |
1602 return Qnil; | |
1603 } | |
1604 | |
1605 static Lisp_Object | |
771 | 1606 subsidiary_coding_system (Lisp_Object coding_system, enum eol_type type) |
428 | 1607 { |
1608 Lisp_Coding_System *cs = XCODING_SYSTEM (coding_system); | |
1609 Lisp_Object new_coding_system; | |
1610 | |
1611 switch (type) | |
1612 { | |
1613 case EOL_AUTODETECT: return coding_system; | |
1614 case EOL_LF: new_coding_system = CODING_SYSTEM_EOL_LF (cs); break; | |
1615 case EOL_CR: new_coding_system = CODING_SYSTEM_EOL_CR (cs); break; | |
1616 case EOL_CRLF: new_coding_system = CODING_SYSTEM_EOL_CRLF (cs); break; | |
2500 | 1617 default: ABORT (); return Qnil; |
428 | 1618 } |
1619 | |
1620 return NILP (new_coding_system) ? coding_system : new_coding_system; | |
1621 } | |
1622 | |
1623 DEFUN ("subsidiary-coding-system", Fsubsidiary_coding_system, 2, 2, 0, /* | |
1624 Return the subsidiary coding system of CODING-SYSTEM with eol type EOL-TYPE. | |
771 | 1625 The logically opposite operation is `coding-system-base'. |
428 | 1626 */ |
1627 (coding_system, eol_type)) | |
1628 { | |
771 | 1629 coding_system = get_coding_system_for_text_file (coding_system, 0); |
428 | 1630 |
1631 return subsidiary_coding_system (coding_system, | |
1632 symbol_to_eol_type (eol_type)); | |
1633 } | |
1634 | |
771 | 1635 DEFUN ("coding-system-base", Fcoding_system_base, |
1636 1, 1, 0, /* | |
1637 Return the base coding system of CODING-SYSTEM. | |
1638 If CODING-SYSTEM is a subsidiary, this returns its parent; otherwise, it | |
1639 returns CODING-SYSTEM. | |
1640 The logically opposite operation is `subsidiary-coding-system'. | |
1641 */ | |
1642 (coding_system)) | |
1643 { | |
1644 Lisp_Object base; | |
1645 | |
1646 coding_system = Fget_coding_system (coding_system); | |
1647 if (EQ (XCODING_SYSTEM_NAME (coding_system), Qbinary)) | |
1648 return Fget_coding_system (Qraw_text); /* hack! */ | |
1649 base = XCODING_SYSTEM_SUBSIDIARY_PARENT (coding_system); | |
1650 if (!NILP (base)) | |
1651 return base; | |
1652 return coding_system; | |
1653 } | |
1654 | |
1655 DEFUN ("coding-system-used-for-io", Fcoding_system_used_for_io, | |
1656 1, 1, 0, /* | |
1657 Return the coding system actually used for I/O. | |
1658 In some cases (e.g. when a particular EOL type is specified) this won't be | |
2297 | 1659 the coding system itself. This can be useful when trying to determine |
1660 precisely how data was decoded. | |
771 | 1661 */ |
1662 (coding_system)) | |
1663 { | |
1664 Lisp_Object canon; | |
1665 | |
1666 coding_system = Fget_coding_system (coding_system); | |
1667 canon = XCODING_SYSTEM_CANONICAL (coding_system); | |
1668 if (!NILP (canon)) | |
1669 return canon; | |
1670 return coding_system; | |
1671 } | |
1672 | |
428 | 1673 |
1674 /************************************************************************/ | |
1675 /* Coding system accessors */ | |
1676 /************************************************************************/ | |
1677 | |
771 | 1678 DEFUN ("coding-system-description", Fcoding_system_description, 1, 1, 0, /* |
1679 Return the description for CODING-SYSTEM. | |
1680 The `description' of a coding system is a short English phrase giving the | |
1681 name rendered according to English punctuation rules, plus possibly some | |
1682 explanatory text (typically in the form of a parenthetical phrase). The | |
1683 description is intended to be short enough that it can appear as a menu item, | |
1684 and clear enough to be recognizable even to someone who is assumed to have | |
1685 some basic familiarity with different encodings but may not know all the | |
1686 technical names; thus, for `cn-gb-2312' is described as "Chinese EUC" and | |
1687 `hz-gb-2312' is described as "Hz/ZW (Chinese)", where the actual name of | |
1688 the encoding is given, followed by a note that this is a Chinese encoding, | |
1689 because the great majority of people encountering this would have no idea | |
1690 what it is, and giving the language indicates whether the encoding should | |
1691 just be ignored or (conceivably) investigated more thoroughly. | |
428 | 1692 */ |
1693 (coding_system)) | |
1694 { | |
1695 coding_system = Fget_coding_system (coding_system); | |
771 | 1696 return XCODING_SYSTEM_DESCRIPTION (coding_system); |
428 | 1697 } |
1698 | |
1699 DEFUN ("coding-system-type", Fcoding_system_type, 1, 1, 0, /* | |
1700 Return the type of CODING-SYSTEM. | |
1701 */ | |
1702 (coding_system)) | |
1703 { | |
771 | 1704 coding_system = Fget_coding_system (coding_system); |
1705 return XCODING_SYSTEM_TYPE (coding_system); | |
428 | 1706 } |
1707 | |
1708 DEFUN ("coding-system-property", Fcoding_system_property, 2, 2, 0, /* | |
1709 Return the PROP property of CODING-SYSTEM. | |
1710 */ | |
1711 (coding_system, prop)) | |
1712 { | |
1713 coding_system = Fget_coding_system (coding_system); | |
1714 CHECK_SYMBOL (prop); | |
1715 | |
1716 if (EQ (prop, Qname)) | |
1717 return XCODING_SYSTEM_NAME (coding_system); | |
1718 else if (EQ (prop, Qtype)) | |
1719 return Fcoding_system_type (coding_system); | |
771 | 1720 else if (EQ (prop, Qdescription)) |
1721 return XCODING_SYSTEM_DESCRIPTION (coding_system); | |
428 | 1722 else if (EQ (prop, Qmnemonic)) |
1723 return XCODING_SYSTEM_MNEMONIC (coding_system); | |
771 | 1724 else if (EQ (prop, Qdocumentation)) |
1725 return XCODING_SYSTEM_DOCUMENTATION (coding_system); | |
428 | 1726 else if (EQ (prop, Qeol_type)) |
771 | 1727 return eol_type_to_symbol (XCODING_SYSTEM_EOL_TYPE |
1728 (coding_system)); | |
428 | 1729 else if (EQ (prop, Qeol_lf)) |
1730 return XCODING_SYSTEM_EOL_LF (coding_system); | |
1731 else if (EQ (prop, Qeol_crlf)) | |
1732 return XCODING_SYSTEM_EOL_CRLF (coding_system); | |
1733 else if (EQ (prop, Qeol_cr)) | |
1734 return XCODING_SYSTEM_EOL_CR (coding_system); | |
1735 else if (EQ (prop, Qpost_read_conversion)) | |
1736 return XCODING_SYSTEM_POST_READ_CONVERSION (coding_system); | |
1737 else if (EQ (prop, Qpre_write_conversion)) | |
1738 return XCODING_SYSTEM_PRE_WRITE_CONVERSION (coding_system); | |
4568
1d74a1d115ee
Add #'query-coding-region tests; do the work necessary to get them running.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4303
diff
changeset
|
1739 else if (EQ (prop, Qsafe_charsets)) |
1d74a1d115ee
Add #'query-coding-region tests; do the work necessary to get them running.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4303
diff
changeset
|
1740 return XCODING_SYSTEM_SAFE_CHARSETS (coding_system); |
1d74a1d115ee
Add #'query-coding-region tests; do the work necessary to get them running.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4303
diff
changeset
|
1741 else if (EQ (prop, Qsafe_chars)) |
1d74a1d115ee
Add #'query-coding-region tests; do the work necessary to get them running.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4303
diff
changeset
|
1742 return XCODING_SYSTEM_SAFE_CHARS (coding_system); |
771 | 1743 else |
1744 { | |
1745 Lisp_Object value = CODESYSMETH_OR_GIVEN (XCODING_SYSTEM (coding_system), | |
1746 getprop, | |
1747 (coding_system, prop), | |
1748 Qunbound); | |
1749 if (UNBOUNDP (value)) | |
1750 invalid_constant ("Unrecognized property", prop); | |
1751 return value; | |
1752 } | |
1753 } | |
1754 | |
1755 | |
1756 /************************************************************************/ | |
1757 /* Coding stream functions */ | |
1758 /************************************************************************/ | |
1759 | |
1760 /* A coding stream is a stream used for encoding or decoding text. The | |
1761 coding-stream object keeps track of the actual coding system, the stream | |
1762 that is at the other end, and data that needs to be persistent across | |
1763 the lifetime of the stream. */ | |
1764 | |
1204 | 1765 extern const struct sized_memory_description chain_coding_stream_description; |
1766 extern const struct sized_memory_description undecided_coding_stream_description; | |
1767 | |
1768 static const struct memory_description coding_stream_data_description_1 []= { | |
2551 | 1769 { XD_BLOCK_PTR, chain_coding_system, 1, |
1770 { &chain_coding_stream_description } }, | |
1771 { XD_BLOCK_PTR, undecided_coding_system, 1, | |
1772 { &undecided_coding_stream_description } }, | |
1204 | 1773 { XD_END } |
1774 }; | |
1775 | |
1776 static const struct sized_memory_description coding_stream_data_description = { | |
1777 sizeof (void *), coding_stream_data_description_1 | |
1778 }; | |
1779 | |
1780 static const struct memory_description coding_lstream_description[] = { | |
1781 { XD_INT, offsetof (struct coding_stream, type) }, | |
1782 { XD_LISP_OBJECT, offsetof (struct coding_stream, orig_codesys) }, | |
1783 { XD_LISP_OBJECT, offsetof (struct coding_stream, codesys) }, | |
1784 { XD_LISP_OBJECT, offsetof (struct coding_stream, other_end) }, | |
1785 { XD_UNION, offsetof (struct coding_stream, data), | |
2551 | 1786 XD_INDIRECT (0, 0), { &coding_stream_data_description } }, |
1204 | 1787 { XD_END } |
1788 }; | |
1789 | |
1790 DEFINE_LSTREAM_IMPLEMENTATION_WITH_DATA ("coding", coding); | |
771 | 1791 |
1792 /* Encoding and decoding are parallel operations, so we create just one | |
1793 stream for both. "Decoding" may involve the extra step of autodetection | |
1794 of the data format, but that's only because of the conventional | |
1795 definition of decoding as converting from external- to | |
1796 internal-formatted data. | |
1797 | |
2297 | 1798 [[ REWRITE ME! ]] |
1799 | |
771 | 1800 #### We really need to abstract out the concept of "data formats" and |
1801 define "converters" that convert from and to specified formats, | |
1802 eliminating the idea of decoding and encoding. When specifying a | |
1803 conversion process, we need to give the data formats themselves, not the | |
1804 conversion processes -- e.g. a coding system called "Unicode->multibyte" | |
1805 converts in both directions, and we could auto-detect the format of data | |
1806 at either end. */ | |
1807 | |
1808 static Bytecount | |
1809 coding_reader (Lstream *stream, unsigned char *data, Bytecount size) | |
1810 { | |
1811 unsigned char *orig_data = data; | |
1812 Bytecount read_size; | |
1813 int error_occurred = 0; | |
1814 struct coding_stream *str = CODING_STREAM_DATA (stream); | |
1815 | |
1816 /* We need to interface to coding_{de,en}code_1(), which expects to take | |
1817 some amount of data and store the result into a Dynarr. We have | |
1818 coding_{de,en}code_1() store into c->runoff, and take data from there | |
1819 as necessary. */ | |
1820 | |
1821 /* We loop until we have enough data, reading chunks from the other | |
1822 end and converting it. */ | |
1823 while (1) | |
1824 { | |
1825 /* Take data from convert_to if we can. Make sure to take at | |
1826 most SIZE bytes, and delete the data from convert_to. */ | |
1827 if (Dynarr_length (str->convert_to) > 0) | |
1828 { | |
1829 Bytecount chunk = | |
1830 min (size, (Bytecount) Dynarr_length (str->convert_to)); | |
4967 | 1831 memcpy (data, Dynarr_begin (str->convert_to), chunk); |
771 | 1832 Dynarr_delete_many (str->convert_to, 0, chunk); |
1833 data += chunk; | |
1834 size -= chunk; | |
1835 } | |
1836 | |
1837 if (size == 0) | |
1838 break; /* No more room for data */ | |
1839 | |
1840 if (str->eof) | |
1841 break; | |
1842 | |
1843 { | |
1844 /* Exhausted convert_to, so get some more. Read into convert_from, | |
1845 after existing "rejected" data from the last conversion. */ | |
1846 Bytecount rejected = Dynarr_length (str->convert_from); | |
1847 /* #### 1024 is arbitrary; we really need to separate 0 from EOF, | |
1848 and when we get 0, keep taking more data until we don't get 0 -- | |
1849 we don't know how much data the conversion routine might need | |
2297 | 1850 before it can generate any data of its own (eg, bzip2). */ |
814 | 1851 Bytecount readmore = |
1852 str->one_byte_at_a_time ? (Bytecount) 1 : | |
1853 max (size, (Bytecount) 1024); | |
771 | 1854 |
1855 Dynarr_add_many (str->convert_from, 0, readmore); | |
1856 read_size = Lstream_read (str->other_end, | |
1857 Dynarr_atp (str->convert_from, rejected), | |
1858 readmore); | |
1859 /* Trim size down to how much we actually got */ | |
5038 | 1860 Dynarr_set_lengthr (str->convert_from, rejected + max (0, read_size)); |
771 | 1861 } |
1862 | |
1863 if (read_size < 0) /* LSTREAM_ERROR */ | |
1864 { | |
1865 error_occurred = 1; | |
1866 break; | |
1867 } | |
1868 if (read_size == 0) /* LSTREAM_EOF */ | |
1869 /* There might be some more end data produced in the translation, | |
1870 so we set a flag and call the conversion method once more to | |
1871 output any final stuff it may be holding, any "go back to a sane | |
1872 state" escape sequences, etc. The conversion method is free to | |
1873 look at this flag, and we use it above to stop looping. */ | |
1874 str->eof = 1; | |
1875 { | |
1876 Bytecount processed; | |
1877 Bytecount to_process = Dynarr_length (str->convert_from); | |
1878 | |
1879 /* Convert the data, and save any rejected data in convert_from */ | |
1880 processed = | |
1881 XCODESYSMETH (str->codesys, convert, | |
4967 | 1882 (str, Dynarr_begin (str->convert_from), |
771 | 1883 str->convert_to, to_process)); |
1884 if (processed < 0) | |
1885 { | |
1886 error_occurred = 1; | |
1887 break; | |
1888 } | |
1889 assert (processed <= to_process); | |
1890 if (processed < to_process) | |
4967 | 1891 memmove (Dynarr_begin (str->convert_from), |
771 | 1892 Dynarr_atp (str->convert_from, processed), |
1893 to_process - processed); | |
5038 | 1894 Dynarr_set_lengthr (str->convert_from, to_process - processed); |
771 | 1895 } |
1896 } | |
1897 | |
1898 if (data - orig_data == 0) | |
1899 return error_occurred ? -1 : 0; | |
1900 else | |
1901 return data - orig_data; | |
1902 } | |
1903 | |
1904 static Bytecount | |
1905 coding_writer (Lstream *stream, const unsigned char *data, Bytecount size) | |
1906 { | |
1907 struct coding_stream *str = CODING_STREAM_DATA (stream); | |
1908 | |
1909 /* Convert all our data into convert_to, and then attempt to write | |
1910 it all out to the other end. */ | |
1911 Dynarr_reset (str->convert_to); | |
1912 size = XCODESYSMETH (str->codesys, convert, | |
1913 (str, data, str->convert_to, size)); | |
4967 | 1914 if (Lstream_write (str->other_end, Dynarr_begin (str->convert_to), |
771 | 1915 Dynarr_length (str->convert_to)) < 0) |
1916 return -1; | |
1917 else | |
1918 /* The return value indicates how much of the incoming data was | |
1919 processed, not how many bytes were written. */ | |
1920 return size; | |
1921 } | |
1922 | |
1923 static int | |
1924 encode_decode_source_sink_type_is_char (Lisp_Object cs, | |
1925 enum source_or_sink sex, | |
1926 enum encode_decode direction) | |
1927 { | |
1928 return (direction == CODING_DECODE ? | |
1929 decoding_source_sink_type_is_char (cs, sex) : | |
1930 encoding_source_sink_type_is_char (cs, sex)); | |
1931 } | |
1932 | |
1933 /* Ensure that the convert methods only get full characters sent to them to | |
1934 convert if the source of that conversion is characters; and that no such | |
1935 full-character checking happens when the source is bytes. Keep in mind | |
1936 that (1) the conversion_end_type return values take the perspective of | |
1937 encoding; (2) the source for decoding is the same as the sink for | |
1938 encoding; (3) when writing, the data is given to us, and we set our own | |
1939 stream to be character mode or not; (4) when reading, the data comes | |
1940 from the other_end stream, and we set that one to be character mode or | |
1941 not. This is consistent with the comment above the prototype for | |
1942 Lstream_set_character_mode(), which lays out rules for who is allowed to | |
1943 modify the character type mode on a stream. | |
1944 | |
814 | 1945 If we're a read stream, we're always setting character mode on the |
1946 source, but we also set it on ourselves consistent with the flag that | |
1947 can disable this (see again the comment above | |
1948 Lstream_set_character_mode()). | |
1949 */ | |
771 | 1950 |
1951 static void | |
1952 set_coding_character_mode (Lstream *stream) | |
1953 { | |
1954 struct coding_stream *str = CODING_STREAM_DATA (stream); | |
1955 Lstream *stream_to_set = | |
1956 stream->flags & LSTREAM_FL_WRITE ? stream : str->other_end; | |
1957 if (encode_decode_source_sink_type_is_char | |
1958 (str->codesys, CODING_SOURCE, str->direction)) | |
1959 Lstream_set_character_mode (stream_to_set); | |
1960 else | |
1961 Lstream_unset_character_mode (stream_to_set); | |
814 | 1962 if (str->set_char_mode_on_us_when_reading && |
1963 (stream->flags & LSTREAM_FL_READ)) | |
1964 { | |
1965 if (encode_decode_source_sink_type_is_char | |
1966 (str->codesys, CODING_SINK, str->direction)) | |
1967 Lstream_set_character_mode (stream); | |
1968 else | |
1969 Lstream_unset_character_mode (stream); | |
1970 } | |
771 | 1971 } |
1972 | |
1973 static Lisp_Object | |
1974 coding_marker (Lisp_Object stream) | |
1975 { | |
1976 struct coding_stream *str = CODING_STREAM_DATA (XLSTREAM (stream)); | |
1977 | |
1978 mark_object (str->orig_codesys); | |
1979 mark_object (str->codesys); | |
1980 MAYBE_XCODESYSMETH (str->codesys, mark_coding_stream, (str)); | |
1981 return wrap_lstream (str->other_end); | |
1982 } | |
1983 | |
1984 static int | |
1985 coding_rewinder (Lstream *stream) | |
1986 { | |
1987 struct coding_stream *str = CODING_STREAM_DATA (stream); | |
1988 MAYBE_XCODESYSMETH (str->codesys, rewind_coding_stream, (str)); | |
1989 | |
1990 str->ch = 0; | |
1991 Dynarr_reset (str->convert_to); | |
1992 Dynarr_reset (str->convert_from); | |
1993 return Lstream_rewind (str->other_end); | |
1994 } | |
1995 | |
1996 static int | |
1997 coding_seekable_p (Lstream *stream) | |
1998 { | |
1999 struct coding_stream *str = CODING_STREAM_DATA (stream); | |
2000 return Lstream_seekable_p (str->other_end); | |
2001 } | |
2002 | |
2003 static int | |
2004 coding_flusher (Lstream *stream) | |
2005 { | |
2006 struct coding_stream *str = CODING_STREAM_DATA (stream); | |
2007 return Lstream_flush (str->other_end); | |
2008 } | |
2009 | |
2010 static int | |
2011 coding_closer (Lstream *stream) | |
2012 { | |
2013 struct coding_stream *str = CODING_STREAM_DATA (stream); | |
2014 if (stream->flags & LSTREAM_FL_WRITE) | |
2015 { | |
2016 str->eof = 1; | |
2017 coding_writer (stream, 0, 0); | |
2018 str->eof = 0; | |
2019 } | |
2020 /* It's safe to free the runoff dynarrs now because they are used only | |
2021 during conversion. We need to keep the type-specific data around, | |
2022 though, because of canonicalize_after_coding. */ | |
2023 if (str->convert_to) | |
2024 { | |
2025 Dynarr_free (str->convert_to); | |
2026 str->convert_to = 0; | |
2027 } | |
2028 if (str->convert_from) | |
428 | 2029 { |
771 | 2030 Dynarr_free (str->convert_from); |
2031 str->convert_from = 0; | |
2032 } | |
2033 | |
800 | 2034 if (str->no_close_other) |
2035 return Lstream_flush (str->other_end); | |
2036 else | |
2037 return Lstream_close (str->other_end); | |
771 | 2038 } |
2039 | |
2040 static void | |
2041 coding_finalizer (Lstream *stream) | |
2042 { | |
2043 struct coding_stream *str = CODING_STREAM_DATA (stream); | |
2044 | |
2045 assert (!str->finalized); | |
2046 MAYBE_XCODESYSMETH (str->codesys, finalize_coding_stream, (str)); | |
2047 if (str->data) | |
2048 { | |
4976
16112448d484
Rename xfree(FOO, TYPE) -> xfree(FOO)
Ben Wing <ben@xemacs.org>
parents:
4967
diff
changeset
|
2049 xfree (str->data); |
771 | 2050 str->data = 0; |
2051 } | |
2052 str->finalized = 1; | |
2053 } | |
2054 | |
2055 static Lisp_Object | |
2056 coding_stream_canonicalize_after_coding (Lstream *stream) | |
2057 { | |
2058 struct coding_stream *str = CODING_STREAM_DATA (stream); | |
2059 | |
2060 return XCODESYSMETH_OR_GIVEN (str->codesys, canonicalize_after_coding, | |
2061 (str), str->codesys); | |
2062 } | |
2063 | |
2064 Lisp_Object | |
2065 coding_stream_detected_coding_system (Lstream *stream) | |
2066 { | |
2067 Lisp_Object codesys = | |
2068 coding_stream_canonicalize_after_coding (stream); | |
2069 if (NILP (codesys)) | |
2070 return Fget_coding_system (Qidentity); | |
2071 return codesys; | |
2072 } | |
2073 | |
2074 Lisp_Object | |
2075 coding_stream_coding_system (Lstream *stream) | |
2076 { | |
2077 return CODING_STREAM_DATA (stream)->codesys; | |
2078 } | |
2079 | |
2080 /* Change the coding system associated with a stream. */ | |
2081 | |
2082 void | |
2083 set_coding_stream_coding_system (Lstream *lstr, Lisp_Object codesys) | |
2084 { | |
2085 struct coding_stream *str = CODING_STREAM_DATA (lstr); | |
2086 if (EQ (str->orig_codesys, codesys)) | |
2087 return; | |
2088 /* We do the equivalent of closing the stream, destroying it, and | |
2089 reinitializing it. This includes flushing out the data and signalling | |
2090 EOF, if we're a writing stream; we also replace the type-specific data | |
2091 with the data appropriate for the new coding system. */ | |
2092 if (!NILP (str->codesys)) | |
2093 { | |
2094 if (lstr->flags & LSTREAM_FL_WRITE) | |
2095 { | |
2096 Lstream_flush (lstr); | |
2097 str->eof = 1; | |
2098 coding_writer (lstr, 0, 0); | |
2099 str->eof = 0; | |
2100 } | |
2101 MAYBE_XCODESYSMETH (str->codesys, finalize_coding_stream, (str)); | |
2102 } | |
2103 str->orig_codesys = codesys; | |
2104 str->codesys = coding_system_real_canonical (codesys); | |
2105 | |
2106 if (str->data) | |
2107 { | |
4976
16112448d484
Rename xfree(FOO, TYPE) -> xfree(FOO)
Ben Wing <ben@xemacs.org>
parents:
4967
diff
changeset
|
2108 xfree (str->data); |
771 | 2109 str->data = 0; |
428 | 2110 } |
771 | 2111 if (XCODING_SYSTEM_METHODS (str->codesys)->coding_data_size) |
1204 | 2112 { |
2113 str->data = | |
2114 xmalloc_and_zero (XCODING_SYSTEM_METHODS (str->codesys)-> | |
2115 coding_data_size); | |
2116 str->type = XCODING_SYSTEM_METHODS (str->codesys)->enumtype; | |
2117 } | |
771 | 2118 MAYBE_XCODESYSMETH (str->codesys, init_coding_stream, (str)); |
2119 /* The new coding system may have different ideas regarding whether its | |
2120 ends are characters or bytes. */ | |
2121 set_coding_character_mode (lstr); | |
2122 } | |
2123 | |
2124 /* WARNING WARNING WARNING WARNING!!!!! If you open up a coding | |
2125 stream for writing, no automatic code detection will be performed. | |
2126 The reason for this is that automatic code detection requires a | |
2127 seekable input. Things will also fail if you open a coding | |
2128 stream for reading using a non-fully-specified coding system and | |
2129 a non-seekable input stream. */ | |
2130 | |
2131 static Lisp_Object | |
2132 make_coding_stream_1 (Lstream *stream, Lisp_Object codesys, | |
800 | 2133 const char *mode, enum encode_decode direction, |
802 | 2134 int flags) |
771 | 2135 { |
2136 Lstream *lstr = Lstream_new (lstream_coding, mode); | |
2137 struct coding_stream *str = CODING_STREAM_DATA (lstr); | |
2138 | |
2139 codesys = Fget_coding_system (codesys); | |
2140 xzero (*str); | |
2141 str->codesys = Qnil; | |
2142 str->orig_codesys = Qnil; | |
2143 str->us = lstr; | |
2144 str->other_end = stream; | |
2145 str->convert_to = Dynarr_new (unsigned_char); | |
2146 str->convert_from = Dynarr_new (unsigned_char); | |
2147 str->direction = direction; | |
814 | 2148 if (flags & LSTREAM_FL_NO_CLOSE_OTHER) |
802 | 2149 str->no_close_other = 1; |
814 | 2150 if (flags & LSTREAM_FL_READ_ONE_BYTE_AT_A_TIME) |
802 | 2151 str->one_byte_at_a_time = 1; |
814 | 2152 if (!(flags & LSTREAM_FL_NO_INIT_CHAR_MODE_WHEN_READING)) |
2153 str->set_char_mode_on_us_when_reading = 1; | |
802 | 2154 |
771 | 2155 set_coding_stream_coding_system (lstr, codesys); |
793 | 2156 return wrap_lstream (lstr); |
771 | 2157 } |
2158 | |
814 | 2159 /* FLAGS: |
2160 | |
2161 LSTREAM_FL_NO_CLOSE_OTHER | |
2162 Don't close STREAM (the stream at the other end) when this stream is | |
2163 closed. | |
2164 | |
2165 LSTREAM_FL_READ_ONE_BYTE_AT_A_TIME | |
2166 When reading from STREAM, read and process one byte at a time rather | |
2167 than in large chunks. This is for reading from TTY's, so we don't | |
2168 block. #### We should instead create a non-blocking filedesc stream | |
2169 that emulates the behavior as necessary using select(), when the | |
2170 fcntls don't work. (As seems to be the case on Cygwin.) | |
2171 | |
2172 LSTREAM_FL_NO_INIT_CHAR_MODE_WHEN_READING | |
2173 When reading from STREAM, read and process one byte at a time rather | |
2174 than in large chunks. This is for reading from TTY's, so we don't | |
2175 block. #### We should instead create a non-blocking filedesc stream | |
2176 that emulates the behavior as necessary using select(), when the | |
2177 fcntls don't work. (As seems to be the case on Cygwin.) | |
2178 */ | |
771 | 2179 Lisp_Object |
2180 make_coding_input_stream (Lstream *stream, Lisp_Object codesys, | |
802 | 2181 enum encode_decode direction, int flags) |
771 | 2182 { |
800 | 2183 return make_coding_stream_1 (stream, codesys, "r", direction, |
802 | 2184 flags); |
771 | 2185 } |
2186 | |
814 | 2187 /* FLAGS: |
2188 | |
2189 LSTREAM_FL_NO_CLOSE_OTHER | |
2190 Don't close STREAM (the stream at the other end) when this stream is | |
2191 closed. | |
2192 */ | |
771 | 2193 Lisp_Object |
2194 make_coding_output_stream (Lstream *stream, Lisp_Object codesys, | |
802 | 2195 enum encode_decode direction, int flags) |
771 | 2196 { |
800 | 2197 return make_coding_stream_1 (stream, codesys, "w", direction, |
802 | 2198 flags); |
771 | 2199 } |
2200 | |
2201 static Lisp_Object | |
2202 encode_decode_coding_region (Lisp_Object start, Lisp_Object end, | |
2203 Lisp_Object coding_system, Lisp_Object buffer, | |
2204 enum encode_decode direction) | |
2205 { | |
2206 Charbpos b, e; | |
2207 struct buffer *buf = decode_buffer (buffer, 0); | |
2208 Lisp_Object instream = Qnil, to_outstream = Qnil, outstream = Qnil; | |
2209 Lisp_Object from_outstream = Qnil, auto_outstream = Qnil; | |
2210 Lisp_Object lb_outstream = Qnil; | |
2211 Lisp_Object next; | |
2212 Lstream *istr, *ostr; | |
2213 struct gcpro gcpro1, gcpro2, gcpro3, gcpro4, gcpro5; | |
2214 struct gcpro ngcpro1; | |
2215 int source_char, sink_char; | |
2216 | |
2217 get_buffer_range_char (buf, start, end, &b, &e, 0); | |
2218 barf_if_buffer_read_only (buf, b, e); | |
2219 | |
2220 GCPRO5 (instream, to_outstream, outstream, from_outstream, lb_outstream); | |
2221 NGCPRO1 (auto_outstream); | |
2222 | |
2223 coding_system = Fget_coding_system (coding_system); | |
2224 source_char = encode_decode_source_sink_type_is_char (coding_system, | |
2225 CODING_SOURCE, | |
2226 direction); | |
2227 sink_char = encode_decode_source_sink_type_is_char (coding_system, | |
2228 CODING_SINK, | |
2229 direction); | |
2230 | |
2231 /* Order is IN <---> [TO] -> OUT -> [FROM] -> [AUTODETECT-EOL] -> LB */ | |
2232 instream = make_lisp_buffer_input_stream (buf, b, e, 0); | |
2233 next = lb_outstream = make_lisp_buffer_output_stream (buf, b, 0); | |
2234 | |
2235 if (direction == CODING_DECODE && | |
2236 XCODING_SYSTEM_EOL_TYPE (coding_system) == EOL_AUTODETECT) | |
2237 next = auto_outstream = | |
2238 make_coding_output_stream | |
800 | 2239 (XLSTREAM (next), Fget_coding_system (Qconvert_eol_autodetect), |
2240 CODING_DECODE, 0); | |
771 | 2241 |
2242 if (!sink_char) | |
2243 next = from_outstream = | |
800 | 2244 make_coding_output_stream (XLSTREAM (next), Qbinary, CODING_DECODE, 0); |
771 | 2245 outstream = make_coding_output_stream (XLSTREAM (next), coding_system, |
800 | 2246 direction, 0); |
771 | 2247 if (!source_char) |
428 | 2248 { |
771 | 2249 to_outstream = |
2250 make_coding_output_stream (XLSTREAM (outstream), | |
800 | 2251 Qbinary, CODING_ENCODE, 0); |
771 | 2252 ostr = XLSTREAM (to_outstream); |
2253 } | |
2254 else | |
2255 ostr = XLSTREAM (outstream); | |
2256 istr = XLSTREAM (instream); | |
2257 | |
2258 /* The chain of streams looks like this: | |
2259 | |
2297 | 2260 [BUFFER] <----- (( read from/send to loop )) |
771 | 2261 ------> [CHAR->BYTE i.e. ENCODE AS BINARY if source is |
2262 in bytes] | |
2263 ------> [ENCODE/DECODE AS SPECIFIED] | |
2264 ------> [BYTE->CHAR i.e. DECODE AS BINARY | |
2265 if sink is in bytes] | |
2266 ------> [AUTODETECT EOL if | |
2267 we're decoding and | |
2268 coding system calls | |
2269 for this] | |
2270 ------> [BUFFER] | |
2271 */ | |
2367 | 2272 |
2273 /* #### See comment | |
2274 | |
2275 EFFICIENCY OF CODING CONVERSION WITH MULTIPLE COPIES/CHAINS | |
2276 | |
2277 in text.c. | |
2278 */ | |
2279 | |
771 | 2280 while (1) |
2281 { | |
2282 char tempbuf[1024]; /* some random amount */ | |
2283 Charbpos newpos, even_newer_pos; | |
2284 Charbpos oldpos = lisp_buffer_stream_startpos (istr); | |
2285 Bytecount size_in_bytes = | |
2286 Lstream_read (istr, tempbuf, sizeof (tempbuf)); | |
2287 | |
2288 if (!size_in_bytes) | |
2289 break; | |
2290 newpos = lisp_buffer_stream_startpos (istr); | |
2291 Lstream_write (ostr, tempbuf, size_in_bytes); | |
2292 even_newer_pos = lisp_buffer_stream_startpos (istr); | |
2293 buffer_delete_range (buf, even_newer_pos - (newpos - oldpos), | |
2294 even_newer_pos, 0); | |
428 | 2295 } |
771 | 2296 |
2297 { | |
2298 Charcount retlen = | |
2299 lisp_buffer_stream_startpos (XLSTREAM (instream)) - b; | |
2300 Lstream_close (istr); | |
2301 Lstream_close (ostr); | |
2302 NUNGCPRO; | |
2303 UNGCPRO; | |
2304 Lstream_delete (istr); | |
2305 if (!NILP (from_outstream)) | |
2306 Lstream_delete (XLSTREAM (from_outstream)); | |
2307 Lstream_delete (XLSTREAM (outstream)); | |
2308 if (!NILP (to_outstream)) | |
2309 Lstream_delete (XLSTREAM (to_outstream)); | |
2310 if (!NILP (auto_outstream)) | |
2311 Lstream_delete (XLSTREAM (auto_outstream)); | |
2312 Lstream_delete (XLSTREAM (lb_outstream)); | |
2313 return make_int (retlen); | |
2314 } | |
2315 } | |
2316 | |
3302 | 2317 DEFUN ("decode-coding-region", Fdecode_coding_region, 3, 4, |
2318 "*r\nzDecode from coding system: \ni", /* | |
771 | 2319 Decode the text between START and END which is encoded in CODING-SYSTEM. |
2320 This is useful if you've read in encoded text from a file without decoding | |
2321 it (e.g. you read in a JIS-formatted file but used the `binary' or | |
2322 `no-conversion' coding system, so that it shows up as "^[$B!<!+^[(B"). | |
2323 Return length of decoded text. | |
3302 | 2324 BUFFER defaults to the current buffer if unspecified, and when interactive. |
771 | 2325 */ |
2326 (start, end, coding_system, buffer)) | |
2327 { | |
2328 return encode_decode_coding_region (start, end, coding_system, buffer, | |
2329 CODING_DECODE); | |
2330 } | |
2331 | |
3302 | 2332 DEFUN ("encode-coding-region", Fencode_coding_region, 3, 4, |
2333 "*r\nzEncode to coding system: \ni", /* | |
771 | 2334 Encode the text between START and END using CODING-SYSTEM. |
2335 This will, for example, convert Japanese characters into stuff such as | |
3302 | 2336 "^[$B!<!+^[(B" if you use the JIS encoding. Return length of encoded text. |
2337 BUFFER defaults to the current buffer if unspecified, and when interactive. | |
771 | 2338 */ |
2339 (start, end, coding_system, buffer)) | |
2340 { | |
2341 return encode_decode_coding_region (start, end, coding_system, buffer, | |
2342 CODING_ENCODE); | |
428 | 2343 } |
2344 | |
4690
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
2345 DEFUN ("query-coding-region", Fquery_coding_region, 3, 7, 0, /* |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
2346 Work out whether CODING-SYSTEM can losslessly encode a region. |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
2347 |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
2348 START and END are the beginning and end of the region to check. |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
2349 CODING-SYSTEM is the coding system to try. |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
2350 |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
2351 Optional argument BUFFER is the buffer to check, and defaults to the current |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
2352 buffer. |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
2353 |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
2354 IGNORE-INVALID-SEQUENCESP, also an optional argument, says to treat XEmacs |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
2355 characters which have an unambiguous encoded representation, despite being |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
2356 undefined in what they represent, as encodable. These chiefly arise with |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
2357 variable-length encodings like UTF-8 and UTF-16, where an invalid sequence |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
2358 is passed through to XEmacs as a sequence of characters with a defined |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
2359 correspondence to the octets on disk, but no non-error semantics; see the |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
2360 `invalid-sequence-coding-system' argument to `set-language-info'. |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
2361 |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
2362 They can also arise with fixed-length encodings like ISO 8859-7, where |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
2363 certain octets on disk have undefined values, and treating them as |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
2364 corresponding to the ISO 8859-1 characters with the same numerical values |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
2365 may lead to data that is not understood by other applications. |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
2366 |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
2367 Optional argument ERRORP says to signal a `text-conversion-error' if some |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
2368 character in the region cannot be encoded, and defaults to nil. |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
2369 |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
2370 Optional argument HIGHLIGHT says to display unencodable characters in the |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
2371 region using `query-coding-warning-face'. It defaults to nil. |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
2372 |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
2373 This function can return multiple values; the intention is that callers use |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
2374 `multiple-value-bind' or the related CL multiple value functions to deal |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
2375 with it. The first result is `t' if the region can be encoded using |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
2376 CODING-SYSTEM, or `nil' if not. If the region cannot be encoded using |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
2377 CODING-SYSTEM, the second result is a range table describing the positions |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
2378 of the unencodable characters. |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
2379 |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
2380 Ranges that describe characters that would be ignored were |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
2381 IGNORE-INVALID-SEQUENCESP non-nil map to the symbol `invalid-sequence'; |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
2382 other ranges map to the symbol `unencodable'. If IGNORE-INVALID-SEQUENCESP |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
2383 is non-nil, all ranges will map to the symbol `unencodable'. See |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
2384 `make-range-table' for more details of range tables. |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
2385 */ |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
2386 (start, end, coding_system, buffer, ignore_invalid_sequencesp, |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
2387 errorp, highlight)) |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
2388 { |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
2389 Charbpos b, e; |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
2390 struct buffer *buf = decode_buffer (buffer, 1); |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
2391 Lisp_Object result; |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
2392 int flags = 0, speccount = specpdl_depth (); |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
2393 |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
2394 coding_system = Fget_coding_system (coding_system); |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
2395 |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
2396 get_buffer_range_char (buf, start, end, &b, &e, 0); |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
2397 |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
2398 if (buf != current_buffer) |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
2399 { |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
2400 record_unwind_protect (save_current_buffer_restore, Fcurrent_buffer ()); |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
2401 set_buffer_internal (buf); |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
2402 } |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
2403 |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
2404 record_unwind_protect (save_excursion_restore, save_excursion_save ()); |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
2405 |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
2406 BUF_SET_PT (buf, b); |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
2407 |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
2408 if (!NILP (ignore_invalid_sequencesp)) |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
2409 { |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
2410 flags |= QUERY_METHOD_IGNORE_INVALID_SEQUENCES; |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
2411 } |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
2412 |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
2413 if (!NILP (errorp)) |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
2414 { |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
2415 flags |= QUERY_METHOD_ERRORP; |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
2416 } |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
2417 |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
2418 if (!NILP (highlight)) |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
2419 { |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
2420 flags |= QUERY_METHOD_HIGHLIGHT; |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
2421 } |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
2422 |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
2423 result = XCODESYSMETH_OR_GIVEN (coding_system, query, |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
2424 (coding_system, buf, e, flags), Qunbound); |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
2425 |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
2426 if (UNBOUNDP (result)) |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
2427 { |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
2428 signal_error (Qtext_conversion_error, |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
2429 "Coding system doesn't say what it can encode", |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
2430 XCODING_SYSTEM_NAME (coding_system)); |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
2431 } |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
2432 |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
2433 result = (NILP (result)) ? Qt : values2 (Qnil, result); |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
2434 |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
2435 return unbind_to_1 (speccount, result); |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
2436 } |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
2437 |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
2438 |
428 | 2439 |
2440 /************************************************************************/ | |
771 | 2441 /* Chain methods */ |
428 | 2442 /************************************************************************/ |
2443 | |
771 | 2444 /* #### Need a way to create "opposite-direction" coding systems. */ |
2445 | |
2446 /* Chain two or more coding systems together to make a combination coding | |
2447 system. */ | |
2448 | |
2449 struct chain_coding_system | |
2450 { | |
2451 /* List of coding systems, in decode order */ | |
2452 Lisp_Object *chain; | |
2453 /* Number of coding systems in list */ | |
2454 int count; | |
2455 /* Coding system to return as a result of canonicalize-after-coding */ | |
2456 Lisp_Object canonicalize_after_coding; | |
2457 }; | |
2458 | |
2459 struct chain_coding_stream | |
2460 { | |
2461 int initted; | |
2462 /* Lstreams for chain coding system */ | |
2463 Lisp_Object *lstreams; | |
2464 int lstream_count; | |
2465 }; | |
2466 | |
1204 | 2467 static const struct memory_description chain_coding_system_description[] = { |
2468 { XD_INT, offsetof (struct chain_coding_system, count) }, | |
2367 | 2469 { XD_BLOCK_PTR, offsetof (struct chain_coding_system, chain), |
2551 | 2470 XD_INDIRECT (0, 0), { &lisp_object_description } }, |
1204 | 2471 { XD_LISP_OBJECT, offsetof (struct chain_coding_system, |
2472 canonicalize_after_coding) }, | |
771 | 2473 { XD_END } |
2474 }; | |
2475 | |
1204 | 2476 static const struct memory_description chain_coding_stream_description_1 [] = { |
2477 { XD_INT, offsetof (struct chain_coding_stream, lstream_count) }, | |
2367 | 2478 { XD_BLOCK_PTR, offsetof (struct chain_coding_stream, lstreams), |
2551 | 2479 XD_INDIRECT (0, 0), { &lisp_object_description } }, |
771 | 2480 { XD_END } |
2481 }; | |
2482 | |
1204 | 2483 const struct sized_memory_description chain_coding_stream_description = { |
2484 sizeof (struct chain_coding_stream), chain_coding_stream_description_1 | |
2485 }; | |
2486 | |
2487 DEFINE_CODING_SYSTEM_TYPE_WITH_DATA (chain); | |
2488 | |
771 | 2489 static Lisp_Object |
2490 chain_canonicalize (Lisp_Object codesys) | |
2491 { | |
2492 /* We make use of the fact that this method is called at init time, after | |
2493 properties have been parsed. init_method is called too early. */ | |
2494 /* #### It's not clear we need this whole chain-canonicalize mechanism | |
2495 any more. */ | |
2496 Lisp_Object chain = Flist (XCODING_SYSTEM_CHAIN_COUNT (codesys), | |
2497 XCODING_SYSTEM_CHAIN_CHAIN (codesys)); | |
2498 chain = Fcons (XCODING_SYSTEM_PRE_WRITE_CONVERSION (codesys), | |
2499 Fcons (XCODING_SYSTEM_POST_READ_CONVERSION (codesys), | |
2500 chain)); | |
2501 Fputhash (chain, codesys, Vchain_canonicalize_hash_table); | |
2502 return codesys; | |
2503 } | |
2504 | |
2505 static Lisp_Object | |
2506 chain_canonicalize_after_coding (struct coding_stream *str) | |
2507 { | |
2508 Lisp_Object cac = | |
2509 XCODING_SYSTEM_CHAIN_CANONICALIZE_AFTER_CODING (str->codesys); | |
2510 if (!NILP (cac)) | |
2511 return cac; | |
2512 return str->codesys; | |
2513 #if 0 | |
2514 struct chain_coding_stream *data = CODING_STREAM_TYPE_DATA (str, chain); | |
2515 Lisp_Object us = str->codesys, codesys; | |
2516 int i; | |
2517 Lisp_Object chain; | |
2518 Lisp_Object tail; | |
2519 int changed = 0; | |
2520 | |
2521 /* #### It's not clear we need this whole chain-canonicalize mechanism | |
2522 any more. */ | |
2523 if (str->direction == CODING_ENCODE || !data->initted) | |
2524 return us; | |
2525 | |
2526 chain = Flist (XCODING_SYSTEM_CHAIN_COUNT (us), | |
2527 XCODING_SYSTEM_CHAIN_CHAIN (us)); | |
2528 | |
2529 tail = chain; | |
2530 for (i = 0; i < XCODING_SYSTEM_CHAIN_COUNT (us); i++) | |
2531 { | |
2532 codesys = (coding_stream_canonicalize_after_coding | |
2533 (XLSTREAM (data->lstreams[i]))); | |
2534 if (!EQ (codesys, XCAR (tail))) | |
2535 changed = 1; | |
2536 XCAR (tail) = codesys; | |
2537 tail = XCDR (tail); | |
2538 } | |
2539 | |
2540 if (!changed) | |
2541 return us; | |
2542 | |
2543 chain = delq_no_quit (Qnil, chain); | |
2544 | |
2545 if (NILP (XCODING_SYSTEM_PRE_WRITE_CONVERSION (us)) && | |
2546 NILP (XCODING_SYSTEM_POST_READ_CONVERSION (us))) | |
2547 { | |
2548 if (NILP (chain)) | |
2549 return Qnil; | |
2550 if (NILP (XCDR (chain))) | |
2551 return XCAR (chain); | |
2552 } | |
2553 | |
2554 codesys = Fgethash (Fcons (XCODING_SYSTEM_PRE_WRITE_CONVERSION (us), | |
2555 Fcons (XCODING_SYSTEM_POST_READ_CONVERSION (us), | |
2556 chain)), Vchain_canonicalize_hash_table, | |
2557 Qnil); | |
2558 if (!NILP (codesys)) | |
2559 return codesys; | |
2560 return make_internal_coding_system | |
2561 (us, "internal-chain-canonicalizer-wrapper", | |
2562 Qchain, Qunbound, list2 (Qchain, chain)); | |
2563 #endif /* 0 */ | |
2564 } | |
2565 | |
2566 static void | |
2567 chain_init (Lisp_Object codesys) | |
2568 { | |
2569 XCODING_SYSTEM_CHAIN_CANONICALIZE_AFTER_CODING (codesys) = Qnil; | |
2570 } | |
2571 | |
2572 static void | |
2573 chain_mark (Lisp_Object codesys) | |
2574 { | |
2575 int i; | |
2576 | |
2577 for (i = 0; i < XCODING_SYSTEM_CHAIN_COUNT (codesys); i++) | |
2578 mark_object (XCODING_SYSTEM_CHAIN_CHAIN (codesys)[i]); | |
2579 mark_object (XCODING_SYSTEM_CHAIN_CANONICALIZE_AFTER_CODING (codesys)); | |
2580 } | |
2581 | |
2582 static void | |
2583 chain_mark_coding_stream_1 (struct chain_coding_stream *data) | |
2584 { | |
2585 int i; | |
2586 | |
2587 for (i = 0; i < data->lstream_count; i++) | |
2588 mark_object (data->lstreams[i]); | |
2589 } | |
2590 | |
2591 static void | |
2592 chain_mark_coding_stream (struct coding_stream *str) | |
2593 { | |
2594 chain_mark_coding_stream_1 (CODING_STREAM_TYPE_DATA (str, chain)); | |
2595 } | |
2596 | |
2597 static void | |
2598 chain_print (Lisp_Object cs, Lisp_Object printcharfun, int escapeflag) | |
2599 { | |
2600 int i; | |
2601 | |
4952
19a72041c5ed
Mule-izing, various fixes related to char * arguments
Ben Wing <ben@xemacs.org>
parents:
4846
diff
changeset
|
2602 write_ascstring (printcharfun, "("); |
771 | 2603 for (i = 0; i < XCODING_SYSTEM_CHAIN_COUNT (cs); i++) |
2604 { | |
4952
19a72041c5ed
Mule-izing, various fixes related to char * arguments
Ben Wing <ben@xemacs.org>
parents:
4846
diff
changeset
|
2605 write_ascstring (printcharfun, i == 0 ? "" : "->"); |
771 | 2606 print_coding_system_in_print_method (XCODING_SYSTEM_CHAIN_CHAIN (cs)[i], |
2607 printcharfun, escapeflag); | |
2608 } | |
2609 { | |
2610 Lisp_Object cac = XCODING_SYSTEM_CHAIN_CANONICALIZE_AFTER_CODING (cs); | |
2611 if (!NILP (cac)) | |
2612 { | |
2613 if (i > 0) | |
4952
19a72041c5ed
Mule-izing, various fixes related to char * arguments
Ben Wing <ben@xemacs.org>
parents:
4846
diff
changeset
|
2614 write_ascstring (printcharfun, " "); |
19a72041c5ed
Mule-izing, various fixes related to char * arguments
Ben Wing <ben@xemacs.org>
parents:
4846
diff
changeset
|
2615 write_ascstring (printcharfun, "canonicalize-after-coding="); |
771 | 2616 print_coding_system_in_print_method (cac, printcharfun, escapeflag); |
2617 } | |
2618 } | |
2619 | |
4952
19a72041c5ed
Mule-izing, various fixes related to char * arguments
Ben Wing <ben@xemacs.org>
parents:
4846
diff
changeset
|
2620 write_ascstring (printcharfun, ")"); |
771 | 2621 } |
2622 | |
2623 static void | |
2624 chain_rewind_coding_stream_1 (struct chain_coding_stream *data) | |
2625 { | |
2626 /* Each will rewind the next; there is always at least one stream (the | |
2627 dynarr stream at the end) if we're initted */ | |
2628 if (data->initted) | |
2629 Lstream_rewind (XLSTREAM (data->lstreams[0])); | |
2630 } | |
2631 | |
2632 static void | |
2633 chain_rewind_coding_stream (struct coding_stream *str) | |
2634 { | |
2635 chain_rewind_coding_stream_1 (CODING_STREAM_TYPE_DATA (str, chain)); | |
2636 } | |
2637 | |
2638 static void | |
2639 chain_init_coding_streams_1 (struct chain_coding_stream *data, | |
2640 unsigned_char_dynarr *dst, | |
2641 int ncodesys, Lisp_Object *codesys, | |
2642 enum encode_decode direction) | |
2643 { | |
2644 int i; | |
2645 Lisp_Object lstream_out; | |
2646 | |
2647 data->lstream_count = ncodesys + 1; | |
2648 data->lstreams = xnew_array (Lisp_Object, data->lstream_count); | |
2649 | |
2650 lstream_out = make_dynarr_output_stream (dst); | |
2651 Lstream_set_buffering (XLSTREAM (lstream_out), LSTREAM_UNBUFFERED, 0); | |
2652 data->lstreams[data->lstream_count - 1] = lstream_out; | |
2653 | |
2654 for (i = ncodesys - 1; i >= 0; i--) | |
2655 { | |
2656 data->lstreams[i] = | |
2657 make_coding_output_stream | |
2658 (XLSTREAM (lstream_out), | |
2659 codesys[direction == CODING_ENCODE ? ncodesys - (i + 1) : i], | |
800 | 2660 direction, 0); |
771 | 2661 lstream_out = data->lstreams[i]; |
2662 Lstream_set_buffering (XLSTREAM (lstream_out), LSTREAM_UNBUFFERED, | |
2663 0); | |
2664 } | |
2665 data->initted = 1; | |
2666 } | |
2667 | |
2668 static Bytecount | |
2669 chain_convert (struct coding_stream *str, const UExtbyte *src, | |
2670 unsigned_char_dynarr *dst, Bytecount n) | |
2671 { | |
2672 struct chain_coding_stream *data = CODING_STREAM_TYPE_DATA (str, chain); | |
2673 | |
2674 if (str->eof) | |
2675 { | |
2676 /* Each will close the next; there is always at least one stream (the | |
2677 dynarr stream at the end) if we're initted. We need to close now | |
2678 because more data may be generated. */ | |
2679 if (data->initted) | |
2680 Lstream_close (XLSTREAM (data->lstreams[0])); | |
2681 return n; | |
2682 } | |
2683 | |
2684 if (!data->initted) | |
2685 chain_init_coding_streams_1 | |
2686 (data, dst, XCODING_SYSTEM_CHAIN_COUNT (str->codesys), | |
2687 XCODING_SYSTEM_CHAIN_CHAIN (str->codesys), str->direction); | |
2688 | |
2689 if (Lstream_write (XLSTREAM (data->lstreams[0]), src, n) < 0) | |
2690 return -1; | |
2691 return n; | |
2692 } | |
2693 | |
2694 static void | |
2695 chain_finalize_coding_stream_1 (struct chain_coding_stream *data) | |
2696 { | |
2697 if (data->lstreams) | |
2698 { | |
2297 | 2699 /* During GC, these objects are unmarked, and are about to be freed. |
2700 We do NOT want them on the free list, and that will cause lots of | |
2701 nastiness including crashes. Just let them be freed normally. */ | |
771 | 2702 if (!gc_in_progress) |
2703 { | |
2704 int i; | |
2297 | 2705 /* Order of deletion is important here! Delete from the head of |
2706 the chain and work your way towards the tail. In general, | |
2707 when you delete an object, there should be *NO* pointers to it | |
2708 anywhere. Deleting back-to-front would be a problem because | |
2709 there are pointers going forward. If there were pointers in | |
2710 both directions, you'd have to disconnect the pointers to a | |
2711 particular object before deleting it. */ | |
771 | 2712 for (i = 0; i < data->lstream_count; i++) |
2713 Lstream_delete (XLSTREAM ((data->lstreams)[i])); | |
2714 } | |
4976
16112448d484
Rename xfree(FOO, TYPE) -> xfree(FOO)
Ben Wing <ben@xemacs.org>
parents:
4967
diff
changeset
|
2715 xfree (data->lstreams); |
771 | 2716 } |
2717 } | |
2718 | |
2719 static void | |
2720 chain_finalize_coding_stream (struct coding_stream *str) | |
2721 { | |
2722 chain_finalize_coding_stream_1 (CODING_STREAM_TYPE_DATA (str, chain)); | |
2723 } | |
2724 | |
2725 static void | |
2726 chain_finalize (Lisp_Object c) | |
2727 { | |
2728 if (XCODING_SYSTEM_CHAIN_CHAIN (c)) | |
4976
16112448d484
Rename xfree(FOO, TYPE) -> xfree(FOO)
Ben Wing <ben@xemacs.org>
parents:
4967
diff
changeset
|
2729 xfree (XCODING_SYSTEM_CHAIN_CHAIN (c)); |
771 | 2730 } |
2731 | |
428 | 2732 static int |
771 | 2733 chain_putprop (Lisp_Object codesys, Lisp_Object key, Lisp_Object value) |
2734 { | |
2735 if (EQ (key, Qchain)) | |
2736 { | |
2737 Lisp_Object *cslist; | |
2738 int count = 0; | |
2739 int i; | |
2740 | |
2367 | 2741 { |
2742 EXTERNAL_LIST_LOOP_2 (elt, value) | |
2743 { | |
2744 Fget_coding_system (elt); | |
2745 count++; | |
2746 } | |
2747 } | |
771 | 2748 |
2749 cslist = xnew_array (Lisp_Object, count); | |
2750 XCODING_SYSTEM_CHAIN_CHAIN (codesys) = cslist; | |
2751 | |
2752 count = 0; | |
2367 | 2753 { |
2754 EXTERNAL_LIST_LOOP_2 (elt, value) | |
2755 { | |
2756 cslist[count] = Fget_coding_system (elt); | |
2757 count++; | |
2758 } | |
2759 } | |
771 | 2760 |
2761 XCODING_SYSTEM_CHAIN_COUNT (codesys) = count; | |
2762 | |
2763 for (i = 0; i < count - 1; i++) | |
2764 { | |
2765 if (decoding_source_sink_type_is_char (cslist[i], CODING_SINK) != | |
2766 decoding_source_sink_type_is_char (cslist[i + 1], CODING_SOURCE)) | |
2767 invalid_argument_2 ("Sink of first must match source of second", | |
2768 cslist[i], cslist[i + 1]); | |
2769 } | |
2770 } | |
2771 else if (EQ (key, Qcanonicalize_after_coding)) | |
2772 XCODING_SYSTEM_CHAIN_CANONICALIZE_AFTER_CODING (codesys) = | |
2773 Fget_coding_system (value); | |
2774 else | |
2775 return 0; | |
2776 return 1; | |
2777 } | |
2778 | |
2779 static Lisp_Object | |
2780 chain_getprop (Lisp_Object coding_system, Lisp_Object prop) | |
2781 { | |
2782 if (EQ (prop, Qchain)) | |
2783 { | |
2784 Lisp_Object result = Qnil; | |
2785 int i; | |
2786 | |
2787 for (i = 0; i < XCODING_SYSTEM_CHAIN_COUNT (coding_system); i++) | |
2788 result = Fcons (XCODING_SYSTEM_CHAIN_CHAIN (coding_system)[i], | |
2789 result); | |
2790 | |
2791 return Fnreverse (result); | |
2792 } | |
2793 else if (EQ (prop, Qcanonicalize_after_coding)) | |
2794 return XCODING_SYSTEM_CHAIN_CANONICALIZE_AFTER_CODING (coding_system); | |
2795 else | |
2796 return Qunbound; | |
2797 } | |
2798 | |
2799 static enum source_sink_type | |
2800 chain_conversion_end_type (Lisp_Object codesys) | |
2801 { | |
2802 Lisp_Object *cslist = XCODING_SYSTEM_CHAIN_CHAIN (codesys); | |
2803 int n = XCODING_SYSTEM_CHAIN_COUNT (codesys); | |
2804 int charp_source, charp_sink; | |
2805 | |
2806 if (n == 0) | |
2807 return DECODES_BYTE_TO_BYTE; /* arbitrary */ | |
2808 charp_source = decoding_source_sink_type_is_char (cslist[0], CODING_SOURCE); | |
2809 charp_sink = decoding_source_sink_type_is_char (cslist[n - 1], CODING_SINK); | |
2810 | |
2811 switch (charp_source * 2 + charp_sink) | |
2812 { | |
2813 case 0: return DECODES_BYTE_TO_BYTE; | |
2814 case 1: return DECODES_BYTE_TO_CHARACTER; | |
2815 case 2: return DECODES_CHARACTER_TO_BYTE; | |
2816 case 3: return DECODES_CHARACTER_TO_CHARACTER; | |
2817 } | |
2818 | |
2500 | 2819 ABORT (); |
771 | 2820 return DECODES_BYTE_TO_BYTE; |
2821 } | |
2822 | |
2823 | |
2824 /************************************************************************/ | |
2825 /* No-conversion methods */ | |
2826 /************************************************************************/ | |
2827 | |
2828 /* "No conversion"; used for binary files. We use quotes because there | |
2829 really is some conversion being applied (it does byte<->char | |
2830 conversion), but it appears to the user as if the text is read in | |
2297 | 2831 without conversion. |
2832 | |
2833 #### Shouldn't we _call_ it that, then? And while we're at it, | |
2834 separate it into "to_internal" and "to_external"? */ | |
771 | 2835 DEFINE_CODING_SYSTEM_TYPE (no_conversion); |
2836 | |
2837 /* This is used when reading in "binary" files -- i.e. files that may | |
2838 contain all 256 possible byte values and that are not to be | |
2839 interpreted as being in any particular encoding. */ | |
2840 static Bytecount | |
2841 no_conversion_convert (struct coding_stream *str, | |
2842 const UExtbyte *src, | |
2843 unsigned_char_dynarr *dst, Bytecount n) | |
2844 { | |
2845 UExtbyte c; | |
2846 unsigned int ch = str->ch; | |
2847 Bytecount orign = n; | |
2848 | |
2849 if (str->direction == CODING_DECODE) | |
2850 { | |
2851 while (n--) | |
2852 { | |
2853 c = *src++; | |
2854 | |
2855 DECODE_ADD_BINARY_CHAR (c, dst); | |
2856 } | |
2857 | |
2858 if (str->eof) | |
2859 DECODE_OUTPUT_PARTIAL_CHAR (ch, dst); | |
2860 } | |
2861 else | |
2862 { | |
2863 | |
2864 while (n--) | |
2865 { | |
2866 c = *src++; | |
826 | 2867 if (byte_ascii_p (c)) |
771 | 2868 { |
2869 assert (ch == 0); | |
2870 Dynarr_add (dst, c); | |
2871 } | |
2872 #ifdef MULE | |
867 | 2873 else if (ibyte_leading_byte_p (c)) |
771 | 2874 { |
2875 assert (ch == 0); | |
2876 if (c == LEADING_BYTE_LATIN_ISO8859_1 || | |
2877 c == LEADING_BYTE_CONTROL_1) | |
2878 ch = c; | |
2879 else | |
2297 | 2880 /* #### This is just plain unacceptable. */ |
771 | 2881 Dynarr_add (dst, '~'); /* untranslatable character */ |
2882 } | |
2883 else | |
2884 { | |
2885 if (ch == LEADING_BYTE_LATIN_ISO8859_1) | |
2886 Dynarr_add (dst, c); | |
2887 else if (ch == LEADING_BYTE_CONTROL_1) | |
2888 { | |
2889 assert (c < 0xC0); | |
2890 Dynarr_add (dst, c - 0x20); | |
2891 } | |
2892 /* else it should be the second or third byte of an | |
2893 untranslatable character, so ignore it */ | |
2894 ch = 0; | |
2895 } | |
2896 #endif /* MULE */ | |
2897 | |
2898 } | |
2899 } | |
2900 | |
2901 str->ch = ch; | |
2902 return orign; | |
2903 } | |
2904 | |
2905 DEFINE_DETECTOR (no_conversion); | |
2906 DEFINE_DETECTOR_CATEGORY (no_conversion, no_conversion); | |
2907 | |
2908 struct no_conversion_detector | |
2909 { | |
2910 int dummy; | |
2911 }; | |
2912 | |
2913 static void | |
2286 | 2914 no_conversion_detect (struct detection_state *st, const UExtbyte *UNUSED (src), |
2915 Bytecount UNUSED (n)) | |
771 | 2916 { |
2917 /* Hack until we get better handling of this stuff! */ | |
2918 DET_RESULT (st, no_conversion) = DET_SLIGHTLY_LIKELY; | |
2919 } | |
2920 | |
2921 | |
2922 /************************************************************************/ | |
2923 /* Convert-eol methods */ | |
2924 /************************************************************************/ | |
2925 | |
2926 /* This is used to handle end-of-line (EOL) differences. It is | |
2819 | 2927 character-to-character, and works (when encoding) *BEFORE* sending data to |
2928 the main encoding routine -- thus, that routine must handle different EOL | |
2929 types itself if it does line-oriented type processing. This is unavoidable | |
2930 because we don't know whether the output of the main encoding routine is | |
2931 ASCII compatible (UTF-16 is definitely not, for example). [[ sjt sez this | |
2932 is bogus. There should be _no_ EOL processing (or processing of any kind) | |
2933 after conversion to external. ]] | |
771 | 2934 |
793 | 2935 There is one parameter: `subtype', either `cr', `lf', `crlf', or nil. |
771 | 2936 */ |
2937 | |
2938 struct convert_eol_coding_system | |
2939 { | |
2940 enum eol_type subtype; | |
2132 | 2941 int dummy; /* On some architectures (eg ia64) the portable dumper can |
2942 produce unaligned access errors without this field. Probably | |
2943 because the combined structure of this structure and | |
2944 Lisp_Coding_System is not properly aligned. */ | |
771 | 2945 }; |
2946 | |
2947 #define CODING_SYSTEM_CONVERT_EOL_SUBTYPE(codesys) \ | |
2948 (CODING_SYSTEM_TYPE_DATA (codesys, convert_eol)->subtype) | |
2949 #define XCODING_SYSTEM_CONVERT_EOL_SUBTYPE(codesys) \ | |
2950 (XCODING_SYSTEM_TYPE_DATA (codesys, convert_eol)->subtype) | |
2951 | |
2952 struct convert_eol_coding_stream | |
2953 { | |
2954 enum eol_type actual; | |
2955 }; | |
2956 | |
1204 | 2957 static const struct memory_description |
771 | 2958 convert_eol_coding_system_description[] = { |
2959 { XD_END } | |
2960 }; | |
2961 | |
1204 | 2962 DEFINE_CODING_SYSTEM_TYPE_WITH_DATA (convert_eol); |
2963 | |
771 | 2964 static void |
2286 | 2965 convert_eol_print (Lisp_Object cs, Lisp_Object printcharfun, |
2966 int UNUSED (escapeflag)) | |
771 | 2967 { |
2968 struct convert_eol_coding_system *data = | |
2969 XCODING_SYSTEM_TYPE_DATA (cs, convert_eol); | |
2970 | |
2971 write_fmt_string (printcharfun, "(%s)", | |
2972 data->subtype == EOL_LF ? "lf" : | |
2973 data->subtype == EOL_CRLF ? "crlf" : | |
2974 data->subtype == EOL_CR ? "cr" : | |
793 | 2975 data->subtype == EOL_AUTODETECT ? "nil" : |
2500 | 2976 (ABORT(), "")); |
771 | 2977 } |
2978 | |
2979 static enum source_sink_type | |
2286 | 2980 convert_eol_conversion_end_type (Lisp_Object UNUSED (codesys)) |
771 | 2981 { |
2982 return DECODES_CHARACTER_TO_CHARACTER; | |
2983 } | |
2984 | |
2985 static int | |
2986 convert_eol_putprop (Lisp_Object codesys, | |
2987 Lisp_Object key, | |
2988 Lisp_Object value) | |
2989 { | |
2990 struct convert_eol_coding_system *data = | |
2991 XCODING_SYSTEM_TYPE_DATA (codesys, convert_eol); | |
2992 | |
2993 if (EQ (key, Qsubtype)) | |
2994 { | |
2995 if (EQ (value, Qlf) /* || EQ (value, Qunix) */) | |
2996 data->subtype = EOL_LF; | |
2997 else if (EQ (value, Qcrlf) /* || EQ (value, Qdos) */) | |
2998 data->subtype = EOL_CRLF; | |
2999 else if (EQ (value, Qcr) /* || EQ (value, Qmac) */) | |
3000 data->subtype = EOL_CR; | |
793 | 3001 else if (EQ (value, Qnil)) |
771 | 3002 data->subtype = EOL_AUTODETECT; |
3003 else invalid_constant ("Unrecognized eol type", value); | |
3004 } | |
3005 else | |
3006 return 0; | |
3007 return 1; | |
3008 } | |
3009 | |
3010 static Lisp_Object | |
3011 convert_eol_getprop (Lisp_Object coding_system, Lisp_Object prop) | |
3012 { | |
3013 struct convert_eol_coding_system *data = | |
3014 XCODING_SYSTEM_TYPE_DATA (coding_system, convert_eol); | |
3015 | |
3016 if (EQ (prop, Qsubtype)) | |
3017 { | |
3018 switch (data->subtype) | |
3019 { | |
3020 case EOL_LF: return Qlf; | |
3021 case EOL_CRLF: return Qcrlf; | |
3022 case EOL_CR: return Qcr; | |
793 | 3023 case EOL_AUTODETECT: return Qnil; |
2500 | 3024 default: ABORT (); |
771 | 3025 } |
3026 } | |
3027 | |
3028 return Qunbound; | |
3029 } | |
3030 | |
3031 static void | |
3032 convert_eol_init_coding_stream (struct coding_stream *str) | |
3033 { | |
3034 struct convert_eol_coding_stream *data = | |
3035 CODING_STREAM_TYPE_DATA (str, convert_eol); | |
3036 data->actual = XCODING_SYSTEM_CONVERT_EOL_SUBTYPE (str->codesys); | |
3037 } | |
3038 | |
3039 static Bytecount | |
867 | 3040 convert_eol_convert (struct coding_stream *str, const Ibyte *src, |
771 | 3041 unsigned_char_dynarr *dst, Bytecount n) |
3042 { | |
3043 if (str->direction == CODING_DECODE) | |
3044 { | |
3045 struct convert_eol_coding_stream *data = | |
3046 CODING_STREAM_TYPE_DATA (str, convert_eol); | |
3047 | |
3048 if (data->actual == EOL_AUTODETECT) | |
3049 { | |
3050 Bytecount n2 = n; | |
867 | 3051 const Ibyte *src2 = src; |
771 | 3052 |
3053 for (; n2; n2--) | |
3054 { | |
867 | 3055 Ibyte c = *src2++; |
771 | 3056 if (c == '\n') |
3057 { | |
3058 data->actual = EOL_LF; | |
3059 break; | |
3060 } | |
3061 else if (c == '\r') | |
3062 { | |
3063 if (n2 == 1) | |
3064 { | |
3065 /* If we're seeing a '\r' at the end of the data, then | |
3066 reject the '\r' right now so it doesn't become an | |
3067 issue in the code below -- unless we're at the end of | |
3068 the stream, in which case we can't do that (because | |
3069 then the '\r' will never get written out), and in any | |
3070 case we should be recognizing it at EOL_CR format. */ | |
3071 if (str->eof) | |
3072 data->actual = EOL_CR; | |
3073 else | |
3074 n--; | |
3075 break; | |
3076 } | |
3077 else if (*src2 == '\n') | |
3078 data->actual = EOL_CRLF; | |
3079 else | |
3080 data->actual = EOL_CR; | |
3081 break; | |
3082 } | |
3083 } | |
3084 } | |
3085 | |
3086 /* str->eof is set, the caller reached EOF on the other end and has | |
3087 no new data to give us. The only data we get is the data we | |
3088 rejected from last time. */ | |
3089 if (data->actual == EOL_LF || data->actual == EOL_AUTODETECT || | |
3090 (str->eof)) | |
3091 Dynarr_add_many (dst, src, n); | |
3092 else | |
3093 { | |
867 | 3094 const Ibyte *end = src + n; |
771 | 3095 while (1) |
3096 { | |
3097 /* Find the next section with no \r and add it. */ | |
867 | 3098 const Ibyte *runstart = src; |
3099 src = (Ibyte *) memchr (src, '\r', end - src); | |
771 | 3100 if (!src) |
3101 src = end; | |
3102 Dynarr_add_many (dst, runstart, src - runstart); | |
3103 /* Stop if at end ... */ | |
3104 if (src == end) | |
3105 break; | |
3106 /* ... else, translate as necessary. */ | |
3107 src++; | |
3108 if (data->actual == EOL_CR) | |
3109 Dynarr_add (dst, '\n'); | |
3110 /* We need to be careful here with CRLF. If we see a CR at the | |
3111 end of the data, we don't know if it's part of a CRLF, so we | |
3112 reject it. Otherwise: If it's part of a CRLF, eat it and | |
3113 loop; the following LF gets added next time around. If it's | |
3114 not part of a CRLF, add the CR and loop. The following | |
3115 character will be processed in the next loop iteration. This | |
3116 correctly handles a sequence like CR+CR+LF. */ | |
3117 else if (src == end) | |
3118 return n - 1; /* reject the CR at the end; we'll get it again | |
3119 next time the convert method is called */ | |
3120 else if (*src != '\n') | |
3121 Dynarr_add (dst, '\r'); | |
3122 } | |
3123 } | |
3124 | |
3125 return n; | |
3126 } | |
3127 else | |
3128 { | |
3129 enum eol_type subtype = | |
3130 XCODING_SYSTEM_CONVERT_EOL_SUBTYPE (str->codesys); | |
867 | 3131 const Ibyte *end = src + n; |
771 | 3132 |
3133 /* We try to be relatively efficient here. */ | |
3134 if (subtype == EOL_LF) | |
3135 Dynarr_add_many (dst, src, n); | |
3136 else | |
3137 { | |
3138 while (1) | |
3139 { | |
3140 /* Find the next section with no \n and add it. */ | |
867 | 3141 const Ibyte *runstart = src; |
3142 src = (Ibyte *) memchr (src, '\n', end - src); | |
771 | 3143 if (!src) |
3144 src = end; | |
3145 Dynarr_add_many (dst, runstart, src - runstart); | |
3146 /* Stop if at end ... */ | |
3147 if (src == end) | |
3148 break; | |
3149 /* ... else, skip over \n and add its translation. */ | |
3150 src++; | |
3151 Dynarr_add (dst, '\r'); | |
3152 if (subtype == EOL_CRLF) | |
3153 Dynarr_add (dst, '\n'); | |
3154 } | |
3155 } | |
3156 | |
3157 return n; | |
3158 } | |
3159 } | |
3160 | |
3161 static Lisp_Object | |
3162 convert_eol_canonicalize_after_coding (struct coding_stream *str) | |
3163 { | |
3164 struct convert_eol_coding_stream *data = | |
3165 CODING_STREAM_TYPE_DATA (str, convert_eol); | |
3166 | |
3167 if (str->direction == CODING_ENCODE) | |
3168 return str->codesys; | |
3169 | |
3170 switch (data->actual) | |
3171 { | |
3172 case EOL_LF: return Fget_coding_system (Qconvert_eol_lf); | |
3173 case EOL_CRLF: return Fget_coding_system (Qconvert_eol_crlf); | |
3174 case EOL_CR: return Fget_coding_system (Qconvert_eol_cr); | |
3175 case EOL_AUTODETECT: return str->codesys; | |
2500 | 3176 default: ABORT (); return Qnil; |
771 | 3177 } |
3178 } | |
3179 | |
3180 | |
3181 /************************************************************************/ | |
3182 /* Undecided methods */ | |
3183 /************************************************************************/ | |
3184 | |
3185 /* Do autodetection. We can autodetect the EOL type only, the coding | |
3186 system only, or both. We only do autodetection when decoding; when | |
3187 encoding, we just pass the data through. | |
3188 | |
3189 When doing just EOL detection, a coding system can be specified; if so, | |
3190 we will decode this data through the coding system before doing EOL | |
3191 detection. The reason for specifying this is so that | |
3192 canonicalize-after-coding works: We will canonicalize the specified | |
3193 coding system into the appropriate EOL type. When doing both coding and | |
3194 EOL detection, we do similar canonicalization, and also catch situations | |
3195 where the EOL type is overspecified, i.e. the detected coding system | |
3196 specifies an EOL type, and either switch to the equivalent | |
3197 non-EOL-processing coding system (if possible), or terminate EOL | |
3198 detection and use the specified EOL type. This prevents data from being | |
3199 EOL-processed twice. | |
3200 */ | |
3201 | |
3202 struct undecided_coding_system | |
3203 { | |
3204 int do_eol, do_coding; | |
3205 Lisp_Object cs; | |
3206 }; | |
3207 | |
3208 struct undecided_coding_stream | |
3209 { | |
3210 Lisp_Object actual; | |
3211 /* Either 2 or 3 lstreams here; see undecided_convert */ | |
3212 struct chain_coding_stream c; | |
3213 | |
3214 struct detection_state *st; | |
3215 }; | |
3216 | |
1204 | 3217 static const struct memory_description undecided_coding_system_description[] = { |
3218 { XD_LISP_OBJECT, offsetof (struct undecided_coding_system, cs) }, | |
771 | 3219 { XD_END } |
3220 }; | |
3221 | |
1204 | 3222 static const struct memory_description undecided_coding_stream_description_1 [] = { |
3223 { XD_LISP_OBJECT, offsetof (struct undecided_coding_stream, actual) }, | |
2367 | 3224 { XD_BLOCK_ARRAY, offsetof (struct undecided_coding_stream, c), |
2551 | 3225 1, { &chain_coding_stream_description } }, |
1204 | 3226 { XD_END } |
3227 }; | |
3228 | |
3229 const struct sized_memory_description undecided_coding_stream_description = { | |
3230 sizeof (struct undecided_coding_stream), undecided_coding_stream_description_1 | |
3231 }; | |
3232 | |
3233 DEFINE_CODING_SYSTEM_TYPE_WITH_DATA (undecided); | |
3234 | |
771 | 3235 static void |
3236 undecided_init (Lisp_Object codesys) | |
3237 { | |
3238 struct undecided_coding_system *data = | |
3239 XCODING_SYSTEM_TYPE_DATA (codesys, undecided); | |
3240 | |
3241 data->cs = Qnil; | |
3242 } | |
3243 | |
3244 static void | |
3245 undecided_mark (Lisp_Object codesys) | |
3246 { | |
3247 struct undecided_coding_system *data = | |
3248 XCODING_SYSTEM_TYPE_DATA (codesys, undecided); | |
3249 | |
3250 mark_object (data->cs); | |
3251 } | |
3252 | |
3253 static void | |
3254 undecided_print (Lisp_Object cs, Lisp_Object printcharfun, int escapeflag) | |
3255 { | |
3256 struct undecided_coding_system *data = | |
3257 XCODING_SYSTEM_TYPE_DATA (cs, undecided); | |
3258 int need_space = 0; | |
3259 | |
4952
19a72041c5ed
Mule-izing, various fixes related to char * arguments
Ben Wing <ben@xemacs.org>
parents:
4846
diff
changeset
|
3260 write_ascstring (printcharfun, "("); |
771 | 3261 if (data->do_eol) |
3262 { | |
4952
19a72041c5ed
Mule-izing, various fixes related to char * arguments
Ben Wing <ben@xemacs.org>
parents:
4846
diff
changeset
|
3263 write_ascstring (printcharfun, "do-eol"); |
771 | 3264 need_space = 1; |
3265 } | |
3266 if (data->do_coding) | |
3267 { | |
3268 if (need_space) | |
4952
19a72041c5ed
Mule-izing, various fixes related to char * arguments
Ben Wing <ben@xemacs.org>
parents:
4846
diff
changeset
|
3269 write_ascstring (printcharfun, " "); |
19a72041c5ed
Mule-izing, various fixes related to char * arguments
Ben Wing <ben@xemacs.org>
parents:
4846
diff
changeset
|
3270 write_ascstring (printcharfun, "do-coding"); |
771 | 3271 need_space = 1; |
3272 } | |
3273 if (!NILP (data->cs)) | |
3274 { | |
3275 if (need_space) | |
4952
19a72041c5ed
Mule-izing, various fixes related to char * arguments
Ben Wing <ben@xemacs.org>
parents:
4846
diff
changeset
|
3276 write_ascstring (printcharfun, " "); |
19a72041c5ed
Mule-izing, various fixes related to char * arguments
Ben Wing <ben@xemacs.org>
parents:
4846
diff
changeset
|
3277 write_ascstring (printcharfun, "coding-system="); |
771 | 3278 print_coding_system_in_print_method (data->cs, printcharfun, escapeflag); |
3279 } | |
4952
19a72041c5ed
Mule-izing, various fixes related to char * arguments
Ben Wing <ben@xemacs.org>
parents:
4846
diff
changeset
|
3280 write_ascstring (printcharfun, ")"); |
771 | 3281 } |
3282 | |
3283 static void | |
3284 undecided_mark_coding_stream (struct coding_stream *str) | |
3285 { | |
1204 | 3286 mark_object (CODING_STREAM_TYPE_DATA (str, undecided)->actual); |
771 | 3287 chain_mark_coding_stream_1 (&CODING_STREAM_TYPE_DATA (str, undecided)->c); |
3288 } | |
3289 | |
3290 static int | |
3291 undecided_putprop (Lisp_Object codesys, Lisp_Object key, Lisp_Object value) | |
3292 { | |
3293 struct undecided_coding_system *data = | |
3294 XCODING_SYSTEM_TYPE_DATA (codesys, undecided); | |
3295 | |
3296 if (EQ (key, Qdo_eol)) | |
3297 data->do_eol = 1; | |
3298 else if (EQ (key, Qdo_coding)) | |
3299 data->do_coding = 1; | |
3300 else if (EQ (key, Qcoding_system)) | |
3301 data->cs = get_coding_system_for_text_file (value, 0); | |
3302 else | |
3303 return 0; | |
3304 return 1; | |
3305 } | |
3306 | |
3307 static Lisp_Object | |
3308 undecided_getprop (Lisp_Object codesys, Lisp_Object prop) | |
3309 { | |
3310 struct undecided_coding_system *data = | |
3311 XCODING_SYSTEM_TYPE_DATA (codesys, undecided); | |
3312 | |
3313 if (EQ (prop, Qdo_eol)) | |
3314 return data->do_eol ? Qt : Qnil; | |
3315 if (EQ (prop, Qdo_coding)) | |
3316 return data->do_coding ? Qt : Qnil; | |
3317 if (EQ (prop, Qcoding_system)) | |
3318 return data->cs; | |
3319 return Qunbound; | |
3320 } | |
3321 | |
3322 static struct detection_state * | |
3323 allocate_detection_state (void) | |
3324 { | |
3325 int i; | |
3326 Bytecount size = MAX_ALIGN_SIZE (sizeof (struct detection_state)); | |
3327 struct detection_state *block; | |
3328 | |
3329 for (i = 0; i < coding_detector_count; i++) | |
3330 size += MAX_ALIGN_SIZE (Dynarr_at (all_coding_detectors, i).data_size); | |
3331 | |
3332 block = (struct detection_state *) xmalloc_and_zero (size); | |
3333 | |
3334 size = MAX_ALIGN_SIZE (sizeof (struct detection_state)); | |
3335 for (i = 0; i < coding_detector_count; i++) | |
3336 { | |
3337 block->data_offset[i] = size; | |
3338 size += MAX_ALIGN_SIZE (Dynarr_at (all_coding_detectors, i).data_size); | |
3339 } | |
3340 | |
3341 return block; | |
3342 } | |
3343 | |
3344 static void | |
3345 free_detection_state (struct detection_state *st) | |
3346 { | |
3347 int i; | |
3348 | |
3349 for (i = 0; i < coding_detector_count; i++) | |
3350 { | |
3351 if (Dynarr_at (all_coding_detectors, i).finalize_detection_state_method) | |
3352 Dynarr_at (all_coding_detectors, i).finalize_detection_state_method | |
3353 (st); | |
3354 } | |
3355 | |
4976
16112448d484
Rename xfree(FOO, TYPE) -> xfree(FOO)
Ben Wing <ben@xemacs.org>
parents:
4967
diff
changeset
|
3356 xfree (st); |
771 | 3357 } |
3358 | |
3359 static int | |
3360 coding_category_symbol_to_id (Lisp_Object symbol) | |
428 | 3361 { |
3362 int i; | |
3363 | |
3364 CHECK_SYMBOL (symbol); | |
771 | 3365 for (i = 0; i < coding_detector_count; i++) |
3366 { | |
3367 detector_category_dynarr *cats = | |
3368 Dynarr_at (all_coding_detectors, i).cats; | |
3369 int j; | |
3370 | |
3371 for (j = 0; j < Dynarr_length (cats); j++) | |
3372 if (EQ (Dynarr_at (cats, j).sym, symbol)) | |
3373 return Dynarr_at (cats, j).id; | |
3374 } | |
3375 | |
563 | 3376 invalid_constant ("Unrecognized coding category", symbol); |
1204 | 3377 RETURN_NOT_REACHED (0); |
428 | 3378 } |
3379 | |
771 | 3380 static Lisp_Object |
3381 coding_category_id_to_symbol (int id) | |
428 | 3382 { |
3383 int i; | |
771 | 3384 |
3385 for (i = 0; i < coding_detector_count; i++) | |
3386 { | |
3387 detector_category_dynarr *cats = | |
3388 Dynarr_at (all_coding_detectors, i).cats; | |
3389 int j; | |
3390 | |
3391 for (j = 0; j < Dynarr_length (cats); j++) | |
3392 if (id == Dynarr_at (cats, j).id) | |
3393 return Dynarr_at (cats, j).sym; | |
3394 } | |
3395 | |
2500 | 3396 ABORT (); |
771 | 3397 return Qnil; /* (usually) not reached */ |
428 | 3398 } |
3399 | |
771 | 3400 static Lisp_Object |
3401 detection_result_number_to_symbol (enum detection_result result) | |
428 | 3402 { |
1494 | 3403 /* let compiler warn if not all enumerators are handled */ |
3404 switch (result) { | |
3405 #define FROB(sym, num) case num: return (sym) | |
771 | 3406 FROB (Qnear_certainty, DET_NEAR_CERTAINTY); |
3407 FROB (Qquite_probable, DET_QUITE_PROBABLE); | |
3408 FROB (Qsomewhat_likely, DET_SOMEWHAT_LIKELY); | |
1494 | 3409 FROB (Qslightly_likely, DET_SLIGHTLY_LIKELY); |
771 | 3410 FROB (Qas_likely_as_unlikely, DET_AS_LIKELY_AS_UNLIKELY); |
3411 FROB (Qsomewhat_unlikely, DET_SOMEWHAT_UNLIKELY); | |
3412 FROB (Qquite_improbable, DET_QUITE_IMPROBABLE); | |
3413 FROB (Qnearly_impossible, DET_NEARLY_IMPOSSIBLE); | |
3414 #undef FROB | |
1494 | 3415 } |
771 | 3416 |
2500 | 3417 ABORT (); |
771 | 3418 return Qnil; /* (usually) not reached */ |
3419 } | |
3420 | |
778 | 3421 #if 0 /* not used */ |
771 | 3422 static enum detection_result |
3423 detection_result_symbol_to_number (Lisp_Object symbol) | |
3424 { | |
1494 | 3425 /* using switch here would be bad style, and doesn't help */ |
771 | 3426 #define FROB(sym, num) if (EQ (symbol, sym)) return (num) |
3427 FROB (Qnear_certainty, DET_NEAR_CERTAINTY); | |
3428 FROB (Qquite_probable, DET_QUITE_PROBABLE); | |
3429 FROB (Qsomewhat_likely, DET_SOMEWHAT_LIKELY); | |
1494 | 3430 FROB (Qslightly_likely, DET_SLIGHTLY_LIKELY); |
771 | 3431 FROB (Qas_likely_as_unlikely, DET_AS_LIKELY_AS_UNLIKELY); |
3432 FROB (Qsomewhat_unlikely, DET_SOMEWHAT_UNLIKELY); | |
3433 FROB (Qquite_improbable, DET_QUITE_IMPROBABLE); | |
3434 FROB (Qnearly_impossible, DET_NEARLY_IMPOSSIBLE); | |
3435 #undef FROB | |
3436 | |
3437 invalid_constant ("Unrecognized detection result", symbol); | |
3438 return ((enum detection_result) 0); /* not reached */ | |
3439 } | |
778 | 3440 #endif /* 0 */ |
771 | 3441 |
3442 /* Set all detection results for a given detector to a specified value. */ | |
3443 void | |
3444 set_detection_results (struct detection_state *st, int detector, int given) | |
3445 { | |
3446 detector_category_dynarr *cats = | |
3447 Dynarr_at (all_coding_detectors, detector).cats; | |
3448 int i; | |
3449 | |
3450 for (i = 0; i < Dynarr_length (cats); i++) | |
3451 st->categories[Dynarr_at (cats, i).id] = given; | |
3452 } | |
428 | 3453 |
3454 static int | |
3455 acceptable_control_char_p (int c) | |
3456 { | |
3457 switch (c) | |
3458 { | |
3459 /* Allow and ignore control characters that you might | |
3460 reasonably see in a text file */ | |
3461 case '\r': | |
3462 case '\n': | |
3463 case '\t': | |
3464 case 7: /* bell */ | |
3465 case 8: /* backspace */ | |
3466 case 11: /* vertical tab */ | |
3467 case 12: /* form feed */ | |
3468 case 26: /* MS-DOS C-z junk */ | |
3469 case 31: /* '^_' -- for info */ | |
3470 return 1; | |
3471 default: | |
3472 return 0; | |
3473 } | |
3474 } | |
3475 | |
771 | 3476 #ifdef DEBUG_XEMACS |
3477 | |
3478 static UExtbyte | |
3479 hex_digit_to_char (int digit) | |
428 | 3480 { |
771 | 3481 if (digit < 10) |
3482 return digit + '0'; | |
3483 else | |
3484 return digit - 10 + 'A'; | |
428 | 3485 } |
3486 | |
771 | 3487 static void |
3488 output_bytes_in_ascii_and_hex (const UExtbyte *src, Bytecount n) | |
428 | 3489 { |
3425 | 3490 Extbyte *ascii = alloca_array (Extbyte, n + 1); |
3491 Extbyte *hex = alloca_array (Extbyte, 3 * n + 1); | |
771 | 3492 int i; |
3413 | 3493 DECLARE_EISTRING (eistr_ascii); |
3494 DECLARE_EISTRING (eistr_hex); | |
771 | 3495 |
3496 for (i = 0; i < n; i++) | |
428 | 3497 { |
3425 | 3498 Extbyte c = src[i]; |
771 | 3499 if (c < 0x20) |
3500 ascii[i] = '.'; | |
428 | 3501 else |
771 | 3502 ascii[i] = c; |
3503 hex[3 * i] = hex_digit_to_char (c >> 4); | |
3504 hex[3 * i + 1] = hex_digit_to_char (c & 0xF); | |
3505 hex[3 * i + 2] = ' '; | |
428 | 3506 } |
771 | 3507 ascii[i] = '\0'; |
3508 hex[3 * i - 1] = '\0'; | |
3413 | 3509 |
3510 eicpy_ext(eistr_hex, hex, Qbinary); | |
3511 eicpy_ext(eistr_ascii, ascii, Qbinary); | |
3512 | |
3425 | 3513 stderr_out ("%s %s", eidata(eistr_ascii), eidata(eistr_hex)); |
428 | 3514 } |
3515 | |
771 | 3516 #endif /* DEBUG_XEMACS */ |
3517 | |
3518 /* Attempt to determine the encoding of the given text. Before calling | |
3519 this function for the first time, you must zero out the detection state. | |
428 | 3520 |
3521 Returns: | |
3522 | |
771 | 3523 0 == keep going |
3524 1 == stop | |
428 | 3525 */ |
3526 | |
3527 static int | |
771 | 3528 detect_coding_type (struct detection_state *st, const UExtbyte *src, |
3529 Bytecount n) | |
428 | 3530 { |
771 | 3531 Bytecount n2 = n; |
3532 const UExtbyte *src2 = src; | |
3533 int i; | |
3534 | |
3535 #ifdef DEBUG_XEMACS | |
3536 if (!NILP (Vdebug_coding_detection)) | |
3537 { | |
3538 int bytes = min (16, n); | |
3539 stderr_out ("detect_coding_type: processing %ld bytes\n", n); | |
3540 stderr_out ("First %d: ", bytes); | |
3541 output_bytes_in_ascii_and_hex (src, bytes); | |
3542 stderr_out ("\nLast %d: ", bytes); | |
3543 output_bytes_in_ascii_and_hex (src + n - bytes, bytes); | |
3544 stderr_out ("\n"); | |
3545 } | |
3546 #endif /* DEBUG_XEMACS */ | |
428 | 3547 if (!st->seen_non_ascii) |
3548 { | |
771 | 3549 for (; n2; n2--, src2++) |
428 | 3550 { |
771 | 3551 UExtbyte c = *src2; |
428 | 3552 if ((c < 0x20 && !acceptable_control_char_p (c)) || c >= 0x80) |
3553 { | |
3554 st->seen_non_ascii = 1; | |
3555 break; | |
3556 } | |
3557 } | |
3558 } | |
3559 | |
771 | 3560 for (i = 0; i < coding_detector_count; i++) |
3561 Dynarr_at (all_coding_detectors, i).detect_method (st, src, n); | |
3562 | |
3563 st->bytes_seen += n; | |
3564 | |
3565 #ifdef DEBUG_XEMACS | |
3566 if (!NILP (Vdebug_coding_detection)) | |
3567 { | |
3568 stderr_out ("seen_non_ascii: %d\n", st->seen_non_ascii); | |
1494 | 3569 if (coding_detector_category_count <= 0) |
3570 stderr_out ("found %d detector categories\n", | |
3571 coding_detector_category_count); | |
771 | 3572 for (i = 0; i < coding_detector_category_count; i++) |
3573 stderr_out_lisp | |
3574 ("%s: %s\n", | |
3575 2, | |
3576 coding_category_id_to_symbol (i), | |
3577 detection_result_number_to_symbol ((enum detection_result) | |
3578 st->categories[i])); | |
3579 } | |
3580 #endif /* DEBUG_XEMACS */ | |
3581 | |
3582 { | |
3583 int not_unlikely = 0; | |
3584 int retval; | |
3585 | |
3586 for (i = 0; i < coding_detector_category_count; i++) | |
3587 if (st->categories[i] >= 0) | |
3588 not_unlikely++; | |
3589 | |
3590 retval = (not_unlikely <= 1 | |
3591 #if 0 /* this is bogus */ | |
3592 || st->bytes_seen >= MAX_BYTES_PROCESSED_FOR_DETECTION | |
428 | 3593 #endif |
771 | 3594 ); |
3595 | |
3596 #ifdef DEBUG_XEMACS | |
3597 if (!NILP (Vdebug_coding_detection)) | |
3598 stderr_out ("detect_coding_type: returning %d (%s)\n", | |
3599 retval, retval ? "stop" : "keep going"); | |
3600 #endif /* DEBUG_XEMACS */ | |
3601 | |
3602 return retval; | |
428 | 3603 } |
3604 } | |
3605 | |
3606 static Lisp_Object | |
771 | 3607 detected_coding_system (struct detection_state *st) |
428 | 3608 { |
771 | 3609 int i; |
3610 int even = 1; | |
3611 | |
3612 if (st->seen_non_ascii) | |
3613 { | |
3614 for (i = 0; i < coding_detector_category_count; i++) | |
3615 if (st->categories[i] != DET_AS_LIKELY_AS_UNLIKELY) | |
3616 { | |
3617 even = 0; | |
3618 break; | |
3619 } | |
3620 } | |
3621 | |
3622 /* #### Here we are ignoring the results of detection when it's all | |
3623 ASCII. This is obviously a bad thing. But we need to fix up the | |
3624 existing detection methods somewhat before we can switch. */ | |
3625 if (even) | |
428 | 3626 { |
3627 /* If the file was entirely or basically ASCII, use the | |
3628 default value of `buffer-file-coding-system'. */ | |
3629 Lisp_Object retval = | |
3630 XBUFFER (Vbuffer_defaults)->buffer_file_coding_system; | |
3631 if (!NILP (retval)) | |
3632 { | |
771 | 3633 retval = find_coding_system_for_text_file (retval, 0); |
428 | 3634 if (NILP (retval)) |
3635 { | |
3636 warn_when_safe | |
3637 (Qbad_variable, Qwarning, | |
3638 "Invalid `default-buffer-file-coding-system', set to nil"); | |
3639 XBUFFER (Vbuffer_defaults)->buffer_file_coding_system = Qnil; | |
3640 } | |
3641 } | |
3642 if (NILP (retval)) | |
4100 | 3643 retval = Fget_coding_system (Qbinary); |
428 | 3644 return retval; |
3645 } | |
3646 else | |
3647 { | |
771 | 3648 int likelihood; |
3649 Lisp_Object retval = Qnil; | |
3650 | |
3651 /* Look through the coding categories first by likelihood and then by | |
3652 priority and find the first one that is allowed. */ | |
3653 | |
3654 for (likelihood = DET_HIGHEST; likelihood >= DET_LOWEST; likelihood--) | |
428 | 3655 { |
771 | 3656 for (i = 0; i < coding_detector_category_count; i++) |
3657 { | |
3658 int cat = coding_category_by_priority[i]; | |
3659 if (st->categories[cat] == likelihood && | |
3660 !NILP (coding_category_system[cat])) | |
3661 { | |
3662 retval = (get_coding_system_for_text_file | |
3663 (coding_category_system[cat], 0)); | |
3664 if (likelihood < DET_AS_LIKELY_AS_UNLIKELY) | |
3665 warn_when_safe_lispobj | |
3666 (intern ("detection"), | |
793 | 3667 Qwarning, |
771 | 3668 emacs_sprintf_string_lisp |
3669 ( | |
3670 "Detected coding %s is unlikely to be correct (likelihood == `%s')", | |
3671 Qnil, 2, XCODING_SYSTEM_NAME (retval), | |
3672 detection_result_number_to_symbol | |
3673 ((enum detection_result) likelihood))); | |
3674 return retval; | |
3675 } | |
3676 } | |
428 | 3677 } |
771 | 3678 |
3679 return Fget_coding_system (Qraw_text); | |
428 | 3680 } |
3681 } | |
3682 | |
1347 | 3683 /* Look for a coding system in the string (skipping over leading |
3684 blanks). If found, return it, otherwise nil. */ | |
3685 | |
3686 static Lisp_Object | |
2531 | 3687 snarf_coding_system (const UExtbyte *p, Bytecount len) |
1347 | 3688 { |
3689 Bytecount n; | |
2531 | 3690 UExtbyte *name; |
1347 | 3691 |
3692 while (*p == ' ' || *p == '\t') p++, len--; | |
3693 len = min (len, 1000); | |
3694 name = alloca_ibytes (len + 1); | |
3695 memcpy (name, p, len); | |
3696 name[len] = '\0'; | |
3697 | |
3698 /* Get coding system name */ | |
3699 /* Characters valid in a MIME charset name (rfc 1521), | |
3700 and in a Lisp symbol name. */ | |
3701 n = qxestrspn (name, | |
3702 "ABCDEFGHIJKLMNOPQRSTUVWXYZ" | |
3703 "abcdefghijklmnopqrstuvwxyz" | |
3704 "0123456789" | |
3705 "!$%&*+-.^_{|}~"); | |
3706 if (n > 0) | |
3707 { | |
3708 name[n] = '\0'; | |
4953
304aebb79cd3
function renamings to track names of char typedefs
Ben Wing <ben@xemacs.org>
parents:
4952
diff
changeset
|
3709 /* This call to intern_istring() is OK because we already verified that |
2531 | 3710 there are only ASCII characters in the string */ |
4953
304aebb79cd3
function renamings to track names of char typedefs
Ben Wing <ben@xemacs.org>
parents:
4952
diff
changeset
|
3711 return find_coding_system_for_text_file (intern_istring ((Ibyte *) name), 0); |
1347 | 3712 } |
3713 | |
3714 return Qnil; | |
3715 } | |
3716 | |
428 | 3717 /* Given a seekable read stream and potential coding system and EOL type |
3718 as specified, do any autodetection that is called for. If the | |
3719 coding system and/or EOL type are not `autodetect', they will be left | |
3720 alone; but this function will never return an autodetect coding system | |
3721 or EOL type. | |
3722 | |
3723 This function does not automatically fetch subsidiary coding systems; | |
3724 that should be unnecessary with the explicit eol-type argument. */ | |
3725 | |
3726 #define LENGTH(string_constant) (sizeof (string_constant) - 1) | |
3727 | |
771 | 3728 static Lisp_Object |
3729 unwind_free_detection_state (Lisp_Object opaque) | |
3730 { | |
3731 struct detection_state *st = | |
3732 (struct detection_state *) get_opaque_ptr (opaque); | |
3733 free_detection_state (st); | |
3734 free_opaque_ptr (opaque); | |
3735 return Qnil; | |
3736 } | |
3737 | |
1347 | 3738 /* #### This duplicates code in `find-coding-system-magic-cookie-in-file' |
3739 in files.el. Look into combining them. */ | |
3740 | |
771 | 3741 static Lisp_Object |
3742 look_for_coding_system_magic_cookie (const UExtbyte *data, Bytecount len) | |
428 | 3743 { |
771 | 3744 const UExtbyte *p; |
3745 const UExtbyte *scan_end; | |
2531 | 3746 Bytecount cookie_len; |
771 | 3747 |
3748 /* Look for initial "-*-"; mode line prefix */ | |
3749 for (p = data, | |
3750 scan_end = data + len - LENGTH ("-*-coding:?-*-"); | |
3751 p <= scan_end | |
3752 && *p != '\n' | |
3753 && *p != '\r'; | |
3754 p++) | |
3755 if (*p == '-' && *(p+1) == '*' && *(p+2) == '-') | |
3756 { | |
3757 const UExtbyte *local_vars_beg = p + 3; | |
3758 /* Look for final "-*-"; mode line suffix */ | |
3759 for (p = local_vars_beg, | |
3760 scan_end = data + len - LENGTH ("-*-"); | |
3761 p <= scan_end | |
428 | 3762 && *p != '\n' |
3763 && *p != '\r'; | |
771 | 3764 p++) |
3765 if (*p == '-' && *(p+1) == '*' && *(p+2) == '-') | |
3766 { | |
3767 const UExtbyte *suffix = p; | |
3768 /* Look for "coding:" */ | |
3769 for (p = local_vars_beg, | |
3770 scan_end = suffix - LENGTH ("coding:?"); | |
3771 p <= scan_end; | |
3772 p++) | |
3773 if (memcmp ("coding:", p, LENGTH ("coding:")) == 0 | |
3774 && (p == local_vars_beg | |
3775 || (*(p-1) == ' ' || | |
3776 *(p-1) == '\t' || | |
3777 *(p-1) == ';'))) | |
3778 { | |
3779 p += LENGTH ("coding:"); | |
1347 | 3780 return snarf_coding_system (p, suffix - p); |
771 | 3781 break; |
3782 } | |
3783 break; | |
3784 } | |
3785 break; | |
3786 } | |
3787 | |
2531 | 3788 /* Look for ;;;###coding system */ |
3789 | |
3790 cookie_len = LENGTH (";;;###coding system: "); | |
3791 | |
3792 for (p = data, | |
3793 scan_end = data + len - cookie_len; | |
3794 p <= scan_end; | |
3795 p++) | |
1347 | 3796 { |
2531 | 3797 if (*p == ';' && !memcmp (p, ";;;###coding system: ", cookie_len)) |
3798 { | |
3799 const UExtbyte *suffix; | |
3800 | |
3801 p += cookie_len; | |
3802 suffix = p; | |
3803 while (suffix < scan_end && !isspace (*suffix)) | |
3804 suffix++; | |
3805 return snarf_coding_system (p, suffix - p); | |
3806 } | |
1347 | 3807 } |
3808 | |
3809 return Qnil; | |
771 | 3810 } |
3811 | |
3812 static Lisp_Object | |
3813 determine_real_coding_system (Lstream *stream) | |
3814 { | |
3815 struct detection_state *st = allocate_detection_state (); | |
3816 int depth = record_unwind_protect (unwind_free_detection_state, | |
3817 make_opaque_ptr (st)); | |
3818 UExtbyte buf[4096]; | |
3819 Bytecount nread = Lstream_read (stream, buf, sizeof (buf)); | |
3820 Lisp_Object coding_system = look_for_coding_system_magic_cookie (buf, nread); | |
3821 | |
3822 if (NILP (coding_system)) | |
3823 { | |
3824 while (1) | |
3825 { | |
3826 if (detect_coding_type (st, buf, nread)) | |
428 | 3827 break; |
771 | 3828 nread = Lstream_read (stream, buf, sizeof (buf)); |
3829 if (nread == 0) | |
3830 break; | |
428 | 3831 } |
771 | 3832 |
3833 coding_system = detected_coding_system (st); | |
428 | 3834 } |
3835 | |
3836 Lstream_rewind (stream); | |
771 | 3837 |
3838 unbind_to (depth); | |
3839 return coding_system; | |
3840 } | |
3841 | |
3842 static void | |
3843 undecided_init_coding_stream (struct coding_stream *str) | |
3844 { | |
3845 struct undecided_coding_stream *data = | |
3846 CODING_STREAM_TYPE_DATA (str, undecided); | |
3847 struct undecided_coding_system *csdata = | |
3848 XCODING_SYSTEM_TYPE_DATA (str->codesys, undecided); | |
3849 | |
3850 data->actual = Qnil; | |
3851 | |
3852 if (str->direction == CODING_DECODE) | |
3853 { | |
3854 Lstream *lst = str->other_end; | |
3855 | |
3856 if ((lst->flags & LSTREAM_FL_READ) && | |
3857 Lstream_seekable_p (lst) && | |
3858 csdata->do_coding) | |
3859 /* We can determine the coding system now. */ | |
3860 data->actual = determine_real_coding_system (lst); | |
3861 } | |
1494 | 3862 |
3863 #ifdef DEBUG_XEMACS | |
3864 if (!NILP (Vdebug_coding_detection)) | |
3865 stderr_out_lisp ("detected coding system: %s\n", 1, data->actual); | |
3866 #endif /* DEBUG_XEMACS */ | |
771 | 3867 } |
3868 | |
3869 static void | |
3870 undecided_rewind_coding_stream (struct coding_stream *str) | |
3871 { | |
3872 chain_rewind_coding_stream_1 (&CODING_STREAM_TYPE_DATA (str, undecided)->c); | |
3873 } | |
3874 | |
3875 static void | |
3876 undecided_finalize_coding_stream (struct coding_stream *str) | |
3877 { | |
3878 struct undecided_coding_stream *data = | |
3879 CODING_STREAM_TYPE_DATA (str, undecided); | |
3880 | |
3881 chain_finalize_coding_stream_1 | |
3882 (&CODING_STREAM_TYPE_DATA (str, undecided)->c); | |
3883 if (data->st) | |
3884 free_detection_state (data->st); | |
3885 } | |
3886 | |
3887 static Lisp_Object | |
3888 undecided_canonicalize (Lisp_Object codesys) | |
3889 { | |
3890 struct undecided_coding_system *csdata = | |
3891 XCODING_SYSTEM_TYPE_DATA (codesys, undecided); | |
3892 if (!csdata->do_eol && !csdata->do_coding) | |
3893 return NILP (csdata->cs) ? Fget_coding_system (Qbinary) : csdata->cs; | |
3894 if (csdata->do_eol && !csdata->do_coding && NILP (csdata->cs)) | |
3895 return Fget_coding_system (Qconvert_eol_autodetect); | |
3896 return codesys; | |
3897 } | |
3898 | |
3899 static Bytecount | |
3900 undecided_convert (struct coding_stream *str, const UExtbyte *src, | |
3901 unsigned_char_dynarr *dst, Bytecount n) | |
3902 { | |
3903 int first_time = 0; | |
3904 | |
3905 if (str->direction == CODING_DECODE) | |
3906 { | |
3907 /* At this point, we have only the following possibilities: | |
3908 | |
3909 do_eol && do_coding | |
3910 do_coding only | |
3911 do_eol only and a coding system was specified | |
3912 | |
3913 Other possibilities are removed during undecided_canonicalize. | |
3914 | |
3915 Therefore, our substreams are either | |
3916 | |
3917 lstream_coding -> lstream_dynarr, or | |
3918 lstream_coding -> lstream_eol -> lstream_dynarr. | |
3919 */ | |
3920 struct undecided_coding_system *csdata = | |
3921 XCODING_SYSTEM_TYPE_DATA (str->codesys, undecided); | |
3922 struct undecided_coding_stream *data = | |
3923 CODING_STREAM_TYPE_DATA (str, undecided); | |
3924 | |
3925 if (str->eof) | |
3926 { | |
3927 /* Each will close the next. We need to close now because more | |
3928 data may be generated. */ | |
3929 if (data->c.initted) | |
3930 Lstream_close (XLSTREAM (data->c.lstreams[0])); | |
3931 return n; | |
3932 } | |
3933 | |
3934 if (!data->c.initted) | |
3935 { | |
3936 data->c.lstream_count = csdata->do_eol ? 3 : 2; | |
3937 data->c.lstreams = xnew_array (Lisp_Object, data->c.lstream_count); | |
3938 | |
3939 data->c.lstreams[data->c.lstream_count - 1] = | |
3940 make_dynarr_output_stream (dst); | |
3941 Lstream_set_buffering | |
3942 (XLSTREAM (data->c.lstreams[data->c.lstream_count - 1]), | |
3943 LSTREAM_UNBUFFERED, 0); | |
3944 if (csdata->do_eol) | |
3945 { | |
3946 data->c.lstreams[1] = | |
3947 make_coding_output_stream | |
3948 (XLSTREAM (data->c.lstreams[data->c.lstream_count - 1]), | |
3949 Fget_coding_system (Qconvert_eol_autodetect), | |
800 | 3950 CODING_DECODE, 0); |
771 | 3951 Lstream_set_buffering |
3952 (XLSTREAM (data->c.lstreams[1]), | |
3953 LSTREAM_UNBUFFERED, 0); | |
3954 } | |
3955 | |
3956 data->c.lstreams[0] = | |
3957 make_coding_output_stream | |
3958 (XLSTREAM (data->c.lstreams[1]), | |
3959 /* Substitute binary if we need to detect the encoding */ | |
3960 csdata->do_coding ? Qbinary : csdata->cs, | |
800 | 3961 CODING_DECODE, 0); |
771 | 3962 Lstream_set_buffering (XLSTREAM (data->c.lstreams[0]), |
3963 LSTREAM_UNBUFFERED, 0); | |
3964 | |
3965 first_time = 1; | |
3966 data->c.initted = 1; | |
3967 } | |
3968 | |
3969 /* If necessary, do encoding-detection now. We do this when we're a | |
3970 writing stream or a non-seekable reading stream, meaning that we | |
3971 can't just process the whole input, rewind, and start over. */ | |
3972 | |
3973 if (csdata->do_coding) | |
3974 { | |
3975 int actual_was_nil = NILP (data->actual); | |
3976 if (NILP (data->actual)) | |
3977 { | |
3978 if (!data->st) | |
3979 data->st = allocate_detection_state (); | |
3980 if (first_time) | |
3981 /* #### This is cheesy. What we really ought to do is buffer | |
3982 up a certain minimum amount of data to get a better result. | |
3983 */ | |
3984 data->actual = look_for_coding_system_magic_cookie (src, n); | |
3985 if (NILP (data->actual)) | |
3986 { | |
3987 /* #### This is cheesy. What we really ought to do is buffer | |
3988 up a certain minimum amount of data so as to get a less | |
3989 random result when doing subprocess detection. */ | |
3990 detect_coding_type (data->st, src, n); | |
3991 data->actual = detected_coding_system (data->st); | |
4100 | 3992 /* kludge to prevent infinite recursion */ |
3993 if (XCODING_SYSTEM(data->actual)->methods->enumtype == undecided_coding_system) | |
3994 data->actual = Fget_coding_system (Qbinary); | |
771 | 3995 } |
3996 } | |
3997 /* We need to set the detected coding system if we actually have | |
3998 such a coding system but didn't before. That is the case | |
3999 either when we just detected it in the previous code or when | |
4000 it was detected during undecided_init_coding_stream(). We | |
4001 can check for that using first_time. */ | |
4002 if (!NILP (data->actual) && (actual_was_nil || first_time)) | |
4003 { | |
4004 /* If the detected coding system doesn't allow for EOL | |
4005 autodetection, try to get the equivalent that does; | |
4006 otherwise, disable EOL detection (overriding whatever | |
4007 may already have been detected). */ | |
4008 if (XCODING_SYSTEM_EOL_TYPE (data->actual) != EOL_AUTODETECT) | |
4009 { | |
4010 if (!NILP (XCODING_SYSTEM_SUBSIDIARY_PARENT (data->actual))) | |
4011 data->actual = | |
4012 XCODING_SYSTEM_SUBSIDIARY_PARENT (data->actual); | |
4013 else if (data->c.lstream_count == 3) | |
4014 set_coding_stream_coding_system | |
4015 (XLSTREAM (data->c.lstreams[1]), | |
4016 Fget_coding_system (Qidentity)); | |
4017 } | |
4018 set_coding_stream_coding_system | |
4019 (XLSTREAM (data->c.lstreams[0]), data->actual); | |
4020 } | |
4021 } | |
4022 | |
4023 if (Lstream_write (XLSTREAM (data->c.lstreams[0]), src, n) < 0) | |
4024 return -1; | |
4025 return n; | |
4026 } | |
4027 else | |
4028 return no_conversion_convert (str, src, dst, n); | |
4029 } | |
4030 | |
4031 static Lisp_Object | |
4032 undecided_canonicalize_after_coding (struct coding_stream *str) | |
4033 { | |
4034 struct undecided_coding_stream *data = | |
4035 CODING_STREAM_TYPE_DATA (str, undecided); | |
4036 Lisp_Object ret, eolret; | |
4037 | |
4038 if (str->direction == CODING_ENCODE) | |
4039 return str->codesys; | |
4040 | |
4041 if (!data->c.initted) | |
4647
e4ed58cb0e5b
Fix bugs with #'find-file, 0-length files, & coding-system-for-read specified.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4569
diff
changeset
|
4042 return str->codesys; |
771 | 4043 |
4044 ret = coding_stream_canonicalize_after_coding | |
4045 (XLSTREAM (data->c.lstreams[0])); | |
4046 if (NILP (ret)) | |
4647
e4ed58cb0e5b
Fix bugs with #'find-file, 0-length files, & coding-system-for-read specified.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4569
diff
changeset
|
4047 ret = str->codesys; |
771 | 4048 if (XCODING_SYSTEM_EOL_TYPE (ret) != EOL_AUTODETECT) |
4049 return ret; | |
4050 eolret = coding_stream_canonicalize_after_coding | |
4051 (XLSTREAM (data->c.lstreams[1])); | |
4052 if (!EQ (XCODING_SYSTEM_TYPE (eolret), Qconvert_eol)) | |
4053 return ret; | |
4054 return | |
4055 Fsubsidiary_coding_system (ret, Fcoding_system_property (eolret, | |
4056 Qsubtype)); | |
4057 } | |
4058 | |
4059 | |
4060 /************************************************************************/ | |
4061 /* Lisp interface: Coding category functions and detection */ | |
4062 /************************************************************************/ | |
4063 | |
4064 DEFUN ("coding-category-list", Fcoding_category_list, 0, 0, 0, /* | |
4065 Return a list of all recognized coding categories. | |
4066 */ | |
4067 ()) | |
4068 { | |
4069 int i; | |
4070 Lisp_Object list = Qnil; | |
4071 | |
4072 for (i = 0; i < coding_detector_count; i++) | |
4073 { | |
4074 detector_category_dynarr *cats = | |
4075 Dynarr_at (all_coding_detectors, i).cats; | |
4076 int j; | |
4077 | |
4078 for (j = 0; j < Dynarr_length (cats); j++) | |
4079 list = Fcons (Dynarr_at (cats, j).sym, list); | |
4080 } | |
4081 | |
4082 return Fnreverse (list); | |
4083 } | |
4084 | |
4085 DEFUN ("set-coding-priority-list", Fset_coding_priority_list, 1, 1, 0, /* | |
4086 Change the priority order of the coding categories. | |
4087 LIST should be list of coding categories, in descending order of | |
4088 priority. Unspecified coding categories will be lower in priority | |
4089 than all specified ones, in the same relative order they were in | |
4090 previously. | |
4091 */ | |
4092 (list)) | |
4093 { | |
4094 int *category_to_priority = | |
4095 alloca_array (int, coding_detector_category_count); | |
4096 int i, j; | |
4097 | |
4098 /* First generate a list that maps coding categories to priorities. */ | |
4099 | |
4100 for (i = 0; i < coding_detector_category_count; i++) | |
4101 category_to_priority[i] = -1; | |
4102 | |
4103 /* Highest priority comes from the specified list. */ | |
4104 i = 0; | |
2367 | 4105 { |
4106 EXTERNAL_LIST_LOOP_2 (elt, list) | |
4107 { | |
4108 int cat = coding_category_symbol_to_id (elt); | |
4109 | |
4110 if (category_to_priority[cat] >= 0) | |
4111 sferror ("Duplicate coding category in list", elt); | |
4112 category_to_priority[cat] = i++; | |
4113 } | |
4114 } | |
771 | 4115 |
4116 /* Now go through the existing categories by priority to retrieve | |
4117 the categories not yet specified and preserve their priority | |
4118 order. */ | |
4119 for (j = 0; j < coding_detector_category_count; j++) | |
4120 { | |
4121 int cat = coding_category_by_priority[j]; | |
4122 if (category_to_priority[cat] < 0) | |
4123 category_to_priority[cat] = i++; | |
4124 } | |
4125 | |
4126 /* Now we need to construct the inverse of the mapping we just | |
4127 constructed. */ | |
4128 | |
4129 for (i = 0; i < coding_detector_category_count; i++) | |
4130 coding_category_by_priority[category_to_priority[i]] = i; | |
4131 | |
4132 /* Phew! That was confusing. */ | |
4133 return Qnil; | |
4134 } | |
4135 | |
4136 DEFUN ("coding-priority-list", Fcoding_priority_list, 0, 0, 0, /* | |
4137 Return a list of coding categories in descending order of priority. | |
4138 */ | |
4139 ()) | |
4140 { | |
4141 int i; | |
4142 Lisp_Object list = Qnil; | |
4143 | |
4144 for (i = 0; i < coding_detector_category_count; i++) | |
4145 list = | |
4146 Fcons (coding_category_id_to_symbol (coding_category_by_priority[i]), | |
4147 list); | |
4148 return Fnreverse (list); | |
4149 } | |
4150 | |
4151 DEFUN ("set-coding-category-system", Fset_coding_category_system, 2, 2, 0, /* | |
4152 Change the coding system associated with a coding category. | |
4153 */ | |
4154 (coding_category, coding_system)) | |
4155 { | |
4156 coding_category_system[coding_category_symbol_to_id (coding_category)] = | |
4157 Fget_coding_system (coding_system); | |
4158 return Qnil; | |
4159 } | |
4160 | |
4161 DEFUN ("coding-category-system", Fcoding_category_system, 1, 1, 0, /* | |
4162 Return the coding system associated with a coding category. | |
4163 */ | |
4164 (coding_category)) | |
4165 { | |
4166 Lisp_Object sys = | |
4167 coding_category_system[coding_category_symbol_to_id (coding_category)]; | |
4168 | |
4169 if (!NILP (sys)) | |
4170 return XCODING_SYSTEM_NAME (sys); | |
4171 return Qnil; | |
4172 } | |
4173 | |
800 | 4174 /* Detect the encoding of STREAM. Assumes stream is at the begnning and will |
4175 read through to the end of STREAM, leaving it there but open. */ | |
4176 | |
771 | 4177 Lisp_Object |
4178 detect_coding_stream (Lisp_Object stream) | |
4179 { | |
4180 Lisp_Object val = Qnil; | |
4181 struct gcpro gcpro1, gcpro2, gcpro3; | |
4182 UExtbyte random_buffer[65536]; | |
4183 Lisp_Object binary_instream = | |
4184 make_coding_input_stream | |
4185 (XLSTREAM (stream), Qbinary, | |
814 | 4186 CODING_ENCODE, LSTREAM_FL_NO_CLOSE_OTHER); |
771 | 4187 Lisp_Object decstream = |
4188 make_coding_input_stream | |
4189 (XLSTREAM (binary_instream), | |
800 | 4190 Qundecided, CODING_DECODE, 0); |
771 | 4191 Lstream *decstr = XLSTREAM (decstream); |
4192 | |
4193 GCPRO3 (decstream, stream, binary_instream); | |
4194 /* Read and discard all data; detection happens as a side effect of this, | |
4195 and we examine what was detected afterwards. */ | |
4196 while (Lstream_read (decstr, random_buffer, sizeof (random_buffer)) > 0) | |
4197 ; | |
4198 | |
4199 val = coding_stream_detected_coding_system (decstr); | |
4200 Lstream_close (decstr); | |
4201 Lstream_delete (decstr); | |
4202 Lstream_delete (XLSTREAM (binary_instream)); | |
4203 UNGCPRO; | |
4204 return val; | |
428 | 4205 } |
4206 | |
4207 DEFUN ("detect-coding-region", Fdetect_coding_region, 2, 3, 0, /* | |
4208 Detect coding system of the text in the region between START and END. | |
444 | 4209 Return a list of possible coding systems ordered by priority. |
3025 | 4210 If only ASCII characters are found, return `undecided' or one of |
428 | 4211 its subsidiary coding systems according to a detected end-of-line |
4212 type. Optional arg BUFFER defaults to the current buffer. | |
4213 */ | |
4214 (start, end, buffer)) | |
4215 { | |
4216 Lisp_Object val = Qnil; | |
4217 struct buffer *buf = decode_buffer (buffer, 0); | |
665 | 4218 Charbpos b, e; |
771 | 4219 Lisp_Object lb_instream; |
428 | 4220 |
4221 get_buffer_range_char (buf, start, end, &b, &e, 0); | |
4222 lb_instream = make_lisp_buffer_input_stream (buf, b, e, 0); | |
771 | 4223 |
4224 val = detect_coding_stream (lb_instream); | |
4225 Lstream_delete (XLSTREAM (lb_instream)); | |
428 | 4226 return val; |
4227 } | |
4228 | |
4229 | |
771 | 4230 |
4231 #ifdef DEBUG_XEMACS | |
4232 | |
428 | 4233 /************************************************************************/ |
771 | 4234 /* Internal methods */ |
4235 /************************************************************************/ | |
4236 | |
4237 /* Raw (internally-formatted) data. */ | |
4238 DEFINE_CODING_SYSTEM_TYPE (internal); | |
428 | 4239 |
665 | 4240 static Bytecount |
2286 | 4241 internal_convert (struct coding_stream *UNUSED (str), const UExtbyte *src, |
771 | 4242 unsigned_char_dynarr *dst, Bytecount n) |
4243 { | |
4244 Bytecount orign = n; | |
4245 Dynarr_add_many (dst, src, n); | |
4246 return orign; | |
4247 } | |
4248 | |
4249 #endif /* DEBUG_XEMACS */ | |
4250 | |
4251 | |
4252 | |
4253 #ifdef HAVE_ZLIB | |
4254 | |
4255 /************************************************************************/ | |
4256 /* Gzip methods */ | |
4257 /************************************************************************/ | |
4258 | |
4259 struct gzip_coding_system | |
428 | 4260 { |
771 | 4261 int level; /* 0 through 9, or -1 for default */ |
4262 }; | |
4263 | |
4264 #define CODING_SYSTEM_GZIP_LEVEL(codesys) \ | |
4265 (CODING_SYSTEM_TYPE_DATA (codesys, gzip)->level) | |
4266 #define XCODING_SYSTEM_GZIP_LEVEL(codesys) \ | |
4267 (XCODING_SYSTEM_TYPE_DATA (codesys, gzip)->level) | |
4268 | |
4269 struct gzip_coding_stream | |
428 | 4270 { |
771 | 4271 z_stream stream; |
4272 int stream_initted; | |
4273 int reached_eof; /* #### this should be handled by the caller, once we | |
4274 return LSTREAM_EOF */ | |
4275 }; | |
4276 | |
1204 | 4277 static const struct memory_description |
771 | 4278 gzip_coding_system_description[] = { |
4279 { XD_END } | |
4280 }; | |
4281 | |
1204 | 4282 DEFINE_CODING_SYSTEM_TYPE_WITH_DATA (gzip); |
4283 | |
771 | 4284 enum source_sink_type |
4285 gzip_conversion_end_type (Lisp_Object codesys) | |
4286 { | |
4287 return DECODES_BYTE_TO_BYTE; | |
428 | 4288 } |
4289 | |
4290 static void | |
771 | 4291 gzip_init (Lisp_Object codesys) |
4292 { | |
4293 struct gzip_coding_system *data = XCODING_SYSTEM_TYPE_DATA (codesys, gzip); | |
4294 data->level = -1; | |
4295 } | |
4296 | |
4297 static void | |
4298 gzip_print (Lisp_Object cs, Lisp_Object printcharfun, int escapeflag) | |
428 | 4299 { |
771 | 4300 struct gzip_coding_system *data = XCODING_SYSTEM_TYPE_DATA (cs, gzip); |
4301 | |
4952
19a72041c5ed
Mule-izing, various fixes related to char * arguments
Ben Wing <ben@xemacs.org>
parents:
4846
diff
changeset
|
4302 write_ascstring (printcharfun, "("); |
771 | 4303 if (data->level == -1) |
4952
19a72041c5ed
Mule-izing, various fixes related to char * arguments
Ben Wing <ben@xemacs.org>
parents:
4846
diff
changeset
|
4304 write_ascstring (printcharfun, "default"); |
771 | 4305 else |
4306 print_internal (make_int (data->level), printcharfun, 0); | |
4952
19a72041c5ed
Mule-izing, various fixes related to char * arguments
Ben Wing <ben@xemacs.org>
parents:
4846
diff
changeset
|
4307 write_ascstring (printcharfun, ")"); |
428 | 4308 } |
4309 | |
4310 static int | |
771 | 4311 gzip_putprop (Lisp_Object codesys, Lisp_Object key, Lisp_Object value) |
428 | 4312 { |
771 | 4313 struct gzip_coding_system *data = XCODING_SYSTEM_TYPE_DATA (codesys, gzip); |
4314 | |
4315 if (EQ (key, Qlevel)) | |
428 | 4316 { |
771 | 4317 if (EQ (value, Qdefault)) |
4318 data->level = -1; | |
4319 else | |
428 | 4320 { |
771 | 4321 CHECK_INT (value); |
4322 check_int_range (XINT (value), 0, 9); | |
4323 data->level = XINT (value); | |
428 | 4324 } |
4325 } | |
4326 else | |
771 | 4327 return 0; |
4328 return 1; | |
428 | 4329 } |
4330 | |
4331 static Lisp_Object | |
771 | 4332 gzip_getprop (Lisp_Object coding_system, Lisp_Object prop) |
428 | 4333 { |
771 | 4334 struct gzip_coding_system *data = |
4335 XCODING_SYSTEM_TYPE_DATA (coding_system, gzip); | |
4336 | |
4337 if (EQ (prop, Qlevel)) | |
428 | 4338 { |
771 | 4339 if (data->level == -1) |
4340 return Qdefault; | |
4341 return make_int (data->level); | |
428 | 4342 } |
771 | 4343 |
4344 return Qunbound; | |
428 | 4345 } |
4346 | |
4347 static void | |
771 | 4348 gzip_init_coding_stream (struct coding_stream *str) |
428 | 4349 { |
771 | 4350 struct gzip_coding_stream *data = CODING_STREAM_TYPE_DATA (str, gzip); |
4351 if (data->stream_initted) | |
428 | 4352 { |
771 | 4353 if (str->direction == CODING_DECODE) |
4354 inflateEnd (&data->stream); | |
4355 else | |
4356 deflateEnd (&data->stream); | |
4357 data->stream_initted = 0; | |
428 | 4358 } |
771 | 4359 data->reached_eof = 0; |
428 | 4360 } |
4361 | |
4362 static void | |
771 | 4363 gzip_rewind_coding_stream (struct coding_stream *str) |
428 | 4364 { |
771 | 4365 gzip_init_coding_stream (str); |
428 | 4366 } |
4367 | |
771 | 4368 static Bytecount |
4369 gzip_convert (struct coding_stream *str, | |
4370 const UExtbyte *src, | |
4371 unsigned_char_dynarr *dst, Bytecount n) | |
428 | 4372 { |
771 | 4373 struct gzip_coding_stream *data = CODING_STREAM_TYPE_DATA (str, gzip); |
4374 int zerr; | |
4375 if (str->direction == CODING_DECODE) | |
428 | 4376 { |
771 | 4377 if (data->reached_eof) |
4378 return n; /* eat the data */ | |
4379 | |
4380 if (!data->stream_initted) | |
428 | 4381 { |
771 | 4382 xzero (data->stream); |
4383 if (inflateInit (&data->stream) != Z_OK) | |
4384 return LSTREAM_ERROR; | |
4385 data->stream_initted = 1; | |
428 | 4386 } |
771 | 4387 |
4388 data->stream.next_in = (Bytef *) src; | |
4389 data->stream.avail_in = n; | |
4390 | |
4391 /* Normally we stop when we've fed all data to the decompressor; but | |
4392 if we're at the end of the input, and the decompressor hasn't | |
4393 reported EOF, we need to keep going, as there might be more output | |
4394 to generate. Z_OK from the decompressor means input was processed | |
4395 or output was generated; if neither, we break out of the loop. | |
4396 Other return values are: | |
4397 | |
4398 Z_STREAM_END EOF from decompressor | |
4399 Z_DATA_ERROR Corrupted data | |
4400 Z_BUF_ERROR No progress possible (this should happen if | |
4401 we try to feed it an incomplete file) | |
4402 Z_MEM_ERROR Out of memory | |
4403 Z_STREAM_ERROR (should never happen) | |
4404 Z_NEED_DICT (#### when will this happen?) | |
4405 */ | |
4406 while (data->stream.avail_in > 0 || str->eof) | |
4407 { | |
4408 /* Reserve an output buffer of the same size as the input buffer; | |
4409 if that's not enough, we keep reserving the same size. */ | |
4410 Bytecount reserved = n; | |
4411 Dynarr_add_many (dst, 0, reserved); | |
4412 /* Careful here! Don't retrieve the pointer until after | |
4413 reserving the space, or it might be bogus */ | |
4414 data->stream.next_out = | |
4415 Dynarr_atp (dst, Dynarr_length (dst) - reserved); | |
4416 data->stream.avail_out = reserved; | |
4417 zerr = inflate (&data->stream, Z_NO_FLUSH); | |
4418 /* Lop off the unused portion */ | |
5038 | 4419 Dynarr_set_lengthr (dst, Dynarr_length (dst) - data->stream.avail_out); |
771 | 4420 if (zerr != Z_OK) |
4421 break; | |
4422 } | |
4423 | |
4424 if (zerr == Z_STREAM_END) | |
4425 data->reached_eof = 1; | |
4426 | |
4427 if ((Bytecount) data->stream.avail_in < n) | |
4428 return n - data->stream.avail_in; | |
4429 | |
4430 if (zerr == Z_OK || zerr == Z_STREAM_END) | |
4431 return 0; | |
4432 | |
4433 return LSTREAM_ERROR; | |
428 | 4434 } |
4435 else | |
4436 { | |
771 | 4437 if (!data->stream_initted) |
4438 { | |
4439 int level = XCODING_SYSTEM_GZIP_LEVEL (str->codesys); | |
4440 xzero (data->stream); | |
4441 if (deflateInit (&data->stream, | |
4442 level == -1 ? Z_DEFAULT_COMPRESSION : level) != | |
4443 Z_OK) | |
4444 return LSTREAM_ERROR; | |
4445 data->stream_initted = 1; | |
428 | 4446 } |
771 | 4447 |
4448 data->stream.next_in = (Bytef *) src; | |
4449 data->stream.avail_in = n; | |
4450 | |
4451 /* Normally we stop when we've fed all data to the compressor; but if | |
4452 we're at the end of the input, and the compressor hasn't reported | |
4453 EOF, we need to keep going, as there might be more output to | |
4454 generate. (To signal EOF on our end, we set the FLUSH parameter | |
4455 to Z_FINISH; when all data is output, Z_STREAM_END will be | |
4456 returned.) Z_OK from the compressor means input was processed or | |
4457 output was generated; if neither, we break out of the loop. Other | |
4458 return values are: | |
4459 | |
4460 Z_STREAM_END EOF from compressor | |
4461 Z_BUF_ERROR No progress possible (should never happen) | |
4462 Z_STREAM_ERROR (should never happen) | |
4463 */ | |
4464 while (data->stream.avail_in > 0 || str->eof) | |
4465 { | |
4466 /* Reserve an output buffer of the same size as the input buffer; | |
4467 if that's not enough, we keep reserving the same size. */ | |
4468 Bytecount reserved = n; | |
4469 Dynarr_add_many (dst, 0, reserved); | |
4470 /* Careful here! Don't retrieve the pointer until after | |
4471 reserving the space, or it might be bogus */ | |
4472 data->stream.next_out = | |
4473 Dynarr_atp (dst, Dynarr_length (dst) - reserved); | |
4474 data->stream.avail_out = reserved; | |
4475 zerr = | |
4476 deflate (&data->stream, | |
4477 str->eof ? Z_FINISH : Z_NO_FLUSH); | |
4478 /* Lop off the unused portion */ | |
5038 | 4479 Dynarr_set_lengthr (dst, Dynarr_length (dst) - data->stream.avail_out); |
771 | 4480 if (zerr != Z_OK) |
4481 break; | |
4482 } | |
4483 | |
4484 if ((Bytecount) data->stream.avail_in < n) | |
4485 return n - data->stream.avail_in; | |
4486 | |
4487 if (zerr == Z_OK || zerr == Z_STREAM_END) | |
4488 return 0; | |
4489 | |
4490 return LSTREAM_ERROR; | |
428 | 4491 } |
4492 } | |
4493 | |
771 | 4494 #endif /* HAVE_ZLIB */ |
428 | 4495 |
4496 | |
4497 /************************************************************************/ | |
4498 /* Initialization */ | |
4499 /************************************************************************/ | |
4500 | |
4501 void | |
4502 syms_of_file_coding (void) | |
4503 { | |
5117
3742ea8250b5
Checking in final CVS version of workspace 'ben-lisp-object'
Ben Wing <ben@xemacs.org>
parents:
3025
diff
changeset
|
4504 INIT_LISP_OBJECT (coding_system); |
442 | 4505 |
771 | 4506 DEFSUBR (Fvalid_coding_system_type_p); |
4507 DEFSUBR (Fcoding_system_type_list); | |
428 | 4508 DEFSUBR (Fcoding_system_p); |
4303 | 4509 DEFSUBR (Fautoload_coding_system); |
428 | 4510 DEFSUBR (Ffind_coding_system); |
4511 DEFSUBR (Fget_coding_system); | |
4512 DEFSUBR (Fcoding_system_list); | |
4513 DEFSUBR (Fcoding_system_name); | |
4690
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
4514 DEFSUBR (Fmake_coding_system_internal); |
428 | 4515 DEFSUBR (Fcopy_coding_system); |
440 | 4516 DEFSUBR (Fcoding_system_canonical_name_p); |
4517 DEFSUBR (Fcoding_system_alias_p); | |
4518 DEFSUBR (Fcoding_system_aliasee); | |
428 | 4519 DEFSUBR (Fdefine_coding_system_alias); |
4520 DEFSUBR (Fsubsidiary_coding_system); | |
771 | 4521 DEFSUBR (Fcoding_system_base); |
4522 DEFSUBR (Fcoding_system_used_for_io); | |
428 | 4523 |
4524 DEFSUBR (Fcoding_system_type); | |
771 | 4525 DEFSUBR (Fcoding_system_description); |
428 | 4526 DEFSUBR (Fcoding_system_property); |
4527 | |
4528 DEFSUBR (Fcoding_category_list); | |
4529 DEFSUBR (Fset_coding_priority_list); | |
4530 DEFSUBR (Fcoding_priority_list); | |
4531 DEFSUBR (Fset_coding_category_system); | |
4532 DEFSUBR (Fcoding_category_system); | |
4533 | |
4534 DEFSUBR (Fdetect_coding_region); | |
4535 DEFSUBR (Fdecode_coding_region); | |
4536 DEFSUBR (Fencode_coding_region); | |
4690
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
4537 DEFSUBR (Fquery_coding_region); |
563 | 4538 DEFSYMBOL_MULTIWORD_PREDICATE (Qcoding_systemp); |
4539 DEFSYMBOL (Qno_conversion); | |
771 | 4540 DEFSYMBOL (Qconvert_eol); |
4541 DEFSYMBOL (Qconvert_eol_autodetect); | |
4542 DEFSYMBOL (Qconvert_eol_lf); | |
4543 DEFSYMBOL (Qconvert_eol_cr); | |
4544 DEFSYMBOL (Qconvert_eol_crlf); | |
563 | 4545 DEFSYMBOL (Qraw_text); |
771 | 4546 |
563 | 4547 DEFSYMBOL (Qmnemonic); |
4548 DEFSYMBOL (Qeol_type); | |
4549 DEFSYMBOL (Qpost_read_conversion); | |
4550 DEFSYMBOL (Qpre_write_conversion); | |
4551 | |
771 | 4552 DEFSYMBOL (Qtranslation_table_for_decode); |
4553 DEFSYMBOL (Qtranslation_table_for_encode); | |
4554 DEFSYMBOL (Qsafe_chars); | |
4555 DEFSYMBOL (Qsafe_charsets); | |
4556 DEFSYMBOL (Qmime_charset); | |
4557 DEFSYMBOL (Qvalid_codes); | |
4558 | |
563 | 4559 DEFSYMBOL (Qcr); |
4560 DEFSYMBOL (Qlf); | |
4561 DEFSYMBOL (Qcrlf); | |
4562 DEFSYMBOL (Qeol_cr); | |
4563 DEFSYMBOL (Qeol_lf); | |
4564 DEFSYMBOL (Qeol_crlf); | |
4565 DEFSYMBOL (Qencode); | |
4566 DEFSYMBOL (Qdecode); | |
428 | 4567 |
771 | 4568 DEFSYMBOL (Qnear_certainty); |
4569 DEFSYMBOL (Qquite_probable); | |
4570 DEFSYMBOL (Qsomewhat_likely); | |
1494 | 4571 DEFSYMBOL (Qslightly_likely); |
771 | 4572 DEFSYMBOL (Qas_likely_as_unlikely); |
4573 DEFSYMBOL (Qsomewhat_unlikely); | |
4574 DEFSYMBOL (Qquite_improbable); | |
4575 DEFSYMBOL (Qnearly_impossible); | |
4576 | |
4577 DEFSYMBOL (Qdo_eol); | |
4578 DEFSYMBOL (Qdo_coding); | |
4579 | |
4580 DEFSYMBOL (Qcanonicalize_after_coding); | |
4581 | |
4303 | 4582 DEFSYMBOL (Qposix_charset_to_coding_system_hash); |
4583 | |
771 | 4584 DEFSYMBOL (Qescape_quoted); |
4585 | |
4690
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
4586 DEFSYMBOL (Qquery_coding_warning_face); |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
4587 DEFSYMBOL (Qaliases); |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
4588 DEFSYMBOL (Qcharset_skip_chars_string); |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
4589 |
771 | 4590 #ifdef HAVE_ZLIB |
4591 DEFSYMBOL (Qgzip); | |
4592 #endif | |
4593 | |
428 | 4594 } |
4595 | |
4596 void | |
4597 lstream_type_create_file_coding (void) | |
4598 { | |
771 | 4599 LSTREAM_HAS_METHOD (coding, reader); |
4600 LSTREAM_HAS_METHOD (coding, writer); | |
4601 LSTREAM_HAS_METHOD (coding, rewinder); | |
4602 LSTREAM_HAS_METHOD (coding, seekable_p); | |
4603 LSTREAM_HAS_METHOD (coding, marker); | |
4604 LSTREAM_HAS_METHOD (coding, flusher); | |
4605 LSTREAM_HAS_METHOD (coding, closer); | |
4606 LSTREAM_HAS_METHOD (coding, finalizer); | |
4607 } | |
4608 | |
4609 void | |
4610 coding_system_type_create (void) | |
4611 { | |
4612 int i; | |
4613 | |
4614 staticpro (&Vcoding_system_hash_table); | |
4615 Vcoding_system_hash_table = | |
4616 make_lisp_hash_table (50, HASH_TABLE_NON_WEAK, HASH_TABLE_EQ); | |
4617 | |
4618 the_coding_system_type_entry_dynarr = Dynarr_new (coding_system_type_entry); | |
2367 | 4619 dump_add_root_block_ptr (&the_coding_system_type_entry_dynarr, |
771 | 4620 &csted_description); |
4621 | |
4622 Vcoding_system_type_list = Qnil; | |
4623 staticpro (&Vcoding_system_type_list); | |
4624 | |
4625 /* Initialize to something reasonable ... */ | |
4626 for (i = 0; i < MAX_DETECTOR_CATEGORIES; i++) | |
4627 { | |
4628 coding_category_system[i] = Qnil; | |
1204 | 4629 dump_add_root_lisp_object (&coding_category_system[i]); |
771 | 4630 coding_category_by_priority[i] = i; |
4631 } | |
4632 | |
4633 dump_add_opaque (coding_category_by_priority, | |
4634 sizeof (coding_category_by_priority)); | |
4635 | |
4636 all_coding_detectors = Dynarr_new2 (detector_dynarr, struct detector); | |
2367 | 4637 dump_add_root_block_ptr (&all_coding_detectors, |
771 | 4638 &detector_dynarr_description); |
4639 | |
4640 dump_add_opaque_int (&coding_system_tick); | |
4641 dump_add_opaque_int (&coding_detector_count); | |
4642 dump_add_opaque_int (&coding_detector_category_count); | |
4643 | |
4644 INITIALIZE_CODING_SYSTEM_TYPE (no_conversion, | |
4645 "no-conversion-coding-system-p"); | |
4646 CODING_SYSTEM_HAS_METHOD (no_conversion, convert); | |
4647 | |
4648 INITIALIZE_DETECTOR (no_conversion); | |
4649 DETECTOR_HAS_METHOD (no_conversion, detect); | |
4650 INITIALIZE_DETECTOR_CATEGORY (no_conversion, no_conversion); | |
4651 | |
4652 INITIALIZE_CODING_SYSTEM_TYPE_WITH_DATA (convert_eol, | |
4653 "convert-eol-coding-system-p"); | |
4654 CODING_SYSTEM_HAS_METHOD (convert_eol, print); | |
4655 CODING_SYSTEM_HAS_METHOD (convert_eol, convert); | |
4656 CODING_SYSTEM_HAS_METHOD (convert_eol, getprop); | |
4657 CODING_SYSTEM_HAS_METHOD (convert_eol, putprop); | |
4658 CODING_SYSTEM_HAS_METHOD (convert_eol, conversion_end_type); | |
4659 CODING_SYSTEM_HAS_METHOD (convert_eol, canonicalize_after_coding); | |
4660 CODING_SYSTEM_HAS_METHOD (convert_eol, init_coding_stream); | |
4661 | |
4662 INITIALIZE_CODING_SYSTEM_TYPE_WITH_DATA (undecided, | |
4663 "undecided-coding-system-p"); | |
4664 CODING_SYSTEM_HAS_METHOD (undecided, init); | |
4665 CODING_SYSTEM_HAS_METHOD (undecided, mark); | |
4666 CODING_SYSTEM_HAS_METHOD (undecided, print); | |
4667 CODING_SYSTEM_HAS_METHOD (undecided, convert); | |
4668 CODING_SYSTEM_HAS_METHOD (undecided, putprop); | |
4669 CODING_SYSTEM_HAS_METHOD (undecided, getprop); | |
4670 CODING_SYSTEM_HAS_METHOD (undecided, init_coding_stream); | |
4671 CODING_SYSTEM_HAS_METHOD (undecided, rewind_coding_stream); | |
4672 CODING_SYSTEM_HAS_METHOD (undecided, finalize_coding_stream); | |
4673 CODING_SYSTEM_HAS_METHOD (undecided, mark_coding_stream); | |
4674 CODING_SYSTEM_HAS_METHOD (undecided, canonicalize); | |
4675 CODING_SYSTEM_HAS_METHOD (undecided, canonicalize_after_coding); | |
4676 | |
4677 INITIALIZE_CODING_SYSTEM_TYPE_WITH_DATA (chain, "chain-coding-system-p"); | |
4678 | |
4679 CODING_SYSTEM_HAS_METHOD (chain, print); | |
4680 CODING_SYSTEM_HAS_METHOD (chain, canonicalize); | |
4681 CODING_SYSTEM_HAS_METHOD (chain, init); | |
4682 CODING_SYSTEM_HAS_METHOD (chain, mark); | |
4683 CODING_SYSTEM_HAS_METHOD (chain, mark_coding_stream); | |
4684 CODING_SYSTEM_HAS_METHOD (chain, convert); | |
4685 CODING_SYSTEM_HAS_METHOD (chain, rewind_coding_stream); | |
4686 CODING_SYSTEM_HAS_METHOD (chain, finalize_coding_stream); | |
4687 CODING_SYSTEM_HAS_METHOD (chain, finalize); | |
4688 CODING_SYSTEM_HAS_METHOD (chain, putprop); | |
4689 CODING_SYSTEM_HAS_METHOD (chain, getprop); | |
4690 CODING_SYSTEM_HAS_METHOD (chain, conversion_end_type); | |
4691 CODING_SYSTEM_HAS_METHOD (chain, canonicalize_after_coding); | |
4692 | |
4693 #ifdef DEBUG_XEMACS | |
4694 INITIALIZE_CODING_SYSTEM_TYPE (internal, "internal-coding-system-p"); | |
4695 CODING_SYSTEM_HAS_METHOD (internal, convert); | |
4696 #endif | |
4697 | |
4698 #ifdef HAVE_ZLIB | |
4699 INITIALIZE_CODING_SYSTEM_TYPE_WITH_DATA (gzip, "gzip-coding-system-p"); | |
4700 CODING_SYSTEM_HAS_METHOD (gzip, conversion_end_type); | |
4701 CODING_SYSTEM_HAS_METHOD (gzip, convert); | |
4702 CODING_SYSTEM_HAS_METHOD (gzip, init); | |
4703 CODING_SYSTEM_HAS_METHOD (gzip, print); | |
4704 CODING_SYSTEM_HAS_METHOD (gzip, init_coding_stream); | |
4705 CODING_SYSTEM_HAS_METHOD (gzip, rewind_coding_stream); | |
4706 CODING_SYSTEM_HAS_METHOD (gzip, putprop); | |
4707 CODING_SYSTEM_HAS_METHOD (gzip, getprop); | |
4708 #endif | |
4709 } | |
4710 | |
4711 void | |
4712 reinit_coding_system_type_create (void) | |
4713 { | |
4714 REINITIALIZE_CODING_SYSTEM_TYPE (no_conversion); | |
4715 REINITIALIZE_CODING_SYSTEM_TYPE (convert_eol); | |
4716 REINITIALIZE_CODING_SYSTEM_TYPE (undecided); | |
4717 REINITIALIZE_CODING_SYSTEM_TYPE (chain); | |
4718 #if 0 | |
4719 REINITIALIZE_CODING_SYSTEM_TYPE (text_file_wrapper); | |
4720 #endif /* 0 */ | |
4721 #ifdef DEBUG_XEMACS | |
4722 REINITIALIZE_CODING_SYSTEM_TYPE (internal); | |
4723 #endif | |
4724 #ifdef HAVE_ZLIB | |
4725 REINITIALIZE_CODING_SYSTEM_TYPE (gzip); | |
4726 #endif | |
4727 } | |
4728 | |
4729 void | |
4730 reinit_vars_of_file_coding (void) | |
4731 { | |
428 | 4732 } |
4733 | |
4734 void | |
4735 vars_of_file_coding (void) | |
4736 { | |
771 | 4737 /* We always have file-coding support */ |
428 | 4738 Fprovide (intern ("file-coding")); |
4739 | |
4952
19a72041c5ed
Mule-izing, various fixes related to char * arguments
Ben Wing <ben@xemacs.org>
parents:
4846
diff
changeset
|
4740 QScoding_system_cookie = build_ascstring (";;;###coding system: "); |
1347 | 4741 staticpro (&QScoding_system_cookie); |
4742 | |
1242 | 4743 #ifdef HAVE_DEFAULT_EOL_DETECTION |
2297 | 4744 /* #### Find a more appropriate place for this comment. |
4745 WARNING: The existing categories are intimately tied to the function | |
1242 | 4746 `coding-system-category' in coding.el. If you change a category, or |
4747 change the layout of any coding system associated with a category, you | |
4748 need to check that function and make sure it's written properly. */ | |
4749 | |
4750 Fprovide (intern ("unix-default-eol-detection")); | |
4751 #endif | |
4752 | |
428 | 4753 DEFVAR_LISP ("keyboard-coding-system", &Vkeyboard_coding_system /* |
3142 | 4754 Default coding system used for TTY and X11 keyboard input. |
4755 Under X11, used only to interpet the character for a key event when that | |
4756 event has a KeySym of NoSymbol but does have an associated string keysym, | |
4757 something that's seen with input methods. | |
4758 | |
4759 If you need to set these things to different coding systems, call the | |
4760 function `set-console-tty-coding-system' for the TTY and use this variable | |
4761 for X11. | |
428 | 4762 */ ); |
4763 Vkeyboard_coding_system = Qnil; | |
4764 | |
4765 DEFVAR_LISP ("terminal-coding-system", &Vterminal_coding_system /* | |
4766 Coding system used for TTY display output. | |
4767 Not used under a windowing system. | |
4768 */ ); | |
4769 Vterminal_coding_system = Qnil; | |
4770 | |
4771 DEFVAR_LISP ("coding-system-for-read", &Vcoding_system_for_read /* | |
440 | 4772 Overriding coding system used when reading from a file or process. |
4773 You should bind this variable with `let', but do not set it globally. | |
4774 If this is non-nil, it specifies the coding system that will be used | |
4775 to decode input on read operations, such as from a file or process. | |
4776 It overrides `buffer-file-coding-system-for-read', | |
428 | 4777 `insert-file-contents-pre-hook', etc. Use those variables instead of |
440 | 4778 this one for permanent changes to the environment. */ ); |
428 | 4779 Vcoding_system_for_read = Qnil; |
4780 | |
4781 DEFVAR_LISP ("coding-system-for-write", | |
4782 &Vcoding_system_for_write /* | |
440 | 4783 Overriding coding system used when writing to a file or process. |
4784 You should bind this variable with `let', but do not set it globally. | |
4785 If this is non-nil, it specifies the coding system that will be used | |
4786 to encode output for write operations, such as to a file or process. | |
4787 It overrides `buffer-file-coding-system', `write-region-pre-hook', etc. | |
4788 Use those variables instead of this one for permanent changes to the | |
4789 environment. */ ); | |
428 | 4790 Vcoding_system_for_write = Qnil; |
4791 | |
4792 DEFVAR_LISP ("file-name-coding-system", &Vfile_name_coding_system /* | |
4793 Coding system used to convert pathnames when accessing files. | |
4794 */ ); | |
4795 Vfile_name_coding_system = Qnil; | |
4796 | |
4797 DEFVAR_BOOL ("enable-multibyte-characters", &enable_multibyte_characters /* | |
771 | 4798 Setting this has no effect. It is purely for FSF compatibility. |
428 | 4799 */ ); |
4800 enable_multibyte_characters = 1; | |
771 | 4801 |
4802 Vchain_canonicalize_hash_table = | |
4803 make_lisp_hash_table (50, HASH_TABLE_NON_WEAK, HASH_TABLE_EQUAL); | |
4804 staticpro (&Vchain_canonicalize_hash_table); | |
4805 | |
4806 #ifdef DEBUG_XEMACS | |
4807 DEFVAR_LISP ("debug-coding-detection", &Vdebug_coding_detection /* | |
4808 If non-nil, display debug information about detection operations in progress. | |
4809 Information is displayed on stderr. | |
4810 */ ); | |
4811 Vdebug_coding_detection = Qnil; | |
4812 #endif | |
4690
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
4813 |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
4814 #ifdef MULE |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
4815 Vdefault_query_coding_region_chartab_cache |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
4816 = make_lisp_hash_table (25, HASH_TABLE_NON_WEAK, HASH_TABLE_EQUAL); |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
4817 staticpro (&Vdefault_query_coding_region_chartab_cache); |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
4818 #endif |
428 | 4819 } |
4820 | |
2297 | 4821 /* #### reformat this for consistent appearance? */ |
4822 | |
428 | 4823 void |
4824 complex_vars_of_file_coding (void) | |
4825 { | |
4690
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
4826 Fmake_coding_system_internal |
771 | 4827 (Qconvert_eol_cr, Qconvert_eol, |
4952
19a72041c5ed
Mule-izing, various fixes related to char * arguments
Ben Wing <ben@xemacs.org>
parents:
4846
diff
changeset
|
4828 build_defer_string ("Convert CR to LF"), |
771 | 4829 nconc2 (list6 (Qdocumentation, |
4952
19a72041c5ed
Mule-izing, various fixes related to char * arguments
Ben Wing <ben@xemacs.org>
parents:
4846
diff
changeset
|
4830 build_defer_string ( |
771 | 4831 "Converts CR (used to mark the end of a line on Macintosh systems) to LF\n" |
4832 "(used internally and under Unix to mark the end of a line)."), | |
4952
19a72041c5ed
Mule-izing, various fixes related to char * arguments
Ben Wing <ben@xemacs.org>
parents:
4846
diff
changeset
|
4833 Qmnemonic, build_ascstring ("CR->LF"), |
771 | 4834 Qsubtype, Qcr), |
4835 /* VERY IMPORTANT! Tell make-coding-system not to generate | |
4836 subsidiaries -- it needs the coding systems we're creating | |
4837 to do so! */ | |
4690
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
4838 list4 (Qeol_type, Qlf, |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
4839 Qsafe_charsets, Qt))); |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
4840 |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
4841 Fmake_coding_system_internal |
771 | 4842 (Qconvert_eol_lf, Qconvert_eol, |
4952
19a72041c5ed
Mule-izing, various fixes related to char * arguments
Ben Wing <ben@xemacs.org>
parents:
4846
diff
changeset
|
4843 build_defer_string ("Convert LF to LF (do nothing)"), |
771 | 4844 nconc2 (list6 (Qdocumentation, |
4952
19a72041c5ed
Mule-izing, various fixes related to char * arguments
Ben Wing <ben@xemacs.org>
parents:
4846
diff
changeset
|
4845 build_defer_string ( |
771 | 4846 "Do nothing."), |
4952
19a72041c5ed
Mule-izing, various fixes related to char * arguments
Ben Wing <ben@xemacs.org>
parents:
4846
diff
changeset
|
4847 Qmnemonic, build_ascstring ("LF->LF"), |
771 | 4848 Qsubtype, Qlf), |
4849 /* VERY IMPORTANT! Tell make-coding-system not to generate | |
4850 subsidiaries -- it needs the coding systems we're creating | |
4851 to do so! */ | |
4690
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
4852 list4 (Qeol_type, Qlf, |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
4853 Qsafe_charsets, Qt))); |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
4854 |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
4855 Fmake_coding_system_internal |
771 | 4856 (Qconvert_eol_crlf, Qconvert_eol, |
4952
19a72041c5ed
Mule-izing, various fixes related to char * arguments
Ben Wing <ben@xemacs.org>
parents:
4846
diff
changeset
|
4857 build_defer_string ("Convert CRLF to LF"), |
771 | 4858 nconc2 (list6 (Qdocumentation, |
4952
19a72041c5ed
Mule-izing, various fixes related to char * arguments
Ben Wing <ben@xemacs.org>
parents:
4846
diff
changeset
|
4859 build_defer_string ( |
771 | 4860 "Converts CR+LF (used to mark the end of a line on Macintosh systems) to LF\n" |
4861 "(used internally and under Unix to mark the end of a line)."), | |
4952
19a72041c5ed
Mule-izing, various fixes related to char * arguments
Ben Wing <ben@xemacs.org>
parents:
4846
diff
changeset
|
4862 Qmnemonic, build_ascstring ("CRLF->LF"), |
771 | 4863 Qsubtype, Qcrlf), |
4690
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
4864 |
771 | 4865 /* VERY IMPORTANT! Tell make-coding-system not to generate |
4866 subsidiaries -- it needs the coding systems we're creating | |
4867 to do so! */ | |
4690
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
4868 list4 (Qeol_type, Qlf, |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
4869 Qsafe_charsets, Qt))); |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
4870 |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
4871 Fmake_coding_system_internal |
771 | 4872 (Qconvert_eol_autodetect, Qconvert_eol, |
4952
19a72041c5ed
Mule-izing, various fixes related to char * arguments
Ben Wing <ben@xemacs.org>
parents:
4846
diff
changeset
|
4873 build_defer_string ("Autodetect EOL type"), |
771 | 4874 nconc2 (list6 (Qdocumentation, |
4952
19a72041c5ed
Mule-izing, various fixes related to char * arguments
Ben Wing <ben@xemacs.org>
parents:
4846
diff
changeset
|
4875 build_defer_string ( |
771 | 4876 "Autodetect the end-of-line type."), |
4952
19a72041c5ed
Mule-izing, various fixes related to char * arguments
Ben Wing <ben@xemacs.org>
parents:
4846
diff
changeset
|
4877 Qmnemonic, build_ascstring ("Auto-EOL"), |
793 | 4878 Qsubtype, Qnil), |
771 | 4879 /* VERY IMPORTANT! Tell make-coding-system not to generate |
4880 subsidiaries -- it needs the coding systems we're creating | |
4881 to do so! */ | |
4690
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
4882 list4 (Qeol_type, Qlf, |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
4883 Qsafe_charsets, Qt))); |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
4884 |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
4885 Fmake_coding_system_internal |
771 | 4886 (Qundecided, Qundecided, |
4952
19a72041c5ed
Mule-izing, various fixes related to char * arguments
Ben Wing <ben@xemacs.org>
parents:
4846
diff
changeset
|
4887 build_defer_string ("Undecided (auto-detect)"), |
771 | 4888 nconc2 (list4 (Qdocumentation, |
4952
19a72041c5ed
Mule-izing, various fixes related to char * arguments
Ben Wing <ben@xemacs.org>
parents:
4846
diff
changeset
|
4889 build_defer_string |
771 | 4890 ("Automatically detects the correct encoding."), |
4952
19a72041c5ed
Mule-izing, various fixes related to char * arguments
Ben Wing <ben@xemacs.org>
parents:
4846
diff
changeset
|
4891 Qmnemonic, build_ascstring ("Auto")), |
771 | 4892 list6 (Qdo_eol, Qt, Qdo_coding, Qt, |
4893 /* We do EOL detection ourselves so we don't need to be | |
4894 wrapped in an EOL detector. (It doesn't actually hurt, | |
4895 though, I don't think.) */ | |
4896 Qeol_type, Qlf))); | |
4897 | |
4690
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
4898 Fmake_coding_system_internal |
771 | 4899 (intern ("undecided-dos"), Qundecided, |
4952
19a72041c5ed
Mule-izing, various fixes related to char * arguments
Ben Wing <ben@xemacs.org>
parents:
4846
diff
changeset
|
4900 build_defer_string ("Undecided (auto-detect) (CRLF)"), |
771 | 4901 nconc2 (list4 (Qdocumentation, |
4952
19a72041c5ed
Mule-izing, various fixes related to char * arguments
Ben Wing <ben@xemacs.org>
parents:
4846
diff
changeset
|
4902 build_defer_string |
771 | 4903 ("Automatically detects the correct encoding; EOL type of CRLF forced."), |
4952
19a72041c5ed
Mule-izing, various fixes related to char * arguments
Ben Wing <ben@xemacs.org>
parents:
4846
diff
changeset
|
4904 Qmnemonic, build_ascstring ("Auto")), |
771 | 4905 list4 (Qdo_coding, Qt, |
4906 Qeol_type, Qcrlf))); | |
4907 | |
4690
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
4908 Fmake_coding_system_internal |
771 | 4909 (intern ("undecided-unix"), Qundecided, |
4952
19a72041c5ed
Mule-izing, various fixes related to char * arguments
Ben Wing <ben@xemacs.org>
parents:
4846
diff
changeset
|
4910 build_defer_string ("Undecided (auto-detect) (LF)"), |
771 | 4911 nconc2 (list4 (Qdocumentation, |
4952
19a72041c5ed
Mule-izing, various fixes related to char * arguments
Ben Wing <ben@xemacs.org>
parents:
4846
diff
changeset
|
4912 build_defer_string |
771 | 4913 ("Automatically detects the correct encoding; EOL type of LF forced."), |
4952
19a72041c5ed
Mule-izing, various fixes related to char * arguments
Ben Wing <ben@xemacs.org>
parents:
4846
diff
changeset
|
4914 Qmnemonic, build_ascstring ("Auto")), |
771 | 4915 list4 (Qdo_coding, Qt, |
4916 Qeol_type, Qlf))); | |
4917 | |
4690
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
4918 Fmake_coding_system_internal |
771 | 4919 (intern ("undecided-mac"), Qundecided, |
4952
19a72041c5ed
Mule-izing, various fixes related to char * arguments
Ben Wing <ben@xemacs.org>
parents:
4846
diff
changeset
|
4920 build_defer_string ("Undecided (auto-detect) (CR)"), |
771 | 4921 nconc2 (list4 (Qdocumentation, |
4952
19a72041c5ed
Mule-izing, various fixes related to char * arguments
Ben Wing <ben@xemacs.org>
parents:
4846
diff
changeset
|
4922 build_defer_string |
771 | 4923 ("Automatically detects the correct encoding; EOL type of CR forced."), |
4952
19a72041c5ed
Mule-izing, various fixes related to char * arguments
Ben Wing <ben@xemacs.org>
parents:
4846
diff
changeset
|
4924 Qmnemonic, build_ascstring ("Auto")), |
771 | 4925 list4 (Qdo_coding, Qt, |
4926 Qeol_type, Qcr))); | |
4927 | |
428 | 4928 /* Need to create this here or we're really screwed. */ |
4690
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
4929 Fmake_coding_system_internal |
428 | 4930 (Qraw_text, Qno_conversion, |
4952
19a72041c5ed
Mule-izing, various fixes related to char * arguments
Ben Wing <ben@xemacs.org>
parents:
4846
diff
changeset
|
4931 build_defer_string ("Raw Text"), |
4690
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
4932 nconc2 (list4 (Qdocumentation, |
4952
19a72041c5ed
Mule-izing, various fixes related to char * arguments
Ben Wing <ben@xemacs.org>
parents:
4846
diff
changeset
|
4933 build_defer_string ("Raw text converts only line-break " |
4690
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
4934 "codes, and acts otherwise like " |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
4935 "`binary'."), |
4952
19a72041c5ed
Mule-izing, various fixes related to char * arguments
Ben Wing <ben@xemacs.org>
parents:
4846
diff
changeset
|
4936 Qmnemonic, build_ascstring ("Raw")), |
4690
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
4937 #ifdef MULE |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
4938 list2 (Qsafe_charsets, list3 (Vcharset_ascii, Vcharset_control_1, |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
4939 Vcharset_latin_iso8859_1)))); |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
4940 |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
4941 #else |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
4942 Qnil)); |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
4943 #endif |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
4944 |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
4945 Fmake_coding_system_internal |
428 | 4946 (Qbinary, Qno_conversion, |
4952
19a72041c5ed
Mule-izing, various fixes related to char * arguments
Ben Wing <ben@xemacs.org>
parents:
4846
diff
changeset
|
4947 build_defer_string ("Binary"), |
4690
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
4948 nconc2 (list6 (Qdocumentation, |
4952
19a72041c5ed
Mule-izing, various fixes related to char * arguments
Ben Wing <ben@xemacs.org>
parents:
4846
diff
changeset
|
4949 build_defer_string ( |
771 | 4950 "This coding system is as close as it comes to doing no conversion.\n" |
4951 "On input, each byte is converted directly into the character\n" | |
4952 "with the corresponding code -- i.e. from the `ascii', `control-1',\n" | |
4953 "or `latin-1' character sets. On output, these characters are\n" | |
4954 "converted back to the corresponding bytes, and other characters\n" | |
4955 "are converted to the default character, i.e. `~'."), | |
4690
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
4956 Qeol_type, Qlf, |
4952
19a72041c5ed
Mule-izing, various fixes related to char * arguments
Ben Wing <ben@xemacs.org>
parents:
4846
diff
changeset
|
4957 Qmnemonic, build_ascstring ("Binary")), |
4690
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
4958 #ifdef MULE |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
4959 list2 (Qsafe_charsets, list3 (Vcharset_ascii, Vcharset_control_1, |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
4960 Vcharset_latin_iso8859_1)))); |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
4961 |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
4962 #else |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
4963 Qnil)); |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
4964 #endif |
428 | 4965 |
771 | 4966 /* Formerly aliased to raw-text! Completely bogus and not even the same |
4967 as FSF Emacs. */ | |
4968 Fdefine_coding_system_alias (Qno_conversion, Qbinary); | |
4969 Fdefine_coding_system_alias (intern ("no-conversion-unix"), | |
4970 intern ("raw-text-unix")); | |
4971 Fdefine_coding_system_alias (intern ("no-conversion-dos"), | |
4972 intern ("raw-text-dos")); | |
4973 Fdefine_coding_system_alias (intern ("no-conversion-mac"), | |
4974 intern ("raw-text-mac")); | |
4975 | |
1318 | 4976 /* These three below will get their defaults set correctly |
4977 in code-init.el. We init them now so we can handle stuff at dump | |
771 | 4978 time before we get to code-init.el. */ |
1318 | 4979 Fdefine_coding_system_alias (Qnative, Qbinary); |
440 | 4980 Fdefine_coding_system_alias (Qterminal, Qbinary); |
4981 Fdefine_coding_system_alias (Qkeyboard, Qbinary); | |
4982 | |
1318 | 4983 Fdefine_coding_system_alias (Qfile_name, Qnative); |
771 | 4984 Fdefine_coding_system_alias (Qidentity, Qconvert_eol_lf); |
4985 | |
428 | 4986 /* Need this for bootstrapping */ |
771 | 4987 coding_category_system[detector_category_no_conversion] = |
428 | 4988 Fget_coding_system (Qraw_text); |
4989 } |