Mercurial > hg > xemacs-beta
annotate src/file-coding.c @ 5043:d0c14ea98592
various frame-geometry fixes
-------------------- ChangeLog entries follow: --------------------
src/ChangeLog addition:
2010-02-15 Ben Wing <ben@xemacs.org>
* EmacsFrame.c:
* EmacsFrame.c (EmacsFrameResize):
* console-msw-impl.h:
* console-msw-impl.h (struct mswindows_frame):
* console-msw-impl.h (FRAME_MSWINDOWS_TARGET_RECT):
* device-tty.c:
* device-tty.c (tty_asynch_device_change):
* event-msw.c:
* event-msw.c (mswindows_wnd_proc):
* faces.c (Fface_list):
* faces.h:
* frame-gtk.c:
* frame-gtk.c (gtk_set_initial_frame_size):
* frame-gtk.c (gtk_set_frame_size):
* frame-msw.c:
* frame-msw.c (mswindows_init_frame_1):
* frame-msw.c (mswindows_set_frame_size):
* frame-msw.c (mswindows_size_frame_internal):
* frame-msw.c (msprinter_init_frame_3):
* frame.c:
* frame.c (enum):
* frame.c (Fmake_frame):
* frame.c (adjust_frame_size):
* frame.c (store_minibuf_frame_prop):
* frame.c (Fframe_property):
* frame.c (Fframe_properties):
* frame.c (Fframe_displayable_pixel_height):
* frame.c (Fframe_displayable_pixel_width):
* frame.c (internal_set_frame_size):
* frame.c (Fset_frame_height):
* frame.c (Fset_frame_pixel_height):
* frame.c (Fset_frame_displayable_pixel_height):
* frame.c (Fset_frame_width):
* frame.c (Fset_frame_pixel_width):
* frame.c (Fset_frame_displayable_pixel_width):
* frame.c (Fset_frame_size):
* frame.c (Fset_frame_pixel_size):
* frame.c (Fset_frame_displayable_pixel_size):
* frame.c (frame_conversion_internal_1):
* frame.c (get_frame_displayable_pixel_size):
* frame.c (change_frame_size_1):
* frame.c (change_frame_size):
* frame.c (generate_title_string):
* frame.h:
* gtk-xemacs.c:
* gtk-xemacs.c (gtk_xemacs_size_request):
* gtk-xemacs.c (gtk_xemacs_size_allocate):
* gtk-xemacs.c (gtk_xemacs_paint):
* gutter.c:
* gutter.c (update_gutter_geometry):
* redisplay.c (end_hold_frame_size_changes):
* redisplay.c (redisplay_frame):
* toolbar.c:
* toolbar.c (update_frame_toolbars_geometry):
* window.c:
* window.c (frame_pixsize_valid_p):
* window.c (check_frame_size):
Various fixes to frame geometry to make it a bit easier to understand
and fix some bugs.
1. IMPORTANT: Some renamings. Will need to be applied carefully to
the carbon repository, in the following order:
-- pixel_to_char_size -> pixel_to_frame_unit_size
-- char_to_pixel_size -> frame_unit_to_pixel_size
-- pixel_to_real_char_size -> pixel_to_char_size
-- char_to_real_pixel_size -> char_to_pixel_size
-- Reverse second and third arguments of change_frame_size() and
change_frame_size_1() to try to make functions consistent in
putting width before height.
-- Eliminate old round_size_to_char, because it didn't really
do anything differently from round_size_to_real_char()
-- round_size_to_real_char -> round_size_to_char; any places that
called the old round_size_to_char should just call the new one.
2. IMPORTANT FOR CARBON: The set_frame_size() method is now passed
sizes in "frame units", like all other frame-sizing functions,
rather than some hacked-up combination of char-cell units and
total pixel size. This only affects window systems that use
"pixelated geometry", and I'm not sure if Carbon is one of them.
MS Windows is pixelated, X and GTK are not. For pixelated-geometry
systems, the size in set_frame_size() is in displayable pixels
rather than total pixels and needs to be converted appropriately;
take a look at the changes made to mswindows_set_frame_size()
method if necessary.
3. Add a big long comment in frame.c describing how frame geometry
works.
4. Remove MS Windows-specific character height and width fields,
duplicative and unused.
5. frame-displayable-pixel-* and set-frame-displayable-pixel-*
didn't use to work on MS Windows, but they do now.
6. In general, clean up the handling of "pixelated geometry" so
that fewer functions have to worry about this. This is really
an abomination that should be removed entirely but that will
have to happen later. Fix some buggy code in
frame_conversion_internal() that happened to "work" because it
was countered by oppositely buggy code in change_frame_size().
7. Clean up some frame-size code in toolbar.c and use functions
already provided in frame.c instead of rolling its own.
8. Fix check_frame_size() in window.c, which formerly didn't take
pixelated geometry into account.
author | Ben Wing <ben@xemacs.org> |
---|---|
date | Mon, 15 Feb 2010 22:14:11 -0600 |
parents | 16112448d484 |
children | 9410323e4b0d b5df3737028a |
rev | line source |
---|---|
771 | 1 /* Text encoding conversion functions; coding-system object. |
2 #### rename me to coding-system.c or coding.c | |
428 | 3 Copyright (C) 1991, 1995 Free Software Foundation, Inc. |
4 Copyright (C) 1995 Sun Microsystems, Inc. | |
3025 | 5 Copyright (C) 2000, 2001, 2002, 2003, 2005 Ben Wing. |
428 | 6 |
7 This file is part of XEmacs. | |
8 | |
9 XEmacs is free software; you can redistribute it and/or modify it | |
10 under the terms of the GNU General Public License as published by the | |
11 Free Software Foundation; either version 2, or (at your option) any | |
12 later version. | |
13 | |
14 XEmacs is distributed in the hope that it will be useful, but WITHOUT | |
15 ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or | |
16 FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License | |
17 for more details. | |
18 | |
19 You should have received a copy of the GNU General Public License | |
20 along with XEmacs; see the file COPYING. If not, write to | |
21 the Free Software Foundation, Inc., 59 Temple Place - Suite 330, | |
22 Boston, MA 02111-1307, USA. */ | |
23 | |
771 | 24 /* Synched up with: Not in FSF. */ |
25 | |
26 /* Authorship: | |
27 | |
28 Current primary author: Ben Wing <ben@xemacs.org> | |
29 | |
30 Rewritten by Ben Wing <ben@xemacs.org>, based originally on coding.c | |
31 from Mule 2.? but probably does not share one line of code with that | |
32 original source. Rewriting work started around Dec. 1994. or Jan. 1995. | |
33 Proceeded in earnest till Nov. 1995. | |
34 | |
35 Around Feb. 17, 1998, Andy Piper renamed what was then mule-coding.c to | |
36 file-coding.c, with the intention of using it to do end-of-line conversion | |
37 on non-MULE machines (specifically, on Windows machines). He separated | |
38 out the MULE stuff from non-MULE using ifdef's, and searched throughout | |
39 the rest of the source tree looking for coding-system-related code that | |
40 was ifdef MULE but should be ifdef HAVE_CODING_SYSTEMS. | |
41 | |
42 Sept. 4 - 8, 1998, Tomohiko Morioka added the UCS_4 and UTF_8 coding system | |
43 types, providing a primitive means of decoding and encoding externally- | |
44 formatted Unicode/UCS_4 and Unicode/UTF_8 data. | |
45 | |
46 January 25, 2000, Martin Buchholz redid and fleshed out the coding | |
47 system alias handling that was first added in prototype form by | |
48 Hrjove Niksic, April 15, 1999. | |
49 | |
50 April to May 2000, Ben Wing: More major reorganization. Adding features | |
51 needed for MS Windows (multibyte, unicode, unicode-to-multibyte), the | |
52 "chain" coding system for chaining two together, and doing a lot of | |
53 reorganization in preparation for properly abstracting out the different | |
54 coding system types. | |
55 | |
56 June 2001, Ben Wing: Added Unicode support. Eliminated previous | |
57 junky Unicode translation support. | |
58 | |
59 August 2001, Ben Wing: Moved Unicode support to unicode.c. Finished | |
60 abstracting everything except detection, which is hard to abstract (see | |
61 just below). | |
62 | |
63 September 2001, Ben Wing: Moved Mule code to mule-coding.c, Windows code | |
64 to intl-win32.c. Lots more rewriting; very little code is untouched | |
65 from before April 2000. Abstracted the detection code, added multiple | |
66 levels of likelihood to increase the reliability of the algorithm. | |
67 | |
68 October 2001, Ben Wing: HAVE_CODING_SYSTEMS is always now defined. | |
69 Removed the conditionals. | |
70 */ | |
71 | |
428 | 72 #include <config.h> |
73 #include "lisp.h" | |
74 | |
75 #include "buffer.h" | |
76 #include "elhash.h" | |
77 #include "insdel.h" | |
78 #include "lstream.h" | |
440 | 79 #include "opaque.h" |
771 | 80 #include "file-coding.h" |
4690
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
81 #include "extents.h" |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
82 #include "rangetab.h" |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
83 #include "chartab.h" |
771 | 84 |
85 #ifdef HAVE_ZLIB | |
86 #include "zlib.h" | |
428 | 87 #endif |
88 | |
89 Lisp_Object Vkeyboard_coding_system; | |
90 Lisp_Object Vterminal_coding_system; | |
91 Lisp_Object Vcoding_system_for_read; | |
92 Lisp_Object Vcoding_system_for_write; | |
93 Lisp_Object Vfile_name_coding_system; | |
94 | |
4690
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
95 Lisp_Object Qaliases, Qcharset_skip_chars_string; |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
96 |
771 | 97 #ifdef DEBUG_XEMACS |
98 Lisp_Object Vdebug_coding_detection; | |
440 | 99 #endif |
771 | 100 |
4690
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
101 #ifdef MULE |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
102 extern Lisp_Object Vcharset_ascii, Vcharset_control_1, |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
103 Vcharset_latin_iso8859_1; |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
104 #endif |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
105 |
771 | 106 typedef struct coding_system_type_entry |
107 { | |
108 struct coding_system_methods *meths; | |
109 } coding_system_type_entry; | |
110 | |
111 typedef struct | |
112 { | |
113 Dynarr_declare (coding_system_type_entry); | |
114 } coding_system_type_entry_dynarr; | |
115 | |
116 static coding_system_type_entry_dynarr *the_coding_system_type_entry_dynarr; | |
117 | |
1204 | 118 static const struct memory_description cste_description_1[] = { |
2551 | 119 { XD_BLOCK_PTR, offsetof (coding_system_type_entry, meths), 1, |
120 { &coding_system_methods_description } }, | |
771 | 121 { XD_END } |
122 }; | |
123 | |
1204 | 124 static const struct sized_memory_description cste_description = { |
771 | 125 sizeof (coding_system_type_entry), |
126 cste_description_1 | |
127 }; | |
128 | |
1204 | 129 static const struct memory_description csted_description_1[] = { |
771 | 130 XD_DYNARR_DESC (coding_system_type_entry_dynarr, &cste_description), |
428 | 131 { XD_END } |
132 }; | |
133 | |
1204 | 134 static const struct sized_memory_description csted_description = { |
771 | 135 sizeof (coding_system_type_entry_dynarr), |
136 csted_description_1 | |
137 }; | |
138 | |
139 static Lisp_Object Vcoding_system_type_list; | |
140 | |
141 /* Coding system currently associated with each coding category. */ | |
142 Lisp_Object coding_category_system[MAX_DETECTOR_CATEGORIES]; | |
143 | |
144 /* Table of all coding categories in decreasing order of priority. | |
145 This describes a permutation of the possible coding categories. */ | |
146 int coding_category_by_priority[MAX_DETECTOR_CATEGORIES]; | |
147 | |
148 /* Value used with to give a unique name to nameless coding systems */ | |
149 int coding_system_tick; | |
150 | |
151 int coding_detector_count; | |
152 int coding_detector_category_count; | |
153 | |
154 detector_dynarr *all_coding_detectors; | |
155 | |
1204 | 156 static const struct memory_description struct_detector_category_description_1[] |
771 | 157 = |
158 { | |
159 { XD_LISP_OBJECT, offsetof (struct detector_category, sym) }, | |
160 { XD_END } | |
161 }; | |
162 | |
1204 | 163 static const struct sized_memory_description struct_detector_category_description = |
771 | 164 { |
165 sizeof (struct detector_category), | |
166 struct_detector_category_description_1 | |
428 | 167 }; |
168 | |
1204 | 169 static const struct memory_description detector_category_dynarr_description_1[] = |
771 | 170 { |
171 XD_DYNARR_DESC (detector_category_dynarr, | |
172 &struct_detector_category_description), | |
173 { XD_END } | |
174 }; | |
175 | |
1204 | 176 static const struct sized_memory_description detector_category_dynarr_description = { |
771 | 177 sizeof (detector_category_dynarr), |
178 detector_category_dynarr_description_1 | |
179 }; | |
180 | |
1204 | 181 static const struct memory_description struct_detector_description_1[] |
771 | 182 = |
183 { | |
2367 | 184 { XD_BLOCK_PTR, offsetof (struct detector, cats), 1, |
2551 | 185 { &detector_category_dynarr_description } }, |
771 | 186 { XD_END } |
187 }; | |
188 | |
1204 | 189 static const struct sized_memory_description struct_detector_description = |
771 | 190 { |
191 sizeof (struct detector), | |
192 struct_detector_description_1 | |
193 }; | |
194 | |
1204 | 195 static const struct memory_description detector_dynarr_description_1[] = |
771 | 196 { |
197 XD_DYNARR_DESC (detector_dynarr, &struct_detector_description), | |
198 { XD_END } | |
199 }; | |
200 | |
1204 | 201 static const struct sized_memory_description detector_dynarr_description = { |
771 | 202 sizeof (detector_dynarr), |
203 detector_dynarr_description_1 | |
204 }; | |
428 | 205 |
206 Lisp_Object Qcoding_systemp; | |
207 | |
771 | 208 Lisp_Object Qraw_text; |
428 | 209 |
210 Lisp_Object Qmnemonic, Qeol_type; | |
211 Lisp_Object Qcr, Qcrlf, Qlf; | |
212 Lisp_Object Qeol_cr, Qeol_crlf, Qeol_lf; | |
213 Lisp_Object Qpost_read_conversion; | |
214 Lisp_Object Qpre_write_conversion; | |
215 | |
771 | 216 Lisp_Object Qtranslation_table_for_decode; |
217 Lisp_Object Qtranslation_table_for_encode; | |
218 Lisp_Object Qsafe_chars; | |
219 Lisp_Object Qsafe_charsets; | |
220 Lisp_Object Qmime_charset; | |
221 Lisp_Object Qvalid_codes; | |
222 | |
223 Lisp_Object Qno_conversion; | |
224 Lisp_Object Qconvert_eol; | |
440 | 225 Lisp_Object Qescape_quoted; |
771 | 226 Lisp_Object Qencode, Qdecode; |
227 | |
228 Lisp_Object Qconvert_eol_lf, Qconvert_eol_cr, Qconvert_eol_crlf; | |
229 Lisp_Object Qconvert_eol_autodetect; | |
230 | |
231 Lisp_Object Qnear_certainty, Qquite_probable, Qsomewhat_likely; | |
1494 | 232 Lisp_Object Qslightly_likely; |
771 | 233 Lisp_Object Qas_likely_as_unlikely, Qsomewhat_unlikely, Qquite_improbable; |
234 Lisp_Object Qnearly_impossible; | |
235 | |
236 Lisp_Object Qdo_eol, Qdo_coding; | |
237 | |
238 Lisp_Object Qcanonicalize_after_coding; | |
239 | |
1347 | 240 Lisp_Object QScoding_system_cookie; |
241 | |
4303 | 242 Lisp_Object Qposix_charset_to_coding_system_hash; |
243 | |
771 | 244 /* This is used to convert autodetected coding systems into existing |
245 systems. For example, the chain undecided->convert-eol-autodetect may | |
246 have its separate parts detected as mswindows-multibyte and | |
247 convert-eol-crlf, and the result needs to be mapped to | |
248 mswindows-multibyte-dos. */ | |
249 /* #### It's not clear we need this whole chain-canonicalize mechanism | |
250 any more. */ | |
251 static Lisp_Object Vchain_canonicalize_hash_table; | |
252 | |
253 #ifdef HAVE_ZLIB | |
254 Lisp_Object Qgzip; | |
428 | 255 #endif |
771 | 256 |
2297 | 257 /* Maps symbols (coding system names) to either coding system objects or |
258 (for aliases) other names. */ | |
771 | 259 static Lisp_Object Vcoding_system_hash_table; |
428 | 260 |
261 int enable_multibyte_characters; | |
262 | |
263 EXFUN (Fcopy_coding_system, 2); | |
264 | |
265 | |
266 /************************************************************************/ | |
771 | 267 /* Coding system object methods */ |
428 | 268 /************************************************************************/ |
269 | |
270 static Lisp_Object | |
271 mark_coding_system (Lisp_Object obj) | |
272 { | |
273 Lisp_Coding_System *codesys = XCODING_SYSTEM (obj); | |
274 | |
1204 | 275 #define MARKED_SLOT(x) mark_object (codesys->x); |
276 #include "coding-system-slots.h" | |
771 | 277 |
278 MAYBE_CODESYSMETH (codesys, mark, (obj)); | |
428 | 279 |
1204 | 280 return Qnil; |
428 | 281 } |
282 | |
283 static void | |
771 | 284 print_coding_system_properties (Lisp_Object obj, Lisp_Object printcharfun) |
285 { | |
286 Lisp_Coding_System *c = XCODING_SYSTEM (obj); | |
287 print_internal (c->methods->type, printcharfun, 1); | |
288 MAYBE_CODESYSMETH (c, print, (obj, printcharfun, 1)); | |
289 if (CODING_SYSTEM_EOL_TYPE (c) != EOL_AUTODETECT) | |
290 write_fmt_string_lisp (printcharfun, " eol-type=%s", | |
291 1, Fcoding_system_property (obj, Qeol_type)); | |
292 } | |
293 | |
294 static void | |
428 | 295 print_coding_system (Lisp_Object obj, Lisp_Object printcharfun, |
2286 | 296 int UNUSED (escapeflag)) |
428 | 297 { |
298 Lisp_Coding_System *c = XCODING_SYSTEM (obj); | |
299 if (print_readably) | |
4846 | 300 printing_unreadable_lcrecord (obj, 0); |
771 | 301 |
302 write_fmt_string_lisp (printcharfun, "#<coding-system %s ", 1, c->name); | |
303 print_coding_system_properties (obj, printcharfun); | |
4952
19a72041c5ed
Mule-izing, various fixes related to char * arguments
Ben Wing <ben@xemacs.org>
parents:
4846
diff
changeset
|
304 write_ascstring (printcharfun, ">"); |
428 | 305 } |
306 | |
771 | 307 /* Print an abbreviated version of a coding system (but still containing |
308 all the information), for use within a coding system print method. */ | |
309 | |
310 static void | |
311 print_coding_system_in_print_method (Lisp_Object cs, Lisp_Object printcharfun, | |
2286 | 312 int UNUSED (escapeflag)) |
771 | 313 { |
800 | 314 write_fmt_string_lisp (printcharfun, "%s[", 1, XCODING_SYSTEM_NAME (cs)); |
771 | 315 print_coding_system_properties (cs, printcharfun); |
4952
19a72041c5ed
Mule-izing, various fixes related to char * arguments
Ben Wing <ben@xemacs.org>
parents:
4846
diff
changeset
|
316 write_ascstring (printcharfun, "]"); |
771 | 317 } |
318 | |
3263 | 319 #ifndef NEW_GC |
428 | 320 static void |
321 finalize_coding_system (void *header, int for_disksave) | |
322 { | |
771 | 323 Lisp_Object cs = wrap_coding_system ((Lisp_Coding_System *) header); |
428 | 324 /* Since coding systems never go away, this function is not |
325 necessary. But it would be necessary if we changed things | |
326 so that coding systems could go away. */ | |
327 if (!for_disksave) /* see comment in lstream.c */ | |
771 | 328 MAYBE_XCODESYSMETH (cs, finalize, (cs)); |
329 } | |
3263 | 330 #endif /* not NEW_GC */ |
771 | 331 |
332 static Bytecount | |
333 sizeof_coding_system (const void *header) | |
334 { | |
335 const Lisp_Coding_System *p = (const Lisp_Coding_System *) header; | |
336 return offsetof (Lisp_Coding_System, data) + p->methods->extra_data_size; | |
428 | 337 } |
338 | |
1204 | 339 static const struct memory_description coding_system_methods_description_1[] |
771 | 340 = { |
341 { XD_LISP_OBJECT, | |
342 offsetof (struct coding_system_methods, type) }, | |
343 { XD_LISP_OBJECT, | |
344 offsetof (struct coding_system_methods, predicate_symbol) }, | |
345 { XD_END } | |
346 }; | |
347 | |
1204 | 348 const struct sized_memory_description coding_system_methods_description = { |
771 | 349 sizeof (struct coding_system_methods), |
350 coding_system_methods_description_1 | |
351 }; | |
352 | |
1204 | 353 static const struct sized_memory_description coding_system_extra_description_map[] = |
354 { | |
355 { offsetof (Lisp_Coding_System, methods) }, | |
356 { offsetof (struct coding_system_methods, extra_description) }, | |
357 { -1 }, | |
771 | 358 }; |
359 | |
1204 | 360 static const struct memory_description coding_system_description[] = |
428 | 361 { |
2367 | 362 { XD_BLOCK_PTR, offsetof (Lisp_Coding_System, methods), 1, |
2551 | 363 { &coding_system_methods_description } }, |
1204 | 364 #define MARKED_SLOT(x) { XD_LISP_OBJECT, offsetof (Lisp_Coding_System, x) }, |
365 #define MARKED_SLOT_ARRAY(slot, size) \ | |
366 { XD_LISP_OBJECT_ARRAY, offsetof (Lisp_Coding_System, slot), size }, | |
367 #include "coding-system-slots.h" | |
2367 | 368 { XD_BLOCK_ARRAY, offsetof (Lisp_Coding_System, data), 1, |
2551 | 369 { coding_system_extra_description_map } }, |
1204 | 370 { XD_END } |
771 | 371 }; |
372 | |
1204 | 373 static const struct memory_description coding_system_empty_extra_description_1[] = |
374 { | |
375 { XD_END } | |
376 }; | |
377 | |
378 const struct sized_memory_description coding_system_empty_extra_description = { | |
379 0, coding_system_empty_extra_description_1 | |
380 }; | |
381 | |
3263 | 382 #ifdef NEW_GC |
383 DEFINE_LRECORD_SEQUENCE_IMPLEMENTATION ("coding-system", coding_system, | |
384 1, /*dumpable-flag*/ | |
385 mark_coding_system, | |
386 print_coding_system, | |
387 0, 0, 0, coding_system_description, | |
388 sizeof_coding_system, | |
389 Lisp_Coding_System); | |
390 #else /* not NEW_GC */ | |
934 | 391 DEFINE_LRECORD_SEQUENCE_IMPLEMENTATION ("coding-system", coding_system, |
392 1, /*dumpable-flag*/ | |
393 mark_coding_system, | |
394 print_coding_system, | |
395 finalize_coding_system, | |
396 0, 0, coding_system_description, | |
397 sizeof_coding_system, | |
398 Lisp_Coding_System); | |
3263 | 399 #endif /* not NEW_GC */ |
771 | 400 |
401 /************************************************************************/ | |
402 /* Creating coding systems */ | |
403 /************************************************************************/ | |
404 | |
405 static struct coding_system_methods * | |
406 decode_coding_system_type (Lisp_Object type, Error_Behavior errb) | |
428 | 407 { |
771 | 408 int i; |
409 | |
410 for (i = 0; i < Dynarr_length (the_coding_system_type_entry_dynarr); i++) | |
428 | 411 { |
771 | 412 if (EQ (type, |
413 Dynarr_at (the_coding_system_type_entry_dynarr, i).meths->type)) | |
414 return Dynarr_at (the_coding_system_type_entry_dynarr, i).meths; | |
428 | 415 } |
771 | 416 |
417 maybe_invalid_constant ("Invalid coding system type", type, | |
418 Qcoding_system, errb); | |
419 | |
420 return 0; | |
428 | 421 } |
422 | |
771 | 423 static int |
424 valid_coding_system_type_p (Lisp_Object type) | |
428 | 425 { |
771 | 426 return decode_coding_system_type (type, ERROR_ME_NOT) != 0; |
427 } | |
428 | |
4690
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
429 #ifdef MULE |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
430 static Lisp_Object Vdefault_query_coding_region_chartab_cache; |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
431 |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
432 /* Non-static because it's used in INITIALIZE_CODING_SYSTEM_TYPE_WITH_DATA. */ |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
433 Lisp_Object |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
434 default_query_method (Lisp_Object codesys, struct buffer *buf, |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
435 Charbpos end, int flags) |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
436 { |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
437 Charbpos pos = BUF_PT (buf), fail_range_start, fail_range_end; |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
438 Charbpos pos_byte = BYTE_BUF_PT (buf); |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
439 Lisp_Object safe_charsets = XCODING_SYSTEM_SAFE_CHARSETS (codesys); |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
440 Lisp_Object safe_chars = XCODING_SYSTEM_SAFE_CHARS (codesys), |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
441 result = Qnil; |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
442 enum query_coding_failure_reasons failed_reason, |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
443 previous_failed_reason = query_coding_succeeded; |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
444 |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
445 /* safe-charsets of t means the coding system can encode everything. */ |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
446 if (EQ (Qnil, safe_chars)) |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
447 { |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
448 if (EQ (Qt, safe_charsets)) |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
449 { |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
450 return Qnil; |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
451 } |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
452 |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
453 /* If we've no information on what characters the coding system can |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
454 encode, give up. */ |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
455 if (EQ (Qnil, safe_charsets) && EQ (Qnil, safe_chars)) |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
456 { |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
457 return Qunbound; |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
458 } |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
459 |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
460 safe_chars = Fgethash (safe_charsets, |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
461 Vdefault_query_coding_region_chartab_cache, |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
462 Qnil); |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
463 if (NILP (safe_chars)) |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
464 { |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
465 safe_chars = Fmake_char_table (Qgeneric); |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
466 { |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
467 EXTERNAL_LIST_LOOP_2 (safe_charset, safe_charsets) |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
468 Fput_char_table (safe_charset, Qt, safe_chars); |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
469 } |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
470 |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
471 Fputhash (safe_charsets, safe_chars, |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
472 Vdefault_query_coding_region_chartab_cache); |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
473 } |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
474 } |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
475 |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
476 if (flags & QUERY_METHOD_HIGHLIGHT && |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
477 /* If we're being called really early, live without highlights getting |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
478 cleared properly: */ |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
479 !(UNBOUNDP (XSYMBOL (Qquery_coding_clear_highlights)->function))) |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
480 { |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
481 /* It's okay to call Lisp here, the only non-stack object we may have |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
482 allocated up to this point is safe_chars, and that's |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
483 reachable from its entry in |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
484 Vdefault_query_coding_region_chartab_cache */ |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
485 call3 (Qquery_coding_clear_highlights, make_int (pos), make_int (end), |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
486 wrap_buffer (buf)); |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
487 } |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
488 |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
489 while (pos < end) |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
490 { |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
491 Ichar ch = BYTE_BUF_FETCH_CHAR (buf, pos_byte); |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
492 if (!EQ (Qnil, get_char_table (ch, safe_chars))) |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
493 { |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
494 pos++; |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
495 INC_BYTEBPOS (buf, pos_byte); |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
496 } |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
497 else |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
498 { |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
499 fail_range_start = pos; |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
500 while ((pos < end) && |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
501 (EQ (Qnil, get_char_table (ch, safe_chars)) |
4839
5f1c6ca4d05e
fix bad change to default_query_method, rewrite conditional for more correctness
Ben Wing <ben@xemacs.org>
parents:
4830
diff
changeset
|
502 && (failed_reason = query_coding_unencodable, |
5f1c6ca4d05e
fix bad change to default_query_method, rewrite conditional for more correctness
Ben Wing <ben@xemacs.org>
parents:
4830
diff
changeset
|
503 (previous_failed_reason == query_coding_succeeded |
5f1c6ca4d05e
fix bad change to default_query_method, rewrite conditional for more correctness
Ben Wing <ben@xemacs.org>
parents:
4830
diff
changeset
|
504 || previous_failed_reason == failed_reason)))) |
4690
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
505 { |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
506 pos++; |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
507 INC_BYTEBPOS (buf, pos_byte); |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
508 ch = BYTE_BUF_FETCH_CHAR (buf, pos_byte); |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
509 previous_failed_reason = failed_reason; |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
510 } |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
511 |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
512 if (fail_range_start == pos) |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
513 { |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
514 /* The character can actually be encoded; move on. */ |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
515 pos++; |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
516 INC_BYTEBPOS (buf, pos_byte); |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
517 } |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
518 else |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
519 { |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
520 assert (previous_failed_reason == query_coding_unencodable); |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
521 |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
522 if (flags & QUERY_METHOD_ERRORP) |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
523 { |
4952
19a72041c5ed
Mule-izing, various fixes related to char * arguments
Ben Wing <ben@xemacs.org>
parents:
4846
diff
changeset
|
524 signal_error_2 |
19a72041c5ed
Mule-izing, various fixes related to char * arguments
Ben Wing <ben@xemacs.org>
parents:
4846
diff
changeset
|
525 (Qtext_conversion_error, |
19a72041c5ed
Mule-izing, various fixes related to char * arguments
Ben Wing <ben@xemacs.org>
parents:
4846
diff
changeset
|
526 "Cannot encode using coding system", |
19a72041c5ed
Mule-izing, various fixes related to char * arguments
Ben Wing <ben@xemacs.org>
parents:
4846
diff
changeset
|
527 make_string_from_buffer (buf, fail_range_start, |
19a72041c5ed
Mule-izing, various fixes related to char * arguments
Ben Wing <ben@xemacs.org>
parents:
4846
diff
changeset
|
528 pos - fail_range_start), |
19a72041c5ed
Mule-izing, various fixes related to char * arguments
Ben Wing <ben@xemacs.org>
parents:
4846
diff
changeset
|
529 XCODING_SYSTEM_NAME (codesys)); |
4690
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
530 } |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
531 |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
532 if (NILP (result)) |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
533 { |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
534 result = Fmake_range_table (Qstart_closed_end_open); |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
535 } |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
536 |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
537 fail_range_end = pos; |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
538 |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
539 Fput_range_table (make_int (fail_range_start), |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
540 make_int (fail_range_end), |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
541 Qunencodable, |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
542 result); |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
543 previous_failed_reason = query_coding_succeeded; |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
544 |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
545 if (flags & QUERY_METHOD_HIGHLIGHT) |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
546 { |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
547 Lisp_Object extent |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
548 = Fmake_extent (make_int (fail_range_start), |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
549 make_int (fail_range_end), |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
550 wrap_buffer (buf)); |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
551 |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
552 Fset_extent_priority |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
553 (extent, make_int (2 + mouse_highlight_priority)); |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
554 Fset_extent_face (extent, Qquery_coding_warning_face); |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
555 } |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
556 } |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
557 } |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
558 } |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
559 |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
560 return result; |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
561 } |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
562 #else |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
563 Lisp_Object |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
564 default_query_method (Lisp_Object UNUSED (codesys), |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
565 struct buffer * UNUSED (buf), |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
566 Charbpos UNUSED (end), int UNUSED (flags)) |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
567 { |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
568 return Qnil; |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
569 } |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
570 #endif /* defined MULE */ |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
571 |
771 | 572 DEFUN ("valid-coding-system-type-p", Fvalid_coding_system_type_p, 1, 1, 0, /* |
573 Given a CODING-SYSTEM-TYPE, return non-nil if it is valid. | |
574 Valid types depend on how XEmacs was compiled but may include | |
3025 | 575 `undecided', `chain', `integer', `ccl', `iso2022', `big5', `shift-jis', |
576 `utf-16', `ucs-4', `utf-8', etc. | |
771 | 577 */ |
578 (coding_system_type)) | |
579 { | |
580 return valid_coding_system_type_p (coding_system_type) ? Qt : Qnil; | |
581 } | |
582 | |
583 DEFUN ("coding-system-type-list", Fcoding_system_type_list, 0, 0, 0, /* | |
584 Return a list of valid coding system types. | |
585 */ | |
586 ()) | |
587 { | |
588 return Fcopy_sequence (Vcoding_system_type_list); | |
589 } | |
590 | |
591 void | |
592 add_entry_to_coding_system_type_list (struct coding_system_methods *meths) | |
593 { | |
594 struct coding_system_type_entry entry; | |
595 | |
596 entry.meths = meths; | |
597 Dynarr_add (the_coding_system_type_entry_dynarr, entry); | |
598 Vcoding_system_type_list = Fcons (meths->type, Vcoding_system_type_list); | |
428 | 599 } |
600 | |
601 DEFUN ("coding-system-p", Fcoding_system_p, 1, 1, 0, /* | |
602 Return t if OBJECT is a coding system. | |
603 A coding system is an object that defines how text containing multiple | |
604 character sets is encoded into a stream of (typically 8-bit) bytes. | |
605 The coding system is used to decode the stream into a series of | |
606 characters (which may be from multiple charsets) when the text is read | |
607 from a file or process, and is used to encode the text back into the | |
608 same format when it is written out to a file or process. | |
609 | |
610 For example, many ISO2022-compliant coding systems (such as Compound | |
611 Text, which is used for inter-client data under the X Window System) | |
612 use escape sequences to switch between different charsets -- Japanese | |
613 Kanji, for example, is invoked with "ESC $ ( B"; ASCII is invoked | |
614 with "ESC ( B"; and Cyrillic is invoked with "ESC - L". See | |
615 `make-coding-system' for more information. | |
616 | |
617 Coding systems are normally identified using a symbol, and the | |
618 symbol is accepted in place of the actual coding system object whenever | |
619 a coding system is called for. (This is similar to how faces work.) | |
620 */ | |
621 (object)) | |
622 { | |
623 return CODING_SYSTEMP (object) ? Qt : Qnil; | |
624 } | |
625 | |
4303 | 626 static Lisp_Object |
627 find_coding_system (Lisp_Object coding_system_or_name, | |
628 int do_autoloads) | |
629 { | |
630 Lisp_Object lookup; | |
631 | |
632 if (NILP (coding_system_or_name)) | |
633 coding_system_or_name = Qbinary; | |
634 else if (CODING_SYSTEMP (coding_system_or_name)) | |
635 return coding_system_or_name; | |
636 else | |
637 CHECK_SYMBOL (coding_system_or_name); | |
638 | |
639 while (1) | |
640 { | |
641 lookup = | |
642 Fgethash (coding_system_or_name, Vcoding_system_hash_table, Qnil); | |
643 | |
644 if (CONSP (lookup) && do_autoloads) | |
645 { | |
646 struct gcpro gcpro1; | |
647 int length; | |
648 DECLARE_EISTRING (desired_base); | |
649 DECLARE_EISTRING (warning_info); | |
650 | |
651 eicpy_lstr (desired_base, XSYMBOL_NAME (coding_system_or_name)); | |
652 | |
653 /* Work out the name of the base coding system. */ | |
654 length = eilen (desired_base); | |
655 if (length > (int)(sizeof ("-unix") - 1)) | |
656 { | |
657 if (0 == qxestrcmp ((UAscbyte *)"-unix", (eidata (desired_base)) | |
658 + (length - (sizeof ("-unix") - 1)))) | |
659 { | |
660 eidel (desired_base, length - (sizeof ("-unix") - 1), | |
661 -1, 5, 5); | |
662 } | |
663 } | |
664 else if (length > (int)(sizeof ("-dos") - 1)) | |
665 { | |
666 if ((0 == qxestrcmp ((UAscbyte *)"-dos", (eidata (desired_base)) | |
667 + (length - (sizeof ("-dos") - 1)))) || | |
668 (0 == qxestrcmp ((UAscbyte *)"-mac", (eidata (desired_base)) | |
669 + (length - (sizeof ("-mac") - 1))))) | |
670 { | |
671 eidel (desired_base, length - (sizeof ("-dos") - 1), -1, | |
672 4, 4); | |
673 } | |
674 } | |
675 | |
4953
304aebb79cd3
function renamings to track names of char typedefs
Ben Wing <ben@xemacs.org>
parents:
4952
diff
changeset
|
676 coding_system_or_name = intern_istring (eidata (desired_base)); |
4303 | 677 |
678 /* Remove this coding system and its subsidiary coding | |
679 systems from the hash, to avoid calling this code recursively. */ | |
680 Fremhash (coding_system_or_name, Vcoding_system_hash_table); | |
681 Fremhash (add_suffix_to_symbol(coding_system_or_name, "-unix"), | |
682 Vcoding_system_hash_table); | |
683 Fremhash (add_suffix_to_symbol(coding_system_or_name, "-dos"), | |
684 Vcoding_system_hash_table); | |
685 Fremhash (add_suffix_to_symbol(coding_system_or_name, "-mac"), | |
686 Vcoding_system_hash_table); | |
687 | |
688 eicpy_ascii (warning_info, "Error autoloading coding system "); | |
689 eicat_lstr (warning_info, XSYMBOL_NAME (coding_system_or_name)); | |
690 | |
691 /* Keep around the form so it doesn't disappear from under | |
692 #'eval's feet. */ | |
693 GCPRO1 (lookup); | |
4952
19a72041c5ed
Mule-izing, various fixes related to char * arguments
Ben Wing <ben@xemacs.org>
parents:
4846
diff
changeset
|
694 call1_trapping_problems ((const CIbyte *) eidata (warning_info), |
4303 | 695 Qeval, lookup, 0); |
696 UNGCPRO; | |
697 | |
698 lookup = | |
699 Fgethash (coding_system_or_name, Vcoding_system_hash_table, Qnil); | |
700 } | |
701 | |
702 if (CODING_SYSTEMP (lookup) || NILP (lookup)) | |
703 return lookup; | |
704 | |
705 coding_system_or_name = lookup; | |
706 } | |
707 } | |
708 | |
428 | 709 DEFUN ("find-coding-system", Ffind_coding_system, 1, 1, 0, /* |
710 Retrieve the coding system of the given name. | |
711 | |
712 If CODING-SYSTEM-OR-NAME is a coding-system object, it is simply | |
713 returned. Otherwise, CODING-SYSTEM-OR-NAME should be a symbol. | |
714 If there is no such coding system, nil is returned. Otherwise the | |
715 associated coding system object is returned. | |
716 */ | |
717 (coding_system_or_name)) | |
718 { | |
4303 | 719 return find_coding_system(coding_system_or_name, 1); |
720 } | |
721 | |
722 DEFUN ("autoload-coding-system", Fautoload_coding_system, 2, 2, 0, /* | |
723 Define SYMBOL as a coding-system that is loaded on demand. | |
724 | |
725 FORM is a form to evaluate to define the coding-system. | |
726 */ | |
727 (symbol, form)) | |
728 { | |
729 Lisp_Object lookup; | |
730 | |
731 CHECK_SYMBOL (symbol); | |
732 CHECK_CONS (form); | |
733 | |
734 lookup = find_coding_system (symbol, 0); | |
735 | |
736 if (!NILP (lookup) && | |
737 /* Allow autoloads to be redefined. */ | |
738 !CONSP (lookup)) | |
440 | 739 { |
4303 | 740 invalid_operation ("Cannot redefine existing coding system", |
741 symbol); | |
440 | 742 } |
4303 | 743 |
744 Fputhash (symbol, form, Vcoding_system_hash_table); | |
745 Fputhash (add_suffix_to_symbol(symbol, "-unix"), form, | |
746 Vcoding_system_hash_table); | |
747 Fputhash (add_suffix_to_symbol(symbol, "-dos"), form, | |
748 Vcoding_system_hash_table); | |
749 Fputhash (add_suffix_to_symbol(symbol, "-mac"), form, | |
750 Vcoding_system_hash_table); | |
751 | |
752 /* Tell the POSIX locale infrastructure about this coding system (though | |
753 unfortunately it'll be too late for the startup locale sniffing. */ | |
754 if (!UNBOUNDP (Qposix_charset_to_coding_system_hash)) | |
755 { | |
756 Lisp_Object val = Fsymbol_value (Qposix_charset_to_coding_system_hash); | |
757 DECLARE_EISTRING (minimal_name); | |
758 Ibyte *full_name; | |
759 int len = XSTRING_LENGTH (XSYMBOL_NAME (symbol)), i; | |
760 | |
761 if (!NILP (val)) | |
762 { | |
763 full_name = XSTRING_DATA (XSYMBOL_NAME (symbol)); | |
764 for (i = 0; i < len; ++i) | |
765 { | |
766 if (full_name[i] >= '0' && full_name[i] <= '9') | |
767 { | |
768 eicat_ch (minimal_name, full_name[i]); | |
769 } | |
770 else if (full_name[i] >= 'a' && full_name[i] <= 'z') | |
771 { | |
772 eicat_ch (minimal_name, full_name[i]); | |
773 } | |
774 else if (full_name[i] >= 'A' && full_name[i] <= 'Z') | |
775 { | |
776 eicat_ch (minimal_name, full_name[i] + | |
777 ('a' - 'A')); | |
778 } | |
779 } | |
780 | |
781 if (eilen (minimal_name)) | |
782 { | |
783 CHECK_HASH_TABLE (val); | |
784 Fputhash (eimake_string(minimal_name), symbol, val); | |
785 } | |
786 } | |
787 } | |
788 | |
789 return Qt; | |
428 | 790 } |
791 | |
792 DEFUN ("get-coding-system", Fget_coding_system, 1, 1, 0, /* | |
793 Retrieve the coding system of the given name. | |
794 Same as `find-coding-system' except that if there is no such | |
795 coding system, an error is signaled instead of returning nil. | |
796 */ | |
797 (name)) | |
798 { | |
799 Lisp_Object coding_system = Ffind_coding_system (name); | |
800 | |
801 if (NILP (coding_system)) | |
563 | 802 invalid_argument ("No such coding system", name); |
428 | 803 return coding_system; |
804 } | |
805 | |
771 | 806 int |
807 coding_system_is_binary (Lisp_Object coding_system) | |
808 { | |
809 Lisp_Coding_System *cs = XCODING_SYSTEM (coding_system); | |
810 return | |
811 (EQ (CODING_SYSTEM_TYPE (cs), Qno_conversion) && | |
812 CODING_SYSTEM_EOL_TYPE (cs) == EOL_LF && | |
813 EQ (CODING_SYSTEM_POST_READ_CONVERSION (cs), Qnil) && | |
814 EQ (CODING_SYSTEM_PRE_WRITE_CONVERSION (cs), Qnil)); | |
815 } | |
816 | |
817 static Lisp_Object | |
818 coding_system_real_canonical (Lisp_Object cs) | |
819 { | |
820 if (!NILP (XCODING_SYSTEM_CANONICAL (cs))) | |
821 return XCODING_SYSTEM_CANONICAL (cs); | |
822 return cs; | |
823 } | |
824 | |
825 /* Return true if coding system is of the "standard" type that decodes | |
826 bytes into characters (suitable for decoding a text file). */ | |
827 int | |
828 coding_system_is_for_text_file (Lisp_Object coding_system) | |
829 { | |
830 return (XCODESYSMETH_OR_GIVEN | |
831 (coding_system, conversion_end_type, | |
832 (coding_system_real_canonical (coding_system)), | |
833 DECODES_BYTE_TO_CHARACTER) == | |
834 DECODES_BYTE_TO_CHARACTER); | |
835 } | |
836 | |
837 static int | |
838 decoding_source_sink_type_is_char (Lisp_Object cs, enum source_or_sink sex) | |
839 { | |
840 enum source_sink_type type = | |
841 XCODESYSMETH_OR_GIVEN (cs, conversion_end_type, | |
842 (coding_system_real_canonical (cs)), | |
843 DECODES_BYTE_TO_CHARACTER); | |
844 if (sex == CODING_SOURCE) | |
845 return (type == DECODES_CHARACTER_TO_CHARACTER || | |
846 type == DECODES_CHARACTER_TO_BYTE); | |
847 else | |
848 return (type == DECODES_CHARACTER_TO_CHARACTER || | |
849 type == DECODES_BYTE_TO_CHARACTER); | |
850 } | |
851 | |
852 static int | |
853 encoding_source_sink_type_is_char (Lisp_Object cs, enum source_or_sink sex) | |
854 { | |
855 return decoding_source_sink_type_is_char (cs, | |
856 /* Sex change */ | |
857 sex == CODING_SOURCE ? | |
858 CODING_SINK : CODING_SOURCE); | |
859 } | |
860 | |
861 /* Like Ffind_coding_system() but check that the coding system is of the | |
862 "standard" type that decodes bytes into characters (suitable for | |
863 decoding a text file), and if not, returns an appropriate wrapper that | |
864 does. Also, if EOL_WRAP is non-zero, check whether this coding system | |
865 wants EOL auto-detection, and if so, wrap with a convert-eol coding | |
866 system to do this. */ | |
867 | |
868 Lisp_Object | |
869 find_coding_system_for_text_file (Lisp_Object name, int eol_wrap) | |
870 { | |
871 Lisp_Object coding_system = Ffind_coding_system (name); | |
872 Lisp_Object wrapper = coding_system; | |
873 | |
874 if (NILP (coding_system)) | |
875 return Qnil; | |
876 if (!coding_system_is_for_text_file (coding_system)) | |
877 { | |
878 wrapper = XCODING_SYSTEM_TEXT_FILE_WRAPPER (coding_system); | |
879 if (NILP (wrapper)) | |
880 { | |
881 Lisp_Object chain; | |
882 if (!decoding_source_sink_type_is_char (coding_system, CODING_SINK)) | |
883 chain = list2 (coding_system, Qbinary); | |
884 else | |
885 chain = list1 (coding_system); | |
886 if (decoding_source_sink_type_is_char (coding_system, CODING_SOURCE)) | |
887 chain = Fcons (Qbinary, chain); | |
888 wrapper = | |
889 make_internal_coding_system | |
890 (coding_system, | |
891 "internal-text-file-wrapper", | |
892 Qchain, | |
893 Qunbound, list4 (Qchain, chain, | |
894 Qcanonicalize_after_coding, coding_system)); | |
895 XCODING_SYSTEM_TEXT_FILE_WRAPPER (coding_system) = wrapper; | |
896 } | |
897 } | |
898 | |
899 if (!eol_wrap || XCODING_SYSTEM_EOL_TYPE (coding_system) != EOL_AUTODETECT) | |
900 return wrapper; | |
901 | |
902 coding_system = wrapper; | |
903 wrapper = XCODING_SYSTEM_AUTO_EOL_WRAPPER (coding_system); | |
904 if (!NILP (wrapper)) | |
905 return wrapper; | |
906 wrapper = | |
907 make_internal_coding_system | |
908 (coding_system, | |
909 "internal-auto-eol-wrapper", | |
910 Qundecided, Qunbound, | |
911 list4 (Qcoding_system, coding_system, | |
912 Qdo_eol, Qt)); | |
913 XCODING_SYSTEM_AUTO_EOL_WRAPPER (coding_system) = wrapper; | |
914 return wrapper; | |
915 } | |
916 | |
917 /* Like Fget_coding_system() but verify that the coding system is of the | |
918 "standard" type that decodes bytes into characters (suitable for | |
919 decoding a text file), and if not, returns an appropriate wrapper that | |
920 does. Also, if EOL_WRAP is non-zero, check whether this coding system | |
921 wants EOL auto-detection, and if so, wrap with a convert-eol coding | |
922 system to do this. */ | |
923 | |
924 Lisp_Object | |
925 get_coding_system_for_text_file (Lisp_Object name, int eol_wrap) | |
926 { | |
927 Lisp_Object coding_system = find_coding_system_for_text_file (name, | |
928 eol_wrap); | |
929 if (NILP (coding_system)) | |
930 invalid_argument ("No such coding system", name); | |
931 return coding_system; | |
932 } | |
933 | |
934 /* We store the coding systems in hash tables with the names as the | |
935 key and the actual coding system object as the value. Occasionally | |
936 we need to use them in a list format. These routines provide us | |
937 with that. */ | |
428 | 938 struct coding_system_list_closure |
939 { | |
940 Lisp_Object *coding_system_list; | |
771 | 941 int normal; |
942 int internal; | |
428 | 943 }; |
944 | |
945 static int | |
4303 | 946 add_coding_system_to_list_mapper (Lisp_Object key, Lisp_Object value, |
428 | 947 void *coding_system_list_closure) |
948 { | |
949 /* This function can GC */ | |
950 struct coding_system_list_closure *cscl = | |
951 (struct coding_system_list_closure *) coding_system_list_closure; | |
952 Lisp_Object *coding_system_list = cscl->coding_system_list; | |
953 | |
771 | 954 /* We can't just use VALUE because KEY might be an alias, and we need |
4303 | 955 the real coding system object. |
956 | |
957 Autoloaded coding systems have conses for their values, and can't be | |
958 internal coding systems, or coding system aliases. */ | |
959 if (CONSP (value) || | |
960 (XCODING_SYSTEM (Ffind_coding_system (key))->internal_p ? | |
961 cscl->internal : cscl->normal)) | |
771 | 962 *coding_system_list = Fcons (key, *coding_system_list); |
428 | 963 return 0; |
964 } | |
965 | |
2297 | 966 /* #### should we specify a conventional for "all coding systems"? */ |
771 | 967 DEFUN ("coding-system-list", Fcoding_system_list, 0, 1, 0, /* |
428 | 968 Return a list of the names of all defined coding systems. |
771 | 969 If INTERNAL is nil, only the normal (non-internal) coding systems are |
970 included. (Internal coding systems are created for various internal | |
971 purposes, such as implementing EOL types of CRLF and CR; generally, you do | |
972 not want to see these.) If it is t, only the internal coding systems are | |
973 included. If it is any other non-nil value both normal and internal are | |
974 included. | |
428 | 975 */ |
771 | 976 (internal)) |
428 | 977 { |
978 Lisp_Object coding_system_list = Qnil; | |
979 struct gcpro gcpro1; | |
980 struct coding_system_list_closure coding_system_list_closure; | |
981 | |
982 GCPRO1 (coding_system_list); | |
983 coding_system_list_closure.coding_system_list = &coding_system_list; | |
771 | 984 coding_system_list_closure.normal = !EQ (internal, Qt); |
985 coding_system_list_closure.internal = !NILP (internal); | |
428 | 986 elisp_maphash (add_coding_system_to_list_mapper, Vcoding_system_hash_table, |
987 &coding_system_list_closure); | |
988 UNGCPRO; | |
989 | |
990 return coding_system_list; | |
991 } | |
992 | |
993 DEFUN ("coding-system-name", Fcoding_system_name, 1, 1, 0, /* | |
994 Return the name of the given coding system. | |
995 */ | |
996 (coding_system)) | |
997 { | |
998 coding_system = Fget_coding_system (coding_system); | |
999 return XCODING_SYSTEM_NAME (coding_system); | |
1000 } | |
1001 | |
1002 static Lisp_Coding_System * | |
771 | 1003 allocate_coding_system (struct coding_system_methods *codesys_meths, |
1004 Bytecount data_size, | |
1005 Lisp_Object name) | |
428 | 1006 { |
771 | 1007 Bytecount total_size = offsetof (Lisp_Coding_System, data) + data_size; |
428 | 1008 Lisp_Coding_System *codesys = |
3024 | 1009 (Lisp_Coding_System *) BASIC_ALLOC_LCRECORD (total_size, |
1204 | 1010 &lrecord_coding_system); |
1011 | |
771 | 1012 codesys->methods = codesys_meths; |
1204 | 1013 #define MARKED_SLOT(x) codesys->x = Qnil; |
1014 #include "coding-system-slots.h" | |
1015 | |
771 | 1016 CODING_SYSTEM_EOL_TYPE (codesys) = EOL_LF; |
1017 CODING_SYSTEM_NAME (codesys) = name; | |
1018 | |
1019 MAYBE_CODESYSMETH (codesys, init, (wrap_coding_system (codesys))); | |
428 | 1020 |
1021 return codesys; | |
1022 } | |
1023 | |
771 | 1024 static enum eol_type |
1025 symbol_to_eol_type (Lisp_Object symbol) | |
1026 { | |
1027 CHECK_SYMBOL (symbol); | |
1028 if (NILP (symbol)) return EOL_AUTODETECT; | |
1029 if (EQ (symbol, Qlf)) return EOL_LF; | |
1030 if (EQ (symbol, Qcrlf)) return EOL_CRLF; | |
1031 if (EQ (symbol, Qcr)) return EOL_CR; | |
1032 | |
1033 invalid_constant ("Unrecognized eol type", symbol); | |
1204 | 1034 RETURN_NOT_REACHED (EOL_AUTODETECT); |
771 | 1035 } |
1036 | |
1037 static Lisp_Object | |
1038 eol_type_to_symbol (enum eol_type type) | |
1039 { | |
1040 switch (type) | |
1041 { | |
2500 | 1042 default: ABORT (); |
771 | 1043 case EOL_LF: return Qlf; |
1044 case EOL_CRLF: return Qcrlf; | |
1045 case EOL_CR: return Qcr; | |
1046 case EOL_AUTODETECT: return Qnil; | |
1047 } | |
1048 } | |
1049 | |
1050 struct subsidiary_type | |
1051 { | |
4528
726060ee587c
First draft of g++ 4.3 warning removal patch. Builds. *Needs ChangeLogs.*
Stephen J. Turnbull <stephen@xemacs.org>
parents:
4522
diff
changeset
|
1052 const Ascbyte *extension; |
726060ee587c
First draft of g++ 4.3 warning removal patch. Builds. *Needs ChangeLogs.*
Stephen J. Turnbull <stephen@xemacs.org>
parents:
4522
diff
changeset
|
1053 const Ascbyte *mnemonic_ext; |
771 | 1054 enum eol_type eol; |
1055 }; | |
1056 | |
1057 static struct subsidiary_type coding_subsidiary_list[] = | |
1058 { { "-unix", "", EOL_LF }, | |
1059 { "-dos", ":T", EOL_CRLF }, | |
1060 { "-mac", ":t", EOL_CR } }; | |
1061 | |
1062 /* kludge */ | |
428 | 1063 static void |
771 | 1064 setup_eol_coding_systems (Lisp_Object codesys) |
428 | 1065 { |
793 | 1066 int len = XSTRING_LENGTH (XSYMBOL (XCODING_SYSTEM_NAME (codesys))->name); |
2367 | 1067 Ibyte *codesys_name = alloca_ibytes (len + 7); |
771 | 1068 int mlen = -1; |
867 | 1069 Ibyte *codesys_mnemonic = 0; |
771 | 1070 Lisp_Object codesys_name_sym, sub_codesys; |
1071 int i; | |
1072 | |
1073 memcpy (codesys_name, | |
793 | 1074 XSTRING_DATA (XSYMBOL (XCODING_SYSTEM_NAME (codesys))->name), len); |
771 | 1075 |
1076 if (STRINGP (XCODING_SYSTEM_MNEMONIC (codesys))) | |
428 | 1077 { |
771 | 1078 mlen = XSTRING_LENGTH (XCODING_SYSTEM_MNEMONIC (codesys)); |
2367 | 1079 codesys_mnemonic = alloca_ibytes (mlen + 7); |
771 | 1080 memcpy (codesys_mnemonic, |
1081 XSTRING_DATA (XCODING_SYSTEM_MNEMONIC (codesys)), mlen); | |
1082 } | |
1083 | |
1084 /* Create three "subsidiary" coding systems, decoding data encoded using | |
1085 each of the three EOL types. We do this for each subsidiary by | |
1086 copying the original coding system, setting the EOL type | |
1087 appropriately, and setting the CANONICAL member of the new coding | |
1088 system to be a chain consisting of the original coding system followed | |
1089 by a convert-eol coding system to do the EOL decoding. For EOL type | |
1090 LF, however, we don't need any decoding, so we skip creating a | |
1091 CANONICAL. | |
1092 | |
1093 If the original coding system is not a text-type coding system | |
1094 (decodes byte->char), we need to coerce it to one by the appropriate | |
1095 wrapping in CANONICAL. */ | |
1096 | |
1097 for (i = 0; i < countof (coding_subsidiary_list); i++) | |
1098 { | |
4528
726060ee587c
First draft of g++ 4.3 warning removal patch. Builds. *Needs ChangeLogs.*
Stephen J. Turnbull <stephen@xemacs.org>
parents:
4522
diff
changeset
|
1099 const Ascbyte *extension = coding_subsidiary_list[i].extension; |
726060ee587c
First draft of g++ 4.3 warning removal patch. Builds. *Needs ChangeLogs.*
Stephen J. Turnbull <stephen@xemacs.org>
parents:
4522
diff
changeset
|
1100 const Ascbyte *mnemonic_ext = coding_subsidiary_list[i].mnemonic_ext; |
771 | 1101 enum eol_type eol = coding_subsidiary_list[i].eol; |
1102 | |
2367 | 1103 qxestrcpy_ascii (codesys_name + len, extension); |
4953
304aebb79cd3
function renamings to track names of char typedefs
Ben Wing <ben@xemacs.org>
parents:
4952
diff
changeset
|
1104 codesys_name_sym = intern_istring (codesys_name); |
771 | 1105 if (mlen != -1) |
2367 | 1106 qxestrcpy_ascii (codesys_mnemonic + mlen, mnemonic_ext); |
771 | 1107 |
1108 sub_codesys = Fcopy_coding_system (codesys, codesys_name_sym); | |
1109 if (mlen != -1) | |
1110 XCODING_SYSTEM_MNEMONIC (sub_codesys) = | |
4953
304aebb79cd3
function renamings to track names of char typedefs
Ben Wing <ben@xemacs.org>
parents:
4952
diff
changeset
|
1111 build_istring (codesys_mnemonic); |
771 | 1112 |
1113 if (eol != EOL_LF) | |
1114 { | |
1115 Lisp_Object chain = list2 (get_coding_system_for_text_file | |
1116 (codesys, 0), | |
1117 eol == EOL_CR ? Qconvert_eol_cr : | |
1118 Qconvert_eol_crlf); | |
1119 Lisp_Object canon = | |
1120 make_internal_coding_system | |
1121 (sub_codesys, "internal-subsidiary-eol-wrapper", | |
1122 Qchain, Qunbound, | |
1123 mlen != -1 ? | |
4953
304aebb79cd3
function renamings to track names of char typedefs
Ben Wing <ben@xemacs.org>
parents:
4952
diff
changeset
|
1124 list6 (Qmnemonic, build_istring (codesys_mnemonic), |
771 | 1125 Qchain, chain, |
1126 Qcanonicalize_after_coding, sub_codesys) : | |
1127 list4 (Qchain, chain, | |
1128 Qcanonicalize_after_coding, sub_codesys)); | |
1129 XCODING_SYSTEM_CANONICAL (sub_codesys) = canon; | |
1130 } | |
1131 XCODING_SYSTEM_EOL_TYPE (sub_codesys) = eol; | |
1132 XCODING_SYSTEM_SUBSIDIARY_PARENT (sub_codesys) = codesys; | |
1133 XCODING_SYSTEM (codesys)->eol[eol] = sub_codesys; | |
428 | 1134 } |
1135 } | |
1136 | |
4690
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
1137 DEFUN ("coding-system-canonical-name-p", Fcoding_system_canonical_name_p, |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
1138 1, 1, 0, /* |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
1139 Return t if OBJECT names a coding system, and is not a coding system alias. |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
1140 */ |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
1141 (object)) |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
1142 { |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
1143 return CODING_SYSTEMP (Fgethash (object, Vcoding_system_hash_table, Qnil)) |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
1144 ? Qt : Qnil; |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
1145 } |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
1146 |
771 | 1147 /* Basic function to create new coding systems. For `make-coding-system', |
1148 NAME-OR-EXISTING is the NAME argument, PREFIX is null, and TYPE, | |
1149 DESCRIPTION, and PROPS are the same. All created coding systems are put | |
1150 in a hash table indexed by NAME. | |
1151 | |
1152 If PREFIX is a string, NAME-OR-EXISTING should specify an existing | |
1153 coding system (or nil), and an internal coding system will be created. | |
1154 The name of the coding system will be constructed by combining PREFIX | |
1155 with the name of the existing coding system (if given), and a number | |
1156 will be appended to insure uniqueness. In such a case, if Qunbound is | |
1157 given for DESCRIPTION, the description gets created based on the | |
1158 generated name. Also, if no mnemonic is given in the properties list, a | |
1159 mnemonic is created based on the generated name. | |
1160 | |
1161 For internal coding systems, the coding system is marked as internal | |
1162 (see `coding-system-list'), and no subsidiaries will be created or | |
1163 eol-wrapping will happen. Otherwise: | |
1164 | |
1165 -- if the eol-type property is `lf' or t, the coding system is merely | |
1166 created and returned. (For t, the coding system will be wrapped with | |
1167 an EOL autodetector when it's used to read a file.) | |
1168 | |
1169 -- if eol-type is `crlf' or `cr', after the coding system object is | |
1170 created, it will be wrapped in a chain with the appropriate | |
1171 convert-eol coding system (either `convert-eol-crlf' or | |
1172 `convert-eol-cr'), so that CRLF->LF or CR->LF conversion is done at | |
1173 decoding time, and the opposite at encoding time. The resulting | |
1174 chain becomes the CANONICAL field of the coding system object. | |
1175 | |
1176 -- if eol-type is nil or omitted, "subsidiaries" are generated: Three | |
1177 coding systems where the original coding system (before wrapping with | |
1178 convert-eol-autodetect) is either unwrapped or wrapped with | |
1179 convert-eol-crlf or convert-eol-cr, respectively, so that coding systems | |
1180 to handle LF, CRLF, and CR end-of-line indicators are created. (This | |
1181 crazy crap is based on existing behavior in other Mule versions, | |
1182 including FSF Emacs.) | |
1183 */ | |
428 | 1184 |
1185 static Lisp_Object | |
4528
726060ee587c
First draft of g++ 4.3 warning removal patch. Builds. *Needs ChangeLogs.*
Stephen J. Turnbull <stephen@xemacs.org>
parents:
4522
diff
changeset
|
1186 make_coding_system_1 (Lisp_Object name_or_existing, const Ascbyte *prefix, |
771 | 1187 Lisp_Object type, Lisp_Object description, |
1188 Lisp_Object props) | |
428 | 1189 { |
771 | 1190 Lisp_Coding_System *cs; |
1191 int need_to_setup_eol_systems = 1; | |
1192 enum eol_type eol_wrapper = EOL_AUTODETECT; | |
1193 struct coding_system_methods *meths; | |
1194 Lisp_Object csobj; | |
4690
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
1195 Lisp_Object defmnem = Qnil, aliases = Qnil; |
771 | 1196 |
1197 if (NILP (type)) | |
1198 type = Qundecided; | |
1199 meths = decode_coding_system_type (type, ERROR_ME); | |
1200 | |
1201 if (prefix) | |
428 | 1202 { |
867 | 1203 Ibyte *newname = |
771 | 1204 emacs_sprintf_malloc (NULL, "%s-%s-%d", |
1205 prefix, | |
867 | 1206 NILP (name_or_existing) ? (Ibyte *) "nil" : |
771 | 1207 XSTRING_DATA (Fsymbol_name (XCODING_SYSTEM_NAME |
1208 (name_or_existing))), | |
1209 ++coding_system_tick); | |
4953
304aebb79cd3
function renamings to track names of char typedefs
Ben Wing <ben@xemacs.org>
parents:
4952
diff
changeset
|
1210 name_or_existing = intern_istring (newname); |
4976
16112448d484
Rename xfree(FOO, TYPE) -> xfree(FOO)
Ben Wing <ben@xemacs.org>
parents:
4967
diff
changeset
|
1211 xfree (newname); |
771 | 1212 |
1213 if (UNBOUNDP (description)) | |
1214 { | |
1215 newname = | |
1216 emacs_sprintf_malloc | |
1217 (NULL, "For Internal Use (%s)", | |
1218 XSTRING_DATA (Fsymbol_name (name_or_existing))); | |
4953
304aebb79cd3
function renamings to track names of char typedefs
Ben Wing <ben@xemacs.org>
parents:
4952
diff
changeset
|
1219 description = build_istring (newname); |
4976
16112448d484
Rename xfree(FOO, TYPE) -> xfree(FOO)
Ben Wing <ben@xemacs.org>
parents:
4967
diff
changeset
|
1220 xfree (newname); |
771 | 1221 } |
1222 | |
1223 newname = emacs_sprintf_malloc (NULL, "Int%d", coding_system_tick); | |
4953
304aebb79cd3
function renamings to track names of char typedefs
Ben Wing <ben@xemacs.org>
parents:
4952
diff
changeset
|
1224 defmnem = build_istring (newname); |
4976
16112448d484
Rename xfree(FOO, TYPE) -> xfree(FOO)
Ben Wing <ben@xemacs.org>
parents:
4967
diff
changeset
|
1225 xfree (newname); |
428 | 1226 } |
771 | 1227 else |
1228 CHECK_SYMBOL (name_or_existing); | |
1229 | |
4303 | 1230 /* See is there an entry for name_or_existing in the defined coding system |
1231 hash table. */ | |
1232 csobj = find_coding_system (name_or_existing, 0); | |
1233 /* Error if it's there and not an autoload form. */ | |
1234 if (!NILP (csobj) && !CONSP (csobj)) | |
771 | 1235 invalid_operation ("Cannot redefine existing coding system", |
4303 | 1236 name_or_existing); |
771 | 1237 |
1238 cs = allocate_coding_system (meths, meths->extra_data_size, | |
1239 name_or_existing); | |
793 | 1240 csobj = wrap_coding_system (cs); |
771 | 1241 |
1242 cs->internal_p = !!prefix; | |
1243 | |
1244 if (NILP (description)) | |
4952
19a72041c5ed
Mule-izing, various fixes related to char * arguments
Ben Wing <ben@xemacs.org>
parents:
4846
diff
changeset
|
1245 description = build_ascstring (""); |
771 | 1246 else |
1247 CHECK_STRING (description); | |
1248 CODING_SYSTEM_DESCRIPTION (cs) = description; | |
1249 | |
1250 if (!NILP (defmnem)) | |
1251 CODING_SYSTEM_MNEMONIC (cs) = defmnem; | |
1252 | |
1253 { | |
1254 EXTERNAL_PROPERTY_LIST_LOOP_3 (key, value, props) | |
1255 { | |
1256 int recognized = 1; | |
1257 | |
1258 if (EQ (key, Qmnemonic)) | |
1259 { | |
1260 if (!NILP (value)) | |
1261 CHECK_STRING (value); | |
1262 CODING_SYSTEM_MNEMONIC (cs) = value; | |
1263 } | |
1264 | |
1265 else if (EQ (key, Qdocumentation)) | |
1266 { | |
1267 if (!NILP (value)) | |
1268 CHECK_STRING (value); | |
1269 CODING_SYSTEM_DOCUMENTATION (cs) = value; | |
1270 } | |
1271 | |
1272 else if (EQ (key, Qeol_type)) | |
1273 { | |
1274 need_to_setup_eol_systems = NILP (value); | |
1275 if (EQ (value, Qt)) | |
1276 value = Qnil; | |
1277 eol_wrapper = symbol_to_eol_type (value); | |
1278 } | |
1279 | |
1280 else if (EQ (key, Qpost_read_conversion)) | |
1281 CODING_SYSTEM_POST_READ_CONVERSION (cs) = value; | |
1282 else if (EQ (key, Qpre_write_conversion)) | |
1283 CODING_SYSTEM_PRE_WRITE_CONVERSION (cs) = value; | |
4690
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
1284 else if (EQ (key, Qaliases)) |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
1285 { |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
1286 EXTERNAL_LIST_LOOP_2 (alias, value) |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
1287 { |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
1288 CHECK_SYMBOL (alias); |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
1289 |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
1290 if (!NILP (Fcoding_system_canonical_name_p (alias))) |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
1291 { |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
1292 invalid_change ("Symbol is the canonical name of a " |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
1293 "coding system and cannot be redefined", |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
1294 alias); |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
1295 } |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
1296 } |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
1297 aliases = value; |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
1298 } |
771 | 1299 /* FSF compatibility */ |
1300 else if (EQ (key, Qtranslation_table_for_decode)) | |
1301 ; | |
1302 else if (EQ (key, Qtranslation_table_for_encode)) | |
1303 ; | |
1304 else if (EQ (key, Qsafe_chars)) | |
4690
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
1305 { |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
1306 CHECK_CHAR_TABLE (value); |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
1307 CODING_SYSTEM_SAFE_CHARS (cs) = value; |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
1308 } |
771 | 1309 else if (EQ (key, Qsafe_charsets)) |
4690
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
1310 { |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
1311 if (!EQ (Qt, value) |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
1312 /* Would be nice to actually do this check, but there are |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
1313 some order conflicts with japanese.el and |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
1314 mule-coding.el */ |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
1315 && 0) |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
1316 { |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
1317 #ifdef MULE |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
1318 EXTERNAL_LIST_LOOP_2 (safe_charset, value) |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
1319 CHECK_CHARSET (Ffind_charset (safe_charset)); |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
1320 #endif |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
1321 } |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
1322 |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
1323 CODING_SYSTEM_SAFE_CHARSETS (cs) = value; |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
1324 } |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
1325 else if (EQ (key, Qcategory)) |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
1326 { |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
1327 Fput (name_or_existing, intern ("coding-system-property"), |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
1328 Fplist_put (Fget (name_or_existing, |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
1329 intern ("coding-system-property"), |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
1330 Qnil), |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
1331 Qcategory, value)); |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
1332 } |
771 | 1333 else if (EQ (key, Qmime_charset)) |
1334 ; | |
1335 else if (EQ (key, Qvalid_codes)) | |
1336 ; | |
1337 else | |
1338 recognized = CODESYSMETH_OR_GIVEN (cs, putprop, | |
1339 (csobj, key, value), 0); | |
1340 | |
1341 if (!recognized) | |
1342 invalid_constant ("Unrecognized property", key); | |
1343 } | |
1344 } | |
1345 | |
1346 { | |
1347 XCODING_SYSTEM_CANONICAL (csobj) = | |
1348 CODESYSMETH_OR_GIVEN (cs, canonicalize, (csobj), Qnil); | |
1349 XCODING_SYSTEM_EOL_TYPE (csobj) = EOL_AUTODETECT; /* for copy-coding-system | |
1350 below */ | |
1351 | |
4303 | 1352 Fputhash (name_or_existing, csobj, Vcoding_system_hash_table); |
1353 | |
771 | 1354 if (need_to_setup_eol_systems && !cs->internal_p) |
1355 setup_eol_coding_systems (csobj); | |
1356 else if (eol_wrapper == EOL_CR || eol_wrapper == EOL_CRLF) | |
1357 { | |
1358 /* If a specific eol-type (other than LF) was specified, we handle | |
1359 this by converting the coding system into a chain that wraps the | |
1360 coding system along with a convert-eol system after it, in | |
1361 exactly that same switcheroo fashion that the normal | |
1362 canonicalize method works -- BUT we will run into a problem if | |
1363 we do it the obvious way, because when `chain' creates its | |
1364 substreams, the substream containing the coding system we're | |
1365 creating will have canonicalization expansion done on it, | |
1366 leading to infinite recursion. So we have to generate a new, | |
1367 internal coding system with the previous value of CANONICAL. */ | |
867 | 1368 Ibyte *newname = |
771 | 1369 emacs_sprintf_malloc |
1370 (NULL, "internal-eol-copy-%s-%d", | |
1371 XSTRING_DATA (Fsymbol_name (name_or_existing)), | |
1372 ++coding_system_tick); | |
4953
304aebb79cd3
function renamings to track names of char typedefs
Ben Wing <ben@xemacs.org>
parents:
4952
diff
changeset
|
1373 Lisp_Object newnamesym = intern_istring (newname); |
771 | 1374 Lisp_Object copied = Fcopy_coding_system (csobj, newnamesym); |
4976
16112448d484
Rename xfree(FOO, TYPE) -> xfree(FOO)
Ben Wing <ben@xemacs.org>
parents:
4967
diff
changeset
|
1375 xfree (newname); |
771 | 1376 |
1377 XCODING_SYSTEM_CANONICAL (csobj) = | |
1378 make_internal_coding_system | |
1379 (csobj, | |
1380 "internal-eol-wrapper", | |
1381 Qchain, Qunbound, | |
1382 list4 (Qchain, | |
1383 list2 (copied, | |
1384 eol_wrapper == EOL_CR ? | |
1385 Qconvert_eol_cr : | |
1386 Qconvert_eol_crlf), | |
1387 Qcanonicalize_after_coding, | |
1388 csobj)); | |
1389 } | |
1390 XCODING_SYSTEM_EOL_TYPE (csobj) = eol_wrapper; | |
4690
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
1391 |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
1392 { |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
1393 EXTERNAL_LIST_LOOP_2 (alias, aliases) |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
1394 Fdefine_coding_system_alias (alias, csobj); |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
1395 } |
771 | 1396 } |
1397 | |
1398 return csobj; | |
428 | 1399 } |
1400 | |
771 | 1401 Lisp_Object |
4528
726060ee587c
First draft of g++ 4.3 warning removal patch. Builds. *Needs ChangeLogs.*
Stephen J. Turnbull <stephen@xemacs.org>
parents:
4522
diff
changeset
|
1402 make_internal_coding_system (Lisp_Object existing, const Ascbyte *prefix, |
771 | 1403 Lisp_Object type, Lisp_Object description, |
1404 Lisp_Object props) | |
1405 { | |
1406 return make_coding_system_1 (existing, prefix, type, description, props); | |
1407 } | |
428 | 1408 |
4690
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
1409 DEFUN ("make-coding-system-internal", Fmake_coding_system_internal, 2, 4, 0, /* |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
1410 See `make-coding-system'. This does much of the work of that function. |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
1411 |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
1412 Without Mule support, it does all the work of that function, and an alias |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
1413 exists, mapping `make-coding-system' to |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
1414 `make-coding-system-internal'. You'll need a non-Mule XEmacs to read the |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
1415 complete docstring. Or you can just read it in make-coding-system.el; |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
1416 something like the following should work: |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
1417 |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
1418 \\[find-function-other-window] find-file RET \\[find-file] mule/make-coding-system.el RET |
771 | 1419 |
428 | 1420 */ |
771 | 1421 (name, type, description, props)) |
428 | 1422 { |
771 | 1423 return make_coding_system_1 (name, 0, type, description, props); |
428 | 1424 } |
1425 | |
1426 DEFUN ("copy-coding-system", Fcopy_coding_system, 2, 2, 0, /* | |
1427 Copy OLD-CODING-SYSTEM to NEW-NAME. | |
1428 If NEW-NAME does not name an existing coding system, a new one will | |
1429 be created. | |
771 | 1430 If you are using this function to create an alias, think again: |
1431 Use `define-coding-system-alias' instead. | |
428 | 1432 */ |
1433 (old_coding_system, new_name)) | |
1434 { | |
1435 Lisp_Object new_coding_system; | |
1436 old_coding_system = Fget_coding_system (old_coding_system); | |
771 | 1437 new_coding_system = |
4303 | 1438 UNBOUNDP (new_name) ? Qnil : find_coding_system (new_name, 0); |
428 | 1439 if (NILP (new_coding_system)) |
1440 { | |
793 | 1441 new_coding_system = |
1442 wrap_coding_system | |
1443 (allocate_coding_system | |
1444 (XCODING_SYSTEM (old_coding_system)->methods, | |
1445 XCODING_SYSTEM (old_coding_system)->methods->extra_data_size, | |
1446 new_name)); | |
771 | 1447 if (!UNBOUNDP (new_name)) |
1448 Fputhash (new_name, new_coding_system, Vcoding_system_hash_table); | |
428 | 1449 } |
771 | 1450 else if (XCODING_SYSTEM (old_coding_system)->methods != |
1451 XCODING_SYSTEM (new_coding_system)->methods) | |
1452 invalid_operation_2 ("Coding systems not same type", | |
1453 old_coding_system, new_coding_system); | |
428 | 1454 |
1455 { | |
1456 Lisp_Coding_System *to = XCODING_SYSTEM (new_coding_system); | |
1457 Lisp_Coding_System *from = XCODING_SYSTEM (old_coding_system); | |
3017 | 1458 COPY_SIZED_LCRECORD (to, from, sizeof_coding_system (from)); |
428 | 1459 to->name = new_name; |
1460 } | |
1461 return new_coding_system; | |
1462 } | |
1463 | |
2297 | 1464 /* #### Shouldn't this really be a find/get pair? */ |
1465 | |
440 | 1466 DEFUN ("coding-system-alias-p", Fcoding_system_alias_p, 1, 1, 0, /* |
1467 Return t if OBJECT is a coding system alias. | |
1468 All coding system aliases are created by `define-coding-system-alias'. | |
1469 */ | |
1470 (object)) | |
428 | 1471 { |
440 | 1472 return SYMBOLP (Fgethash (object, Vcoding_system_hash_table, Qzero)) |
1473 ? Qt : Qnil; | |
1474 } | |
1475 | |
1476 DEFUN ("coding-system-aliasee", Fcoding_system_aliasee, 1, 1, 0, /* | |
1477 Return the coding-system symbol for which symbol ALIAS is an alias. | |
1478 */ | |
1479 (alias)) | |
1480 { | |
1481 Lisp_Object aliasee = Fgethash (alias, Vcoding_system_hash_table, Qnil); | |
1482 if (SYMBOLP (aliasee)) | |
1483 return aliasee; | |
1484 else | |
563 | 1485 invalid_argument ("Symbol is not a coding system alias", alias); |
1204 | 1486 RETURN_NOT_REACHED (Qnil); |
440 | 1487 } |
1488 | |
1489 /* A maphash function, for removing dangling coding system aliases. */ | |
1490 static int | |
2286 | 1491 dangling_coding_system_alias_p (Lisp_Object UNUSED (alias), |
440 | 1492 Lisp_Object aliasee, |
1493 void *dangling_aliases) | |
1494 { | |
1495 if (SYMBOLP (aliasee) | |
1496 && NILP (Fgethash (aliasee, Vcoding_system_hash_table, Qnil))) | |
428 | 1497 { |
440 | 1498 (*(int *) dangling_aliases)++; |
1499 return 1; | |
428 | 1500 } |
440 | 1501 else |
1502 return 0; | |
1503 } | |
1504 | |
1505 DEFUN ("define-coding-system-alias", Fdefine_coding_system_alias, 2, 2, 0, /* | |
1506 Define symbol ALIAS as an alias for coding system ALIASEE. | |
1507 | |
1508 You can use this function to redefine an alias that has already been defined, | |
1509 but you cannot redefine a name which is the canonical name for a coding system. | |
1510 \(a canonical name of a coding system is what is returned when you call | |
1511 `coding-system-name' on a coding system). | |
1512 | |
1513 ALIASEE itself can be an alias, which allows you to define nested aliases. | |
1514 | |
1515 You are forbidden, however, from creating alias loops or `dangling' aliases. | |
1516 These will be detected, and an error will be signaled if you attempt to do so. | |
1517 | |
1518 If ALIASEE is nil, then ALIAS will simply be undefined. | |
1519 | |
1520 See also `coding-system-alias-p', `coding-system-aliasee', | |
1521 and `coding-system-canonical-name-p'. | |
1522 */ | |
1523 (alias, aliasee)) | |
1524 { | |
2286 | 1525 Lisp_Object probe; |
440 | 1526 |
1527 CHECK_SYMBOL (alias); | |
1528 | |
1529 if (!NILP (Fcoding_system_canonical_name_p (alias))) | |
563 | 1530 invalid_change |
440 | 1531 ("Symbol is the canonical name of a coding system and cannot be redefined", |
1532 alias); | |
1533 | |
1534 if (NILP (aliasee)) | |
1535 { | |
771 | 1536 Lisp_Object subsidiary_unix = add_suffix_to_symbol (alias, "-unix"); |
1537 Lisp_Object subsidiary_dos = add_suffix_to_symbol (alias, "-dos"); | |
1538 Lisp_Object subsidiary_mac = add_suffix_to_symbol (alias, "-mac"); | |
440 | 1539 |
1540 Fremhash (alias, Vcoding_system_hash_table); | |
1541 | |
1542 /* Undefine subsidiary aliases, | |
1543 presumably created by a previous call to this function */ | |
1544 if (! NILP (Fcoding_system_alias_p (subsidiary_unix)) && | |
1545 ! NILP (Fcoding_system_alias_p (subsidiary_dos)) && | |
1546 ! NILP (Fcoding_system_alias_p (subsidiary_mac))) | |
1547 { | |
1548 Fdefine_coding_system_alias (subsidiary_unix, Qnil); | |
1549 Fdefine_coding_system_alias (subsidiary_dos, Qnil); | |
1550 Fdefine_coding_system_alias (subsidiary_mac, Qnil); | |
1551 } | |
1552 | |
1553 /* Undefine dangling coding system aliases. */ | |
1554 { | |
1555 int dangling_aliases; | |
1556 | |
1557 do { | |
1558 dangling_aliases = 0; | |
1559 elisp_map_remhash (dangling_coding_system_alias_p, | |
1560 Vcoding_system_hash_table, | |
1561 &dangling_aliases); | |
1562 } while (dangling_aliases > 0); | |
1563 } | |
1564 | |
1565 return Qnil; | |
1566 } | |
1567 | |
1568 if (CODING_SYSTEMP (aliasee)) | |
1569 aliasee = XCODING_SYSTEM_NAME (aliasee); | |
1570 | |
1571 /* Checks that aliasee names a coding-system */ | |
2286 | 1572 (void) Fget_coding_system (aliasee); |
440 | 1573 |
1574 /* Check for coding system alias loops */ | |
1575 if (EQ (alias, aliasee)) | |
563 | 1576 alias_loop: invalid_operation_2 |
440 | 1577 ("Attempt to create a coding system alias loop", alias, aliasee); |
1578 | |
1579 for (probe = aliasee; | |
1580 SYMBOLP (probe); | |
1581 probe = Fgethash (probe, Vcoding_system_hash_table, Qzero)) | |
1582 { | |
1583 if (EQ (probe, alias)) | |
1584 goto alias_loop; | |
1585 } | |
1586 | |
1587 Fputhash (alias, aliasee, Vcoding_system_hash_table); | |
1588 | |
1589 /* Set up aliases for subsidiaries. | |
2297 | 1590 #### There must be a better way to handle subsidiary coding systems. |
1591 Inquiring Minds Want To Know: shouldn't they always be chains? */ | |
440 | 1592 { |
1593 static const char *suffixes[] = { "-unix", "-dos", "-mac" }; | |
1594 int i; | |
1595 for (i = 0; i < countof (suffixes); i++) | |
1596 { | |
1597 Lisp_Object alias_subsidiary = | |
771 | 1598 add_suffix_to_symbol (alias, suffixes[i]); |
440 | 1599 Lisp_Object aliasee_subsidiary = |
771 | 1600 add_suffix_to_symbol (aliasee, suffixes[i]); |
440 | 1601 |
1602 if (! NILP (Ffind_coding_system (aliasee_subsidiary))) | |
1603 Fdefine_coding_system_alias (alias_subsidiary, aliasee_subsidiary); | |
1604 } | |
1605 } | |
428 | 1606 /* FSF return value is a vector of [ALIAS-unix ALIAS-dos ALIAS-mac], |
1607 but it doesn't look intentional, so I'd rather return something | |
1608 meaningful or nothing at all. */ | |
1609 return Qnil; | |
1610 } | |
1611 | |
1612 static Lisp_Object | |
771 | 1613 subsidiary_coding_system (Lisp_Object coding_system, enum eol_type type) |
428 | 1614 { |
1615 Lisp_Coding_System *cs = XCODING_SYSTEM (coding_system); | |
1616 Lisp_Object new_coding_system; | |
1617 | |
1618 switch (type) | |
1619 { | |
1620 case EOL_AUTODETECT: return coding_system; | |
1621 case EOL_LF: new_coding_system = CODING_SYSTEM_EOL_LF (cs); break; | |
1622 case EOL_CR: new_coding_system = CODING_SYSTEM_EOL_CR (cs); break; | |
1623 case EOL_CRLF: new_coding_system = CODING_SYSTEM_EOL_CRLF (cs); break; | |
2500 | 1624 default: ABORT (); return Qnil; |
428 | 1625 } |
1626 | |
1627 return NILP (new_coding_system) ? coding_system : new_coding_system; | |
1628 } | |
1629 | |
1630 DEFUN ("subsidiary-coding-system", Fsubsidiary_coding_system, 2, 2, 0, /* | |
1631 Return the subsidiary coding system of CODING-SYSTEM with eol type EOL-TYPE. | |
771 | 1632 The logically opposite operation is `coding-system-base'. |
428 | 1633 */ |
1634 (coding_system, eol_type)) | |
1635 { | |
771 | 1636 coding_system = get_coding_system_for_text_file (coding_system, 0); |
428 | 1637 |
1638 return subsidiary_coding_system (coding_system, | |
1639 symbol_to_eol_type (eol_type)); | |
1640 } | |
1641 | |
771 | 1642 DEFUN ("coding-system-base", Fcoding_system_base, |
1643 1, 1, 0, /* | |
1644 Return the base coding system of CODING-SYSTEM. | |
1645 If CODING-SYSTEM is a subsidiary, this returns its parent; otherwise, it | |
1646 returns CODING-SYSTEM. | |
1647 The logically opposite operation is `subsidiary-coding-system'. | |
1648 */ | |
1649 (coding_system)) | |
1650 { | |
1651 Lisp_Object base; | |
1652 | |
1653 coding_system = Fget_coding_system (coding_system); | |
1654 if (EQ (XCODING_SYSTEM_NAME (coding_system), Qbinary)) | |
1655 return Fget_coding_system (Qraw_text); /* hack! */ | |
1656 base = XCODING_SYSTEM_SUBSIDIARY_PARENT (coding_system); | |
1657 if (!NILP (base)) | |
1658 return base; | |
1659 return coding_system; | |
1660 } | |
1661 | |
1662 DEFUN ("coding-system-used-for-io", Fcoding_system_used_for_io, | |
1663 1, 1, 0, /* | |
1664 Return the coding system actually used for I/O. | |
1665 In some cases (e.g. when a particular EOL type is specified) this won't be | |
2297 | 1666 the coding system itself. This can be useful when trying to determine |
1667 precisely how data was decoded. | |
771 | 1668 */ |
1669 (coding_system)) | |
1670 { | |
1671 Lisp_Object canon; | |
1672 | |
1673 coding_system = Fget_coding_system (coding_system); | |
1674 canon = XCODING_SYSTEM_CANONICAL (coding_system); | |
1675 if (!NILP (canon)) | |
1676 return canon; | |
1677 return coding_system; | |
1678 } | |
1679 | |
428 | 1680 |
1681 /************************************************************************/ | |
1682 /* Coding system accessors */ | |
1683 /************************************************************************/ | |
1684 | |
771 | 1685 DEFUN ("coding-system-description", Fcoding_system_description, 1, 1, 0, /* |
1686 Return the description for CODING-SYSTEM. | |
1687 The `description' of a coding system is a short English phrase giving the | |
1688 name rendered according to English punctuation rules, plus possibly some | |
1689 explanatory text (typically in the form of a parenthetical phrase). The | |
1690 description is intended to be short enough that it can appear as a menu item, | |
1691 and clear enough to be recognizable even to someone who is assumed to have | |
1692 some basic familiarity with different encodings but may not know all the | |
1693 technical names; thus, for `cn-gb-2312' is described as "Chinese EUC" and | |
1694 `hz-gb-2312' is described as "Hz/ZW (Chinese)", where the actual name of | |
1695 the encoding is given, followed by a note that this is a Chinese encoding, | |
1696 because the great majority of people encountering this would have no idea | |
1697 what it is, and giving the language indicates whether the encoding should | |
1698 just be ignored or (conceivably) investigated more thoroughly. | |
428 | 1699 */ |
1700 (coding_system)) | |
1701 { | |
1702 coding_system = Fget_coding_system (coding_system); | |
771 | 1703 return XCODING_SYSTEM_DESCRIPTION (coding_system); |
428 | 1704 } |
1705 | |
1706 DEFUN ("coding-system-type", Fcoding_system_type, 1, 1, 0, /* | |
1707 Return the type of CODING-SYSTEM. | |
1708 */ | |
1709 (coding_system)) | |
1710 { | |
771 | 1711 coding_system = Fget_coding_system (coding_system); |
1712 return XCODING_SYSTEM_TYPE (coding_system); | |
428 | 1713 } |
1714 | |
1715 DEFUN ("coding-system-property", Fcoding_system_property, 2, 2, 0, /* | |
1716 Return the PROP property of CODING-SYSTEM. | |
1717 */ | |
1718 (coding_system, prop)) | |
1719 { | |
1720 coding_system = Fget_coding_system (coding_system); | |
1721 CHECK_SYMBOL (prop); | |
1722 | |
1723 if (EQ (prop, Qname)) | |
1724 return XCODING_SYSTEM_NAME (coding_system); | |
1725 else if (EQ (prop, Qtype)) | |
1726 return Fcoding_system_type (coding_system); | |
771 | 1727 else if (EQ (prop, Qdescription)) |
1728 return XCODING_SYSTEM_DESCRIPTION (coding_system); | |
428 | 1729 else if (EQ (prop, Qmnemonic)) |
1730 return XCODING_SYSTEM_MNEMONIC (coding_system); | |
771 | 1731 else if (EQ (prop, Qdocumentation)) |
1732 return XCODING_SYSTEM_DOCUMENTATION (coding_system); | |
428 | 1733 else if (EQ (prop, Qeol_type)) |
771 | 1734 return eol_type_to_symbol (XCODING_SYSTEM_EOL_TYPE |
1735 (coding_system)); | |
428 | 1736 else if (EQ (prop, Qeol_lf)) |
1737 return XCODING_SYSTEM_EOL_LF (coding_system); | |
1738 else if (EQ (prop, Qeol_crlf)) | |
1739 return XCODING_SYSTEM_EOL_CRLF (coding_system); | |
1740 else if (EQ (prop, Qeol_cr)) | |
1741 return XCODING_SYSTEM_EOL_CR (coding_system); | |
1742 else if (EQ (prop, Qpost_read_conversion)) | |
1743 return XCODING_SYSTEM_POST_READ_CONVERSION (coding_system); | |
1744 else if (EQ (prop, Qpre_write_conversion)) | |
1745 return XCODING_SYSTEM_PRE_WRITE_CONVERSION (coding_system); | |
4568
1d74a1d115ee
Add #'query-coding-region tests; do the work necessary to get them running.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4303
diff
changeset
|
1746 else if (EQ (prop, Qsafe_charsets)) |
1d74a1d115ee
Add #'query-coding-region tests; do the work necessary to get them running.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4303
diff
changeset
|
1747 return XCODING_SYSTEM_SAFE_CHARSETS (coding_system); |
1d74a1d115ee
Add #'query-coding-region tests; do the work necessary to get them running.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4303
diff
changeset
|
1748 else if (EQ (prop, Qsafe_chars)) |
1d74a1d115ee
Add #'query-coding-region tests; do the work necessary to get them running.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4303
diff
changeset
|
1749 return XCODING_SYSTEM_SAFE_CHARS (coding_system); |
771 | 1750 else |
1751 { | |
1752 Lisp_Object value = CODESYSMETH_OR_GIVEN (XCODING_SYSTEM (coding_system), | |
1753 getprop, | |
1754 (coding_system, prop), | |
1755 Qunbound); | |
1756 if (UNBOUNDP (value)) | |
1757 invalid_constant ("Unrecognized property", prop); | |
1758 return value; | |
1759 } | |
1760 } | |
1761 | |
1762 | |
1763 /************************************************************************/ | |
1764 /* Coding stream functions */ | |
1765 /************************************************************************/ | |
1766 | |
1767 /* A coding stream is a stream used for encoding or decoding text. The | |
1768 coding-stream object keeps track of the actual coding system, the stream | |
1769 that is at the other end, and data that needs to be persistent across | |
1770 the lifetime of the stream. */ | |
1771 | |
1204 | 1772 extern const struct sized_memory_description chain_coding_stream_description; |
1773 extern const struct sized_memory_description undecided_coding_stream_description; | |
1774 | |
1775 static const struct memory_description coding_stream_data_description_1 []= { | |
2551 | 1776 { XD_BLOCK_PTR, chain_coding_system, 1, |
1777 { &chain_coding_stream_description } }, | |
1778 { XD_BLOCK_PTR, undecided_coding_system, 1, | |
1779 { &undecided_coding_stream_description } }, | |
1204 | 1780 { XD_END } |
1781 }; | |
1782 | |
1783 static const struct sized_memory_description coding_stream_data_description = { | |
1784 sizeof (void *), coding_stream_data_description_1 | |
1785 }; | |
1786 | |
1787 static const struct memory_description coding_lstream_description[] = { | |
1788 { XD_INT, offsetof (struct coding_stream, type) }, | |
1789 { XD_LISP_OBJECT, offsetof (struct coding_stream, orig_codesys) }, | |
1790 { XD_LISP_OBJECT, offsetof (struct coding_stream, codesys) }, | |
1791 { XD_LISP_OBJECT, offsetof (struct coding_stream, other_end) }, | |
1792 { XD_UNION, offsetof (struct coding_stream, data), | |
2551 | 1793 XD_INDIRECT (0, 0), { &coding_stream_data_description } }, |
1204 | 1794 { XD_END } |
1795 }; | |
1796 | |
1797 DEFINE_LSTREAM_IMPLEMENTATION_WITH_DATA ("coding", coding); | |
771 | 1798 |
1799 /* Encoding and decoding are parallel operations, so we create just one | |
1800 stream for both. "Decoding" may involve the extra step of autodetection | |
1801 of the data format, but that's only because of the conventional | |
1802 definition of decoding as converting from external- to | |
1803 internal-formatted data. | |
1804 | |
2297 | 1805 [[ REWRITE ME! ]] |
1806 | |
771 | 1807 #### We really need to abstract out the concept of "data formats" and |
1808 define "converters" that convert from and to specified formats, | |
1809 eliminating the idea of decoding and encoding. When specifying a | |
1810 conversion process, we need to give the data formats themselves, not the | |
1811 conversion processes -- e.g. a coding system called "Unicode->multibyte" | |
1812 converts in both directions, and we could auto-detect the format of data | |
1813 at either end. */ | |
1814 | |
1815 static Bytecount | |
1816 coding_reader (Lstream *stream, unsigned char *data, Bytecount size) | |
1817 { | |
1818 unsigned char *orig_data = data; | |
1819 Bytecount read_size; | |
1820 int error_occurred = 0; | |
1821 struct coding_stream *str = CODING_STREAM_DATA (stream); | |
1822 | |
1823 /* We need to interface to coding_{de,en}code_1(), which expects to take | |
1824 some amount of data and store the result into a Dynarr. We have | |
1825 coding_{de,en}code_1() store into c->runoff, and take data from there | |
1826 as necessary. */ | |
1827 | |
1828 /* We loop until we have enough data, reading chunks from the other | |
1829 end and converting it. */ | |
1830 while (1) | |
1831 { | |
1832 /* Take data from convert_to if we can. Make sure to take at | |
1833 most SIZE bytes, and delete the data from convert_to. */ | |
1834 if (Dynarr_length (str->convert_to) > 0) | |
1835 { | |
1836 Bytecount chunk = | |
1837 min (size, (Bytecount) Dynarr_length (str->convert_to)); | |
4967 | 1838 memcpy (data, Dynarr_begin (str->convert_to), chunk); |
771 | 1839 Dynarr_delete_many (str->convert_to, 0, chunk); |
1840 data += chunk; | |
1841 size -= chunk; | |
1842 } | |
1843 | |
1844 if (size == 0) | |
1845 break; /* No more room for data */ | |
1846 | |
1847 if (str->eof) | |
1848 break; | |
1849 | |
1850 { | |
1851 /* Exhausted convert_to, so get some more. Read into convert_from, | |
1852 after existing "rejected" data from the last conversion. */ | |
1853 Bytecount rejected = Dynarr_length (str->convert_from); | |
1854 /* #### 1024 is arbitrary; we really need to separate 0 from EOF, | |
1855 and when we get 0, keep taking more data until we don't get 0 -- | |
1856 we don't know how much data the conversion routine might need | |
2297 | 1857 before it can generate any data of its own (eg, bzip2). */ |
814 | 1858 Bytecount readmore = |
1859 str->one_byte_at_a_time ? (Bytecount) 1 : | |
1860 max (size, (Bytecount) 1024); | |
771 | 1861 |
1862 Dynarr_add_many (str->convert_from, 0, readmore); | |
1863 read_size = Lstream_read (str->other_end, | |
1864 Dynarr_atp (str->convert_from, rejected), | |
1865 readmore); | |
1866 /* Trim size down to how much we actually got */ | |
4967 | 1867 Dynarr_set_length (str->convert_from, rejected + max (0, read_size)); |
771 | 1868 } |
1869 | |
1870 if (read_size < 0) /* LSTREAM_ERROR */ | |
1871 { | |
1872 error_occurred = 1; | |
1873 break; | |
1874 } | |
1875 if (read_size == 0) /* LSTREAM_EOF */ | |
1876 /* There might be some more end data produced in the translation, | |
1877 so we set a flag and call the conversion method once more to | |
1878 output any final stuff it may be holding, any "go back to a sane | |
1879 state" escape sequences, etc. The conversion method is free to | |
1880 look at this flag, and we use it above to stop looping. */ | |
1881 str->eof = 1; | |
1882 { | |
1883 Bytecount processed; | |
1884 Bytecount to_process = Dynarr_length (str->convert_from); | |
1885 | |
1886 /* Convert the data, and save any rejected data in convert_from */ | |
1887 processed = | |
1888 XCODESYSMETH (str->codesys, convert, | |
4967 | 1889 (str, Dynarr_begin (str->convert_from), |
771 | 1890 str->convert_to, to_process)); |
1891 if (processed < 0) | |
1892 { | |
1893 error_occurred = 1; | |
1894 break; | |
1895 } | |
1896 assert (processed <= to_process); | |
1897 if (processed < to_process) | |
4967 | 1898 memmove (Dynarr_begin (str->convert_from), |
771 | 1899 Dynarr_atp (str->convert_from, processed), |
1900 to_process - processed); | |
4967 | 1901 Dynarr_set_length (str->convert_from, to_process - processed); |
771 | 1902 } |
1903 } | |
1904 | |
1905 if (data - orig_data == 0) | |
1906 return error_occurred ? -1 : 0; | |
1907 else | |
1908 return data - orig_data; | |
1909 } | |
1910 | |
1911 static Bytecount | |
1912 coding_writer (Lstream *stream, const unsigned char *data, Bytecount size) | |
1913 { | |
1914 struct coding_stream *str = CODING_STREAM_DATA (stream); | |
1915 | |
1916 /* Convert all our data into convert_to, and then attempt to write | |
1917 it all out to the other end. */ | |
1918 Dynarr_reset (str->convert_to); | |
1919 size = XCODESYSMETH (str->codesys, convert, | |
1920 (str, data, str->convert_to, size)); | |
4967 | 1921 if (Lstream_write (str->other_end, Dynarr_begin (str->convert_to), |
771 | 1922 Dynarr_length (str->convert_to)) < 0) |
1923 return -1; | |
1924 else | |
1925 /* The return value indicates how much of the incoming data was | |
1926 processed, not how many bytes were written. */ | |
1927 return size; | |
1928 } | |
1929 | |
1930 static int | |
1931 encode_decode_source_sink_type_is_char (Lisp_Object cs, | |
1932 enum source_or_sink sex, | |
1933 enum encode_decode direction) | |
1934 { | |
1935 return (direction == CODING_DECODE ? | |
1936 decoding_source_sink_type_is_char (cs, sex) : | |
1937 encoding_source_sink_type_is_char (cs, sex)); | |
1938 } | |
1939 | |
1940 /* Ensure that the convert methods only get full characters sent to them to | |
1941 convert if the source of that conversion is characters; and that no such | |
1942 full-character checking happens when the source is bytes. Keep in mind | |
1943 that (1) the conversion_end_type return values take the perspective of | |
1944 encoding; (2) the source for decoding is the same as the sink for | |
1945 encoding; (3) when writing, the data is given to us, and we set our own | |
1946 stream to be character mode or not; (4) when reading, the data comes | |
1947 from the other_end stream, and we set that one to be character mode or | |
1948 not. This is consistent with the comment above the prototype for | |
1949 Lstream_set_character_mode(), which lays out rules for who is allowed to | |
1950 modify the character type mode on a stream. | |
1951 | |
814 | 1952 If we're a read stream, we're always setting character mode on the |
1953 source, but we also set it on ourselves consistent with the flag that | |
1954 can disable this (see again the comment above | |
1955 Lstream_set_character_mode()). | |
1956 */ | |
771 | 1957 |
1958 static void | |
1959 set_coding_character_mode (Lstream *stream) | |
1960 { | |
1961 struct coding_stream *str = CODING_STREAM_DATA (stream); | |
1962 Lstream *stream_to_set = | |
1963 stream->flags & LSTREAM_FL_WRITE ? stream : str->other_end; | |
1964 if (encode_decode_source_sink_type_is_char | |
1965 (str->codesys, CODING_SOURCE, str->direction)) | |
1966 Lstream_set_character_mode (stream_to_set); | |
1967 else | |
1968 Lstream_unset_character_mode (stream_to_set); | |
814 | 1969 if (str->set_char_mode_on_us_when_reading && |
1970 (stream->flags & LSTREAM_FL_READ)) | |
1971 { | |
1972 if (encode_decode_source_sink_type_is_char | |
1973 (str->codesys, CODING_SINK, str->direction)) | |
1974 Lstream_set_character_mode (stream); | |
1975 else | |
1976 Lstream_unset_character_mode (stream); | |
1977 } | |
771 | 1978 } |
1979 | |
1980 static Lisp_Object | |
1981 coding_marker (Lisp_Object stream) | |
1982 { | |
1983 struct coding_stream *str = CODING_STREAM_DATA (XLSTREAM (stream)); | |
1984 | |
1985 mark_object (str->orig_codesys); | |
1986 mark_object (str->codesys); | |
1987 MAYBE_XCODESYSMETH (str->codesys, mark_coding_stream, (str)); | |
1988 return wrap_lstream (str->other_end); | |
1989 } | |
1990 | |
1991 static int | |
1992 coding_rewinder (Lstream *stream) | |
1993 { | |
1994 struct coding_stream *str = CODING_STREAM_DATA (stream); | |
1995 MAYBE_XCODESYSMETH (str->codesys, rewind_coding_stream, (str)); | |
1996 | |
1997 str->ch = 0; | |
1998 Dynarr_reset (str->convert_to); | |
1999 Dynarr_reset (str->convert_from); | |
2000 return Lstream_rewind (str->other_end); | |
2001 } | |
2002 | |
2003 static int | |
2004 coding_seekable_p (Lstream *stream) | |
2005 { | |
2006 struct coding_stream *str = CODING_STREAM_DATA (stream); | |
2007 return Lstream_seekable_p (str->other_end); | |
2008 } | |
2009 | |
2010 static int | |
2011 coding_flusher (Lstream *stream) | |
2012 { | |
2013 struct coding_stream *str = CODING_STREAM_DATA (stream); | |
2014 return Lstream_flush (str->other_end); | |
2015 } | |
2016 | |
2017 static int | |
2018 coding_closer (Lstream *stream) | |
2019 { | |
2020 struct coding_stream *str = CODING_STREAM_DATA (stream); | |
2021 if (stream->flags & LSTREAM_FL_WRITE) | |
2022 { | |
2023 str->eof = 1; | |
2024 coding_writer (stream, 0, 0); | |
2025 str->eof = 0; | |
2026 } | |
2027 /* It's safe to free the runoff dynarrs now because they are used only | |
2028 during conversion. We need to keep the type-specific data around, | |
2029 though, because of canonicalize_after_coding. */ | |
2030 if (str->convert_to) | |
2031 { | |
2032 Dynarr_free (str->convert_to); | |
2033 str->convert_to = 0; | |
2034 } | |
2035 if (str->convert_from) | |
428 | 2036 { |
771 | 2037 Dynarr_free (str->convert_from); |
2038 str->convert_from = 0; | |
2039 } | |
2040 | |
800 | 2041 if (str->no_close_other) |
2042 return Lstream_flush (str->other_end); | |
2043 else | |
2044 return Lstream_close (str->other_end); | |
771 | 2045 } |
2046 | |
2047 static void | |
2048 coding_finalizer (Lstream *stream) | |
2049 { | |
2050 struct coding_stream *str = CODING_STREAM_DATA (stream); | |
2051 | |
2052 assert (!str->finalized); | |
2053 MAYBE_XCODESYSMETH (str->codesys, finalize_coding_stream, (str)); | |
2054 if (str->data) | |
2055 { | |
4976
16112448d484
Rename xfree(FOO, TYPE) -> xfree(FOO)
Ben Wing <ben@xemacs.org>
parents:
4967
diff
changeset
|
2056 xfree (str->data); |
771 | 2057 str->data = 0; |
2058 } | |
2059 str->finalized = 1; | |
2060 } | |
2061 | |
2062 static Lisp_Object | |
2063 coding_stream_canonicalize_after_coding (Lstream *stream) | |
2064 { | |
2065 struct coding_stream *str = CODING_STREAM_DATA (stream); | |
2066 | |
2067 return XCODESYSMETH_OR_GIVEN (str->codesys, canonicalize_after_coding, | |
2068 (str), str->codesys); | |
2069 } | |
2070 | |
2071 Lisp_Object | |
2072 coding_stream_detected_coding_system (Lstream *stream) | |
2073 { | |
2074 Lisp_Object codesys = | |
2075 coding_stream_canonicalize_after_coding (stream); | |
2076 if (NILP (codesys)) | |
2077 return Fget_coding_system (Qidentity); | |
2078 return codesys; | |
2079 } | |
2080 | |
2081 Lisp_Object | |
2082 coding_stream_coding_system (Lstream *stream) | |
2083 { | |
2084 return CODING_STREAM_DATA (stream)->codesys; | |
2085 } | |
2086 | |
2087 /* Change the coding system associated with a stream. */ | |
2088 | |
2089 void | |
2090 set_coding_stream_coding_system (Lstream *lstr, Lisp_Object codesys) | |
2091 { | |
2092 struct coding_stream *str = CODING_STREAM_DATA (lstr); | |
2093 if (EQ (str->orig_codesys, codesys)) | |
2094 return; | |
2095 /* We do the equivalent of closing the stream, destroying it, and | |
2096 reinitializing it. This includes flushing out the data and signalling | |
2097 EOF, if we're a writing stream; we also replace the type-specific data | |
2098 with the data appropriate for the new coding system. */ | |
2099 if (!NILP (str->codesys)) | |
2100 { | |
2101 if (lstr->flags & LSTREAM_FL_WRITE) | |
2102 { | |
2103 Lstream_flush (lstr); | |
2104 str->eof = 1; | |
2105 coding_writer (lstr, 0, 0); | |
2106 str->eof = 0; | |
2107 } | |
2108 MAYBE_XCODESYSMETH (str->codesys, finalize_coding_stream, (str)); | |
2109 } | |
2110 str->orig_codesys = codesys; | |
2111 str->codesys = coding_system_real_canonical (codesys); | |
2112 | |
2113 if (str->data) | |
2114 { | |
4976
16112448d484
Rename xfree(FOO, TYPE) -> xfree(FOO)
Ben Wing <ben@xemacs.org>
parents:
4967
diff
changeset
|
2115 xfree (str->data); |
771 | 2116 str->data = 0; |
428 | 2117 } |
771 | 2118 if (XCODING_SYSTEM_METHODS (str->codesys)->coding_data_size) |
1204 | 2119 { |
2120 str->data = | |
2121 xmalloc_and_zero (XCODING_SYSTEM_METHODS (str->codesys)-> | |
2122 coding_data_size); | |
2123 str->type = XCODING_SYSTEM_METHODS (str->codesys)->enumtype; | |
2124 } | |
771 | 2125 MAYBE_XCODESYSMETH (str->codesys, init_coding_stream, (str)); |
2126 /* The new coding system may have different ideas regarding whether its | |
2127 ends are characters or bytes. */ | |
2128 set_coding_character_mode (lstr); | |
2129 } | |
2130 | |
2131 /* WARNING WARNING WARNING WARNING!!!!! If you open up a coding | |
2132 stream for writing, no automatic code detection will be performed. | |
2133 The reason for this is that automatic code detection requires a | |
2134 seekable input. Things will also fail if you open a coding | |
2135 stream for reading using a non-fully-specified coding system and | |
2136 a non-seekable input stream. */ | |
2137 | |
2138 static Lisp_Object | |
2139 make_coding_stream_1 (Lstream *stream, Lisp_Object codesys, | |
800 | 2140 const char *mode, enum encode_decode direction, |
802 | 2141 int flags) |
771 | 2142 { |
2143 Lstream *lstr = Lstream_new (lstream_coding, mode); | |
2144 struct coding_stream *str = CODING_STREAM_DATA (lstr); | |
2145 | |
2146 codesys = Fget_coding_system (codesys); | |
2147 xzero (*str); | |
2148 str->codesys = Qnil; | |
2149 str->orig_codesys = Qnil; | |
2150 str->us = lstr; | |
2151 str->other_end = stream; | |
2152 str->convert_to = Dynarr_new (unsigned_char); | |
2153 str->convert_from = Dynarr_new (unsigned_char); | |
2154 str->direction = direction; | |
814 | 2155 if (flags & LSTREAM_FL_NO_CLOSE_OTHER) |
802 | 2156 str->no_close_other = 1; |
814 | 2157 if (flags & LSTREAM_FL_READ_ONE_BYTE_AT_A_TIME) |
802 | 2158 str->one_byte_at_a_time = 1; |
814 | 2159 if (!(flags & LSTREAM_FL_NO_INIT_CHAR_MODE_WHEN_READING)) |
2160 str->set_char_mode_on_us_when_reading = 1; | |
802 | 2161 |
771 | 2162 set_coding_stream_coding_system (lstr, codesys); |
793 | 2163 return wrap_lstream (lstr); |
771 | 2164 } |
2165 | |
814 | 2166 /* FLAGS: |
2167 | |
2168 LSTREAM_FL_NO_CLOSE_OTHER | |
2169 Don't close STREAM (the stream at the other end) when this stream is | |
2170 closed. | |
2171 | |
2172 LSTREAM_FL_READ_ONE_BYTE_AT_A_TIME | |
2173 When reading from STREAM, read and process one byte at a time rather | |
2174 than in large chunks. This is for reading from TTY's, so we don't | |
2175 block. #### We should instead create a non-blocking filedesc stream | |
2176 that emulates the behavior as necessary using select(), when the | |
2177 fcntls don't work. (As seems to be the case on Cygwin.) | |
2178 | |
2179 LSTREAM_FL_NO_INIT_CHAR_MODE_WHEN_READING | |
2180 When reading from STREAM, read and process one byte at a time rather | |
2181 than in large chunks. This is for reading from TTY's, so we don't | |
2182 block. #### We should instead create a non-blocking filedesc stream | |
2183 that emulates the behavior as necessary using select(), when the | |
2184 fcntls don't work. (As seems to be the case on Cygwin.) | |
2185 */ | |
771 | 2186 Lisp_Object |
2187 make_coding_input_stream (Lstream *stream, Lisp_Object codesys, | |
802 | 2188 enum encode_decode direction, int flags) |
771 | 2189 { |
800 | 2190 return make_coding_stream_1 (stream, codesys, "r", direction, |
802 | 2191 flags); |
771 | 2192 } |
2193 | |
814 | 2194 /* FLAGS: |
2195 | |
2196 LSTREAM_FL_NO_CLOSE_OTHER | |
2197 Don't close STREAM (the stream at the other end) when this stream is | |
2198 closed. | |
2199 */ | |
771 | 2200 Lisp_Object |
2201 make_coding_output_stream (Lstream *stream, Lisp_Object codesys, | |
802 | 2202 enum encode_decode direction, int flags) |
771 | 2203 { |
800 | 2204 return make_coding_stream_1 (stream, codesys, "w", direction, |
802 | 2205 flags); |
771 | 2206 } |
2207 | |
2208 static Lisp_Object | |
2209 encode_decode_coding_region (Lisp_Object start, Lisp_Object end, | |
2210 Lisp_Object coding_system, Lisp_Object buffer, | |
2211 enum encode_decode direction) | |
2212 { | |
2213 Charbpos b, e; | |
2214 struct buffer *buf = decode_buffer (buffer, 0); | |
2215 Lisp_Object instream = Qnil, to_outstream = Qnil, outstream = Qnil; | |
2216 Lisp_Object from_outstream = Qnil, auto_outstream = Qnil; | |
2217 Lisp_Object lb_outstream = Qnil; | |
2218 Lisp_Object next; | |
2219 Lstream *istr, *ostr; | |
2220 struct gcpro gcpro1, gcpro2, gcpro3, gcpro4, gcpro5; | |
2221 struct gcpro ngcpro1; | |
2222 int source_char, sink_char; | |
2223 | |
2224 get_buffer_range_char (buf, start, end, &b, &e, 0); | |
2225 barf_if_buffer_read_only (buf, b, e); | |
2226 | |
2227 GCPRO5 (instream, to_outstream, outstream, from_outstream, lb_outstream); | |
2228 NGCPRO1 (auto_outstream); | |
2229 | |
2230 coding_system = Fget_coding_system (coding_system); | |
2231 source_char = encode_decode_source_sink_type_is_char (coding_system, | |
2232 CODING_SOURCE, | |
2233 direction); | |
2234 sink_char = encode_decode_source_sink_type_is_char (coding_system, | |
2235 CODING_SINK, | |
2236 direction); | |
2237 | |
2238 /* Order is IN <---> [TO] -> OUT -> [FROM] -> [AUTODETECT-EOL] -> LB */ | |
2239 instream = make_lisp_buffer_input_stream (buf, b, e, 0); | |
2240 next = lb_outstream = make_lisp_buffer_output_stream (buf, b, 0); | |
2241 | |
2242 if (direction == CODING_DECODE && | |
2243 XCODING_SYSTEM_EOL_TYPE (coding_system) == EOL_AUTODETECT) | |
2244 next = auto_outstream = | |
2245 make_coding_output_stream | |
800 | 2246 (XLSTREAM (next), Fget_coding_system (Qconvert_eol_autodetect), |
2247 CODING_DECODE, 0); | |
771 | 2248 |
2249 if (!sink_char) | |
2250 next = from_outstream = | |
800 | 2251 make_coding_output_stream (XLSTREAM (next), Qbinary, CODING_DECODE, 0); |
771 | 2252 outstream = make_coding_output_stream (XLSTREAM (next), coding_system, |
800 | 2253 direction, 0); |
771 | 2254 if (!source_char) |
428 | 2255 { |
771 | 2256 to_outstream = |
2257 make_coding_output_stream (XLSTREAM (outstream), | |
800 | 2258 Qbinary, CODING_ENCODE, 0); |
771 | 2259 ostr = XLSTREAM (to_outstream); |
2260 } | |
2261 else | |
2262 ostr = XLSTREAM (outstream); | |
2263 istr = XLSTREAM (instream); | |
2264 | |
2265 /* The chain of streams looks like this: | |
2266 | |
2297 | 2267 [BUFFER] <----- (( read from/send to loop )) |
771 | 2268 ------> [CHAR->BYTE i.e. ENCODE AS BINARY if source is |
2269 in bytes] | |
2270 ------> [ENCODE/DECODE AS SPECIFIED] | |
2271 ------> [BYTE->CHAR i.e. DECODE AS BINARY | |
2272 if sink is in bytes] | |
2273 ------> [AUTODETECT EOL if | |
2274 we're decoding and | |
2275 coding system calls | |
2276 for this] | |
2277 ------> [BUFFER] | |
2278 */ | |
2367 | 2279 |
2280 /* #### See comment | |
2281 | |
2282 EFFICIENCY OF CODING CONVERSION WITH MULTIPLE COPIES/CHAINS | |
2283 | |
2284 in text.c. | |
2285 */ | |
2286 | |
771 | 2287 while (1) |
2288 { | |
2289 char tempbuf[1024]; /* some random amount */ | |
2290 Charbpos newpos, even_newer_pos; | |
2291 Charbpos oldpos = lisp_buffer_stream_startpos (istr); | |
2292 Bytecount size_in_bytes = | |
2293 Lstream_read (istr, tempbuf, sizeof (tempbuf)); | |
2294 | |
2295 if (!size_in_bytes) | |
2296 break; | |
2297 newpos = lisp_buffer_stream_startpos (istr); | |
2298 Lstream_write (ostr, tempbuf, size_in_bytes); | |
2299 even_newer_pos = lisp_buffer_stream_startpos (istr); | |
2300 buffer_delete_range (buf, even_newer_pos - (newpos - oldpos), | |
2301 even_newer_pos, 0); | |
428 | 2302 } |
771 | 2303 |
2304 { | |
2305 Charcount retlen = | |
2306 lisp_buffer_stream_startpos (XLSTREAM (instream)) - b; | |
2307 Lstream_close (istr); | |
2308 Lstream_close (ostr); | |
2309 NUNGCPRO; | |
2310 UNGCPRO; | |
2311 Lstream_delete (istr); | |
2312 if (!NILP (from_outstream)) | |
2313 Lstream_delete (XLSTREAM (from_outstream)); | |
2314 Lstream_delete (XLSTREAM (outstream)); | |
2315 if (!NILP (to_outstream)) | |
2316 Lstream_delete (XLSTREAM (to_outstream)); | |
2317 if (!NILP (auto_outstream)) | |
2318 Lstream_delete (XLSTREAM (auto_outstream)); | |
2319 Lstream_delete (XLSTREAM (lb_outstream)); | |
2320 return make_int (retlen); | |
2321 } | |
2322 } | |
2323 | |
3302 | 2324 DEFUN ("decode-coding-region", Fdecode_coding_region, 3, 4, |
2325 "*r\nzDecode from coding system: \ni", /* | |
771 | 2326 Decode the text between START and END which is encoded in CODING-SYSTEM. |
2327 This is useful if you've read in encoded text from a file without decoding | |
2328 it (e.g. you read in a JIS-formatted file but used the `binary' or | |
2329 `no-conversion' coding system, so that it shows up as "^[$B!<!+^[(B"). | |
2330 Return length of decoded text. | |
3302 | 2331 BUFFER defaults to the current buffer if unspecified, and when interactive. |
771 | 2332 */ |
2333 (start, end, coding_system, buffer)) | |
2334 { | |
2335 return encode_decode_coding_region (start, end, coding_system, buffer, | |
2336 CODING_DECODE); | |
2337 } | |
2338 | |
3302 | 2339 DEFUN ("encode-coding-region", Fencode_coding_region, 3, 4, |
2340 "*r\nzEncode to coding system: \ni", /* | |
771 | 2341 Encode the text between START and END using CODING-SYSTEM. |
2342 This will, for example, convert Japanese characters into stuff such as | |
3302 | 2343 "^[$B!<!+^[(B" if you use the JIS encoding. Return length of encoded text. |
2344 BUFFER defaults to the current buffer if unspecified, and when interactive. | |
771 | 2345 */ |
2346 (start, end, coding_system, buffer)) | |
2347 { | |
2348 return encode_decode_coding_region (start, end, coding_system, buffer, | |
2349 CODING_ENCODE); | |
428 | 2350 } |
2351 | |
4690
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
2352 DEFUN ("query-coding-region", Fquery_coding_region, 3, 7, 0, /* |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
2353 Work out whether CODING-SYSTEM can losslessly encode a region. |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
2354 |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
2355 START and END are the beginning and end of the region to check. |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
2356 CODING-SYSTEM is the coding system to try. |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
2357 |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
2358 Optional argument BUFFER is the buffer to check, and defaults to the current |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
2359 buffer. |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
2360 |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
2361 IGNORE-INVALID-SEQUENCESP, also an optional argument, says to treat XEmacs |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
2362 characters which have an unambiguous encoded representation, despite being |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
2363 undefined in what they represent, as encodable. These chiefly arise with |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
2364 variable-length encodings like UTF-8 and UTF-16, where an invalid sequence |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
2365 is passed through to XEmacs as a sequence of characters with a defined |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
2366 correspondence to the octets on disk, but no non-error semantics; see the |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
2367 `invalid-sequence-coding-system' argument to `set-language-info'. |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
2368 |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
2369 They can also arise with fixed-length encodings like ISO 8859-7, where |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
2370 certain octets on disk have undefined values, and treating them as |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
2371 corresponding to the ISO 8859-1 characters with the same numerical values |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
2372 may lead to data that is not understood by other applications. |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
2373 |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
2374 Optional argument ERRORP says to signal a `text-conversion-error' if some |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
2375 character in the region cannot be encoded, and defaults to nil. |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
2376 |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
2377 Optional argument HIGHLIGHT says to display unencodable characters in the |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
2378 region using `query-coding-warning-face'. It defaults to nil. |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
2379 |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
2380 This function can return multiple values; the intention is that callers use |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
2381 `multiple-value-bind' or the related CL multiple value functions to deal |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
2382 with it. The first result is `t' if the region can be encoded using |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
2383 CODING-SYSTEM, or `nil' if not. If the region cannot be encoded using |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
2384 CODING-SYSTEM, the second result is a range table describing the positions |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
2385 of the unencodable characters. |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
2386 |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
2387 Ranges that describe characters that would be ignored were |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
2388 IGNORE-INVALID-SEQUENCESP non-nil map to the symbol `invalid-sequence'; |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
2389 other ranges map to the symbol `unencodable'. If IGNORE-INVALID-SEQUENCESP |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
2390 is non-nil, all ranges will map to the symbol `unencodable'. See |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
2391 `make-range-table' for more details of range tables. |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
2392 */ |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
2393 (start, end, coding_system, buffer, ignore_invalid_sequencesp, |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
2394 errorp, highlight)) |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
2395 { |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
2396 Charbpos b, e; |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
2397 struct buffer *buf = decode_buffer (buffer, 1); |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
2398 Lisp_Object result; |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
2399 int flags = 0, speccount = specpdl_depth (); |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
2400 |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
2401 coding_system = Fget_coding_system (coding_system); |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
2402 |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
2403 get_buffer_range_char (buf, start, end, &b, &e, 0); |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
2404 |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
2405 if (buf != current_buffer) |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
2406 { |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
2407 record_unwind_protect (save_current_buffer_restore, Fcurrent_buffer ()); |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
2408 set_buffer_internal (buf); |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
2409 } |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
2410 |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
2411 record_unwind_protect (save_excursion_restore, save_excursion_save ()); |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
2412 |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
2413 BUF_SET_PT (buf, b); |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
2414 |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
2415 if (!NILP (ignore_invalid_sequencesp)) |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
2416 { |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
2417 flags |= QUERY_METHOD_IGNORE_INVALID_SEQUENCES; |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
2418 } |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
2419 |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
2420 if (!NILP (errorp)) |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
2421 { |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
2422 flags |= QUERY_METHOD_ERRORP; |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
2423 } |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
2424 |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
2425 if (!NILP (highlight)) |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
2426 { |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
2427 flags |= QUERY_METHOD_HIGHLIGHT; |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
2428 } |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
2429 |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
2430 result = XCODESYSMETH_OR_GIVEN (coding_system, query, |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
2431 (coding_system, buf, e, flags), Qunbound); |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
2432 |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
2433 if (UNBOUNDP (result)) |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
2434 { |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
2435 signal_error (Qtext_conversion_error, |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
2436 "Coding system doesn't say what it can encode", |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
2437 XCODING_SYSTEM_NAME (coding_system)); |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
2438 } |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
2439 |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
2440 result = (NILP (result)) ? Qt : values2 (Qnil, result); |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
2441 |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
2442 return unbind_to_1 (speccount, result); |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
2443 } |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
2444 |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
2445 |
428 | 2446 |
2447 /************************************************************************/ | |
771 | 2448 /* Chain methods */ |
428 | 2449 /************************************************************************/ |
2450 | |
771 | 2451 /* #### Need a way to create "opposite-direction" coding systems. */ |
2452 | |
2453 /* Chain two or more coding systems together to make a combination coding | |
2454 system. */ | |
2455 | |
2456 struct chain_coding_system | |
2457 { | |
2458 /* List of coding systems, in decode order */ | |
2459 Lisp_Object *chain; | |
2460 /* Number of coding systems in list */ | |
2461 int count; | |
2462 /* Coding system to return as a result of canonicalize-after-coding */ | |
2463 Lisp_Object canonicalize_after_coding; | |
2464 }; | |
2465 | |
2466 struct chain_coding_stream | |
2467 { | |
2468 int initted; | |
2469 /* Lstreams for chain coding system */ | |
2470 Lisp_Object *lstreams; | |
2471 int lstream_count; | |
2472 }; | |
2473 | |
1204 | 2474 static const struct memory_description chain_coding_system_description[] = { |
2475 { XD_INT, offsetof (struct chain_coding_system, count) }, | |
2367 | 2476 { XD_BLOCK_PTR, offsetof (struct chain_coding_system, chain), |
2551 | 2477 XD_INDIRECT (0, 0), { &lisp_object_description } }, |
1204 | 2478 { XD_LISP_OBJECT, offsetof (struct chain_coding_system, |
2479 canonicalize_after_coding) }, | |
771 | 2480 { XD_END } |
2481 }; | |
2482 | |
1204 | 2483 static const struct memory_description chain_coding_stream_description_1 [] = { |
2484 { XD_INT, offsetof (struct chain_coding_stream, lstream_count) }, | |
2367 | 2485 { XD_BLOCK_PTR, offsetof (struct chain_coding_stream, lstreams), |
2551 | 2486 XD_INDIRECT (0, 0), { &lisp_object_description } }, |
771 | 2487 { XD_END } |
2488 }; | |
2489 | |
1204 | 2490 const struct sized_memory_description chain_coding_stream_description = { |
2491 sizeof (struct chain_coding_stream), chain_coding_stream_description_1 | |
2492 }; | |
2493 | |
2494 DEFINE_CODING_SYSTEM_TYPE_WITH_DATA (chain); | |
2495 | |
771 | 2496 static Lisp_Object |
2497 chain_canonicalize (Lisp_Object codesys) | |
2498 { | |
2499 /* We make use of the fact that this method is called at init time, after | |
2500 properties have been parsed. init_method is called too early. */ | |
2501 /* #### It's not clear we need this whole chain-canonicalize mechanism | |
2502 any more. */ | |
2503 Lisp_Object chain = Flist (XCODING_SYSTEM_CHAIN_COUNT (codesys), | |
2504 XCODING_SYSTEM_CHAIN_CHAIN (codesys)); | |
2505 chain = Fcons (XCODING_SYSTEM_PRE_WRITE_CONVERSION (codesys), | |
2506 Fcons (XCODING_SYSTEM_POST_READ_CONVERSION (codesys), | |
2507 chain)); | |
2508 Fputhash (chain, codesys, Vchain_canonicalize_hash_table); | |
2509 return codesys; | |
2510 } | |
2511 | |
2512 static Lisp_Object | |
2513 chain_canonicalize_after_coding (struct coding_stream *str) | |
2514 { | |
2515 Lisp_Object cac = | |
2516 XCODING_SYSTEM_CHAIN_CANONICALIZE_AFTER_CODING (str->codesys); | |
2517 if (!NILP (cac)) | |
2518 return cac; | |
2519 return str->codesys; | |
2520 #if 0 | |
2521 struct chain_coding_stream *data = CODING_STREAM_TYPE_DATA (str, chain); | |
2522 Lisp_Object us = str->codesys, codesys; | |
2523 int i; | |
2524 Lisp_Object chain; | |
2525 Lisp_Object tail; | |
2526 int changed = 0; | |
2527 | |
2528 /* #### It's not clear we need this whole chain-canonicalize mechanism | |
2529 any more. */ | |
2530 if (str->direction == CODING_ENCODE || !data->initted) | |
2531 return us; | |
2532 | |
2533 chain = Flist (XCODING_SYSTEM_CHAIN_COUNT (us), | |
2534 XCODING_SYSTEM_CHAIN_CHAIN (us)); | |
2535 | |
2536 tail = chain; | |
2537 for (i = 0; i < XCODING_SYSTEM_CHAIN_COUNT (us); i++) | |
2538 { | |
2539 codesys = (coding_stream_canonicalize_after_coding | |
2540 (XLSTREAM (data->lstreams[i]))); | |
2541 if (!EQ (codesys, XCAR (tail))) | |
2542 changed = 1; | |
2543 XCAR (tail) = codesys; | |
2544 tail = XCDR (tail); | |
2545 } | |
2546 | |
2547 if (!changed) | |
2548 return us; | |
2549 | |
2550 chain = delq_no_quit (Qnil, chain); | |
2551 | |
2552 if (NILP (XCODING_SYSTEM_PRE_WRITE_CONVERSION (us)) && | |
2553 NILP (XCODING_SYSTEM_POST_READ_CONVERSION (us))) | |
2554 { | |
2555 if (NILP (chain)) | |
2556 return Qnil; | |
2557 if (NILP (XCDR (chain))) | |
2558 return XCAR (chain); | |
2559 } | |
2560 | |
2561 codesys = Fgethash (Fcons (XCODING_SYSTEM_PRE_WRITE_CONVERSION (us), | |
2562 Fcons (XCODING_SYSTEM_POST_READ_CONVERSION (us), | |
2563 chain)), Vchain_canonicalize_hash_table, | |
2564 Qnil); | |
2565 if (!NILP (codesys)) | |
2566 return codesys; | |
2567 return make_internal_coding_system | |
2568 (us, "internal-chain-canonicalizer-wrapper", | |
2569 Qchain, Qunbound, list2 (Qchain, chain)); | |
2570 #endif /* 0 */ | |
2571 } | |
2572 | |
2573 static void | |
2574 chain_init (Lisp_Object codesys) | |
2575 { | |
2576 XCODING_SYSTEM_CHAIN_CANONICALIZE_AFTER_CODING (codesys) = Qnil; | |
2577 } | |
2578 | |
2579 static void | |
2580 chain_mark (Lisp_Object codesys) | |
2581 { | |
2582 int i; | |
2583 | |
2584 for (i = 0; i < XCODING_SYSTEM_CHAIN_COUNT (codesys); i++) | |
2585 mark_object (XCODING_SYSTEM_CHAIN_CHAIN (codesys)[i]); | |
2586 mark_object (XCODING_SYSTEM_CHAIN_CANONICALIZE_AFTER_CODING (codesys)); | |
2587 } | |
2588 | |
2589 static void | |
2590 chain_mark_coding_stream_1 (struct chain_coding_stream *data) | |
2591 { | |
2592 int i; | |
2593 | |
2594 for (i = 0; i < data->lstream_count; i++) | |
2595 mark_object (data->lstreams[i]); | |
2596 } | |
2597 | |
2598 static void | |
2599 chain_mark_coding_stream (struct coding_stream *str) | |
2600 { | |
2601 chain_mark_coding_stream_1 (CODING_STREAM_TYPE_DATA (str, chain)); | |
2602 } | |
2603 | |
2604 static void | |
2605 chain_print (Lisp_Object cs, Lisp_Object printcharfun, int escapeflag) | |
2606 { | |
2607 int i; | |
2608 | |
4952
19a72041c5ed
Mule-izing, various fixes related to char * arguments
Ben Wing <ben@xemacs.org>
parents:
4846
diff
changeset
|
2609 write_ascstring (printcharfun, "("); |
771 | 2610 for (i = 0; i < XCODING_SYSTEM_CHAIN_COUNT (cs); i++) |
2611 { | |
4952
19a72041c5ed
Mule-izing, various fixes related to char * arguments
Ben Wing <ben@xemacs.org>
parents:
4846
diff
changeset
|
2612 write_ascstring (printcharfun, i == 0 ? "" : "->"); |
771 | 2613 print_coding_system_in_print_method (XCODING_SYSTEM_CHAIN_CHAIN (cs)[i], |
2614 printcharfun, escapeflag); | |
2615 } | |
2616 { | |
2617 Lisp_Object cac = XCODING_SYSTEM_CHAIN_CANONICALIZE_AFTER_CODING (cs); | |
2618 if (!NILP (cac)) | |
2619 { | |
2620 if (i > 0) | |
4952
19a72041c5ed
Mule-izing, various fixes related to char * arguments
Ben Wing <ben@xemacs.org>
parents:
4846
diff
changeset
|
2621 write_ascstring (printcharfun, " "); |
19a72041c5ed
Mule-izing, various fixes related to char * arguments
Ben Wing <ben@xemacs.org>
parents:
4846
diff
changeset
|
2622 write_ascstring (printcharfun, "canonicalize-after-coding="); |
771 | 2623 print_coding_system_in_print_method (cac, printcharfun, escapeflag); |
2624 } | |
2625 } | |
2626 | |
4952
19a72041c5ed
Mule-izing, various fixes related to char * arguments
Ben Wing <ben@xemacs.org>
parents:
4846
diff
changeset
|
2627 write_ascstring (printcharfun, ")"); |
771 | 2628 } |
2629 | |
2630 static void | |
2631 chain_rewind_coding_stream_1 (struct chain_coding_stream *data) | |
2632 { | |
2633 /* Each will rewind the next; there is always at least one stream (the | |
2634 dynarr stream at the end) if we're initted */ | |
2635 if (data->initted) | |
2636 Lstream_rewind (XLSTREAM (data->lstreams[0])); | |
2637 } | |
2638 | |
2639 static void | |
2640 chain_rewind_coding_stream (struct coding_stream *str) | |
2641 { | |
2642 chain_rewind_coding_stream_1 (CODING_STREAM_TYPE_DATA (str, chain)); | |
2643 } | |
2644 | |
2645 static void | |
2646 chain_init_coding_streams_1 (struct chain_coding_stream *data, | |
2647 unsigned_char_dynarr *dst, | |
2648 int ncodesys, Lisp_Object *codesys, | |
2649 enum encode_decode direction) | |
2650 { | |
2651 int i; | |
2652 Lisp_Object lstream_out; | |
2653 | |
2654 data->lstream_count = ncodesys + 1; | |
2655 data->lstreams = xnew_array (Lisp_Object, data->lstream_count); | |
2656 | |
2657 lstream_out = make_dynarr_output_stream (dst); | |
2658 Lstream_set_buffering (XLSTREAM (lstream_out), LSTREAM_UNBUFFERED, 0); | |
2659 data->lstreams[data->lstream_count - 1] = lstream_out; | |
2660 | |
2661 for (i = ncodesys - 1; i >= 0; i--) | |
2662 { | |
2663 data->lstreams[i] = | |
2664 make_coding_output_stream | |
2665 (XLSTREAM (lstream_out), | |
2666 codesys[direction == CODING_ENCODE ? ncodesys - (i + 1) : i], | |
800 | 2667 direction, 0); |
771 | 2668 lstream_out = data->lstreams[i]; |
2669 Lstream_set_buffering (XLSTREAM (lstream_out), LSTREAM_UNBUFFERED, | |
2670 0); | |
2671 } | |
2672 data->initted = 1; | |
2673 } | |
2674 | |
2675 static Bytecount | |
2676 chain_convert (struct coding_stream *str, const UExtbyte *src, | |
2677 unsigned_char_dynarr *dst, Bytecount n) | |
2678 { | |
2679 struct chain_coding_stream *data = CODING_STREAM_TYPE_DATA (str, chain); | |
2680 | |
2681 if (str->eof) | |
2682 { | |
2683 /* Each will close the next; there is always at least one stream (the | |
2684 dynarr stream at the end) if we're initted. We need to close now | |
2685 because more data may be generated. */ | |
2686 if (data->initted) | |
2687 Lstream_close (XLSTREAM (data->lstreams[0])); | |
2688 return n; | |
2689 } | |
2690 | |
2691 if (!data->initted) | |
2692 chain_init_coding_streams_1 | |
2693 (data, dst, XCODING_SYSTEM_CHAIN_COUNT (str->codesys), | |
2694 XCODING_SYSTEM_CHAIN_CHAIN (str->codesys), str->direction); | |
2695 | |
2696 if (Lstream_write (XLSTREAM (data->lstreams[0]), src, n) < 0) | |
2697 return -1; | |
2698 return n; | |
2699 } | |
2700 | |
2701 static void | |
2702 chain_finalize_coding_stream_1 (struct chain_coding_stream *data) | |
2703 { | |
2704 if (data->lstreams) | |
2705 { | |
2297 | 2706 /* During GC, these objects are unmarked, and are about to be freed. |
2707 We do NOT want them on the free list, and that will cause lots of | |
2708 nastiness including crashes. Just let them be freed normally. */ | |
771 | 2709 if (!gc_in_progress) |
2710 { | |
2711 int i; | |
2297 | 2712 /* Order of deletion is important here! Delete from the head of |
2713 the chain and work your way towards the tail. In general, | |
2714 when you delete an object, there should be *NO* pointers to it | |
2715 anywhere. Deleting back-to-front would be a problem because | |
2716 there are pointers going forward. If there were pointers in | |
2717 both directions, you'd have to disconnect the pointers to a | |
2718 particular object before deleting it. */ | |
771 | 2719 for (i = 0; i < data->lstream_count; i++) |
2720 Lstream_delete (XLSTREAM ((data->lstreams)[i])); | |
2721 } | |
4976
16112448d484
Rename xfree(FOO, TYPE) -> xfree(FOO)
Ben Wing <ben@xemacs.org>
parents:
4967
diff
changeset
|
2722 xfree (data->lstreams); |
771 | 2723 } |
2724 } | |
2725 | |
2726 static void | |
2727 chain_finalize_coding_stream (struct coding_stream *str) | |
2728 { | |
2729 chain_finalize_coding_stream_1 (CODING_STREAM_TYPE_DATA (str, chain)); | |
2730 } | |
2731 | |
2732 static void | |
2733 chain_finalize (Lisp_Object c) | |
2734 { | |
2735 if (XCODING_SYSTEM_CHAIN_CHAIN (c)) | |
4976
16112448d484
Rename xfree(FOO, TYPE) -> xfree(FOO)
Ben Wing <ben@xemacs.org>
parents:
4967
diff
changeset
|
2736 xfree (XCODING_SYSTEM_CHAIN_CHAIN (c)); |
771 | 2737 } |
2738 | |
428 | 2739 static int |
771 | 2740 chain_putprop (Lisp_Object codesys, Lisp_Object key, Lisp_Object value) |
2741 { | |
2742 if (EQ (key, Qchain)) | |
2743 { | |
2744 Lisp_Object *cslist; | |
2745 int count = 0; | |
2746 int i; | |
2747 | |
2367 | 2748 { |
2749 EXTERNAL_LIST_LOOP_2 (elt, value) | |
2750 { | |
2751 Fget_coding_system (elt); | |
2752 count++; | |
2753 } | |
2754 } | |
771 | 2755 |
2756 cslist = xnew_array (Lisp_Object, count); | |
2757 XCODING_SYSTEM_CHAIN_CHAIN (codesys) = cslist; | |
2758 | |
2759 count = 0; | |
2367 | 2760 { |
2761 EXTERNAL_LIST_LOOP_2 (elt, value) | |
2762 { | |
2763 cslist[count] = Fget_coding_system (elt); | |
2764 count++; | |
2765 } | |
2766 } | |
771 | 2767 |
2768 XCODING_SYSTEM_CHAIN_COUNT (codesys) = count; | |
2769 | |
2770 for (i = 0; i < count - 1; i++) | |
2771 { | |
2772 if (decoding_source_sink_type_is_char (cslist[i], CODING_SINK) != | |
2773 decoding_source_sink_type_is_char (cslist[i + 1], CODING_SOURCE)) | |
2774 invalid_argument_2 ("Sink of first must match source of second", | |
2775 cslist[i], cslist[i + 1]); | |
2776 } | |
2777 } | |
2778 else if (EQ (key, Qcanonicalize_after_coding)) | |
2779 XCODING_SYSTEM_CHAIN_CANONICALIZE_AFTER_CODING (codesys) = | |
2780 Fget_coding_system (value); | |
2781 else | |
2782 return 0; | |
2783 return 1; | |
2784 } | |
2785 | |
2786 static Lisp_Object | |
2787 chain_getprop (Lisp_Object coding_system, Lisp_Object prop) | |
2788 { | |
2789 if (EQ (prop, Qchain)) | |
2790 { | |
2791 Lisp_Object result = Qnil; | |
2792 int i; | |
2793 | |
2794 for (i = 0; i < XCODING_SYSTEM_CHAIN_COUNT (coding_system); i++) | |
2795 result = Fcons (XCODING_SYSTEM_CHAIN_CHAIN (coding_system)[i], | |
2796 result); | |
2797 | |
2798 return Fnreverse (result); | |
2799 } | |
2800 else if (EQ (prop, Qcanonicalize_after_coding)) | |
2801 return XCODING_SYSTEM_CHAIN_CANONICALIZE_AFTER_CODING (coding_system); | |
2802 else | |
2803 return Qunbound; | |
2804 } | |
2805 | |
2806 static enum source_sink_type | |
2807 chain_conversion_end_type (Lisp_Object codesys) | |
2808 { | |
2809 Lisp_Object *cslist = XCODING_SYSTEM_CHAIN_CHAIN (codesys); | |
2810 int n = XCODING_SYSTEM_CHAIN_COUNT (codesys); | |
2811 int charp_source, charp_sink; | |
2812 | |
2813 if (n == 0) | |
2814 return DECODES_BYTE_TO_BYTE; /* arbitrary */ | |
2815 charp_source = decoding_source_sink_type_is_char (cslist[0], CODING_SOURCE); | |
2816 charp_sink = decoding_source_sink_type_is_char (cslist[n - 1], CODING_SINK); | |
2817 | |
2818 switch (charp_source * 2 + charp_sink) | |
2819 { | |
2820 case 0: return DECODES_BYTE_TO_BYTE; | |
2821 case 1: return DECODES_BYTE_TO_CHARACTER; | |
2822 case 2: return DECODES_CHARACTER_TO_BYTE; | |
2823 case 3: return DECODES_CHARACTER_TO_CHARACTER; | |
2824 } | |
2825 | |
2500 | 2826 ABORT (); |
771 | 2827 return DECODES_BYTE_TO_BYTE; |
2828 } | |
2829 | |
2830 | |
2831 /************************************************************************/ | |
2832 /* No-conversion methods */ | |
2833 /************************************************************************/ | |
2834 | |
2835 /* "No conversion"; used for binary files. We use quotes because there | |
2836 really is some conversion being applied (it does byte<->char | |
2837 conversion), but it appears to the user as if the text is read in | |
2297 | 2838 without conversion. |
2839 | |
2840 #### Shouldn't we _call_ it that, then? And while we're at it, | |
2841 separate it into "to_internal" and "to_external"? */ | |
771 | 2842 DEFINE_CODING_SYSTEM_TYPE (no_conversion); |
2843 | |
2844 /* This is used when reading in "binary" files -- i.e. files that may | |
2845 contain all 256 possible byte values and that are not to be | |
2846 interpreted as being in any particular encoding. */ | |
2847 static Bytecount | |
2848 no_conversion_convert (struct coding_stream *str, | |
2849 const UExtbyte *src, | |
2850 unsigned_char_dynarr *dst, Bytecount n) | |
2851 { | |
2852 UExtbyte c; | |
2853 unsigned int ch = str->ch; | |
2854 Bytecount orign = n; | |
2855 | |
2856 if (str->direction == CODING_DECODE) | |
2857 { | |
2858 while (n--) | |
2859 { | |
2860 c = *src++; | |
2861 | |
2862 DECODE_ADD_BINARY_CHAR (c, dst); | |
2863 } | |
2864 | |
2865 if (str->eof) | |
2866 DECODE_OUTPUT_PARTIAL_CHAR (ch, dst); | |
2867 } | |
2868 else | |
2869 { | |
2870 | |
2871 while (n--) | |
2872 { | |
2873 c = *src++; | |
826 | 2874 if (byte_ascii_p (c)) |
771 | 2875 { |
2876 assert (ch == 0); | |
2877 Dynarr_add (dst, c); | |
2878 } | |
2879 #ifdef MULE | |
867 | 2880 else if (ibyte_leading_byte_p (c)) |
771 | 2881 { |
2882 assert (ch == 0); | |
2883 if (c == LEADING_BYTE_LATIN_ISO8859_1 || | |
2884 c == LEADING_BYTE_CONTROL_1) | |
2885 ch = c; | |
2886 else | |
2297 | 2887 /* #### This is just plain unacceptable. */ |
771 | 2888 Dynarr_add (dst, '~'); /* untranslatable character */ |
2889 } | |
2890 else | |
2891 { | |
2892 if (ch == LEADING_BYTE_LATIN_ISO8859_1) | |
2893 Dynarr_add (dst, c); | |
2894 else if (ch == LEADING_BYTE_CONTROL_1) | |
2895 { | |
2896 assert (c < 0xC0); | |
2897 Dynarr_add (dst, c - 0x20); | |
2898 } | |
2899 /* else it should be the second or third byte of an | |
2900 untranslatable character, so ignore it */ | |
2901 ch = 0; | |
2902 } | |
2903 #endif /* MULE */ | |
2904 | |
2905 } | |
2906 } | |
2907 | |
2908 str->ch = ch; | |
2909 return orign; | |
2910 } | |
2911 | |
2912 DEFINE_DETECTOR (no_conversion); | |
2913 DEFINE_DETECTOR_CATEGORY (no_conversion, no_conversion); | |
2914 | |
2915 struct no_conversion_detector | |
2916 { | |
2917 int dummy; | |
2918 }; | |
2919 | |
2920 static void | |
2286 | 2921 no_conversion_detect (struct detection_state *st, const UExtbyte *UNUSED (src), |
2922 Bytecount UNUSED (n)) | |
771 | 2923 { |
2924 /* Hack until we get better handling of this stuff! */ | |
2925 DET_RESULT (st, no_conversion) = DET_SLIGHTLY_LIKELY; | |
2926 } | |
2927 | |
2928 | |
2929 /************************************************************************/ | |
2930 /* Convert-eol methods */ | |
2931 /************************************************************************/ | |
2932 | |
2933 /* This is used to handle end-of-line (EOL) differences. It is | |
2819 | 2934 character-to-character, and works (when encoding) *BEFORE* sending data to |
2935 the main encoding routine -- thus, that routine must handle different EOL | |
2936 types itself if it does line-oriented type processing. This is unavoidable | |
2937 because we don't know whether the output of the main encoding routine is | |
2938 ASCII compatible (UTF-16 is definitely not, for example). [[ sjt sez this | |
2939 is bogus. There should be _no_ EOL processing (or processing of any kind) | |
2940 after conversion to external. ]] | |
771 | 2941 |
793 | 2942 There is one parameter: `subtype', either `cr', `lf', `crlf', or nil. |
771 | 2943 */ |
2944 | |
2945 struct convert_eol_coding_system | |
2946 { | |
2947 enum eol_type subtype; | |
2132 | 2948 int dummy; /* On some architectures (eg ia64) the portable dumper can |
2949 produce unaligned access errors without this field. Probably | |
2950 because the combined structure of this structure and | |
2951 Lisp_Coding_System is not properly aligned. */ | |
771 | 2952 }; |
2953 | |
2954 #define CODING_SYSTEM_CONVERT_EOL_SUBTYPE(codesys) \ | |
2955 (CODING_SYSTEM_TYPE_DATA (codesys, convert_eol)->subtype) | |
2956 #define XCODING_SYSTEM_CONVERT_EOL_SUBTYPE(codesys) \ | |
2957 (XCODING_SYSTEM_TYPE_DATA (codesys, convert_eol)->subtype) | |
2958 | |
2959 struct convert_eol_coding_stream | |
2960 { | |
2961 enum eol_type actual; | |
2962 }; | |
2963 | |
1204 | 2964 static const struct memory_description |
771 | 2965 convert_eol_coding_system_description[] = { |
2966 { XD_END } | |
2967 }; | |
2968 | |
1204 | 2969 DEFINE_CODING_SYSTEM_TYPE_WITH_DATA (convert_eol); |
2970 | |
771 | 2971 static void |
2286 | 2972 convert_eol_print (Lisp_Object cs, Lisp_Object printcharfun, |
2973 int UNUSED (escapeflag)) | |
771 | 2974 { |
2975 struct convert_eol_coding_system *data = | |
2976 XCODING_SYSTEM_TYPE_DATA (cs, convert_eol); | |
2977 | |
2978 write_fmt_string (printcharfun, "(%s)", | |
2979 data->subtype == EOL_LF ? "lf" : | |
2980 data->subtype == EOL_CRLF ? "crlf" : | |
2981 data->subtype == EOL_CR ? "cr" : | |
793 | 2982 data->subtype == EOL_AUTODETECT ? "nil" : |
2500 | 2983 (ABORT(), "")); |
771 | 2984 } |
2985 | |
2986 static enum source_sink_type | |
2286 | 2987 convert_eol_conversion_end_type (Lisp_Object UNUSED (codesys)) |
771 | 2988 { |
2989 return DECODES_CHARACTER_TO_CHARACTER; | |
2990 } | |
2991 | |
2992 static int | |
2993 convert_eol_putprop (Lisp_Object codesys, | |
2994 Lisp_Object key, | |
2995 Lisp_Object value) | |
2996 { | |
2997 struct convert_eol_coding_system *data = | |
2998 XCODING_SYSTEM_TYPE_DATA (codesys, convert_eol); | |
2999 | |
3000 if (EQ (key, Qsubtype)) | |
3001 { | |
3002 if (EQ (value, Qlf) /* || EQ (value, Qunix) */) | |
3003 data->subtype = EOL_LF; | |
3004 else if (EQ (value, Qcrlf) /* || EQ (value, Qdos) */) | |
3005 data->subtype = EOL_CRLF; | |
3006 else if (EQ (value, Qcr) /* || EQ (value, Qmac) */) | |
3007 data->subtype = EOL_CR; | |
793 | 3008 else if (EQ (value, Qnil)) |
771 | 3009 data->subtype = EOL_AUTODETECT; |
3010 else invalid_constant ("Unrecognized eol type", value); | |
3011 } | |
3012 else | |
3013 return 0; | |
3014 return 1; | |
3015 } | |
3016 | |
3017 static Lisp_Object | |
3018 convert_eol_getprop (Lisp_Object coding_system, Lisp_Object prop) | |
3019 { | |
3020 struct convert_eol_coding_system *data = | |
3021 XCODING_SYSTEM_TYPE_DATA (coding_system, convert_eol); | |
3022 | |
3023 if (EQ (prop, Qsubtype)) | |
3024 { | |
3025 switch (data->subtype) | |
3026 { | |
3027 case EOL_LF: return Qlf; | |
3028 case EOL_CRLF: return Qcrlf; | |
3029 case EOL_CR: return Qcr; | |
793 | 3030 case EOL_AUTODETECT: return Qnil; |
2500 | 3031 default: ABORT (); |
771 | 3032 } |
3033 } | |
3034 | |
3035 return Qunbound; | |
3036 } | |
3037 | |
3038 static void | |
3039 convert_eol_init_coding_stream (struct coding_stream *str) | |
3040 { | |
3041 struct convert_eol_coding_stream *data = | |
3042 CODING_STREAM_TYPE_DATA (str, convert_eol); | |
3043 data->actual = XCODING_SYSTEM_CONVERT_EOL_SUBTYPE (str->codesys); | |
3044 } | |
3045 | |
3046 static Bytecount | |
867 | 3047 convert_eol_convert (struct coding_stream *str, const Ibyte *src, |
771 | 3048 unsigned_char_dynarr *dst, Bytecount n) |
3049 { | |
3050 if (str->direction == CODING_DECODE) | |
3051 { | |
3052 struct convert_eol_coding_stream *data = | |
3053 CODING_STREAM_TYPE_DATA (str, convert_eol); | |
3054 | |
3055 if (data->actual == EOL_AUTODETECT) | |
3056 { | |
3057 Bytecount n2 = n; | |
867 | 3058 const Ibyte *src2 = src; |
771 | 3059 |
3060 for (; n2; n2--) | |
3061 { | |
867 | 3062 Ibyte c = *src2++; |
771 | 3063 if (c == '\n') |
3064 { | |
3065 data->actual = EOL_LF; | |
3066 break; | |
3067 } | |
3068 else if (c == '\r') | |
3069 { | |
3070 if (n2 == 1) | |
3071 { | |
3072 /* If we're seeing a '\r' at the end of the data, then | |
3073 reject the '\r' right now so it doesn't become an | |
3074 issue in the code below -- unless we're at the end of | |
3075 the stream, in which case we can't do that (because | |
3076 then the '\r' will never get written out), and in any | |
3077 case we should be recognizing it at EOL_CR format. */ | |
3078 if (str->eof) | |
3079 data->actual = EOL_CR; | |
3080 else | |
3081 n--; | |
3082 break; | |
3083 } | |
3084 else if (*src2 == '\n') | |
3085 data->actual = EOL_CRLF; | |
3086 else | |
3087 data->actual = EOL_CR; | |
3088 break; | |
3089 } | |
3090 } | |
3091 } | |
3092 | |
3093 /* str->eof is set, the caller reached EOF on the other end and has | |
3094 no new data to give us. The only data we get is the data we | |
3095 rejected from last time. */ | |
3096 if (data->actual == EOL_LF || data->actual == EOL_AUTODETECT || | |
3097 (str->eof)) | |
3098 Dynarr_add_many (dst, src, n); | |
3099 else | |
3100 { | |
867 | 3101 const Ibyte *end = src + n; |
771 | 3102 while (1) |
3103 { | |
3104 /* Find the next section with no \r and add it. */ | |
867 | 3105 const Ibyte *runstart = src; |
3106 src = (Ibyte *) memchr (src, '\r', end - src); | |
771 | 3107 if (!src) |
3108 src = end; | |
3109 Dynarr_add_many (dst, runstart, src - runstart); | |
3110 /* Stop if at end ... */ | |
3111 if (src == end) | |
3112 break; | |
3113 /* ... else, translate as necessary. */ | |
3114 src++; | |
3115 if (data->actual == EOL_CR) | |
3116 Dynarr_add (dst, '\n'); | |
3117 /* We need to be careful here with CRLF. If we see a CR at the | |
3118 end of the data, we don't know if it's part of a CRLF, so we | |
3119 reject it. Otherwise: If it's part of a CRLF, eat it and | |
3120 loop; the following LF gets added next time around. If it's | |
3121 not part of a CRLF, add the CR and loop. The following | |
3122 character will be processed in the next loop iteration. This | |
3123 correctly handles a sequence like CR+CR+LF. */ | |
3124 else if (src == end) | |
3125 return n - 1; /* reject the CR at the end; we'll get it again | |
3126 next time the convert method is called */ | |
3127 else if (*src != '\n') | |
3128 Dynarr_add (dst, '\r'); | |
3129 } | |
3130 } | |
3131 | |
3132 return n; | |
3133 } | |
3134 else | |
3135 { | |
3136 enum eol_type subtype = | |
3137 XCODING_SYSTEM_CONVERT_EOL_SUBTYPE (str->codesys); | |
867 | 3138 const Ibyte *end = src + n; |
771 | 3139 |
3140 /* We try to be relatively efficient here. */ | |
3141 if (subtype == EOL_LF) | |
3142 Dynarr_add_many (dst, src, n); | |
3143 else | |
3144 { | |
3145 while (1) | |
3146 { | |
3147 /* Find the next section with no \n and add it. */ | |
867 | 3148 const Ibyte *runstart = src; |
3149 src = (Ibyte *) memchr (src, '\n', end - src); | |
771 | 3150 if (!src) |
3151 src = end; | |
3152 Dynarr_add_many (dst, runstart, src - runstart); | |
3153 /* Stop if at end ... */ | |
3154 if (src == end) | |
3155 break; | |
3156 /* ... else, skip over \n and add its translation. */ | |
3157 src++; | |
3158 Dynarr_add (dst, '\r'); | |
3159 if (subtype == EOL_CRLF) | |
3160 Dynarr_add (dst, '\n'); | |
3161 } | |
3162 } | |
3163 | |
3164 return n; | |
3165 } | |
3166 } | |
3167 | |
3168 static Lisp_Object | |
3169 convert_eol_canonicalize_after_coding (struct coding_stream *str) | |
3170 { | |
3171 struct convert_eol_coding_stream *data = | |
3172 CODING_STREAM_TYPE_DATA (str, convert_eol); | |
3173 | |
3174 if (str->direction == CODING_ENCODE) | |
3175 return str->codesys; | |
3176 | |
3177 switch (data->actual) | |
3178 { | |
3179 case EOL_LF: return Fget_coding_system (Qconvert_eol_lf); | |
3180 case EOL_CRLF: return Fget_coding_system (Qconvert_eol_crlf); | |
3181 case EOL_CR: return Fget_coding_system (Qconvert_eol_cr); | |
3182 case EOL_AUTODETECT: return str->codesys; | |
2500 | 3183 default: ABORT (); return Qnil; |
771 | 3184 } |
3185 } | |
3186 | |
3187 | |
3188 /************************************************************************/ | |
3189 /* Undecided methods */ | |
3190 /************************************************************************/ | |
3191 | |
3192 /* Do autodetection. We can autodetect the EOL type only, the coding | |
3193 system only, or both. We only do autodetection when decoding; when | |
3194 encoding, we just pass the data through. | |
3195 | |
3196 When doing just EOL detection, a coding system can be specified; if so, | |
3197 we will decode this data through the coding system before doing EOL | |
3198 detection. The reason for specifying this is so that | |
3199 canonicalize-after-coding works: We will canonicalize the specified | |
3200 coding system into the appropriate EOL type. When doing both coding and | |
3201 EOL detection, we do similar canonicalization, and also catch situations | |
3202 where the EOL type is overspecified, i.e. the detected coding system | |
3203 specifies an EOL type, and either switch to the equivalent | |
3204 non-EOL-processing coding system (if possible), or terminate EOL | |
3205 detection and use the specified EOL type. This prevents data from being | |
3206 EOL-processed twice. | |
3207 */ | |
3208 | |
3209 struct undecided_coding_system | |
3210 { | |
3211 int do_eol, do_coding; | |
3212 Lisp_Object cs; | |
3213 }; | |
3214 | |
3215 struct undecided_coding_stream | |
3216 { | |
3217 Lisp_Object actual; | |
3218 /* Either 2 or 3 lstreams here; see undecided_convert */ | |
3219 struct chain_coding_stream c; | |
3220 | |
3221 struct detection_state *st; | |
3222 }; | |
3223 | |
1204 | 3224 static const struct memory_description undecided_coding_system_description[] = { |
3225 { XD_LISP_OBJECT, offsetof (struct undecided_coding_system, cs) }, | |
771 | 3226 { XD_END } |
3227 }; | |
3228 | |
1204 | 3229 static const struct memory_description undecided_coding_stream_description_1 [] = { |
3230 { XD_LISP_OBJECT, offsetof (struct undecided_coding_stream, actual) }, | |
2367 | 3231 { XD_BLOCK_ARRAY, offsetof (struct undecided_coding_stream, c), |
2551 | 3232 1, { &chain_coding_stream_description } }, |
1204 | 3233 { XD_END } |
3234 }; | |
3235 | |
3236 const struct sized_memory_description undecided_coding_stream_description = { | |
3237 sizeof (struct undecided_coding_stream), undecided_coding_stream_description_1 | |
3238 }; | |
3239 | |
3240 DEFINE_CODING_SYSTEM_TYPE_WITH_DATA (undecided); | |
3241 | |
771 | 3242 static void |
3243 undecided_init (Lisp_Object codesys) | |
3244 { | |
3245 struct undecided_coding_system *data = | |
3246 XCODING_SYSTEM_TYPE_DATA (codesys, undecided); | |
3247 | |
3248 data->cs = Qnil; | |
3249 } | |
3250 | |
3251 static void | |
3252 undecided_mark (Lisp_Object codesys) | |
3253 { | |
3254 struct undecided_coding_system *data = | |
3255 XCODING_SYSTEM_TYPE_DATA (codesys, undecided); | |
3256 | |
3257 mark_object (data->cs); | |
3258 } | |
3259 | |
3260 static void | |
3261 undecided_print (Lisp_Object cs, Lisp_Object printcharfun, int escapeflag) | |
3262 { | |
3263 struct undecided_coding_system *data = | |
3264 XCODING_SYSTEM_TYPE_DATA (cs, undecided); | |
3265 int need_space = 0; | |
3266 | |
4952
19a72041c5ed
Mule-izing, various fixes related to char * arguments
Ben Wing <ben@xemacs.org>
parents:
4846
diff
changeset
|
3267 write_ascstring (printcharfun, "("); |
771 | 3268 if (data->do_eol) |
3269 { | |
4952
19a72041c5ed
Mule-izing, various fixes related to char * arguments
Ben Wing <ben@xemacs.org>
parents:
4846
diff
changeset
|
3270 write_ascstring (printcharfun, "do-eol"); |
771 | 3271 need_space = 1; |
3272 } | |
3273 if (data->do_coding) | |
3274 { | |
3275 if (need_space) | |
4952
19a72041c5ed
Mule-izing, various fixes related to char * arguments
Ben Wing <ben@xemacs.org>
parents:
4846
diff
changeset
|
3276 write_ascstring (printcharfun, " "); |
19a72041c5ed
Mule-izing, various fixes related to char * arguments
Ben Wing <ben@xemacs.org>
parents:
4846
diff
changeset
|
3277 write_ascstring (printcharfun, "do-coding"); |
771 | 3278 need_space = 1; |
3279 } | |
3280 if (!NILP (data->cs)) | |
3281 { | |
3282 if (need_space) | |
4952
19a72041c5ed
Mule-izing, various fixes related to char * arguments
Ben Wing <ben@xemacs.org>
parents:
4846
diff
changeset
|
3283 write_ascstring (printcharfun, " "); |
19a72041c5ed
Mule-izing, various fixes related to char * arguments
Ben Wing <ben@xemacs.org>
parents:
4846
diff
changeset
|
3284 write_ascstring (printcharfun, "coding-system="); |
771 | 3285 print_coding_system_in_print_method (data->cs, printcharfun, escapeflag); |
3286 } | |
4952
19a72041c5ed
Mule-izing, various fixes related to char * arguments
Ben Wing <ben@xemacs.org>
parents:
4846
diff
changeset
|
3287 write_ascstring (printcharfun, ")"); |
771 | 3288 } |
3289 | |
3290 static void | |
3291 undecided_mark_coding_stream (struct coding_stream *str) | |
3292 { | |
1204 | 3293 mark_object (CODING_STREAM_TYPE_DATA (str, undecided)->actual); |
771 | 3294 chain_mark_coding_stream_1 (&CODING_STREAM_TYPE_DATA (str, undecided)->c); |
3295 } | |
3296 | |
3297 static int | |
3298 undecided_putprop (Lisp_Object codesys, Lisp_Object key, Lisp_Object value) | |
3299 { | |
3300 struct undecided_coding_system *data = | |
3301 XCODING_SYSTEM_TYPE_DATA (codesys, undecided); | |
3302 | |
3303 if (EQ (key, Qdo_eol)) | |
3304 data->do_eol = 1; | |
3305 else if (EQ (key, Qdo_coding)) | |
3306 data->do_coding = 1; | |
3307 else if (EQ (key, Qcoding_system)) | |
3308 data->cs = get_coding_system_for_text_file (value, 0); | |
3309 else | |
3310 return 0; | |
3311 return 1; | |
3312 } | |
3313 | |
3314 static Lisp_Object | |
3315 undecided_getprop (Lisp_Object codesys, Lisp_Object prop) | |
3316 { | |
3317 struct undecided_coding_system *data = | |
3318 XCODING_SYSTEM_TYPE_DATA (codesys, undecided); | |
3319 | |
3320 if (EQ (prop, Qdo_eol)) | |
3321 return data->do_eol ? Qt : Qnil; | |
3322 if (EQ (prop, Qdo_coding)) | |
3323 return data->do_coding ? Qt : Qnil; | |
3324 if (EQ (prop, Qcoding_system)) | |
3325 return data->cs; | |
3326 return Qunbound; | |
3327 } | |
3328 | |
3329 static struct detection_state * | |
3330 allocate_detection_state (void) | |
3331 { | |
3332 int i; | |
3333 Bytecount size = MAX_ALIGN_SIZE (sizeof (struct detection_state)); | |
3334 struct detection_state *block; | |
3335 | |
3336 for (i = 0; i < coding_detector_count; i++) | |
3337 size += MAX_ALIGN_SIZE (Dynarr_at (all_coding_detectors, i).data_size); | |
3338 | |
3339 block = (struct detection_state *) xmalloc_and_zero (size); | |
3340 | |
3341 size = MAX_ALIGN_SIZE (sizeof (struct detection_state)); | |
3342 for (i = 0; i < coding_detector_count; i++) | |
3343 { | |
3344 block->data_offset[i] = size; | |
3345 size += MAX_ALIGN_SIZE (Dynarr_at (all_coding_detectors, i).data_size); | |
3346 } | |
3347 | |
3348 return block; | |
3349 } | |
3350 | |
3351 static void | |
3352 free_detection_state (struct detection_state *st) | |
3353 { | |
3354 int i; | |
3355 | |
3356 for (i = 0; i < coding_detector_count; i++) | |
3357 { | |
3358 if (Dynarr_at (all_coding_detectors, i).finalize_detection_state_method) | |
3359 Dynarr_at (all_coding_detectors, i).finalize_detection_state_method | |
3360 (st); | |
3361 } | |
3362 | |
4976
16112448d484
Rename xfree(FOO, TYPE) -> xfree(FOO)
Ben Wing <ben@xemacs.org>
parents:
4967
diff
changeset
|
3363 xfree (st); |
771 | 3364 } |
3365 | |
3366 static int | |
3367 coding_category_symbol_to_id (Lisp_Object symbol) | |
428 | 3368 { |
3369 int i; | |
3370 | |
3371 CHECK_SYMBOL (symbol); | |
771 | 3372 for (i = 0; i < coding_detector_count; i++) |
3373 { | |
3374 detector_category_dynarr *cats = | |
3375 Dynarr_at (all_coding_detectors, i).cats; | |
3376 int j; | |
3377 | |
3378 for (j = 0; j < Dynarr_length (cats); j++) | |
3379 if (EQ (Dynarr_at (cats, j).sym, symbol)) | |
3380 return Dynarr_at (cats, j).id; | |
3381 } | |
3382 | |
563 | 3383 invalid_constant ("Unrecognized coding category", symbol); |
1204 | 3384 RETURN_NOT_REACHED (0); |
428 | 3385 } |
3386 | |
771 | 3387 static Lisp_Object |
3388 coding_category_id_to_symbol (int id) | |
428 | 3389 { |
3390 int i; | |
771 | 3391 |
3392 for (i = 0; i < coding_detector_count; i++) | |
3393 { | |
3394 detector_category_dynarr *cats = | |
3395 Dynarr_at (all_coding_detectors, i).cats; | |
3396 int j; | |
3397 | |
3398 for (j = 0; j < Dynarr_length (cats); j++) | |
3399 if (id == Dynarr_at (cats, j).id) | |
3400 return Dynarr_at (cats, j).sym; | |
3401 } | |
3402 | |
2500 | 3403 ABORT (); |
771 | 3404 return Qnil; /* (usually) not reached */ |
428 | 3405 } |
3406 | |
771 | 3407 static Lisp_Object |
3408 detection_result_number_to_symbol (enum detection_result result) | |
428 | 3409 { |
1494 | 3410 /* let compiler warn if not all enumerators are handled */ |
3411 switch (result) { | |
3412 #define FROB(sym, num) case num: return (sym) | |
771 | 3413 FROB (Qnear_certainty, DET_NEAR_CERTAINTY); |
3414 FROB (Qquite_probable, DET_QUITE_PROBABLE); | |
3415 FROB (Qsomewhat_likely, DET_SOMEWHAT_LIKELY); | |
1494 | 3416 FROB (Qslightly_likely, DET_SLIGHTLY_LIKELY); |
771 | 3417 FROB (Qas_likely_as_unlikely, DET_AS_LIKELY_AS_UNLIKELY); |
3418 FROB (Qsomewhat_unlikely, DET_SOMEWHAT_UNLIKELY); | |
3419 FROB (Qquite_improbable, DET_QUITE_IMPROBABLE); | |
3420 FROB (Qnearly_impossible, DET_NEARLY_IMPOSSIBLE); | |
3421 #undef FROB | |
1494 | 3422 } |
771 | 3423 |
2500 | 3424 ABORT (); |
771 | 3425 return Qnil; /* (usually) not reached */ |
3426 } | |
3427 | |
778 | 3428 #if 0 /* not used */ |
771 | 3429 static enum detection_result |
3430 detection_result_symbol_to_number (Lisp_Object symbol) | |
3431 { | |
1494 | 3432 /* using switch here would be bad style, and doesn't help */ |
771 | 3433 #define FROB(sym, num) if (EQ (symbol, sym)) return (num) |
3434 FROB (Qnear_certainty, DET_NEAR_CERTAINTY); | |
3435 FROB (Qquite_probable, DET_QUITE_PROBABLE); | |
3436 FROB (Qsomewhat_likely, DET_SOMEWHAT_LIKELY); | |
1494 | 3437 FROB (Qslightly_likely, DET_SLIGHTLY_LIKELY); |
771 | 3438 FROB (Qas_likely_as_unlikely, DET_AS_LIKELY_AS_UNLIKELY); |
3439 FROB (Qsomewhat_unlikely, DET_SOMEWHAT_UNLIKELY); | |
3440 FROB (Qquite_improbable, DET_QUITE_IMPROBABLE); | |
3441 FROB (Qnearly_impossible, DET_NEARLY_IMPOSSIBLE); | |
3442 #undef FROB | |
3443 | |
3444 invalid_constant ("Unrecognized detection result", symbol); | |
3445 return ((enum detection_result) 0); /* not reached */ | |
3446 } | |
778 | 3447 #endif /* 0 */ |
771 | 3448 |
3449 /* Set all detection results for a given detector to a specified value. */ | |
3450 void | |
3451 set_detection_results (struct detection_state *st, int detector, int given) | |
3452 { | |
3453 detector_category_dynarr *cats = | |
3454 Dynarr_at (all_coding_detectors, detector).cats; | |
3455 int i; | |
3456 | |
3457 for (i = 0; i < Dynarr_length (cats); i++) | |
3458 st->categories[Dynarr_at (cats, i).id] = given; | |
3459 } | |
428 | 3460 |
3461 static int | |
3462 acceptable_control_char_p (int c) | |
3463 { | |
3464 switch (c) | |
3465 { | |
3466 /* Allow and ignore control characters that you might | |
3467 reasonably see in a text file */ | |
3468 case '\r': | |
3469 case '\n': | |
3470 case '\t': | |
3471 case 7: /* bell */ | |
3472 case 8: /* backspace */ | |
3473 case 11: /* vertical tab */ | |
3474 case 12: /* form feed */ | |
3475 case 26: /* MS-DOS C-z junk */ | |
3476 case 31: /* '^_' -- for info */ | |
3477 return 1; | |
3478 default: | |
3479 return 0; | |
3480 } | |
3481 } | |
3482 | |
771 | 3483 #ifdef DEBUG_XEMACS |
3484 | |
3485 static UExtbyte | |
3486 hex_digit_to_char (int digit) | |
428 | 3487 { |
771 | 3488 if (digit < 10) |
3489 return digit + '0'; | |
3490 else | |
3491 return digit - 10 + 'A'; | |
428 | 3492 } |
3493 | |
771 | 3494 static void |
3495 output_bytes_in_ascii_and_hex (const UExtbyte *src, Bytecount n) | |
428 | 3496 { |
3425 | 3497 Extbyte *ascii = alloca_array (Extbyte, n + 1); |
3498 Extbyte *hex = alloca_array (Extbyte, 3 * n + 1); | |
771 | 3499 int i; |
3413 | 3500 DECLARE_EISTRING (eistr_ascii); |
3501 DECLARE_EISTRING (eistr_hex); | |
771 | 3502 |
3503 for (i = 0; i < n; i++) | |
428 | 3504 { |
3425 | 3505 Extbyte c = src[i]; |
771 | 3506 if (c < 0x20) |
3507 ascii[i] = '.'; | |
428 | 3508 else |
771 | 3509 ascii[i] = c; |
3510 hex[3 * i] = hex_digit_to_char (c >> 4); | |
3511 hex[3 * i + 1] = hex_digit_to_char (c & 0xF); | |
3512 hex[3 * i + 2] = ' '; | |
428 | 3513 } |
771 | 3514 ascii[i] = '\0'; |
3515 hex[3 * i - 1] = '\0'; | |
3413 | 3516 |
3517 eicpy_ext(eistr_hex, hex, Qbinary); | |
3518 eicpy_ext(eistr_ascii, ascii, Qbinary); | |
3519 | |
3425 | 3520 stderr_out ("%s %s", eidata(eistr_ascii), eidata(eistr_hex)); |
428 | 3521 } |
3522 | |
771 | 3523 #endif /* DEBUG_XEMACS */ |
3524 | |
3525 /* Attempt to determine the encoding of the given text. Before calling | |
3526 this function for the first time, you must zero out the detection state. | |
428 | 3527 |
3528 Returns: | |
3529 | |
771 | 3530 0 == keep going |
3531 1 == stop | |
428 | 3532 */ |
3533 | |
3534 static int | |
771 | 3535 detect_coding_type (struct detection_state *st, const UExtbyte *src, |
3536 Bytecount n) | |
428 | 3537 { |
771 | 3538 Bytecount n2 = n; |
3539 const UExtbyte *src2 = src; | |
3540 int i; | |
3541 | |
3542 #ifdef DEBUG_XEMACS | |
3543 if (!NILP (Vdebug_coding_detection)) | |
3544 { | |
3545 int bytes = min (16, n); | |
3546 stderr_out ("detect_coding_type: processing %ld bytes\n", n); | |
3547 stderr_out ("First %d: ", bytes); | |
3548 output_bytes_in_ascii_and_hex (src, bytes); | |
3549 stderr_out ("\nLast %d: ", bytes); | |
3550 output_bytes_in_ascii_and_hex (src + n - bytes, bytes); | |
3551 stderr_out ("\n"); | |
3552 } | |
3553 #endif /* DEBUG_XEMACS */ | |
428 | 3554 if (!st->seen_non_ascii) |
3555 { | |
771 | 3556 for (; n2; n2--, src2++) |
428 | 3557 { |
771 | 3558 UExtbyte c = *src2; |
428 | 3559 if ((c < 0x20 && !acceptable_control_char_p (c)) || c >= 0x80) |
3560 { | |
3561 st->seen_non_ascii = 1; | |
3562 break; | |
3563 } | |
3564 } | |
3565 } | |
3566 | |
771 | 3567 for (i = 0; i < coding_detector_count; i++) |
3568 Dynarr_at (all_coding_detectors, i).detect_method (st, src, n); | |
3569 | |
3570 st->bytes_seen += n; | |
3571 | |
3572 #ifdef DEBUG_XEMACS | |
3573 if (!NILP (Vdebug_coding_detection)) | |
3574 { | |
3575 stderr_out ("seen_non_ascii: %d\n", st->seen_non_ascii); | |
1494 | 3576 if (coding_detector_category_count <= 0) |
3577 stderr_out ("found %d detector categories\n", | |
3578 coding_detector_category_count); | |
771 | 3579 for (i = 0; i < coding_detector_category_count; i++) |
3580 stderr_out_lisp | |
3581 ("%s: %s\n", | |
3582 2, | |
3583 coding_category_id_to_symbol (i), | |
3584 detection_result_number_to_symbol ((enum detection_result) | |
3585 st->categories[i])); | |
3586 } | |
3587 #endif /* DEBUG_XEMACS */ | |
3588 | |
3589 { | |
3590 int not_unlikely = 0; | |
3591 int retval; | |
3592 | |
3593 for (i = 0; i < coding_detector_category_count; i++) | |
3594 if (st->categories[i] >= 0) | |
3595 not_unlikely++; | |
3596 | |
3597 retval = (not_unlikely <= 1 | |
3598 #if 0 /* this is bogus */ | |
3599 || st->bytes_seen >= MAX_BYTES_PROCESSED_FOR_DETECTION | |
428 | 3600 #endif |
771 | 3601 ); |
3602 | |
3603 #ifdef DEBUG_XEMACS | |
3604 if (!NILP (Vdebug_coding_detection)) | |
3605 stderr_out ("detect_coding_type: returning %d (%s)\n", | |
3606 retval, retval ? "stop" : "keep going"); | |
3607 #endif /* DEBUG_XEMACS */ | |
3608 | |
3609 return retval; | |
428 | 3610 } |
3611 } | |
3612 | |
3613 static Lisp_Object | |
771 | 3614 detected_coding_system (struct detection_state *st) |
428 | 3615 { |
771 | 3616 int i; |
3617 int even = 1; | |
3618 | |
3619 if (st->seen_non_ascii) | |
3620 { | |
3621 for (i = 0; i < coding_detector_category_count; i++) | |
3622 if (st->categories[i] != DET_AS_LIKELY_AS_UNLIKELY) | |
3623 { | |
3624 even = 0; | |
3625 break; | |
3626 } | |
3627 } | |
3628 | |
3629 /* #### Here we are ignoring the results of detection when it's all | |
3630 ASCII. This is obviously a bad thing. But we need to fix up the | |
3631 existing detection methods somewhat before we can switch. */ | |
3632 if (even) | |
428 | 3633 { |
3634 /* If the file was entirely or basically ASCII, use the | |
3635 default value of `buffer-file-coding-system'. */ | |
3636 Lisp_Object retval = | |
3637 XBUFFER (Vbuffer_defaults)->buffer_file_coding_system; | |
3638 if (!NILP (retval)) | |
3639 { | |
771 | 3640 retval = find_coding_system_for_text_file (retval, 0); |
428 | 3641 if (NILP (retval)) |
3642 { | |
3643 warn_when_safe | |
3644 (Qbad_variable, Qwarning, | |
3645 "Invalid `default-buffer-file-coding-system', set to nil"); | |
3646 XBUFFER (Vbuffer_defaults)->buffer_file_coding_system = Qnil; | |
3647 } | |
3648 } | |
3649 if (NILP (retval)) | |
4100 | 3650 retval = Fget_coding_system (Qbinary); |
428 | 3651 return retval; |
3652 } | |
3653 else | |
3654 { | |
771 | 3655 int likelihood; |
3656 Lisp_Object retval = Qnil; | |
3657 | |
3658 /* Look through the coding categories first by likelihood and then by | |
3659 priority and find the first one that is allowed. */ | |
3660 | |
3661 for (likelihood = DET_HIGHEST; likelihood >= DET_LOWEST; likelihood--) | |
428 | 3662 { |
771 | 3663 for (i = 0; i < coding_detector_category_count; i++) |
3664 { | |
3665 int cat = coding_category_by_priority[i]; | |
3666 if (st->categories[cat] == likelihood && | |
3667 !NILP (coding_category_system[cat])) | |
3668 { | |
3669 retval = (get_coding_system_for_text_file | |
3670 (coding_category_system[cat], 0)); | |
3671 if (likelihood < DET_AS_LIKELY_AS_UNLIKELY) | |
3672 warn_when_safe_lispobj | |
3673 (intern ("detection"), | |
793 | 3674 Qwarning, |
771 | 3675 emacs_sprintf_string_lisp |
3676 ( | |
3677 "Detected coding %s is unlikely to be correct (likelihood == `%s')", | |
3678 Qnil, 2, XCODING_SYSTEM_NAME (retval), | |
3679 detection_result_number_to_symbol | |
3680 ((enum detection_result) likelihood))); | |
3681 return retval; | |
3682 } | |
3683 } | |
428 | 3684 } |
771 | 3685 |
3686 return Fget_coding_system (Qraw_text); | |
428 | 3687 } |
3688 } | |
3689 | |
1347 | 3690 /* Look for a coding system in the string (skipping over leading |
3691 blanks). If found, return it, otherwise nil. */ | |
3692 | |
3693 static Lisp_Object | |
2531 | 3694 snarf_coding_system (const UExtbyte *p, Bytecount len) |
1347 | 3695 { |
3696 Bytecount n; | |
2531 | 3697 UExtbyte *name; |
1347 | 3698 |
3699 while (*p == ' ' || *p == '\t') p++, len--; | |
3700 len = min (len, 1000); | |
3701 name = alloca_ibytes (len + 1); | |
3702 memcpy (name, p, len); | |
3703 name[len] = '\0'; | |
3704 | |
3705 /* Get coding system name */ | |
3706 /* Characters valid in a MIME charset name (rfc 1521), | |
3707 and in a Lisp symbol name. */ | |
3708 n = qxestrspn (name, | |
3709 "ABCDEFGHIJKLMNOPQRSTUVWXYZ" | |
3710 "abcdefghijklmnopqrstuvwxyz" | |
3711 "0123456789" | |
3712 "!$%&*+-.^_{|}~"); | |
3713 if (n > 0) | |
3714 { | |
3715 name[n] = '\0'; | |
4953
304aebb79cd3
function renamings to track names of char typedefs
Ben Wing <ben@xemacs.org>
parents:
4952
diff
changeset
|
3716 /* This call to intern_istring() is OK because we already verified that |
2531 | 3717 there are only ASCII characters in the string */ |
4953
304aebb79cd3
function renamings to track names of char typedefs
Ben Wing <ben@xemacs.org>
parents:
4952
diff
changeset
|
3718 return find_coding_system_for_text_file (intern_istring ((Ibyte *) name), 0); |
1347 | 3719 } |
3720 | |
3721 return Qnil; | |
3722 } | |
3723 | |
428 | 3724 /* Given a seekable read stream and potential coding system and EOL type |
3725 as specified, do any autodetection that is called for. If the | |
3726 coding system and/or EOL type are not `autodetect', they will be left | |
3727 alone; but this function will never return an autodetect coding system | |
3728 or EOL type. | |
3729 | |
3730 This function does not automatically fetch subsidiary coding systems; | |
3731 that should be unnecessary with the explicit eol-type argument. */ | |
3732 | |
3733 #define LENGTH(string_constant) (sizeof (string_constant) - 1) | |
3734 | |
771 | 3735 static Lisp_Object |
3736 unwind_free_detection_state (Lisp_Object opaque) | |
3737 { | |
3738 struct detection_state *st = | |
3739 (struct detection_state *) get_opaque_ptr (opaque); | |
3740 free_detection_state (st); | |
3741 free_opaque_ptr (opaque); | |
3742 return Qnil; | |
3743 } | |
3744 | |
1347 | 3745 /* #### This duplicates code in `find-coding-system-magic-cookie-in-file' |
3746 in files.el. Look into combining them. */ | |
3747 | |
771 | 3748 static Lisp_Object |
3749 look_for_coding_system_magic_cookie (const UExtbyte *data, Bytecount len) | |
428 | 3750 { |
771 | 3751 const UExtbyte *p; |
3752 const UExtbyte *scan_end; | |
2531 | 3753 Bytecount cookie_len; |
771 | 3754 |
3755 /* Look for initial "-*-"; mode line prefix */ | |
3756 for (p = data, | |
3757 scan_end = data + len - LENGTH ("-*-coding:?-*-"); | |
3758 p <= scan_end | |
3759 && *p != '\n' | |
3760 && *p != '\r'; | |
3761 p++) | |
3762 if (*p == '-' && *(p+1) == '*' && *(p+2) == '-') | |
3763 { | |
3764 const UExtbyte *local_vars_beg = p + 3; | |
3765 /* Look for final "-*-"; mode line suffix */ | |
3766 for (p = local_vars_beg, | |
3767 scan_end = data + len - LENGTH ("-*-"); | |
3768 p <= scan_end | |
428 | 3769 && *p != '\n' |
3770 && *p != '\r'; | |
771 | 3771 p++) |
3772 if (*p == '-' && *(p+1) == '*' && *(p+2) == '-') | |
3773 { | |
3774 const UExtbyte *suffix = p; | |
3775 /* Look for "coding:" */ | |
3776 for (p = local_vars_beg, | |
3777 scan_end = suffix - LENGTH ("coding:?"); | |
3778 p <= scan_end; | |
3779 p++) | |
3780 if (memcmp ("coding:", p, LENGTH ("coding:")) == 0 | |
3781 && (p == local_vars_beg | |
3782 || (*(p-1) == ' ' || | |
3783 *(p-1) == '\t' || | |
3784 *(p-1) == ';'))) | |
3785 { | |
3786 p += LENGTH ("coding:"); | |
1347 | 3787 return snarf_coding_system (p, suffix - p); |
771 | 3788 break; |
3789 } | |
3790 break; | |
3791 } | |
3792 break; | |
3793 } | |
3794 | |
2531 | 3795 /* Look for ;;;###coding system */ |
3796 | |
3797 cookie_len = LENGTH (";;;###coding system: "); | |
3798 | |
3799 for (p = data, | |
3800 scan_end = data + len - cookie_len; | |
3801 p <= scan_end; | |
3802 p++) | |
1347 | 3803 { |
2531 | 3804 if (*p == ';' && !memcmp (p, ";;;###coding system: ", cookie_len)) |
3805 { | |
3806 const UExtbyte *suffix; | |
3807 | |
3808 p += cookie_len; | |
3809 suffix = p; | |
3810 while (suffix < scan_end && !isspace (*suffix)) | |
3811 suffix++; | |
3812 return snarf_coding_system (p, suffix - p); | |
3813 } | |
1347 | 3814 } |
3815 | |
3816 return Qnil; | |
771 | 3817 } |
3818 | |
3819 static Lisp_Object | |
3820 determine_real_coding_system (Lstream *stream) | |
3821 { | |
3822 struct detection_state *st = allocate_detection_state (); | |
3823 int depth = record_unwind_protect (unwind_free_detection_state, | |
3824 make_opaque_ptr (st)); | |
3825 UExtbyte buf[4096]; | |
3826 Bytecount nread = Lstream_read (stream, buf, sizeof (buf)); | |
3827 Lisp_Object coding_system = look_for_coding_system_magic_cookie (buf, nread); | |
3828 | |
3829 if (NILP (coding_system)) | |
3830 { | |
3831 while (1) | |
3832 { | |
3833 if (detect_coding_type (st, buf, nread)) | |
428 | 3834 break; |
771 | 3835 nread = Lstream_read (stream, buf, sizeof (buf)); |
3836 if (nread == 0) | |
3837 break; | |
428 | 3838 } |
771 | 3839 |
3840 coding_system = detected_coding_system (st); | |
428 | 3841 } |
3842 | |
3843 Lstream_rewind (stream); | |
771 | 3844 |
3845 unbind_to (depth); | |
3846 return coding_system; | |
3847 } | |
3848 | |
3849 static void | |
3850 undecided_init_coding_stream (struct coding_stream *str) | |
3851 { | |
3852 struct undecided_coding_stream *data = | |
3853 CODING_STREAM_TYPE_DATA (str, undecided); | |
3854 struct undecided_coding_system *csdata = | |
3855 XCODING_SYSTEM_TYPE_DATA (str->codesys, undecided); | |
3856 | |
3857 data->actual = Qnil; | |
3858 | |
3859 if (str->direction == CODING_DECODE) | |
3860 { | |
3861 Lstream *lst = str->other_end; | |
3862 | |
3863 if ((lst->flags & LSTREAM_FL_READ) && | |
3864 Lstream_seekable_p (lst) && | |
3865 csdata->do_coding) | |
3866 /* We can determine the coding system now. */ | |
3867 data->actual = determine_real_coding_system (lst); | |
3868 } | |
1494 | 3869 |
3870 #ifdef DEBUG_XEMACS | |
3871 if (!NILP (Vdebug_coding_detection)) | |
3872 stderr_out_lisp ("detected coding system: %s\n", 1, data->actual); | |
3873 #endif /* DEBUG_XEMACS */ | |
771 | 3874 } |
3875 | |
3876 static void | |
3877 undecided_rewind_coding_stream (struct coding_stream *str) | |
3878 { | |
3879 chain_rewind_coding_stream_1 (&CODING_STREAM_TYPE_DATA (str, undecided)->c); | |
3880 } | |
3881 | |
3882 static void | |
3883 undecided_finalize_coding_stream (struct coding_stream *str) | |
3884 { | |
3885 struct undecided_coding_stream *data = | |
3886 CODING_STREAM_TYPE_DATA (str, undecided); | |
3887 | |
3888 chain_finalize_coding_stream_1 | |
3889 (&CODING_STREAM_TYPE_DATA (str, undecided)->c); | |
3890 if (data->st) | |
3891 free_detection_state (data->st); | |
3892 } | |
3893 | |
3894 static Lisp_Object | |
3895 undecided_canonicalize (Lisp_Object codesys) | |
3896 { | |
3897 struct undecided_coding_system *csdata = | |
3898 XCODING_SYSTEM_TYPE_DATA (codesys, undecided); | |
3899 if (!csdata->do_eol && !csdata->do_coding) | |
3900 return NILP (csdata->cs) ? Fget_coding_system (Qbinary) : csdata->cs; | |
3901 if (csdata->do_eol && !csdata->do_coding && NILP (csdata->cs)) | |
3902 return Fget_coding_system (Qconvert_eol_autodetect); | |
3903 return codesys; | |
3904 } | |
3905 | |
3906 static Bytecount | |
3907 undecided_convert (struct coding_stream *str, const UExtbyte *src, | |
3908 unsigned_char_dynarr *dst, Bytecount n) | |
3909 { | |
3910 int first_time = 0; | |
3911 | |
3912 if (str->direction == CODING_DECODE) | |
3913 { | |
3914 /* At this point, we have only the following possibilities: | |
3915 | |
3916 do_eol && do_coding | |
3917 do_coding only | |
3918 do_eol only and a coding system was specified | |
3919 | |
3920 Other possibilities are removed during undecided_canonicalize. | |
3921 | |
3922 Therefore, our substreams are either | |
3923 | |
3924 lstream_coding -> lstream_dynarr, or | |
3925 lstream_coding -> lstream_eol -> lstream_dynarr. | |
3926 */ | |
3927 struct undecided_coding_system *csdata = | |
3928 XCODING_SYSTEM_TYPE_DATA (str->codesys, undecided); | |
3929 struct undecided_coding_stream *data = | |
3930 CODING_STREAM_TYPE_DATA (str, undecided); | |
3931 | |
3932 if (str->eof) | |
3933 { | |
3934 /* Each will close the next. We need to close now because more | |
3935 data may be generated. */ | |
3936 if (data->c.initted) | |
3937 Lstream_close (XLSTREAM (data->c.lstreams[0])); | |
3938 return n; | |
3939 } | |
3940 | |
3941 if (!data->c.initted) | |
3942 { | |
3943 data->c.lstream_count = csdata->do_eol ? 3 : 2; | |
3944 data->c.lstreams = xnew_array (Lisp_Object, data->c.lstream_count); | |
3945 | |
3946 data->c.lstreams[data->c.lstream_count - 1] = | |
3947 make_dynarr_output_stream (dst); | |
3948 Lstream_set_buffering | |
3949 (XLSTREAM (data->c.lstreams[data->c.lstream_count - 1]), | |
3950 LSTREAM_UNBUFFERED, 0); | |
3951 if (csdata->do_eol) | |
3952 { | |
3953 data->c.lstreams[1] = | |
3954 make_coding_output_stream | |
3955 (XLSTREAM (data->c.lstreams[data->c.lstream_count - 1]), | |
3956 Fget_coding_system (Qconvert_eol_autodetect), | |
800 | 3957 CODING_DECODE, 0); |
771 | 3958 Lstream_set_buffering |
3959 (XLSTREAM (data->c.lstreams[1]), | |
3960 LSTREAM_UNBUFFERED, 0); | |
3961 } | |
3962 | |
3963 data->c.lstreams[0] = | |
3964 make_coding_output_stream | |
3965 (XLSTREAM (data->c.lstreams[1]), | |
3966 /* Substitute binary if we need to detect the encoding */ | |
3967 csdata->do_coding ? Qbinary : csdata->cs, | |
800 | 3968 CODING_DECODE, 0); |
771 | 3969 Lstream_set_buffering (XLSTREAM (data->c.lstreams[0]), |
3970 LSTREAM_UNBUFFERED, 0); | |
3971 | |
3972 first_time = 1; | |
3973 data->c.initted = 1; | |
3974 } | |
3975 | |
3976 /* If necessary, do encoding-detection now. We do this when we're a | |
3977 writing stream or a non-seekable reading stream, meaning that we | |
3978 can't just process the whole input, rewind, and start over. */ | |
3979 | |
3980 if (csdata->do_coding) | |
3981 { | |
3982 int actual_was_nil = NILP (data->actual); | |
3983 if (NILP (data->actual)) | |
3984 { | |
3985 if (!data->st) | |
3986 data->st = allocate_detection_state (); | |
3987 if (first_time) | |
3988 /* #### This is cheesy. What we really ought to do is buffer | |
3989 up a certain minimum amount of data to get a better result. | |
3990 */ | |
3991 data->actual = look_for_coding_system_magic_cookie (src, n); | |
3992 if (NILP (data->actual)) | |
3993 { | |
3994 /* #### This is cheesy. What we really ought to do is buffer | |
3995 up a certain minimum amount of data so as to get a less | |
3996 random result when doing subprocess detection. */ | |
3997 detect_coding_type (data->st, src, n); | |
3998 data->actual = detected_coding_system (data->st); | |
4100 | 3999 /* kludge to prevent infinite recursion */ |
4000 if (XCODING_SYSTEM(data->actual)->methods->enumtype == undecided_coding_system) | |
4001 data->actual = Fget_coding_system (Qbinary); | |
771 | 4002 } |
4003 } | |
4004 /* We need to set the detected coding system if we actually have | |
4005 such a coding system but didn't before. That is the case | |
4006 either when we just detected it in the previous code or when | |
4007 it was detected during undecided_init_coding_stream(). We | |
4008 can check for that using first_time. */ | |
4009 if (!NILP (data->actual) && (actual_was_nil || first_time)) | |
4010 { | |
4011 /* If the detected coding system doesn't allow for EOL | |
4012 autodetection, try to get the equivalent that does; | |
4013 otherwise, disable EOL detection (overriding whatever | |
4014 may already have been detected). */ | |
4015 if (XCODING_SYSTEM_EOL_TYPE (data->actual) != EOL_AUTODETECT) | |
4016 { | |
4017 if (!NILP (XCODING_SYSTEM_SUBSIDIARY_PARENT (data->actual))) | |
4018 data->actual = | |
4019 XCODING_SYSTEM_SUBSIDIARY_PARENT (data->actual); | |
4020 else if (data->c.lstream_count == 3) | |
4021 set_coding_stream_coding_system | |
4022 (XLSTREAM (data->c.lstreams[1]), | |
4023 Fget_coding_system (Qidentity)); | |
4024 } | |
4025 set_coding_stream_coding_system | |
4026 (XLSTREAM (data->c.lstreams[0]), data->actual); | |
4027 } | |
4028 } | |
4029 | |
4030 if (Lstream_write (XLSTREAM (data->c.lstreams[0]), src, n) < 0) | |
4031 return -1; | |
4032 return n; | |
4033 } | |
4034 else | |
4035 return no_conversion_convert (str, src, dst, n); | |
4036 } | |
4037 | |
4038 static Lisp_Object | |
4039 undecided_canonicalize_after_coding (struct coding_stream *str) | |
4040 { | |
4041 struct undecided_coding_stream *data = | |
4042 CODING_STREAM_TYPE_DATA (str, undecided); | |
4043 Lisp_Object ret, eolret; | |
4044 | |
4045 if (str->direction == CODING_ENCODE) | |
4046 return str->codesys; | |
4047 | |
4048 if (!data->c.initted) | |
4647
e4ed58cb0e5b
Fix bugs with #'find-file, 0-length files, & coding-system-for-read specified.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4569
diff
changeset
|
4049 return str->codesys; |
771 | 4050 |
4051 ret = coding_stream_canonicalize_after_coding | |
4052 (XLSTREAM (data->c.lstreams[0])); | |
4053 if (NILP (ret)) | |
4647
e4ed58cb0e5b
Fix bugs with #'find-file, 0-length files, & coding-system-for-read specified.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4569
diff
changeset
|
4054 ret = str->codesys; |
771 | 4055 if (XCODING_SYSTEM_EOL_TYPE (ret) != EOL_AUTODETECT) |
4056 return ret; | |
4057 eolret = coding_stream_canonicalize_after_coding | |
4058 (XLSTREAM (data->c.lstreams[1])); | |
4059 if (!EQ (XCODING_SYSTEM_TYPE (eolret), Qconvert_eol)) | |
4060 return ret; | |
4061 return | |
4062 Fsubsidiary_coding_system (ret, Fcoding_system_property (eolret, | |
4063 Qsubtype)); | |
4064 } | |
4065 | |
4066 | |
4067 /************************************************************************/ | |
4068 /* Lisp interface: Coding category functions and detection */ | |
4069 /************************************************************************/ | |
4070 | |
4071 DEFUN ("coding-category-list", Fcoding_category_list, 0, 0, 0, /* | |
4072 Return a list of all recognized coding categories. | |
4073 */ | |
4074 ()) | |
4075 { | |
4076 int i; | |
4077 Lisp_Object list = Qnil; | |
4078 | |
4079 for (i = 0; i < coding_detector_count; i++) | |
4080 { | |
4081 detector_category_dynarr *cats = | |
4082 Dynarr_at (all_coding_detectors, i).cats; | |
4083 int j; | |
4084 | |
4085 for (j = 0; j < Dynarr_length (cats); j++) | |
4086 list = Fcons (Dynarr_at (cats, j).sym, list); | |
4087 } | |
4088 | |
4089 return Fnreverse (list); | |
4090 } | |
4091 | |
4092 DEFUN ("set-coding-priority-list", Fset_coding_priority_list, 1, 1, 0, /* | |
4093 Change the priority order of the coding categories. | |
4094 LIST should be list of coding categories, in descending order of | |
4095 priority. Unspecified coding categories will be lower in priority | |
4096 than all specified ones, in the same relative order they were in | |
4097 previously. | |
4098 */ | |
4099 (list)) | |
4100 { | |
4101 int *category_to_priority = | |
4102 alloca_array (int, coding_detector_category_count); | |
4103 int i, j; | |
4104 | |
4105 /* First generate a list that maps coding categories to priorities. */ | |
4106 | |
4107 for (i = 0; i < coding_detector_category_count; i++) | |
4108 category_to_priority[i] = -1; | |
4109 | |
4110 /* Highest priority comes from the specified list. */ | |
4111 i = 0; | |
2367 | 4112 { |
4113 EXTERNAL_LIST_LOOP_2 (elt, list) | |
4114 { | |
4115 int cat = coding_category_symbol_to_id (elt); | |
4116 | |
4117 if (category_to_priority[cat] >= 0) | |
4118 sferror ("Duplicate coding category in list", elt); | |
4119 category_to_priority[cat] = i++; | |
4120 } | |
4121 } | |
771 | 4122 |
4123 /* Now go through the existing categories by priority to retrieve | |
4124 the categories not yet specified and preserve their priority | |
4125 order. */ | |
4126 for (j = 0; j < coding_detector_category_count; j++) | |
4127 { | |
4128 int cat = coding_category_by_priority[j]; | |
4129 if (category_to_priority[cat] < 0) | |
4130 category_to_priority[cat] = i++; | |
4131 } | |
4132 | |
4133 /* Now we need to construct the inverse of the mapping we just | |
4134 constructed. */ | |
4135 | |
4136 for (i = 0; i < coding_detector_category_count; i++) | |
4137 coding_category_by_priority[category_to_priority[i]] = i; | |
4138 | |
4139 /* Phew! That was confusing. */ | |
4140 return Qnil; | |
4141 } | |
4142 | |
4143 DEFUN ("coding-priority-list", Fcoding_priority_list, 0, 0, 0, /* | |
4144 Return a list of coding categories in descending order of priority. | |
4145 */ | |
4146 ()) | |
4147 { | |
4148 int i; | |
4149 Lisp_Object list = Qnil; | |
4150 | |
4151 for (i = 0; i < coding_detector_category_count; i++) | |
4152 list = | |
4153 Fcons (coding_category_id_to_symbol (coding_category_by_priority[i]), | |
4154 list); | |
4155 return Fnreverse (list); | |
4156 } | |
4157 | |
4158 DEFUN ("set-coding-category-system", Fset_coding_category_system, 2, 2, 0, /* | |
4159 Change the coding system associated with a coding category. | |
4160 */ | |
4161 (coding_category, coding_system)) | |
4162 { | |
4163 coding_category_system[coding_category_symbol_to_id (coding_category)] = | |
4164 Fget_coding_system (coding_system); | |
4165 return Qnil; | |
4166 } | |
4167 | |
4168 DEFUN ("coding-category-system", Fcoding_category_system, 1, 1, 0, /* | |
4169 Return the coding system associated with a coding category. | |
4170 */ | |
4171 (coding_category)) | |
4172 { | |
4173 Lisp_Object sys = | |
4174 coding_category_system[coding_category_symbol_to_id (coding_category)]; | |
4175 | |
4176 if (!NILP (sys)) | |
4177 return XCODING_SYSTEM_NAME (sys); | |
4178 return Qnil; | |
4179 } | |
4180 | |
800 | 4181 /* Detect the encoding of STREAM. Assumes stream is at the begnning and will |
4182 read through to the end of STREAM, leaving it there but open. */ | |
4183 | |
771 | 4184 Lisp_Object |
4185 detect_coding_stream (Lisp_Object stream) | |
4186 { | |
4187 Lisp_Object val = Qnil; | |
4188 struct gcpro gcpro1, gcpro2, gcpro3; | |
4189 UExtbyte random_buffer[65536]; | |
4190 Lisp_Object binary_instream = | |
4191 make_coding_input_stream | |
4192 (XLSTREAM (stream), Qbinary, | |
814 | 4193 CODING_ENCODE, LSTREAM_FL_NO_CLOSE_OTHER); |
771 | 4194 Lisp_Object decstream = |
4195 make_coding_input_stream | |
4196 (XLSTREAM (binary_instream), | |
800 | 4197 Qundecided, CODING_DECODE, 0); |
771 | 4198 Lstream *decstr = XLSTREAM (decstream); |
4199 | |
4200 GCPRO3 (decstream, stream, binary_instream); | |
4201 /* Read and discard all data; detection happens as a side effect of this, | |
4202 and we examine what was detected afterwards. */ | |
4203 while (Lstream_read (decstr, random_buffer, sizeof (random_buffer)) > 0) | |
4204 ; | |
4205 | |
4206 val = coding_stream_detected_coding_system (decstr); | |
4207 Lstream_close (decstr); | |
4208 Lstream_delete (decstr); | |
4209 Lstream_delete (XLSTREAM (binary_instream)); | |
4210 UNGCPRO; | |
4211 return val; | |
428 | 4212 } |
4213 | |
4214 DEFUN ("detect-coding-region", Fdetect_coding_region, 2, 3, 0, /* | |
4215 Detect coding system of the text in the region between START and END. | |
444 | 4216 Return a list of possible coding systems ordered by priority. |
3025 | 4217 If only ASCII characters are found, return `undecided' or one of |
428 | 4218 its subsidiary coding systems according to a detected end-of-line |
4219 type. Optional arg BUFFER defaults to the current buffer. | |
4220 */ | |
4221 (start, end, buffer)) | |
4222 { | |
4223 Lisp_Object val = Qnil; | |
4224 struct buffer *buf = decode_buffer (buffer, 0); | |
665 | 4225 Charbpos b, e; |
771 | 4226 Lisp_Object lb_instream; |
428 | 4227 |
4228 get_buffer_range_char (buf, start, end, &b, &e, 0); | |
4229 lb_instream = make_lisp_buffer_input_stream (buf, b, e, 0); | |
771 | 4230 |
4231 val = detect_coding_stream (lb_instream); | |
4232 Lstream_delete (XLSTREAM (lb_instream)); | |
428 | 4233 return val; |
4234 } | |
4235 | |
4236 | |
771 | 4237 |
4238 #ifdef DEBUG_XEMACS | |
4239 | |
428 | 4240 /************************************************************************/ |
771 | 4241 /* Internal methods */ |
4242 /************************************************************************/ | |
4243 | |
4244 /* Raw (internally-formatted) data. */ | |
4245 DEFINE_CODING_SYSTEM_TYPE (internal); | |
428 | 4246 |
665 | 4247 static Bytecount |
2286 | 4248 internal_convert (struct coding_stream *UNUSED (str), const UExtbyte *src, |
771 | 4249 unsigned_char_dynarr *dst, Bytecount n) |
4250 { | |
4251 Bytecount orign = n; | |
4252 Dynarr_add_many (dst, src, n); | |
4253 return orign; | |
4254 } | |
4255 | |
4256 #endif /* DEBUG_XEMACS */ | |
4257 | |
4258 | |
4259 | |
4260 #ifdef HAVE_ZLIB | |
4261 | |
4262 /************************************************************************/ | |
4263 /* Gzip methods */ | |
4264 /************************************************************************/ | |
4265 | |
4266 struct gzip_coding_system | |
428 | 4267 { |
771 | 4268 int level; /* 0 through 9, or -1 for default */ |
4269 }; | |
4270 | |
4271 #define CODING_SYSTEM_GZIP_LEVEL(codesys) \ | |
4272 (CODING_SYSTEM_TYPE_DATA (codesys, gzip)->level) | |
4273 #define XCODING_SYSTEM_GZIP_LEVEL(codesys) \ | |
4274 (XCODING_SYSTEM_TYPE_DATA (codesys, gzip)->level) | |
4275 | |
4276 struct gzip_coding_stream | |
428 | 4277 { |
771 | 4278 z_stream stream; |
4279 int stream_initted; | |
4280 int reached_eof; /* #### this should be handled by the caller, once we | |
4281 return LSTREAM_EOF */ | |
4282 }; | |
4283 | |
1204 | 4284 static const struct memory_description |
771 | 4285 gzip_coding_system_description[] = { |
4286 { XD_END } | |
4287 }; | |
4288 | |
1204 | 4289 DEFINE_CODING_SYSTEM_TYPE_WITH_DATA (gzip); |
4290 | |
771 | 4291 enum source_sink_type |
4292 gzip_conversion_end_type (Lisp_Object codesys) | |
4293 { | |
4294 return DECODES_BYTE_TO_BYTE; | |
428 | 4295 } |
4296 | |
4297 static void | |
771 | 4298 gzip_init (Lisp_Object codesys) |
4299 { | |
4300 struct gzip_coding_system *data = XCODING_SYSTEM_TYPE_DATA (codesys, gzip); | |
4301 data->level = -1; | |
4302 } | |
4303 | |
4304 static void | |
4305 gzip_print (Lisp_Object cs, Lisp_Object printcharfun, int escapeflag) | |
428 | 4306 { |
771 | 4307 struct gzip_coding_system *data = XCODING_SYSTEM_TYPE_DATA (cs, gzip); |
4308 | |
4952
19a72041c5ed
Mule-izing, various fixes related to char * arguments
Ben Wing <ben@xemacs.org>
parents:
4846
diff
changeset
|
4309 write_ascstring (printcharfun, "("); |
771 | 4310 if (data->level == -1) |
4952
19a72041c5ed
Mule-izing, various fixes related to char * arguments
Ben Wing <ben@xemacs.org>
parents:
4846
diff
changeset
|
4311 write_ascstring (printcharfun, "default"); |
771 | 4312 else |
4313 print_internal (make_int (data->level), printcharfun, 0); | |
4952
19a72041c5ed
Mule-izing, various fixes related to char * arguments
Ben Wing <ben@xemacs.org>
parents:
4846
diff
changeset
|
4314 write_ascstring (printcharfun, ")"); |
428 | 4315 } |
4316 | |
4317 static int | |
771 | 4318 gzip_putprop (Lisp_Object codesys, Lisp_Object key, Lisp_Object value) |
428 | 4319 { |
771 | 4320 struct gzip_coding_system *data = XCODING_SYSTEM_TYPE_DATA (codesys, gzip); |
4321 | |
4322 if (EQ (key, Qlevel)) | |
428 | 4323 { |
771 | 4324 if (EQ (value, Qdefault)) |
4325 data->level = -1; | |
4326 else | |
428 | 4327 { |
771 | 4328 CHECK_INT (value); |
4329 check_int_range (XINT (value), 0, 9); | |
4330 data->level = XINT (value); | |
428 | 4331 } |
4332 } | |
4333 else | |
771 | 4334 return 0; |
4335 return 1; | |
428 | 4336 } |
4337 | |
4338 static Lisp_Object | |
771 | 4339 gzip_getprop (Lisp_Object coding_system, Lisp_Object prop) |
428 | 4340 { |
771 | 4341 struct gzip_coding_system *data = |
4342 XCODING_SYSTEM_TYPE_DATA (coding_system, gzip); | |
4343 | |
4344 if (EQ (prop, Qlevel)) | |
428 | 4345 { |
771 | 4346 if (data->level == -1) |
4347 return Qdefault; | |
4348 return make_int (data->level); | |
428 | 4349 } |
771 | 4350 |
4351 return Qunbound; | |
428 | 4352 } |
4353 | |
4354 static void | |
771 | 4355 gzip_init_coding_stream (struct coding_stream *str) |
428 | 4356 { |
771 | 4357 struct gzip_coding_stream *data = CODING_STREAM_TYPE_DATA (str, gzip); |
4358 if (data->stream_initted) | |
428 | 4359 { |
771 | 4360 if (str->direction == CODING_DECODE) |
4361 inflateEnd (&data->stream); | |
4362 else | |
4363 deflateEnd (&data->stream); | |
4364 data->stream_initted = 0; | |
428 | 4365 } |
771 | 4366 data->reached_eof = 0; |
428 | 4367 } |
4368 | |
4369 static void | |
771 | 4370 gzip_rewind_coding_stream (struct coding_stream *str) |
428 | 4371 { |
771 | 4372 gzip_init_coding_stream (str); |
428 | 4373 } |
4374 | |
771 | 4375 static Bytecount |
4376 gzip_convert (struct coding_stream *str, | |
4377 const UExtbyte *src, | |
4378 unsigned_char_dynarr *dst, Bytecount n) | |
428 | 4379 { |
771 | 4380 struct gzip_coding_stream *data = CODING_STREAM_TYPE_DATA (str, gzip); |
4381 int zerr; | |
4382 if (str->direction == CODING_DECODE) | |
428 | 4383 { |
771 | 4384 if (data->reached_eof) |
4385 return n; /* eat the data */ | |
4386 | |
4387 if (!data->stream_initted) | |
428 | 4388 { |
771 | 4389 xzero (data->stream); |
4390 if (inflateInit (&data->stream) != Z_OK) | |
4391 return LSTREAM_ERROR; | |
4392 data->stream_initted = 1; | |
428 | 4393 } |
771 | 4394 |
4395 data->stream.next_in = (Bytef *) src; | |
4396 data->stream.avail_in = n; | |
4397 | |
4398 /* Normally we stop when we've fed all data to the decompressor; but | |
4399 if we're at the end of the input, and the decompressor hasn't | |
4400 reported EOF, we need to keep going, as there might be more output | |
4401 to generate. Z_OK from the decompressor means input was processed | |
4402 or output was generated; if neither, we break out of the loop. | |
4403 Other return values are: | |
4404 | |
4405 Z_STREAM_END EOF from decompressor | |
4406 Z_DATA_ERROR Corrupted data | |
4407 Z_BUF_ERROR No progress possible (this should happen if | |
4408 we try to feed it an incomplete file) | |
4409 Z_MEM_ERROR Out of memory | |
4410 Z_STREAM_ERROR (should never happen) | |
4411 Z_NEED_DICT (#### when will this happen?) | |
4412 */ | |
4413 while (data->stream.avail_in > 0 || str->eof) | |
4414 { | |
4415 /* Reserve an output buffer of the same size as the input buffer; | |
4416 if that's not enough, we keep reserving the same size. */ | |
4417 Bytecount reserved = n; | |
4418 Dynarr_add_many (dst, 0, reserved); | |
4419 /* Careful here! Don't retrieve the pointer until after | |
4420 reserving the space, or it might be bogus */ | |
4421 data->stream.next_out = | |
4422 Dynarr_atp (dst, Dynarr_length (dst) - reserved); | |
4423 data->stream.avail_out = reserved; | |
4424 zerr = inflate (&data->stream, Z_NO_FLUSH); | |
4425 /* Lop off the unused portion */ | |
4967 | 4426 Dynarr_set_length (dst, Dynarr_length (dst) - data->stream.avail_out); |
771 | 4427 if (zerr != Z_OK) |
4428 break; | |
4429 } | |
4430 | |
4431 if (zerr == Z_STREAM_END) | |
4432 data->reached_eof = 1; | |
4433 | |
4434 if ((Bytecount) data->stream.avail_in < n) | |
4435 return n - data->stream.avail_in; | |
4436 | |
4437 if (zerr == Z_OK || zerr == Z_STREAM_END) | |
4438 return 0; | |
4439 | |
4440 return LSTREAM_ERROR; | |
428 | 4441 } |
4442 else | |
4443 { | |
771 | 4444 if (!data->stream_initted) |
4445 { | |
4446 int level = XCODING_SYSTEM_GZIP_LEVEL (str->codesys); | |
4447 xzero (data->stream); | |
4448 if (deflateInit (&data->stream, | |
4449 level == -1 ? Z_DEFAULT_COMPRESSION : level) != | |
4450 Z_OK) | |
4451 return LSTREAM_ERROR; | |
4452 data->stream_initted = 1; | |
428 | 4453 } |
771 | 4454 |
4455 data->stream.next_in = (Bytef *) src; | |
4456 data->stream.avail_in = n; | |
4457 | |
4458 /* Normally we stop when we've fed all data to the compressor; but if | |
4459 we're at the end of the input, and the compressor hasn't reported | |
4460 EOF, we need to keep going, as there might be more output to | |
4461 generate. (To signal EOF on our end, we set the FLUSH parameter | |
4462 to Z_FINISH; when all data is output, Z_STREAM_END will be | |
4463 returned.) Z_OK from the compressor means input was processed or | |
4464 output was generated; if neither, we break out of the loop. Other | |
4465 return values are: | |
4466 | |
4467 Z_STREAM_END EOF from compressor | |
4468 Z_BUF_ERROR No progress possible (should never happen) | |
4469 Z_STREAM_ERROR (should never happen) | |
4470 */ | |
4471 while (data->stream.avail_in > 0 || str->eof) | |
4472 { | |
4473 /* Reserve an output buffer of the same size as the input buffer; | |
4474 if that's not enough, we keep reserving the same size. */ | |
4475 Bytecount reserved = n; | |
4476 Dynarr_add_many (dst, 0, reserved); | |
4477 /* Careful here! Don't retrieve the pointer until after | |
4478 reserving the space, or it might be bogus */ | |
4479 data->stream.next_out = | |
4480 Dynarr_atp (dst, Dynarr_length (dst) - reserved); | |
4481 data->stream.avail_out = reserved; | |
4482 zerr = | |
4483 deflate (&data->stream, | |
4484 str->eof ? Z_FINISH : Z_NO_FLUSH); | |
4485 /* Lop off the unused portion */ | |
4967 | 4486 Dynarr_set_length (dst, Dynarr_length (dst) - data->stream.avail_out); |
771 | 4487 if (zerr != Z_OK) |
4488 break; | |
4489 } | |
4490 | |
4491 if ((Bytecount) data->stream.avail_in < n) | |
4492 return n - data->stream.avail_in; | |
4493 | |
4494 if (zerr == Z_OK || zerr == Z_STREAM_END) | |
4495 return 0; | |
4496 | |
4497 return LSTREAM_ERROR; | |
428 | 4498 } |
4499 } | |
4500 | |
771 | 4501 #endif /* HAVE_ZLIB */ |
428 | 4502 |
4503 | |
4504 /************************************************************************/ | |
4505 /* Initialization */ | |
4506 /************************************************************************/ | |
4507 | |
4508 void | |
4509 syms_of_file_coding (void) | |
4510 { | |
442 | 4511 INIT_LRECORD_IMPLEMENTATION (coding_system); |
4512 | |
771 | 4513 DEFSUBR (Fvalid_coding_system_type_p); |
4514 DEFSUBR (Fcoding_system_type_list); | |
428 | 4515 DEFSUBR (Fcoding_system_p); |
4303 | 4516 DEFSUBR (Fautoload_coding_system); |
428 | 4517 DEFSUBR (Ffind_coding_system); |
4518 DEFSUBR (Fget_coding_system); | |
4519 DEFSUBR (Fcoding_system_list); | |
4520 DEFSUBR (Fcoding_system_name); | |
4690
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
4521 DEFSUBR (Fmake_coding_system_internal); |
428 | 4522 DEFSUBR (Fcopy_coding_system); |
440 | 4523 DEFSUBR (Fcoding_system_canonical_name_p); |
4524 DEFSUBR (Fcoding_system_alias_p); | |
4525 DEFSUBR (Fcoding_system_aliasee); | |
428 | 4526 DEFSUBR (Fdefine_coding_system_alias); |
4527 DEFSUBR (Fsubsidiary_coding_system); | |
771 | 4528 DEFSUBR (Fcoding_system_base); |
4529 DEFSUBR (Fcoding_system_used_for_io); | |
428 | 4530 |
4531 DEFSUBR (Fcoding_system_type); | |
771 | 4532 DEFSUBR (Fcoding_system_description); |
428 | 4533 DEFSUBR (Fcoding_system_property); |
4534 | |
4535 DEFSUBR (Fcoding_category_list); | |
4536 DEFSUBR (Fset_coding_priority_list); | |
4537 DEFSUBR (Fcoding_priority_list); | |
4538 DEFSUBR (Fset_coding_category_system); | |
4539 DEFSUBR (Fcoding_category_system); | |
4540 | |
4541 DEFSUBR (Fdetect_coding_region); | |
4542 DEFSUBR (Fdecode_coding_region); | |
4543 DEFSUBR (Fencode_coding_region); | |
4690
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
4544 DEFSUBR (Fquery_coding_region); |
563 | 4545 DEFSYMBOL_MULTIWORD_PREDICATE (Qcoding_systemp); |
4546 DEFSYMBOL (Qno_conversion); | |
771 | 4547 DEFSYMBOL (Qconvert_eol); |
4548 DEFSYMBOL (Qconvert_eol_autodetect); | |
4549 DEFSYMBOL (Qconvert_eol_lf); | |
4550 DEFSYMBOL (Qconvert_eol_cr); | |
4551 DEFSYMBOL (Qconvert_eol_crlf); | |
563 | 4552 DEFSYMBOL (Qraw_text); |
771 | 4553 |
563 | 4554 DEFSYMBOL (Qmnemonic); |
4555 DEFSYMBOL (Qeol_type); | |
4556 DEFSYMBOL (Qpost_read_conversion); | |
4557 DEFSYMBOL (Qpre_write_conversion); | |
4558 | |
771 | 4559 DEFSYMBOL (Qtranslation_table_for_decode); |
4560 DEFSYMBOL (Qtranslation_table_for_encode); | |
4561 DEFSYMBOL (Qsafe_chars); | |
4562 DEFSYMBOL (Qsafe_charsets); | |
4563 DEFSYMBOL (Qmime_charset); | |
4564 DEFSYMBOL (Qvalid_codes); | |
4565 | |
563 | 4566 DEFSYMBOL (Qcr); |
4567 DEFSYMBOL (Qlf); | |
4568 DEFSYMBOL (Qcrlf); | |
4569 DEFSYMBOL (Qeol_cr); | |
4570 DEFSYMBOL (Qeol_lf); | |
4571 DEFSYMBOL (Qeol_crlf); | |
4572 DEFSYMBOL (Qencode); | |
4573 DEFSYMBOL (Qdecode); | |
428 | 4574 |
771 | 4575 DEFSYMBOL (Qnear_certainty); |
4576 DEFSYMBOL (Qquite_probable); | |
4577 DEFSYMBOL (Qsomewhat_likely); | |
1494 | 4578 DEFSYMBOL (Qslightly_likely); |
771 | 4579 DEFSYMBOL (Qas_likely_as_unlikely); |
4580 DEFSYMBOL (Qsomewhat_unlikely); | |
4581 DEFSYMBOL (Qquite_improbable); | |
4582 DEFSYMBOL (Qnearly_impossible); | |
4583 | |
4584 DEFSYMBOL (Qdo_eol); | |
4585 DEFSYMBOL (Qdo_coding); | |
4586 | |
4587 DEFSYMBOL (Qcanonicalize_after_coding); | |
4588 | |
4303 | 4589 DEFSYMBOL (Qposix_charset_to_coding_system_hash); |
4590 | |
771 | 4591 DEFSYMBOL (Qescape_quoted); |
4592 | |
4690
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
4593 DEFSYMBOL (Qquery_coding_warning_face); |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
4594 DEFSYMBOL (Qaliases); |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
4595 DEFSYMBOL (Qcharset_skip_chars_string); |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
4596 |
771 | 4597 #ifdef HAVE_ZLIB |
4598 DEFSYMBOL (Qgzip); | |
4599 #endif | |
4600 | |
428 | 4601 } |
4602 | |
4603 void | |
4604 lstream_type_create_file_coding (void) | |
4605 { | |
771 | 4606 LSTREAM_HAS_METHOD (coding, reader); |
4607 LSTREAM_HAS_METHOD (coding, writer); | |
4608 LSTREAM_HAS_METHOD (coding, rewinder); | |
4609 LSTREAM_HAS_METHOD (coding, seekable_p); | |
4610 LSTREAM_HAS_METHOD (coding, marker); | |
4611 LSTREAM_HAS_METHOD (coding, flusher); | |
4612 LSTREAM_HAS_METHOD (coding, closer); | |
4613 LSTREAM_HAS_METHOD (coding, finalizer); | |
4614 } | |
4615 | |
4616 void | |
4617 coding_system_type_create (void) | |
4618 { | |
4619 int i; | |
4620 | |
4621 staticpro (&Vcoding_system_hash_table); | |
4622 Vcoding_system_hash_table = | |
4623 make_lisp_hash_table (50, HASH_TABLE_NON_WEAK, HASH_TABLE_EQ); | |
4624 | |
4625 the_coding_system_type_entry_dynarr = Dynarr_new (coding_system_type_entry); | |
2367 | 4626 dump_add_root_block_ptr (&the_coding_system_type_entry_dynarr, |
771 | 4627 &csted_description); |
4628 | |
4629 Vcoding_system_type_list = Qnil; | |
4630 staticpro (&Vcoding_system_type_list); | |
4631 | |
4632 /* Initialize to something reasonable ... */ | |
4633 for (i = 0; i < MAX_DETECTOR_CATEGORIES; i++) | |
4634 { | |
4635 coding_category_system[i] = Qnil; | |
1204 | 4636 dump_add_root_lisp_object (&coding_category_system[i]); |
771 | 4637 coding_category_by_priority[i] = i; |
4638 } | |
4639 | |
4640 dump_add_opaque (coding_category_by_priority, | |
4641 sizeof (coding_category_by_priority)); | |
4642 | |
4643 all_coding_detectors = Dynarr_new2 (detector_dynarr, struct detector); | |
2367 | 4644 dump_add_root_block_ptr (&all_coding_detectors, |
771 | 4645 &detector_dynarr_description); |
4646 | |
4647 dump_add_opaque_int (&coding_system_tick); | |
4648 dump_add_opaque_int (&coding_detector_count); | |
4649 dump_add_opaque_int (&coding_detector_category_count); | |
4650 | |
4651 INITIALIZE_CODING_SYSTEM_TYPE (no_conversion, | |
4652 "no-conversion-coding-system-p"); | |
4653 CODING_SYSTEM_HAS_METHOD (no_conversion, convert); | |
4654 | |
4655 INITIALIZE_DETECTOR (no_conversion); | |
4656 DETECTOR_HAS_METHOD (no_conversion, detect); | |
4657 INITIALIZE_DETECTOR_CATEGORY (no_conversion, no_conversion); | |
4658 | |
4659 INITIALIZE_CODING_SYSTEM_TYPE_WITH_DATA (convert_eol, | |
4660 "convert-eol-coding-system-p"); | |
4661 CODING_SYSTEM_HAS_METHOD (convert_eol, print); | |
4662 CODING_SYSTEM_HAS_METHOD (convert_eol, convert); | |
4663 CODING_SYSTEM_HAS_METHOD (convert_eol, getprop); | |
4664 CODING_SYSTEM_HAS_METHOD (convert_eol, putprop); | |
4665 CODING_SYSTEM_HAS_METHOD (convert_eol, conversion_end_type); | |
4666 CODING_SYSTEM_HAS_METHOD (convert_eol, canonicalize_after_coding); | |
4667 CODING_SYSTEM_HAS_METHOD (convert_eol, init_coding_stream); | |
4668 | |
4669 INITIALIZE_CODING_SYSTEM_TYPE_WITH_DATA (undecided, | |
4670 "undecided-coding-system-p"); | |
4671 CODING_SYSTEM_HAS_METHOD (undecided, init); | |
4672 CODING_SYSTEM_HAS_METHOD (undecided, mark); | |
4673 CODING_SYSTEM_HAS_METHOD (undecided, print); | |
4674 CODING_SYSTEM_HAS_METHOD (undecided, convert); | |
4675 CODING_SYSTEM_HAS_METHOD (undecided, putprop); | |
4676 CODING_SYSTEM_HAS_METHOD (undecided, getprop); | |
4677 CODING_SYSTEM_HAS_METHOD (undecided, init_coding_stream); | |
4678 CODING_SYSTEM_HAS_METHOD (undecided, rewind_coding_stream); | |
4679 CODING_SYSTEM_HAS_METHOD (undecided, finalize_coding_stream); | |
4680 CODING_SYSTEM_HAS_METHOD (undecided, mark_coding_stream); | |
4681 CODING_SYSTEM_HAS_METHOD (undecided, canonicalize); | |
4682 CODING_SYSTEM_HAS_METHOD (undecided, canonicalize_after_coding); | |
4683 | |
4684 INITIALIZE_CODING_SYSTEM_TYPE_WITH_DATA (chain, "chain-coding-system-p"); | |
4685 | |
4686 CODING_SYSTEM_HAS_METHOD (chain, print); | |
4687 CODING_SYSTEM_HAS_METHOD (chain, canonicalize); | |
4688 CODING_SYSTEM_HAS_METHOD (chain, init); | |
4689 CODING_SYSTEM_HAS_METHOD (chain, mark); | |
4690 CODING_SYSTEM_HAS_METHOD (chain, mark_coding_stream); | |
4691 CODING_SYSTEM_HAS_METHOD (chain, convert); | |
4692 CODING_SYSTEM_HAS_METHOD (chain, rewind_coding_stream); | |
4693 CODING_SYSTEM_HAS_METHOD (chain, finalize_coding_stream); | |
4694 CODING_SYSTEM_HAS_METHOD (chain, finalize); | |
4695 CODING_SYSTEM_HAS_METHOD (chain, putprop); | |
4696 CODING_SYSTEM_HAS_METHOD (chain, getprop); | |
4697 CODING_SYSTEM_HAS_METHOD (chain, conversion_end_type); | |
4698 CODING_SYSTEM_HAS_METHOD (chain, canonicalize_after_coding); | |
4699 | |
4700 #ifdef DEBUG_XEMACS | |
4701 INITIALIZE_CODING_SYSTEM_TYPE (internal, "internal-coding-system-p"); | |
4702 CODING_SYSTEM_HAS_METHOD (internal, convert); | |
4703 #endif | |
4704 | |
4705 #ifdef HAVE_ZLIB | |
4706 INITIALIZE_CODING_SYSTEM_TYPE_WITH_DATA (gzip, "gzip-coding-system-p"); | |
4707 CODING_SYSTEM_HAS_METHOD (gzip, conversion_end_type); | |
4708 CODING_SYSTEM_HAS_METHOD (gzip, convert); | |
4709 CODING_SYSTEM_HAS_METHOD (gzip, init); | |
4710 CODING_SYSTEM_HAS_METHOD (gzip, print); | |
4711 CODING_SYSTEM_HAS_METHOD (gzip, init_coding_stream); | |
4712 CODING_SYSTEM_HAS_METHOD (gzip, rewind_coding_stream); | |
4713 CODING_SYSTEM_HAS_METHOD (gzip, putprop); | |
4714 CODING_SYSTEM_HAS_METHOD (gzip, getprop); | |
4715 #endif | |
4716 } | |
4717 | |
4718 void | |
4719 reinit_coding_system_type_create (void) | |
4720 { | |
4721 REINITIALIZE_CODING_SYSTEM_TYPE (no_conversion); | |
4722 REINITIALIZE_CODING_SYSTEM_TYPE (convert_eol); | |
4723 REINITIALIZE_CODING_SYSTEM_TYPE (undecided); | |
4724 REINITIALIZE_CODING_SYSTEM_TYPE (chain); | |
4725 #if 0 | |
4726 REINITIALIZE_CODING_SYSTEM_TYPE (text_file_wrapper); | |
4727 #endif /* 0 */ | |
4728 #ifdef DEBUG_XEMACS | |
4729 REINITIALIZE_CODING_SYSTEM_TYPE (internal); | |
4730 #endif | |
4731 #ifdef HAVE_ZLIB | |
4732 REINITIALIZE_CODING_SYSTEM_TYPE (gzip); | |
4733 #endif | |
4734 } | |
4735 | |
4736 void | |
4737 reinit_vars_of_file_coding (void) | |
4738 { | |
428 | 4739 } |
4740 | |
4741 void | |
4742 vars_of_file_coding (void) | |
4743 { | |
771 | 4744 /* We always have file-coding support */ |
428 | 4745 Fprovide (intern ("file-coding")); |
4746 | |
4952
19a72041c5ed
Mule-izing, various fixes related to char * arguments
Ben Wing <ben@xemacs.org>
parents:
4846
diff
changeset
|
4747 QScoding_system_cookie = build_ascstring (";;;###coding system: "); |
1347 | 4748 staticpro (&QScoding_system_cookie); |
4749 | |
1242 | 4750 #ifdef HAVE_DEFAULT_EOL_DETECTION |
2297 | 4751 /* #### Find a more appropriate place for this comment. |
4752 WARNING: The existing categories are intimately tied to the function | |
1242 | 4753 `coding-system-category' in coding.el. If you change a category, or |
4754 change the layout of any coding system associated with a category, you | |
4755 need to check that function and make sure it's written properly. */ | |
4756 | |
4757 Fprovide (intern ("unix-default-eol-detection")); | |
4758 #endif | |
4759 | |
428 | 4760 DEFVAR_LISP ("keyboard-coding-system", &Vkeyboard_coding_system /* |
3142 | 4761 Default coding system used for TTY and X11 keyboard input. |
4762 Under X11, used only to interpet the character for a key event when that | |
4763 event has a KeySym of NoSymbol but does have an associated string keysym, | |
4764 something that's seen with input methods. | |
4765 | |
4766 If you need to set these things to different coding systems, call the | |
4767 function `set-console-tty-coding-system' for the TTY and use this variable | |
4768 for X11. | |
428 | 4769 */ ); |
4770 Vkeyboard_coding_system = Qnil; | |
4771 | |
4772 DEFVAR_LISP ("terminal-coding-system", &Vterminal_coding_system /* | |
4773 Coding system used for TTY display output. | |
4774 Not used under a windowing system. | |
4775 */ ); | |
4776 Vterminal_coding_system = Qnil; | |
4777 | |
4778 DEFVAR_LISP ("coding-system-for-read", &Vcoding_system_for_read /* | |
440 | 4779 Overriding coding system used when reading from a file or process. |
4780 You should bind this variable with `let', but do not set it globally. | |
4781 If this is non-nil, it specifies the coding system that will be used | |
4782 to decode input on read operations, such as from a file or process. | |
4783 It overrides `buffer-file-coding-system-for-read', | |
428 | 4784 `insert-file-contents-pre-hook', etc. Use those variables instead of |
440 | 4785 this one for permanent changes to the environment. */ ); |
428 | 4786 Vcoding_system_for_read = Qnil; |
4787 | |
4788 DEFVAR_LISP ("coding-system-for-write", | |
4789 &Vcoding_system_for_write /* | |
440 | 4790 Overriding coding system used when writing to a file or process. |
4791 You should bind this variable with `let', but do not set it globally. | |
4792 If this is non-nil, it specifies the coding system that will be used | |
4793 to encode output for write operations, such as to a file or process. | |
4794 It overrides `buffer-file-coding-system', `write-region-pre-hook', etc. | |
4795 Use those variables instead of this one for permanent changes to the | |
4796 environment. */ ); | |
428 | 4797 Vcoding_system_for_write = Qnil; |
4798 | |
4799 DEFVAR_LISP ("file-name-coding-system", &Vfile_name_coding_system /* | |
4800 Coding system used to convert pathnames when accessing files. | |
4801 */ ); | |
4802 Vfile_name_coding_system = Qnil; | |
4803 | |
4804 DEFVAR_BOOL ("enable-multibyte-characters", &enable_multibyte_characters /* | |
771 | 4805 Setting this has no effect. It is purely for FSF compatibility. |
428 | 4806 */ ); |
4807 enable_multibyte_characters = 1; | |
771 | 4808 |
4809 Vchain_canonicalize_hash_table = | |
4810 make_lisp_hash_table (50, HASH_TABLE_NON_WEAK, HASH_TABLE_EQUAL); | |
4811 staticpro (&Vchain_canonicalize_hash_table); | |
4812 | |
4813 #ifdef DEBUG_XEMACS | |
4814 DEFVAR_LISP ("debug-coding-detection", &Vdebug_coding_detection /* | |
4815 If non-nil, display debug information about detection operations in progress. | |
4816 Information is displayed on stderr. | |
4817 */ ); | |
4818 Vdebug_coding_detection = Qnil; | |
4819 #endif | |
4690
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
4820 |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
4821 #ifdef MULE |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
4822 Vdefault_query_coding_region_chartab_cache |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
4823 = make_lisp_hash_table (25, HASH_TABLE_NON_WEAK, HASH_TABLE_EQUAL); |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
4824 staticpro (&Vdefault_query_coding_region_chartab_cache); |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
4825 #endif |
428 | 4826 } |
4827 | |
2297 | 4828 /* #### reformat this for consistent appearance? */ |
4829 | |
428 | 4830 void |
4831 complex_vars_of_file_coding (void) | |
4832 { | |
4690
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
4833 Fmake_coding_system_internal |
771 | 4834 (Qconvert_eol_cr, Qconvert_eol, |
4952
19a72041c5ed
Mule-izing, various fixes related to char * arguments
Ben Wing <ben@xemacs.org>
parents:
4846
diff
changeset
|
4835 build_defer_string ("Convert CR to LF"), |
771 | 4836 nconc2 (list6 (Qdocumentation, |
4952
19a72041c5ed
Mule-izing, various fixes related to char * arguments
Ben Wing <ben@xemacs.org>
parents:
4846
diff
changeset
|
4837 build_defer_string ( |
771 | 4838 "Converts CR (used to mark the end of a line on Macintosh systems) to LF\n" |
4839 "(used internally and under Unix to mark the end of a line)."), | |
4952
19a72041c5ed
Mule-izing, various fixes related to char * arguments
Ben Wing <ben@xemacs.org>
parents:
4846
diff
changeset
|
4840 Qmnemonic, build_ascstring ("CR->LF"), |
771 | 4841 Qsubtype, Qcr), |
4842 /* VERY IMPORTANT! Tell make-coding-system not to generate | |
4843 subsidiaries -- it needs the coding systems we're creating | |
4844 to do so! */ | |
4690
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
4845 list4 (Qeol_type, Qlf, |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
4846 Qsafe_charsets, Qt))); |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
4847 |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
4848 Fmake_coding_system_internal |
771 | 4849 (Qconvert_eol_lf, Qconvert_eol, |
4952
19a72041c5ed
Mule-izing, various fixes related to char * arguments
Ben Wing <ben@xemacs.org>
parents:
4846
diff
changeset
|
4850 build_defer_string ("Convert LF to LF (do nothing)"), |
771 | 4851 nconc2 (list6 (Qdocumentation, |
4952
19a72041c5ed
Mule-izing, various fixes related to char * arguments
Ben Wing <ben@xemacs.org>
parents:
4846
diff
changeset
|
4852 build_defer_string ( |
771 | 4853 "Do nothing."), |
4952
19a72041c5ed
Mule-izing, various fixes related to char * arguments
Ben Wing <ben@xemacs.org>
parents:
4846
diff
changeset
|
4854 Qmnemonic, build_ascstring ("LF->LF"), |
771 | 4855 Qsubtype, Qlf), |
4856 /* VERY IMPORTANT! Tell make-coding-system not to generate | |
4857 subsidiaries -- it needs the coding systems we're creating | |
4858 to do so! */ | |
4690
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
4859 list4 (Qeol_type, Qlf, |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
4860 Qsafe_charsets, Qt))); |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
4861 |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
4862 Fmake_coding_system_internal |
771 | 4863 (Qconvert_eol_crlf, Qconvert_eol, |
4952
19a72041c5ed
Mule-izing, various fixes related to char * arguments
Ben Wing <ben@xemacs.org>
parents:
4846
diff
changeset
|
4864 build_defer_string ("Convert CRLF to LF"), |
771 | 4865 nconc2 (list6 (Qdocumentation, |
4952
19a72041c5ed
Mule-izing, various fixes related to char * arguments
Ben Wing <ben@xemacs.org>
parents:
4846
diff
changeset
|
4866 build_defer_string ( |
771 | 4867 "Converts CR+LF (used to mark the end of a line on Macintosh systems) to LF\n" |
4868 "(used internally and under Unix to mark the end of a line)."), | |
4952
19a72041c5ed
Mule-izing, various fixes related to char * arguments
Ben Wing <ben@xemacs.org>
parents:
4846
diff
changeset
|
4869 Qmnemonic, build_ascstring ("CRLF->LF"), |
771 | 4870 Qsubtype, Qcrlf), |
4690
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
4871 |
771 | 4872 /* VERY IMPORTANT! Tell make-coding-system not to generate |
4873 subsidiaries -- it needs the coding systems we're creating | |
4874 to do so! */ | |
4690
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
4875 list4 (Qeol_type, Qlf, |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
4876 Qsafe_charsets, Qt))); |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
4877 |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
4878 Fmake_coding_system_internal |
771 | 4879 (Qconvert_eol_autodetect, Qconvert_eol, |
4952
19a72041c5ed
Mule-izing, various fixes related to char * arguments
Ben Wing <ben@xemacs.org>
parents:
4846
diff
changeset
|
4880 build_defer_string ("Autodetect EOL type"), |
771 | 4881 nconc2 (list6 (Qdocumentation, |
4952
19a72041c5ed
Mule-izing, various fixes related to char * arguments
Ben Wing <ben@xemacs.org>
parents:
4846
diff
changeset
|
4882 build_defer_string ( |
771 | 4883 "Autodetect the end-of-line type."), |
4952
19a72041c5ed
Mule-izing, various fixes related to char * arguments
Ben Wing <ben@xemacs.org>
parents:
4846
diff
changeset
|
4884 Qmnemonic, build_ascstring ("Auto-EOL"), |
793 | 4885 Qsubtype, Qnil), |
771 | 4886 /* VERY IMPORTANT! Tell make-coding-system not to generate |
4887 subsidiaries -- it needs the coding systems we're creating | |
4888 to do so! */ | |
4690
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
4889 list4 (Qeol_type, Qlf, |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
4890 Qsafe_charsets, Qt))); |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
4891 |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
4892 Fmake_coding_system_internal |
771 | 4893 (Qundecided, Qundecided, |
4952
19a72041c5ed
Mule-izing, various fixes related to char * arguments
Ben Wing <ben@xemacs.org>
parents:
4846
diff
changeset
|
4894 build_defer_string ("Undecided (auto-detect)"), |
771 | 4895 nconc2 (list4 (Qdocumentation, |
4952
19a72041c5ed
Mule-izing, various fixes related to char * arguments
Ben Wing <ben@xemacs.org>
parents:
4846
diff
changeset
|
4896 build_defer_string |
771 | 4897 ("Automatically detects the correct encoding."), |
4952
19a72041c5ed
Mule-izing, various fixes related to char * arguments
Ben Wing <ben@xemacs.org>
parents:
4846
diff
changeset
|
4898 Qmnemonic, build_ascstring ("Auto")), |
771 | 4899 list6 (Qdo_eol, Qt, Qdo_coding, Qt, |
4900 /* We do EOL detection ourselves so we don't need to be | |
4901 wrapped in an EOL detector. (It doesn't actually hurt, | |
4902 though, I don't think.) */ | |
4903 Qeol_type, Qlf))); | |
4904 | |
4690
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
4905 Fmake_coding_system_internal |
771 | 4906 (intern ("undecided-dos"), Qundecided, |
4952
19a72041c5ed
Mule-izing, various fixes related to char * arguments
Ben Wing <ben@xemacs.org>
parents:
4846
diff
changeset
|
4907 build_defer_string ("Undecided (auto-detect) (CRLF)"), |
771 | 4908 nconc2 (list4 (Qdocumentation, |
4952
19a72041c5ed
Mule-izing, various fixes related to char * arguments
Ben Wing <ben@xemacs.org>
parents:
4846
diff
changeset
|
4909 build_defer_string |
771 | 4910 ("Automatically detects the correct encoding; EOL type of CRLF forced."), |
4952
19a72041c5ed
Mule-izing, various fixes related to char * arguments
Ben Wing <ben@xemacs.org>
parents:
4846
diff
changeset
|
4911 Qmnemonic, build_ascstring ("Auto")), |
771 | 4912 list4 (Qdo_coding, Qt, |
4913 Qeol_type, Qcrlf))); | |
4914 | |
4690
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
4915 Fmake_coding_system_internal |
771 | 4916 (intern ("undecided-unix"), Qundecided, |
4952
19a72041c5ed
Mule-izing, various fixes related to char * arguments
Ben Wing <ben@xemacs.org>
parents:
4846
diff
changeset
|
4917 build_defer_string ("Undecided (auto-detect) (LF)"), |
771 | 4918 nconc2 (list4 (Qdocumentation, |
4952
19a72041c5ed
Mule-izing, various fixes related to char * arguments
Ben Wing <ben@xemacs.org>
parents:
4846
diff
changeset
|
4919 build_defer_string |
771 | 4920 ("Automatically detects the correct encoding; EOL type of LF forced."), |
4952
19a72041c5ed
Mule-izing, various fixes related to char * arguments
Ben Wing <ben@xemacs.org>
parents:
4846
diff
changeset
|
4921 Qmnemonic, build_ascstring ("Auto")), |
771 | 4922 list4 (Qdo_coding, Qt, |
4923 Qeol_type, Qlf))); | |
4924 | |
4690
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
4925 Fmake_coding_system_internal |
771 | 4926 (intern ("undecided-mac"), Qundecided, |
4952
19a72041c5ed
Mule-izing, various fixes related to char * arguments
Ben Wing <ben@xemacs.org>
parents:
4846
diff
changeset
|
4927 build_defer_string ("Undecided (auto-detect) (CR)"), |
771 | 4928 nconc2 (list4 (Qdocumentation, |
4952
19a72041c5ed
Mule-izing, various fixes related to char * arguments
Ben Wing <ben@xemacs.org>
parents:
4846
diff
changeset
|
4929 build_defer_string |
771 | 4930 ("Automatically detects the correct encoding; EOL type of CR forced."), |
4952
19a72041c5ed
Mule-izing, various fixes related to char * arguments
Ben Wing <ben@xemacs.org>
parents:
4846
diff
changeset
|
4931 Qmnemonic, build_ascstring ("Auto")), |
771 | 4932 list4 (Qdo_coding, Qt, |
4933 Qeol_type, Qcr))); | |
4934 | |
428 | 4935 /* Need to create this here or we're really screwed. */ |
4690
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
4936 Fmake_coding_system_internal |
428 | 4937 (Qraw_text, Qno_conversion, |
4952
19a72041c5ed
Mule-izing, various fixes related to char * arguments
Ben Wing <ben@xemacs.org>
parents:
4846
diff
changeset
|
4938 build_defer_string ("Raw Text"), |
4690
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
4939 nconc2 (list4 (Qdocumentation, |
4952
19a72041c5ed
Mule-izing, various fixes related to char * arguments
Ben Wing <ben@xemacs.org>
parents:
4846
diff
changeset
|
4940 build_defer_string ("Raw text converts only line-break " |
4690
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
4941 "codes, and acts otherwise like " |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
4942 "`binary'."), |
4952
19a72041c5ed
Mule-izing, various fixes related to char * arguments
Ben Wing <ben@xemacs.org>
parents:
4846
diff
changeset
|
4943 Qmnemonic, build_ascstring ("Raw")), |
4690
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
4944 #ifdef MULE |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
4945 list2 (Qsafe_charsets, list3 (Vcharset_ascii, Vcharset_control_1, |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
4946 Vcharset_latin_iso8859_1)))); |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
4947 |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
4948 #else |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
4949 Qnil)); |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
4950 #endif |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
4951 |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
4952 Fmake_coding_system_internal |
428 | 4953 (Qbinary, Qno_conversion, |
4952
19a72041c5ed
Mule-izing, various fixes related to char * arguments
Ben Wing <ben@xemacs.org>
parents:
4846
diff
changeset
|
4954 build_defer_string ("Binary"), |
4690
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
4955 nconc2 (list6 (Qdocumentation, |
4952
19a72041c5ed
Mule-izing, various fixes related to char * arguments
Ben Wing <ben@xemacs.org>
parents:
4846
diff
changeset
|
4956 build_defer_string ( |
771 | 4957 "This coding system is as close as it comes to doing no conversion.\n" |
4958 "On input, each byte is converted directly into the character\n" | |
4959 "with the corresponding code -- i.e. from the `ascii', `control-1',\n" | |
4960 "or `latin-1' character sets. On output, these characters are\n" | |
4961 "converted back to the corresponding bytes, and other characters\n" | |
4962 "are converted to the default character, i.e. `~'."), | |
4690
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
4963 Qeol_type, Qlf, |
4952
19a72041c5ed
Mule-izing, various fixes related to char * arguments
Ben Wing <ben@xemacs.org>
parents:
4846
diff
changeset
|
4964 Qmnemonic, build_ascstring ("Binary")), |
4690
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
4965 #ifdef MULE |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
4966 list2 (Qsafe_charsets, list3 (Vcharset_ascii, Vcharset_control_1, |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
4967 Vcharset_latin_iso8859_1)))); |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
4968 |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
4969 #else |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
4970 Qnil)); |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
4971 #endif |
428 | 4972 |
771 | 4973 /* Formerly aliased to raw-text! Completely bogus and not even the same |
4974 as FSF Emacs. */ | |
4975 Fdefine_coding_system_alias (Qno_conversion, Qbinary); | |
4976 Fdefine_coding_system_alias (intern ("no-conversion-unix"), | |
4977 intern ("raw-text-unix")); | |
4978 Fdefine_coding_system_alias (intern ("no-conversion-dos"), | |
4979 intern ("raw-text-dos")); | |
4980 Fdefine_coding_system_alias (intern ("no-conversion-mac"), | |
4981 intern ("raw-text-mac")); | |
4982 | |
1318 | 4983 /* These three below will get their defaults set correctly |
4984 in code-init.el. We init them now so we can handle stuff at dump | |
771 | 4985 time before we get to code-init.el. */ |
1318 | 4986 Fdefine_coding_system_alias (Qnative, Qbinary); |
440 | 4987 Fdefine_coding_system_alias (Qterminal, Qbinary); |
4988 Fdefine_coding_system_alias (Qkeyboard, Qbinary); | |
4989 | |
1318 | 4990 Fdefine_coding_system_alias (Qfile_name, Qnative); |
771 | 4991 Fdefine_coding_system_alias (Qidentity, Qconvert_eol_lf); |
4992 | |
428 | 4993 /* Need this for bootstrapping */ |
771 | 4994 coding_category_system[detector_category_no_conversion] = |
428 | 4995 Fget_coding_system (Qraw_text); |
4996 } |