Mercurial > hg > xemacs-beta
annotate src/file-coding.c @ 4885:6772ce4d982b
Fix hash tables, #'member*, #'assoc*, #'eql compiler macros if bignums
lisp/ChangeLog addition:
2010-01-24 Aidan Kehoe <kehoea@parhasard.net>
Correct the semantics of #'member*, #'eql, #'assoc* in the
presence of bignums; change the integerp byte code to fixnump
semantics.
* bytecomp.el (fixnump, integerp, byte-compile-integerp):
Change the integerp byte code to fixnump; add a byte-compile
method to integerp using fixnump and numberp and avoiding a
funcall most of the time, since in the non-core contexts where
integerp is used, it's mostly distinguishing between fixnums and
things that are not numbers at all.
* byte-optimize.el (side-effect-free-fns, byte-after-unbind-ops)
(byte-compile-side-effect-and-error-free-ops):
Replace the integerp bytecode with fixnump; add fixnump to the
side-effect-free-fns. Add the other extended number type
predicates to the list in passing.
* obsolete.el (floatp-safe): Mark this as obsolete.
* cl.el (eql): Go into more detail in the docstring here. Don't
bother checking whether both arguments are numbers; one is enough,
#'equal will fail correctly if they have distinct types.
(subst): Replace a call to #'integerp (deciding whether to use
#'memq or not) with one to #'fixnump.
Delete most-positive-fixnum, most-negative-fixnum from this file;
they're now always in C, so they can't be modified from Lisp.
* cl-seq.el (member*, assoc*, rassoc*):
Correct these functions in the presence of bignums.
* cl-macs.el (cl-make-type-test): The type test for a fixnum is
now fixnump. Ditch floatp-safe, use floatp instead.
(eql): Correct this compiler macro in the presence of bignums.
(assoc*): Correct this compiler macro in the presence of bignums.
* simple.el (undo):
Change #'integerp to #'fixnump here, since we use #'delq with the
same value as ELT a few lines down.
src/ChangeLog addition:
2010-01-24 Aidan Kehoe <kehoea@parhasard.net>
Fix problems with #'eql, extended number types, and the hash table
implementation; change the Bintegerp bytecode to fixnump semantics
even on bignum builds, since #'integerp can have a fast
implementation in terms of #'fixnump for most of its extant uses,
but not vice-versa.
* lisp.h: Always #include number.h; we want the macros provided in
it, even if the various number types are not available.
* number.h (NON_FIXNUM_NUMBER_P): New macro, giving 1 when its
argument is of non-immediate number type. Equivalent to FLOATP if
WITH_NUMBER_TYPES is not defined.
* elhash.c (lisp_object_eql_equal, lisp_object_eql_hash):
Use NON_FIXNUM_NUMBER_P in these functions, instead of FLOATP,
giving more correct behaviour in the presence of the extended
number types.
* bytecode.c (Bfixnump, execute_optimized_program):
Rename Bintegerp to Bfixnump; change its semantics to reflect the
new name on builds with bignum support.
* data.c (Ffixnump, Fintegerp, syms_of_data, vars_of_data):
Always make #'fixnump available, even on non-BIGNUM builds;
always implement #'integerp in this file, even on BIGNUM builds.
Move most-positive-fixnum, most-negative-fixnum here from
number.c, so they are Lisp constants even on builds without number
types, and attempts to change or bind them error.
Use the NUMBERP and INTEGERP macros even on builds without
extended number types.
* data.c (fixnum_char_or_marker_to_int):
Rename this function from integer_char_or_marker_to_int, to better
reflect the arguments it accepts.
* number.c (Fevenp, Foddp, syms_of_number):
Never provide #'integerp in this file. Remove #'oddp,
#'evenp; their implementations are overridden by those in cl.el.
* number.c (vars_of_number):
most-positive-fixnum, most-negative-fixnum are no longer here.
man/ChangeLog addition:
2010-01-23 Aidan Kehoe <kehoea@parhasard.net>
Generally: be careful to say fixnum, not integer, when talking
about fixed-precision integral types. I'm sure I've missed
instances, both here and in the docstrings, but this is a decent
start.
* lispref/text.texi (Columns):
Document where only fixnums, not integers generally, are accepted.
(Registers):
Remove some ancient char-int confoundance here.
* lispref/strings.texi (Creating Strings, Creating Strings):
Be more exact in describing where fixnums but not integers in
general are accepted.
(Creating Strings): Use a more contemporary example to illustrate
how concat deals with lists including integers about #xFF. Delete
some obsolete documentation on same.
(Char Table Types): Document that only fixnums are accepted as
values in syntax tables.
* lispref/searching.texi (String Search, Search and Replace):
Be exact in describing where fixnums but not integers in general
are accepted.
* lispref/range-tables.texi (Range Tables): Be exact in describing
them; only fixnums are accepted to describe ranges.
* lispref/os.texi (Killing XEmacs, User Identification)
(Time of Day, Time Conversion):
Be more exact about using fixnum where only fixed-precision
integers are accepted.
* lispref/objects.texi (Integer Type): Be more exact (and
up-to-date) about the possible values for
integers. Cross-reference to documentation of the bignum extension.
(Equality Predicates):
(Range Table Type):
(Array Type): Use fixnum, not integer, to describe a
fixed-precision integer.
(Syntax Table Type): Correct some English syntax here.
* lispref/numbers.texi (Numbers): Change the phrasing here to use
fixnum to mean the fixed-precision integers normal in emacs.
Document that our terminology deviates from that of Common Lisp,
and that we're working on it.
(Compatibility Issues): Reiterate the Common Lisp versus Emacs
Lisp compatibility issues.
(Comparison of Numbers, Arithmetic Operations):
* lispref/commands.texi (Command Loop Info, Working With Events):
* lispref/buffers.texi (Modification Time):
Be more exact in describing where fixnums but not integers in
general are accepted.
| author | Aidan Kehoe <kehoea@parhasard.net> |
|---|---|
| date | Sun, 24 Jan 2010 15:21:27 +0000 |
| parents | a98ca4640147 |
| children | 19a72041c5ed |
| rev | line source |
|---|---|
| 771 | 1 /* Text encoding conversion functions; coding-system object. |
| 2 #### rename me to coding-system.c or coding.c | |
| 428 | 3 Copyright (C) 1991, 1995 Free Software Foundation, Inc. |
| 4 Copyright (C) 1995 Sun Microsystems, Inc. | |
| 3025 | 5 Copyright (C) 2000, 2001, 2002, 2003, 2005 Ben Wing. |
| 428 | 6 |
| 7 This file is part of XEmacs. | |
| 8 | |
| 9 XEmacs is free software; you can redistribute it and/or modify it | |
| 10 under the terms of the GNU General Public License as published by the | |
| 11 Free Software Foundation; either version 2, or (at your option) any | |
| 12 later version. | |
| 13 | |
| 14 XEmacs is distributed in the hope that it will be useful, but WITHOUT | |
| 15 ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or | |
| 16 FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License | |
| 17 for more details. | |
| 18 | |
| 19 You should have received a copy of the GNU General Public License | |
| 20 along with XEmacs; see the file COPYING. If not, write to | |
| 21 the Free Software Foundation, Inc., 59 Temple Place - Suite 330, | |
| 22 Boston, MA 02111-1307, USA. */ | |
| 23 | |
| 771 | 24 /* Synched up with: Not in FSF. */ |
| 25 | |
| 26 /* Authorship: | |
| 27 | |
| 28 Current primary author: Ben Wing <ben@xemacs.org> | |
| 29 | |
| 30 Rewritten by Ben Wing <ben@xemacs.org>, based originally on coding.c | |
| 31 from Mule 2.? but probably does not share one line of code with that | |
| 32 original source. Rewriting work started around Dec. 1994. or Jan. 1995. | |
| 33 Proceeded in earnest till Nov. 1995. | |
| 34 | |
| 35 Around Feb. 17, 1998, Andy Piper renamed what was then mule-coding.c to | |
| 36 file-coding.c, with the intention of using it to do end-of-line conversion | |
| 37 on non-MULE machines (specifically, on Windows machines). He separated | |
| 38 out the MULE stuff from non-MULE using ifdef's, and searched throughout | |
| 39 the rest of the source tree looking for coding-system-related code that | |
| 40 was ifdef MULE but should be ifdef HAVE_CODING_SYSTEMS. | |
| 41 | |
| 42 Sept. 4 - 8, 1998, Tomohiko Morioka added the UCS_4 and UTF_8 coding system | |
| 43 types, providing a primitive means of decoding and encoding externally- | |
| 44 formatted Unicode/UCS_4 and Unicode/UTF_8 data. | |
| 45 | |
| 46 January 25, 2000, Martin Buchholz redid and fleshed out the coding | |
| 47 system alias handling that was first added in prototype form by | |
| 48 Hrjove Niksic, April 15, 1999. | |
| 49 | |
| 50 April to May 2000, Ben Wing: More major reorganization. Adding features | |
| 51 needed for MS Windows (multibyte, unicode, unicode-to-multibyte), the | |
| 52 "chain" coding system for chaining two together, and doing a lot of | |
| 53 reorganization in preparation for properly abstracting out the different | |
| 54 coding system types. | |
| 55 | |
| 56 June 2001, Ben Wing: Added Unicode support. Eliminated previous | |
| 57 junky Unicode translation support. | |
| 58 | |
| 59 August 2001, Ben Wing: Moved Unicode support to unicode.c. Finished | |
| 60 abstracting everything except detection, which is hard to abstract (see | |
| 61 just below). | |
| 62 | |
| 63 September 2001, Ben Wing: Moved Mule code to mule-coding.c, Windows code | |
| 64 to intl-win32.c. Lots more rewriting; very little code is untouched | |
| 65 from before April 2000. Abstracted the detection code, added multiple | |
| 66 levels of likelihood to increase the reliability of the algorithm. | |
| 67 | |
| 68 October 2001, Ben Wing: HAVE_CODING_SYSTEMS is always now defined. | |
| 69 Removed the conditionals. | |
| 70 */ | |
| 71 | |
| 428 | 72 #include <config.h> |
| 73 #include "lisp.h" | |
| 74 | |
| 75 #include "buffer.h" | |
| 76 #include "elhash.h" | |
| 77 #include "insdel.h" | |
| 78 #include "lstream.h" | |
| 440 | 79 #include "opaque.h" |
| 771 | 80 #include "file-coding.h" |
|
4690
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
81 #include "extents.h" |
|
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
82 #include "rangetab.h" |
|
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
83 #include "chartab.h" |
| 771 | 84 |
| 85 #ifdef HAVE_ZLIB | |
| 86 #include "zlib.h" | |
| 428 | 87 #endif |
| 88 | |
| 89 Lisp_Object Vkeyboard_coding_system; | |
| 90 Lisp_Object Vterminal_coding_system; | |
| 91 Lisp_Object Vcoding_system_for_read; | |
| 92 Lisp_Object Vcoding_system_for_write; | |
| 93 Lisp_Object Vfile_name_coding_system; | |
| 94 | |
|
4690
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
95 Lisp_Object Qaliases, Qcharset_skip_chars_string; |
|
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
96 |
| 771 | 97 #ifdef DEBUG_XEMACS |
| 98 Lisp_Object Vdebug_coding_detection; | |
| 440 | 99 #endif |
| 771 | 100 |
|
4690
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
101 #ifdef MULE |
|
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
102 extern Lisp_Object Vcharset_ascii, Vcharset_control_1, |
|
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
103 Vcharset_latin_iso8859_1; |
|
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
104 #endif |
|
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
105 |
| 771 | 106 typedef struct coding_system_type_entry |
| 107 { | |
| 108 struct coding_system_methods *meths; | |
| 109 } coding_system_type_entry; | |
| 110 | |
| 111 typedef struct | |
| 112 { | |
| 113 Dynarr_declare (coding_system_type_entry); | |
| 114 } coding_system_type_entry_dynarr; | |
| 115 | |
| 116 static coding_system_type_entry_dynarr *the_coding_system_type_entry_dynarr; | |
| 117 | |
| 1204 | 118 static const struct memory_description cste_description_1[] = { |
| 2551 | 119 { XD_BLOCK_PTR, offsetof (coding_system_type_entry, meths), 1, |
| 120 { &coding_system_methods_description } }, | |
| 771 | 121 { XD_END } |
| 122 }; | |
| 123 | |
| 1204 | 124 static const struct sized_memory_description cste_description = { |
| 771 | 125 sizeof (coding_system_type_entry), |
| 126 cste_description_1 | |
| 127 }; | |
| 128 | |
| 1204 | 129 static const struct memory_description csted_description_1[] = { |
| 771 | 130 XD_DYNARR_DESC (coding_system_type_entry_dynarr, &cste_description), |
| 428 | 131 { XD_END } |
| 132 }; | |
| 133 | |
| 1204 | 134 static const struct sized_memory_description csted_description = { |
| 771 | 135 sizeof (coding_system_type_entry_dynarr), |
| 136 csted_description_1 | |
| 137 }; | |
| 138 | |
| 139 static Lisp_Object Vcoding_system_type_list; | |
| 140 | |
| 141 /* Coding system currently associated with each coding category. */ | |
| 142 Lisp_Object coding_category_system[MAX_DETECTOR_CATEGORIES]; | |
| 143 | |
| 144 /* Table of all coding categories in decreasing order of priority. | |
| 145 This describes a permutation of the possible coding categories. */ | |
| 146 int coding_category_by_priority[MAX_DETECTOR_CATEGORIES]; | |
| 147 | |
| 148 /* Value used with to give a unique name to nameless coding systems */ | |
| 149 int coding_system_tick; | |
| 150 | |
| 151 int coding_detector_count; | |
| 152 int coding_detector_category_count; | |
| 153 | |
| 154 detector_dynarr *all_coding_detectors; | |
| 155 | |
| 1204 | 156 static const struct memory_description struct_detector_category_description_1[] |
| 771 | 157 = |
| 158 { | |
| 159 { XD_LISP_OBJECT, offsetof (struct detector_category, sym) }, | |
| 160 { XD_END } | |
| 161 }; | |
| 162 | |
| 1204 | 163 static const struct sized_memory_description struct_detector_category_description = |
| 771 | 164 { |
| 165 sizeof (struct detector_category), | |
| 166 struct_detector_category_description_1 | |
| 428 | 167 }; |
| 168 | |
| 1204 | 169 static const struct memory_description detector_category_dynarr_description_1[] = |
| 771 | 170 { |
| 171 XD_DYNARR_DESC (detector_category_dynarr, | |
| 172 &struct_detector_category_description), | |
| 173 { XD_END } | |
| 174 }; | |
| 175 | |
| 1204 | 176 static const struct sized_memory_description detector_category_dynarr_description = { |
| 771 | 177 sizeof (detector_category_dynarr), |
| 178 detector_category_dynarr_description_1 | |
| 179 }; | |
| 180 | |
| 1204 | 181 static const struct memory_description struct_detector_description_1[] |
| 771 | 182 = |
| 183 { | |
| 2367 | 184 { XD_BLOCK_PTR, offsetof (struct detector, cats), 1, |
| 2551 | 185 { &detector_category_dynarr_description } }, |
| 771 | 186 { XD_END } |
| 187 }; | |
| 188 | |
| 1204 | 189 static const struct sized_memory_description struct_detector_description = |
| 771 | 190 { |
| 191 sizeof (struct detector), | |
| 192 struct_detector_description_1 | |
| 193 }; | |
| 194 | |
| 1204 | 195 static const struct memory_description detector_dynarr_description_1[] = |
| 771 | 196 { |
| 197 XD_DYNARR_DESC (detector_dynarr, &struct_detector_description), | |
| 198 { XD_END } | |
| 199 }; | |
| 200 | |
| 1204 | 201 static const struct sized_memory_description detector_dynarr_description = { |
| 771 | 202 sizeof (detector_dynarr), |
| 203 detector_dynarr_description_1 | |
| 204 }; | |
| 428 | 205 |
| 206 Lisp_Object Qcoding_systemp; | |
| 207 | |
| 771 | 208 Lisp_Object Qraw_text; |
| 428 | 209 |
| 210 Lisp_Object Qmnemonic, Qeol_type; | |
| 211 Lisp_Object Qcr, Qcrlf, Qlf; | |
| 212 Lisp_Object Qeol_cr, Qeol_crlf, Qeol_lf; | |
| 213 Lisp_Object Qpost_read_conversion; | |
| 214 Lisp_Object Qpre_write_conversion; | |
| 215 | |
| 771 | 216 Lisp_Object Qtranslation_table_for_decode; |
| 217 Lisp_Object Qtranslation_table_for_encode; | |
| 218 Lisp_Object Qsafe_chars; | |
| 219 Lisp_Object Qsafe_charsets; | |
| 220 Lisp_Object Qmime_charset; | |
| 221 Lisp_Object Qvalid_codes; | |
| 222 | |
| 223 Lisp_Object Qno_conversion; | |
| 224 Lisp_Object Qconvert_eol; | |
| 440 | 225 Lisp_Object Qescape_quoted; |
| 771 | 226 Lisp_Object Qencode, Qdecode; |
| 227 | |
| 228 Lisp_Object Qconvert_eol_lf, Qconvert_eol_cr, Qconvert_eol_crlf; | |
| 229 Lisp_Object Qconvert_eol_autodetect; | |
| 230 | |
| 231 Lisp_Object Qnear_certainty, Qquite_probable, Qsomewhat_likely; | |
| 1494 | 232 Lisp_Object Qslightly_likely; |
| 771 | 233 Lisp_Object Qas_likely_as_unlikely, Qsomewhat_unlikely, Qquite_improbable; |
| 234 Lisp_Object Qnearly_impossible; | |
| 235 | |
| 236 Lisp_Object Qdo_eol, Qdo_coding; | |
| 237 | |
| 238 Lisp_Object Qcanonicalize_after_coding; | |
| 239 | |
| 1347 | 240 Lisp_Object QScoding_system_cookie; |
| 241 | |
| 4303 | 242 Lisp_Object Qposix_charset_to_coding_system_hash; |
| 243 | |
| 771 | 244 /* This is used to convert autodetected coding systems into existing |
| 245 systems. For example, the chain undecided->convert-eol-autodetect may | |
| 246 have its separate parts detected as mswindows-multibyte and | |
| 247 convert-eol-crlf, and the result needs to be mapped to | |
| 248 mswindows-multibyte-dos. */ | |
| 249 /* #### It's not clear we need this whole chain-canonicalize mechanism | |
| 250 any more. */ | |
| 251 static Lisp_Object Vchain_canonicalize_hash_table; | |
| 252 | |
| 253 #ifdef HAVE_ZLIB | |
| 254 Lisp_Object Qgzip; | |
| 428 | 255 #endif |
| 771 | 256 |
| 2297 | 257 /* Maps symbols (coding system names) to either coding system objects or |
| 258 (for aliases) other names. */ | |
| 771 | 259 static Lisp_Object Vcoding_system_hash_table; |
| 428 | 260 |
| 261 int enable_multibyte_characters; | |
| 262 | |
| 263 EXFUN (Fcopy_coding_system, 2); | |
| 264 | |
| 265 | |
| 266 /************************************************************************/ | |
| 771 | 267 /* Coding system object methods */ |
| 428 | 268 /************************************************************************/ |
| 269 | |
| 270 static Lisp_Object | |
| 271 mark_coding_system (Lisp_Object obj) | |
| 272 { | |
| 273 Lisp_Coding_System *codesys = XCODING_SYSTEM (obj); | |
| 274 | |
| 1204 | 275 #define MARKED_SLOT(x) mark_object (codesys->x); |
| 276 #include "coding-system-slots.h" | |
| 771 | 277 |
| 278 MAYBE_CODESYSMETH (codesys, mark, (obj)); | |
| 428 | 279 |
| 1204 | 280 return Qnil; |
| 428 | 281 } |
| 282 | |
| 283 static void | |
| 771 | 284 print_coding_system_properties (Lisp_Object obj, Lisp_Object printcharfun) |
| 285 { | |
| 286 Lisp_Coding_System *c = XCODING_SYSTEM (obj); | |
| 287 print_internal (c->methods->type, printcharfun, 1); | |
| 288 MAYBE_CODESYSMETH (c, print, (obj, printcharfun, 1)); | |
| 289 if (CODING_SYSTEM_EOL_TYPE (c) != EOL_AUTODETECT) | |
| 290 write_fmt_string_lisp (printcharfun, " eol-type=%s", | |
| 291 1, Fcoding_system_property (obj, Qeol_type)); | |
| 292 } | |
| 293 | |
| 294 static void | |
| 428 | 295 print_coding_system (Lisp_Object obj, Lisp_Object printcharfun, |
| 2286 | 296 int UNUSED (escapeflag)) |
| 428 | 297 { |
| 298 Lisp_Coding_System *c = XCODING_SYSTEM (obj); | |
| 299 if (print_readably) | |
| 4846 | 300 printing_unreadable_lcrecord (obj, 0); |
| 771 | 301 |
| 302 write_fmt_string_lisp (printcharfun, "#<coding-system %s ", 1, c->name); | |
| 303 print_coding_system_properties (obj, printcharfun); | |
| 826 | 304 write_c_string (printcharfun, ">"); |
| 428 | 305 } |
| 306 | |
| 771 | 307 /* Print an abbreviated version of a coding system (but still containing |
| 308 all the information), for use within a coding system print method. */ | |
| 309 | |
| 310 static void | |
| 311 print_coding_system_in_print_method (Lisp_Object cs, Lisp_Object printcharfun, | |
| 2286 | 312 int UNUSED (escapeflag)) |
| 771 | 313 { |
| 800 | 314 write_fmt_string_lisp (printcharfun, "%s[", 1, XCODING_SYSTEM_NAME (cs)); |
| 771 | 315 print_coding_system_properties (cs, printcharfun); |
| 826 | 316 write_c_string (printcharfun, "]"); |
| 771 | 317 } |
| 318 | |
| 3263 | 319 #ifndef NEW_GC |
| 428 | 320 static void |
| 321 finalize_coding_system (void *header, int for_disksave) | |
| 322 { | |
| 771 | 323 Lisp_Object cs = wrap_coding_system ((Lisp_Coding_System *) header); |
| 428 | 324 /* Since coding systems never go away, this function is not |
| 325 necessary. But it would be necessary if we changed things | |
| 326 so that coding systems could go away. */ | |
| 327 if (!for_disksave) /* see comment in lstream.c */ | |
| 771 | 328 MAYBE_XCODESYSMETH (cs, finalize, (cs)); |
| 329 } | |
| 3263 | 330 #endif /* not NEW_GC */ |
| 771 | 331 |
| 332 static Bytecount | |
| 333 sizeof_coding_system (const void *header) | |
| 334 { | |
| 335 const Lisp_Coding_System *p = (const Lisp_Coding_System *) header; | |
| 336 return offsetof (Lisp_Coding_System, data) + p->methods->extra_data_size; | |
| 428 | 337 } |
| 338 | |
| 1204 | 339 static const struct memory_description coding_system_methods_description_1[] |
| 771 | 340 = { |
| 341 { XD_LISP_OBJECT, | |
| 342 offsetof (struct coding_system_methods, type) }, | |
| 343 { XD_LISP_OBJECT, | |
| 344 offsetof (struct coding_system_methods, predicate_symbol) }, | |
| 345 { XD_END } | |
| 346 }; | |
| 347 | |
| 1204 | 348 const struct sized_memory_description coding_system_methods_description = { |
| 771 | 349 sizeof (struct coding_system_methods), |
| 350 coding_system_methods_description_1 | |
| 351 }; | |
| 352 | |
| 1204 | 353 static const struct sized_memory_description coding_system_extra_description_map[] = |
| 354 { | |
| 355 { offsetof (Lisp_Coding_System, methods) }, | |
| 356 { offsetof (struct coding_system_methods, extra_description) }, | |
| 357 { -1 }, | |
| 771 | 358 }; |
| 359 | |
| 1204 | 360 static const struct memory_description coding_system_description[] = |
| 428 | 361 { |
| 2367 | 362 { XD_BLOCK_PTR, offsetof (Lisp_Coding_System, methods), 1, |
| 2551 | 363 { &coding_system_methods_description } }, |
| 1204 | 364 #define MARKED_SLOT(x) { XD_LISP_OBJECT, offsetof (Lisp_Coding_System, x) }, |
| 365 #define MARKED_SLOT_ARRAY(slot, size) \ | |
| 366 { XD_LISP_OBJECT_ARRAY, offsetof (Lisp_Coding_System, slot), size }, | |
| 367 #include "coding-system-slots.h" | |
| 2367 | 368 { XD_BLOCK_ARRAY, offsetof (Lisp_Coding_System, data), 1, |
| 2551 | 369 { coding_system_extra_description_map } }, |
| 1204 | 370 { XD_END } |
| 771 | 371 }; |
| 372 | |
| 1204 | 373 static const struct memory_description coding_system_empty_extra_description_1[] = |
| 374 { | |
| 375 { XD_END } | |
| 376 }; | |
| 377 | |
| 378 const struct sized_memory_description coding_system_empty_extra_description = { | |
| 379 0, coding_system_empty_extra_description_1 | |
| 380 }; | |
| 381 | |
| 3263 | 382 #ifdef NEW_GC |
| 383 DEFINE_LRECORD_SEQUENCE_IMPLEMENTATION ("coding-system", coding_system, | |
| 384 1, /*dumpable-flag*/ | |
| 385 mark_coding_system, | |
| 386 print_coding_system, | |
| 387 0, 0, 0, coding_system_description, | |
| 388 sizeof_coding_system, | |
| 389 Lisp_Coding_System); | |
| 390 #else /* not NEW_GC */ | |
| 934 | 391 DEFINE_LRECORD_SEQUENCE_IMPLEMENTATION ("coding-system", coding_system, |
| 392 1, /*dumpable-flag*/ | |
| 393 mark_coding_system, | |
| 394 print_coding_system, | |
| 395 finalize_coding_system, | |
| 396 0, 0, coding_system_description, | |
| 397 sizeof_coding_system, | |
| 398 Lisp_Coding_System); | |
| 3263 | 399 #endif /* not NEW_GC */ |
| 771 | 400 |
| 401 /************************************************************************/ | |
| 402 /* Creating coding systems */ | |
| 403 /************************************************************************/ | |
| 404 | |
| 405 static struct coding_system_methods * | |
| 406 decode_coding_system_type (Lisp_Object type, Error_Behavior errb) | |
| 428 | 407 { |
| 771 | 408 int i; |
| 409 | |
| 410 for (i = 0; i < Dynarr_length (the_coding_system_type_entry_dynarr); i++) | |
| 428 | 411 { |
| 771 | 412 if (EQ (type, |
| 413 Dynarr_at (the_coding_system_type_entry_dynarr, i).meths->type)) | |
| 414 return Dynarr_at (the_coding_system_type_entry_dynarr, i).meths; | |
| 428 | 415 } |
| 771 | 416 |
| 417 maybe_invalid_constant ("Invalid coding system type", type, | |
| 418 Qcoding_system, errb); | |
| 419 | |
| 420 return 0; | |
| 428 | 421 } |
| 422 | |
| 771 | 423 static int |
| 424 valid_coding_system_type_p (Lisp_Object type) | |
| 428 | 425 { |
| 771 | 426 return decode_coding_system_type (type, ERROR_ME_NOT) != 0; |
| 427 } | |
| 428 | |
|
4690
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
429 #ifdef MULE |
|
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
430 static Lisp_Object Vdefault_query_coding_region_chartab_cache; |
|
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
431 |
|
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
432 /* Non-static because it's used in INITIALIZE_CODING_SYSTEM_TYPE_WITH_DATA. */ |
|
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
433 Lisp_Object |
|
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
434 default_query_method (Lisp_Object codesys, struct buffer *buf, |
|
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
435 Charbpos end, int flags) |
|
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
436 { |
|
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
437 Charbpos pos = BUF_PT (buf), fail_range_start, fail_range_end; |
|
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
438 Charbpos pos_byte = BYTE_BUF_PT (buf); |
|
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
439 Lisp_Object safe_charsets = XCODING_SYSTEM_SAFE_CHARSETS (codesys); |
|
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
440 Lisp_Object safe_chars = XCODING_SYSTEM_SAFE_CHARS (codesys), |
|
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
441 result = Qnil; |
|
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
442 enum query_coding_failure_reasons failed_reason, |
|
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
443 previous_failed_reason = query_coding_succeeded; |
|
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
444 |
|
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
445 /* safe-charsets of t means the coding system can encode everything. */ |
|
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
446 if (EQ (Qnil, safe_chars)) |
|
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
447 { |
|
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
448 if (EQ (Qt, safe_charsets)) |
|
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
449 { |
|
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
450 return Qnil; |
|
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
451 } |
|
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
452 |
|
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
453 /* If we've no information on what characters the coding system can |
|
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
454 encode, give up. */ |
|
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
455 if (EQ (Qnil, safe_charsets) && EQ (Qnil, safe_chars)) |
|
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
456 { |
|
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
457 return Qunbound; |
|
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
458 } |
|
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
459 |
|
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
460 safe_chars = Fgethash (safe_charsets, |
|
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
461 Vdefault_query_coding_region_chartab_cache, |
|
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
462 Qnil); |
|
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
463 if (NILP (safe_chars)) |
|
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
464 { |
|
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
465 safe_chars = Fmake_char_table (Qgeneric); |
|
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
466 { |
|
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
467 EXTERNAL_LIST_LOOP_2 (safe_charset, safe_charsets) |
|
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
468 Fput_char_table (safe_charset, Qt, safe_chars); |
|
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
469 } |
|
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
470 |
|
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
471 Fputhash (safe_charsets, safe_chars, |
|
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
472 Vdefault_query_coding_region_chartab_cache); |
|
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
473 } |
|
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
474 } |
|
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
475 |
|
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
476 if (flags & QUERY_METHOD_HIGHLIGHT && |
|
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
477 /* If we're being called really early, live without highlights getting |
|
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
478 cleared properly: */ |
|
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
479 !(UNBOUNDP (XSYMBOL (Qquery_coding_clear_highlights)->function))) |
|
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
480 { |
|
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
481 /* It's okay to call Lisp here, the only non-stack object we may have |
|
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
482 allocated up to this point is safe_chars, and that's |
|
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
483 reachable from its entry in |
|
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
484 Vdefault_query_coding_region_chartab_cache */ |
|
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
485 call3 (Qquery_coding_clear_highlights, make_int (pos), make_int (end), |
|
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
486 wrap_buffer (buf)); |
|
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
487 } |
|
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
488 |
|
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
489 while (pos < end) |
|
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
490 { |
|
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
491 Ichar ch = BYTE_BUF_FETCH_CHAR (buf, pos_byte); |
|
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
492 if (!EQ (Qnil, get_char_table (ch, safe_chars))) |
|
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
493 { |
|
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
494 pos++; |
|
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
495 INC_BYTEBPOS (buf, pos_byte); |
|
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
496 } |
|
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
497 else |
|
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
498 { |
|
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
499 fail_range_start = pos; |
|
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
500 while ((pos < end) && |
|
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
501 (EQ (Qnil, get_char_table (ch, safe_chars)) |
|
4839
5f1c6ca4d05e
fix bad change to default_query_method, rewrite conditional for more correctness
Ben Wing <ben@xemacs.org>
parents:
4830
diff
changeset
|
502 && (failed_reason = query_coding_unencodable, |
|
5f1c6ca4d05e
fix bad change to default_query_method, rewrite conditional for more correctness
Ben Wing <ben@xemacs.org>
parents:
4830
diff
changeset
|
503 (previous_failed_reason == query_coding_succeeded |
|
5f1c6ca4d05e
fix bad change to default_query_method, rewrite conditional for more correctness
Ben Wing <ben@xemacs.org>
parents:
4830
diff
changeset
|
504 || previous_failed_reason == failed_reason)))) |
|
4690
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
505 { |
|
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
506 pos++; |
|
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
507 INC_BYTEBPOS (buf, pos_byte); |
|
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
508 ch = BYTE_BUF_FETCH_CHAR (buf, pos_byte); |
|
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
509 previous_failed_reason = failed_reason; |
|
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
510 } |
|
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
511 |
|
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
512 if (fail_range_start == pos) |
|
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
513 { |
|
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
514 /* The character can actually be encoded; move on. */ |
|
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
515 pos++; |
|
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
516 INC_BYTEBPOS (buf, pos_byte); |
|
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
517 } |
|
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
518 else |
|
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
519 { |
|
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
520 assert (previous_failed_reason == query_coding_unencodable); |
|
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
521 |
|
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
522 if (flags & QUERY_METHOD_ERRORP) |
|
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
523 { |
|
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
524 DECLARE_EISTRING (error_details); |
|
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
525 |
|
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
526 eicpy_ascii (error_details, "Cannot encode "); |
|
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
527 eicat_lstr (error_details, |
|
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
528 make_string_from_buffer (buf, fail_range_start, |
|
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
529 pos - |
|
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
530 fail_range_start)); |
|
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
531 eicat_ascii (error_details, " using coding system"); |
|
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
532 |
|
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
533 signal_error (Qtext_conversion_error, |
|
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
534 (const CIbyte *)(eidata (error_details)), |
|
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
535 XCODING_SYSTEM_NAME (codesys)); |
|
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
536 } |
|
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
537 |
|
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
538 if (NILP (result)) |
|
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
539 { |
|
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
540 result = Fmake_range_table (Qstart_closed_end_open); |
|
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
541 } |
|
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
542 |
|
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
543 fail_range_end = pos; |
|
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
544 |
|
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
545 Fput_range_table (make_int (fail_range_start), |
|
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
546 make_int (fail_range_end), |
|
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
547 Qunencodable, |
|
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
548 result); |
|
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
549 previous_failed_reason = query_coding_succeeded; |
|
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
550 |
|
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
551 if (flags & QUERY_METHOD_HIGHLIGHT) |
|
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
552 { |
|
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
553 Lisp_Object extent |
|
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
554 = Fmake_extent (make_int (fail_range_start), |
|
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
555 make_int (fail_range_end), |
|
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
556 wrap_buffer (buf)); |
|
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
557 |
|
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
558 Fset_extent_priority |
|
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
559 (extent, make_int (2 + mouse_highlight_priority)); |
|
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
560 Fset_extent_face (extent, Qquery_coding_warning_face); |
|
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
561 } |
|
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
562 } |
|
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
563 } |
|
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
564 } |
|
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
565 |
|
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
566 return result; |
|
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
567 } |
|
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
568 #else |
|
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
569 Lisp_Object |
|
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
570 default_query_method (Lisp_Object UNUSED (codesys), |
|
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
571 struct buffer * UNUSED (buf), |
|
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
572 Charbpos UNUSED (end), int UNUSED (flags)) |
|
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
573 { |
|
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
574 return Qnil; |
|
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
575 } |
|
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
576 #endif /* defined MULE */ |
|
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
577 |
| 771 | 578 DEFUN ("valid-coding-system-type-p", Fvalid_coding_system_type_p, 1, 1, 0, /* |
| 579 Given a CODING-SYSTEM-TYPE, return non-nil if it is valid. | |
| 580 Valid types depend on how XEmacs was compiled but may include | |
| 3025 | 581 `undecided', `chain', `integer', `ccl', `iso2022', `big5', `shift-jis', |
| 582 `utf-16', `ucs-4', `utf-8', etc. | |
| 771 | 583 */ |
| 584 (coding_system_type)) | |
| 585 { | |
| 586 return valid_coding_system_type_p (coding_system_type) ? Qt : Qnil; | |
| 587 } | |
| 588 | |
| 589 DEFUN ("coding-system-type-list", Fcoding_system_type_list, 0, 0, 0, /* | |
| 590 Return a list of valid coding system types. | |
| 591 */ | |
| 592 ()) | |
| 593 { | |
| 594 return Fcopy_sequence (Vcoding_system_type_list); | |
| 595 } | |
| 596 | |
| 597 void | |
| 598 add_entry_to_coding_system_type_list (struct coding_system_methods *meths) | |
| 599 { | |
| 600 struct coding_system_type_entry entry; | |
| 601 | |
| 602 entry.meths = meths; | |
| 603 Dynarr_add (the_coding_system_type_entry_dynarr, entry); | |
| 604 Vcoding_system_type_list = Fcons (meths->type, Vcoding_system_type_list); | |
| 428 | 605 } |
| 606 | |
| 607 DEFUN ("coding-system-p", Fcoding_system_p, 1, 1, 0, /* | |
| 608 Return t if OBJECT is a coding system. | |
| 609 A coding system is an object that defines how text containing multiple | |
| 610 character sets is encoded into a stream of (typically 8-bit) bytes. | |
| 611 The coding system is used to decode the stream into a series of | |
| 612 characters (which may be from multiple charsets) when the text is read | |
| 613 from a file or process, and is used to encode the text back into the | |
| 614 same format when it is written out to a file or process. | |
| 615 | |
| 616 For example, many ISO2022-compliant coding systems (such as Compound | |
| 617 Text, which is used for inter-client data under the X Window System) | |
| 618 use escape sequences to switch between different charsets -- Japanese | |
| 619 Kanji, for example, is invoked with "ESC $ ( B"; ASCII is invoked | |
| 620 with "ESC ( B"; and Cyrillic is invoked with "ESC - L". See | |
| 621 `make-coding-system' for more information. | |
| 622 | |
| 623 Coding systems are normally identified using a symbol, and the | |
| 624 symbol is accepted in place of the actual coding system object whenever | |
| 625 a coding system is called for. (This is similar to how faces work.) | |
| 626 */ | |
| 627 (object)) | |
| 628 { | |
| 629 return CODING_SYSTEMP (object) ? Qt : Qnil; | |
| 630 } | |
| 631 | |
| 4303 | 632 static Lisp_Object |
| 633 find_coding_system (Lisp_Object coding_system_or_name, | |
| 634 int do_autoloads) | |
| 635 { | |
| 636 Lisp_Object lookup; | |
| 637 | |
| 638 if (NILP (coding_system_or_name)) | |
| 639 coding_system_or_name = Qbinary; | |
| 640 else if (CODING_SYSTEMP (coding_system_or_name)) | |
| 641 return coding_system_or_name; | |
| 642 else | |
| 643 CHECK_SYMBOL (coding_system_or_name); | |
| 644 | |
| 645 while (1) | |
| 646 { | |
| 647 lookup = | |
| 648 Fgethash (coding_system_or_name, Vcoding_system_hash_table, Qnil); | |
| 649 | |
| 650 if (CONSP (lookup) && do_autoloads) | |
| 651 { | |
| 652 struct gcpro gcpro1; | |
| 653 int length; | |
| 654 DECLARE_EISTRING (desired_base); | |
| 655 DECLARE_EISTRING (warning_info); | |
| 656 | |
| 657 eicpy_lstr (desired_base, XSYMBOL_NAME (coding_system_or_name)); | |
| 658 | |
| 659 /* Work out the name of the base coding system. */ | |
| 660 length = eilen (desired_base); | |
| 661 if (length > (int)(sizeof ("-unix") - 1)) | |
| 662 { | |
| 663 if (0 == qxestrcmp ((UAscbyte *)"-unix", (eidata (desired_base)) | |
| 664 + (length - (sizeof ("-unix") - 1)))) | |
| 665 { | |
| 666 eidel (desired_base, length - (sizeof ("-unix") - 1), | |
| 667 -1, 5, 5); | |
| 668 } | |
| 669 } | |
| 670 else if (length > (int)(sizeof ("-dos") - 1)) | |
| 671 { | |
| 672 if ((0 == qxestrcmp ((UAscbyte *)"-dos", (eidata (desired_base)) | |
| 673 + (length - (sizeof ("-dos") - 1)))) || | |
| 674 (0 == qxestrcmp ((UAscbyte *)"-mac", (eidata (desired_base)) | |
| 675 + (length - (sizeof ("-mac") - 1))))) | |
| 676 { | |
| 677 eidel (desired_base, length - (sizeof ("-dos") - 1), -1, | |
| 678 4, 4); | |
| 679 } | |
| 680 } | |
| 681 | |
| 682 coding_system_or_name = intern_int (eidata (desired_base)); | |
| 683 | |
| 684 /* Remove this coding system and its subsidiary coding | |
| 685 systems from the hash, to avoid calling this code recursively. */ | |
| 686 Fremhash (coding_system_or_name, Vcoding_system_hash_table); | |
| 687 Fremhash (add_suffix_to_symbol(coding_system_or_name, "-unix"), | |
| 688 Vcoding_system_hash_table); | |
| 689 Fremhash (add_suffix_to_symbol(coding_system_or_name, "-dos"), | |
| 690 Vcoding_system_hash_table); | |
| 691 Fremhash (add_suffix_to_symbol(coding_system_or_name, "-mac"), | |
| 692 Vcoding_system_hash_table); | |
| 693 | |
| 694 eicpy_ascii (warning_info, "Error autoloading coding system "); | |
| 695 eicat_lstr (warning_info, XSYMBOL_NAME (coding_system_or_name)); | |
| 696 | |
| 697 /* Keep around the form so it doesn't disappear from under | |
| 698 #'eval's feet. */ | |
| 699 GCPRO1 (lookup); | |
| 700 call1_trapping_problems ((const CIbyte *)eidata (warning_info), | |
| 701 Qeval, lookup, 0); | |
| 702 UNGCPRO; | |
| 703 | |
| 704 lookup = | |
| 705 Fgethash (coding_system_or_name, Vcoding_system_hash_table, Qnil); | |
| 706 } | |
| 707 | |
| 708 if (CODING_SYSTEMP (lookup) || NILP (lookup)) | |
| 709 return lookup; | |
| 710 | |
| 711 coding_system_or_name = lookup; | |
| 712 } | |
| 713 } | |
| 714 | |
| 428 | 715 DEFUN ("find-coding-system", Ffind_coding_system, 1, 1, 0, /* |
| 716 Retrieve the coding system of the given name. | |
| 717 | |
| 718 If CODING-SYSTEM-OR-NAME is a coding-system object, it is simply | |
| 719 returned. Otherwise, CODING-SYSTEM-OR-NAME should be a symbol. | |
| 720 If there is no such coding system, nil is returned. Otherwise the | |
| 721 associated coding system object is returned. | |
| 722 */ | |
| 723 (coding_system_or_name)) | |
| 724 { | |
| 4303 | 725 return find_coding_system(coding_system_or_name, 1); |
| 726 } | |
| 727 | |
| 728 DEFUN ("autoload-coding-system", Fautoload_coding_system, 2, 2, 0, /* | |
| 729 Define SYMBOL as a coding-system that is loaded on demand. | |
| 730 | |
| 731 FORM is a form to evaluate to define the coding-system. | |
| 732 */ | |
| 733 (symbol, form)) | |
| 734 { | |
| 735 Lisp_Object lookup; | |
| 736 | |
| 737 CHECK_SYMBOL (symbol); | |
| 738 CHECK_CONS (form); | |
| 739 | |
| 740 lookup = find_coding_system (symbol, 0); | |
| 741 | |
| 742 if (!NILP (lookup) && | |
| 743 /* Allow autoloads to be redefined. */ | |
| 744 !CONSP (lookup)) | |
| 440 | 745 { |
| 4303 | 746 invalid_operation ("Cannot redefine existing coding system", |
| 747 symbol); | |
| 440 | 748 } |
| 4303 | 749 |
| 750 Fputhash (symbol, form, Vcoding_system_hash_table); | |
| 751 Fputhash (add_suffix_to_symbol(symbol, "-unix"), form, | |
| 752 Vcoding_system_hash_table); | |
| 753 Fputhash (add_suffix_to_symbol(symbol, "-dos"), form, | |
| 754 Vcoding_system_hash_table); | |
| 755 Fputhash (add_suffix_to_symbol(symbol, "-mac"), form, | |
| 756 Vcoding_system_hash_table); | |
| 757 | |
| 758 /* Tell the POSIX locale infrastructure about this coding system (though | |
| 759 unfortunately it'll be too late for the startup locale sniffing. */ | |
| 760 if (!UNBOUNDP (Qposix_charset_to_coding_system_hash)) | |
| 761 { | |
| 762 Lisp_Object val = Fsymbol_value (Qposix_charset_to_coding_system_hash); | |
| 763 DECLARE_EISTRING (minimal_name); | |
| 764 Ibyte *full_name; | |
| 765 int len = XSTRING_LENGTH (XSYMBOL_NAME (symbol)), i; | |
| 766 | |
| 767 if (!NILP (val)) | |
| 768 { | |
| 769 full_name = XSTRING_DATA (XSYMBOL_NAME (symbol)); | |
| 770 for (i = 0; i < len; ++i) | |
| 771 { | |
| 772 if (full_name[i] >= '0' && full_name[i] <= '9') | |
| 773 { | |
| 774 eicat_ch (minimal_name, full_name[i]); | |
| 775 } | |
| 776 else if (full_name[i] >= 'a' && full_name[i] <= 'z') | |
| 777 { | |
| 778 eicat_ch (minimal_name, full_name[i]); | |
| 779 } | |
| 780 else if (full_name[i] >= 'A' && full_name[i] <= 'Z') | |
| 781 { | |
| 782 eicat_ch (minimal_name, full_name[i] + | |
| 783 ('a' - 'A')); | |
| 784 } | |
| 785 } | |
| 786 | |
| 787 if (eilen (minimal_name)) | |
| 788 { | |
| 789 CHECK_HASH_TABLE (val); | |
| 790 Fputhash (eimake_string(minimal_name), symbol, val); | |
| 791 } | |
| 792 } | |
| 793 } | |
| 794 | |
| 795 return Qt; | |
| 428 | 796 } |
| 797 | |
| 798 DEFUN ("get-coding-system", Fget_coding_system, 1, 1, 0, /* | |
| 799 Retrieve the coding system of the given name. | |
| 800 Same as `find-coding-system' except that if there is no such | |
| 801 coding system, an error is signaled instead of returning nil. | |
| 802 */ | |
| 803 (name)) | |
| 804 { | |
| 805 Lisp_Object coding_system = Ffind_coding_system (name); | |
| 806 | |
| 807 if (NILP (coding_system)) | |
| 563 | 808 invalid_argument ("No such coding system", name); |
| 428 | 809 return coding_system; |
| 810 } | |
| 811 | |
| 771 | 812 int |
| 813 coding_system_is_binary (Lisp_Object coding_system) | |
| 814 { | |
| 815 Lisp_Coding_System *cs = XCODING_SYSTEM (coding_system); | |
| 816 return | |
| 817 (EQ (CODING_SYSTEM_TYPE (cs), Qno_conversion) && | |
| 818 CODING_SYSTEM_EOL_TYPE (cs) == EOL_LF && | |
| 819 EQ (CODING_SYSTEM_POST_READ_CONVERSION (cs), Qnil) && | |
| 820 EQ (CODING_SYSTEM_PRE_WRITE_CONVERSION (cs), Qnil)); | |
| 821 } | |
| 822 | |
| 823 static Lisp_Object | |
| 824 coding_system_real_canonical (Lisp_Object cs) | |
| 825 { | |
| 826 if (!NILP (XCODING_SYSTEM_CANONICAL (cs))) | |
| 827 return XCODING_SYSTEM_CANONICAL (cs); | |
| 828 return cs; | |
| 829 } | |
| 830 | |
| 831 /* Return true if coding system is of the "standard" type that decodes | |
| 832 bytes into characters (suitable for decoding a text file). */ | |
| 833 int | |
| 834 coding_system_is_for_text_file (Lisp_Object coding_system) | |
| 835 { | |
| 836 return (XCODESYSMETH_OR_GIVEN | |
| 837 (coding_system, conversion_end_type, | |
| 838 (coding_system_real_canonical (coding_system)), | |
| 839 DECODES_BYTE_TO_CHARACTER) == | |
| 840 DECODES_BYTE_TO_CHARACTER); | |
| 841 } | |
| 842 | |
| 843 static int | |
| 844 decoding_source_sink_type_is_char (Lisp_Object cs, enum source_or_sink sex) | |
| 845 { | |
| 846 enum source_sink_type type = | |
| 847 XCODESYSMETH_OR_GIVEN (cs, conversion_end_type, | |
| 848 (coding_system_real_canonical (cs)), | |
| 849 DECODES_BYTE_TO_CHARACTER); | |
| 850 if (sex == CODING_SOURCE) | |
| 851 return (type == DECODES_CHARACTER_TO_CHARACTER || | |
| 852 type == DECODES_CHARACTER_TO_BYTE); | |
| 853 else | |
| 854 return (type == DECODES_CHARACTER_TO_CHARACTER || | |
| 855 type == DECODES_BYTE_TO_CHARACTER); | |
| 856 } | |
| 857 | |
| 858 static int | |
| 859 encoding_source_sink_type_is_char (Lisp_Object cs, enum source_or_sink sex) | |
| 860 { | |
| 861 return decoding_source_sink_type_is_char (cs, | |
| 862 /* Sex change */ | |
| 863 sex == CODING_SOURCE ? | |
| 864 CODING_SINK : CODING_SOURCE); | |
| 865 } | |
| 866 | |
| 867 /* Like Ffind_coding_system() but check that the coding system is of the | |
| 868 "standard" type that decodes bytes into characters (suitable for | |
| 869 decoding a text file), and if not, returns an appropriate wrapper that | |
| 870 does. Also, if EOL_WRAP is non-zero, check whether this coding system | |
| 871 wants EOL auto-detection, and if so, wrap with a convert-eol coding | |
| 872 system to do this. */ | |
| 873 | |
| 874 Lisp_Object | |
| 875 find_coding_system_for_text_file (Lisp_Object name, int eol_wrap) | |
| 876 { | |
| 877 Lisp_Object coding_system = Ffind_coding_system (name); | |
| 878 Lisp_Object wrapper = coding_system; | |
| 879 | |
| 880 if (NILP (coding_system)) | |
| 881 return Qnil; | |
| 882 if (!coding_system_is_for_text_file (coding_system)) | |
| 883 { | |
| 884 wrapper = XCODING_SYSTEM_TEXT_FILE_WRAPPER (coding_system); | |
| 885 if (NILP (wrapper)) | |
| 886 { | |
| 887 Lisp_Object chain; | |
| 888 if (!decoding_source_sink_type_is_char (coding_system, CODING_SINK)) | |
| 889 chain = list2 (coding_system, Qbinary); | |
| 890 else | |
| 891 chain = list1 (coding_system); | |
| 892 if (decoding_source_sink_type_is_char (coding_system, CODING_SOURCE)) | |
| 893 chain = Fcons (Qbinary, chain); | |
| 894 wrapper = | |
| 895 make_internal_coding_system | |
| 896 (coding_system, | |
| 897 "internal-text-file-wrapper", | |
| 898 Qchain, | |
| 899 Qunbound, list4 (Qchain, chain, | |
| 900 Qcanonicalize_after_coding, coding_system)); | |
| 901 XCODING_SYSTEM_TEXT_FILE_WRAPPER (coding_system) = wrapper; | |
| 902 } | |
| 903 } | |
| 904 | |
| 905 if (!eol_wrap || XCODING_SYSTEM_EOL_TYPE (coding_system) != EOL_AUTODETECT) | |
| 906 return wrapper; | |
| 907 | |
| 908 coding_system = wrapper; | |
| 909 wrapper = XCODING_SYSTEM_AUTO_EOL_WRAPPER (coding_system); | |
| 910 if (!NILP (wrapper)) | |
| 911 return wrapper; | |
| 912 wrapper = | |
| 913 make_internal_coding_system | |
| 914 (coding_system, | |
| 915 "internal-auto-eol-wrapper", | |
| 916 Qundecided, Qunbound, | |
| 917 list4 (Qcoding_system, coding_system, | |
| 918 Qdo_eol, Qt)); | |
| 919 XCODING_SYSTEM_AUTO_EOL_WRAPPER (coding_system) = wrapper; | |
| 920 return wrapper; | |
| 921 } | |
| 922 | |
| 923 /* Like Fget_coding_system() but verify that the coding system is of the | |
| 924 "standard" type that decodes bytes into characters (suitable for | |
| 925 decoding a text file), and if not, returns an appropriate wrapper that | |
| 926 does. Also, if EOL_WRAP is non-zero, check whether this coding system | |
| 927 wants EOL auto-detection, and if so, wrap with a convert-eol coding | |
| 928 system to do this. */ | |
| 929 | |
| 930 Lisp_Object | |
| 931 get_coding_system_for_text_file (Lisp_Object name, int eol_wrap) | |
| 932 { | |
| 933 Lisp_Object coding_system = find_coding_system_for_text_file (name, | |
| 934 eol_wrap); | |
| 935 if (NILP (coding_system)) | |
| 936 invalid_argument ("No such coding system", name); | |
| 937 return coding_system; | |
| 938 } | |
| 939 | |
| 940 /* We store the coding systems in hash tables with the names as the | |
| 941 key and the actual coding system object as the value. Occasionally | |
| 942 we need to use them in a list format. These routines provide us | |
| 943 with that. */ | |
| 428 | 944 struct coding_system_list_closure |
| 945 { | |
| 946 Lisp_Object *coding_system_list; | |
| 771 | 947 int normal; |
| 948 int internal; | |
| 428 | 949 }; |
| 950 | |
| 951 static int | |
| 4303 | 952 add_coding_system_to_list_mapper (Lisp_Object key, Lisp_Object value, |
| 428 | 953 void *coding_system_list_closure) |
| 954 { | |
| 955 /* This function can GC */ | |
| 956 struct coding_system_list_closure *cscl = | |
| 957 (struct coding_system_list_closure *) coding_system_list_closure; | |
| 958 Lisp_Object *coding_system_list = cscl->coding_system_list; | |
| 959 | |
| 771 | 960 /* We can't just use VALUE because KEY might be an alias, and we need |
| 4303 | 961 the real coding system object. |
| 962 | |
| 963 Autoloaded coding systems have conses for their values, and can't be | |
| 964 internal coding systems, or coding system aliases. */ | |
| 965 if (CONSP (value) || | |
| 966 (XCODING_SYSTEM (Ffind_coding_system (key))->internal_p ? | |
| 967 cscl->internal : cscl->normal)) | |
| 771 | 968 *coding_system_list = Fcons (key, *coding_system_list); |
| 428 | 969 return 0; |
| 970 } | |
| 971 | |
| 2297 | 972 /* #### should we specify a conventional for "all coding systems"? */ |
| 771 | 973 DEFUN ("coding-system-list", Fcoding_system_list, 0, 1, 0, /* |
| 428 | 974 Return a list of the names of all defined coding systems. |
| 771 | 975 If INTERNAL is nil, only the normal (non-internal) coding systems are |
| 976 included. (Internal coding systems are created for various internal | |
| 977 purposes, such as implementing EOL types of CRLF and CR; generally, you do | |
| 978 not want to see these.) If it is t, only the internal coding systems are | |
| 979 included. If it is any other non-nil value both normal and internal are | |
| 980 included. | |
| 428 | 981 */ |
| 771 | 982 (internal)) |
| 428 | 983 { |
| 984 Lisp_Object coding_system_list = Qnil; | |
| 985 struct gcpro gcpro1; | |
| 986 struct coding_system_list_closure coding_system_list_closure; | |
| 987 | |
| 988 GCPRO1 (coding_system_list); | |
| 989 coding_system_list_closure.coding_system_list = &coding_system_list; | |
| 771 | 990 coding_system_list_closure.normal = !EQ (internal, Qt); |
| 991 coding_system_list_closure.internal = !NILP (internal); | |
| 428 | 992 elisp_maphash (add_coding_system_to_list_mapper, Vcoding_system_hash_table, |
| 993 &coding_system_list_closure); | |
| 994 UNGCPRO; | |
| 995 | |
| 996 return coding_system_list; | |
| 997 } | |
| 998 | |
| 999 DEFUN ("coding-system-name", Fcoding_system_name, 1, 1, 0, /* | |
| 1000 Return the name of the given coding system. | |
| 1001 */ | |
| 1002 (coding_system)) | |
| 1003 { | |
| 1004 coding_system = Fget_coding_system (coding_system); | |
| 1005 return XCODING_SYSTEM_NAME (coding_system); | |
| 1006 } | |
| 1007 | |
| 1008 static Lisp_Coding_System * | |
| 771 | 1009 allocate_coding_system (struct coding_system_methods *codesys_meths, |
| 1010 Bytecount data_size, | |
| 1011 Lisp_Object name) | |
| 428 | 1012 { |
| 771 | 1013 Bytecount total_size = offsetof (Lisp_Coding_System, data) + data_size; |
| 428 | 1014 Lisp_Coding_System *codesys = |
| 3024 | 1015 (Lisp_Coding_System *) BASIC_ALLOC_LCRECORD (total_size, |
| 1204 | 1016 &lrecord_coding_system); |
| 1017 | |
| 771 | 1018 codesys->methods = codesys_meths; |
| 1204 | 1019 #define MARKED_SLOT(x) codesys->x = Qnil; |
| 1020 #include "coding-system-slots.h" | |
| 1021 | |
| 771 | 1022 CODING_SYSTEM_EOL_TYPE (codesys) = EOL_LF; |
| 1023 CODING_SYSTEM_NAME (codesys) = name; | |
| 1024 | |
| 1025 MAYBE_CODESYSMETH (codesys, init, (wrap_coding_system (codesys))); | |
| 428 | 1026 |
| 1027 return codesys; | |
| 1028 } | |
| 1029 | |
| 771 | 1030 static enum eol_type |
| 1031 symbol_to_eol_type (Lisp_Object symbol) | |
| 1032 { | |
| 1033 CHECK_SYMBOL (symbol); | |
| 1034 if (NILP (symbol)) return EOL_AUTODETECT; | |
| 1035 if (EQ (symbol, Qlf)) return EOL_LF; | |
| 1036 if (EQ (symbol, Qcrlf)) return EOL_CRLF; | |
| 1037 if (EQ (symbol, Qcr)) return EOL_CR; | |
| 1038 | |
| 1039 invalid_constant ("Unrecognized eol type", symbol); | |
| 1204 | 1040 RETURN_NOT_REACHED (EOL_AUTODETECT); |
| 771 | 1041 } |
| 1042 | |
| 1043 static Lisp_Object | |
| 1044 eol_type_to_symbol (enum eol_type type) | |
| 1045 { | |
| 1046 switch (type) | |
| 1047 { | |
| 2500 | 1048 default: ABORT (); |
| 771 | 1049 case EOL_LF: return Qlf; |
| 1050 case EOL_CRLF: return Qcrlf; | |
| 1051 case EOL_CR: return Qcr; | |
| 1052 case EOL_AUTODETECT: return Qnil; | |
| 1053 } | |
| 1054 } | |
| 1055 | |
| 1056 struct subsidiary_type | |
| 1057 { | |
|
4528
726060ee587c
First draft of g++ 4.3 warning removal patch. Builds. *Needs ChangeLogs.*
Stephen J. Turnbull <stephen@xemacs.org>
parents:
4522
diff
changeset
|
1058 const Ascbyte *extension; |
|
726060ee587c
First draft of g++ 4.3 warning removal patch. Builds. *Needs ChangeLogs.*
Stephen J. Turnbull <stephen@xemacs.org>
parents:
4522
diff
changeset
|
1059 const Ascbyte *mnemonic_ext; |
| 771 | 1060 enum eol_type eol; |
| 1061 }; | |
| 1062 | |
| 1063 static struct subsidiary_type coding_subsidiary_list[] = | |
| 1064 { { "-unix", "", EOL_LF }, | |
| 1065 { "-dos", ":T", EOL_CRLF }, | |
| 1066 { "-mac", ":t", EOL_CR } }; | |
| 1067 | |
| 1068 /* kludge */ | |
| 428 | 1069 static void |
| 771 | 1070 setup_eol_coding_systems (Lisp_Object codesys) |
| 428 | 1071 { |
| 793 | 1072 int len = XSTRING_LENGTH (XSYMBOL (XCODING_SYSTEM_NAME (codesys))->name); |
| 2367 | 1073 Ibyte *codesys_name = alloca_ibytes (len + 7); |
| 771 | 1074 int mlen = -1; |
| 867 | 1075 Ibyte *codesys_mnemonic = 0; |
| 771 | 1076 Lisp_Object codesys_name_sym, sub_codesys; |
| 1077 int i; | |
| 1078 | |
| 1079 memcpy (codesys_name, | |
| 793 | 1080 XSTRING_DATA (XSYMBOL (XCODING_SYSTEM_NAME (codesys))->name), len); |
| 771 | 1081 |
| 1082 if (STRINGP (XCODING_SYSTEM_MNEMONIC (codesys))) | |
| 428 | 1083 { |
| 771 | 1084 mlen = XSTRING_LENGTH (XCODING_SYSTEM_MNEMONIC (codesys)); |
| 2367 | 1085 codesys_mnemonic = alloca_ibytes (mlen + 7); |
| 771 | 1086 memcpy (codesys_mnemonic, |
| 1087 XSTRING_DATA (XCODING_SYSTEM_MNEMONIC (codesys)), mlen); | |
| 1088 } | |
| 1089 | |
| 1090 /* Create three "subsidiary" coding systems, decoding data encoded using | |
| 1091 each of the three EOL types. We do this for each subsidiary by | |
| 1092 copying the original coding system, setting the EOL type | |
| 1093 appropriately, and setting the CANONICAL member of the new coding | |
| 1094 system to be a chain consisting of the original coding system followed | |
| 1095 by a convert-eol coding system to do the EOL decoding. For EOL type | |
| 1096 LF, however, we don't need any decoding, so we skip creating a | |
| 1097 CANONICAL. | |
| 1098 | |
| 1099 If the original coding system is not a text-type coding system | |
| 1100 (decodes byte->char), we need to coerce it to one by the appropriate | |
| 1101 wrapping in CANONICAL. */ | |
| 1102 | |
| 1103 for (i = 0; i < countof (coding_subsidiary_list); i++) | |
| 1104 { | |
|
4528
726060ee587c
First draft of g++ 4.3 warning removal patch. Builds. *Needs ChangeLogs.*
Stephen J. Turnbull <stephen@xemacs.org>
parents:
4522
diff
changeset
|
1105 const Ascbyte *extension = coding_subsidiary_list[i].extension; |
|
726060ee587c
First draft of g++ 4.3 warning removal patch. Builds. *Needs ChangeLogs.*
Stephen J. Turnbull <stephen@xemacs.org>
parents:
4522
diff
changeset
|
1106 const Ascbyte *mnemonic_ext = coding_subsidiary_list[i].mnemonic_ext; |
| 771 | 1107 enum eol_type eol = coding_subsidiary_list[i].eol; |
| 1108 | |
| 2367 | 1109 qxestrcpy_ascii (codesys_name + len, extension); |
| 771 | 1110 codesys_name_sym = intern_int (codesys_name); |
| 1111 if (mlen != -1) | |
| 2367 | 1112 qxestrcpy_ascii (codesys_mnemonic + mlen, mnemonic_ext); |
| 771 | 1113 |
| 1114 sub_codesys = Fcopy_coding_system (codesys, codesys_name_sym); | |
| 1115 if (mlen != -1) | |
| 1116 XCODING_SYSTEM_MNEMONIC (sub_codesys) = | |
| 1117 build_intstring (codesys_mnemonic); | |
| 1118 | |
| 1119 if (eol != EOL_LF) | |
| 1120 { | |
| 1121 Lisp_Object chain = list2 (get_coding_system_for_text_file | |
| 1122 (codesys, 0), | |
| 1123 eol == EOL_CR ? Qconvert_eol_cr : | |
| 1124 Qconvert_eol_crlf); | |
| 1125 Lisp_Object canon = | |
| 1126 make_internal_coding_system | |
| 1127 (sub_codesys, "internal-subsidiary-eol-wrapper", | |
| 1128 Qchain, Qunbound, | |
| 1129 mlen != -1 ? | |
| 1130 list6 (Qmnemonic, build_intstring (codesys_mnemonic), | |
| 1131 Qchain, chain, | |
| 1132 Qcanonicalize_after_coding, sub_codesys) : | |
| 1133 list4 (Qchain, chain, | |
| 1134 Qcanonicalize_after_coding, sub_codesys)); | |
| 1135 XCODING_SYSTEM_CANONICAL (sub_codesys) = canon; | |
| 1136 } | |
| 1137 XCODING_SYSTEM_EOL_TYPE (sub_codesys) = eol; | |
| 1138 XCODING_SYSTEM_SUBSIDIARY_PARENT (sub_codesys) = codesys; | |
| 1139 XCODING_SYSTEM (codesys)->eol[eol] = sub_codesys; | |
| 428 | 1140 } |
| 1141 } | |
| 1142 | |
|
4690
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
1143 DEFUN ("coding-system-canonical-name-p", Fcoding_system_canonical_name_p, |
|
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
1144 1, 1, 0, /* |
|
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
1145 Return t if OBJECT names a coding system, and is not a coding system alias. |
|
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
1146 */ |
|
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
1147 (object)) |
|
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
1148 { |
|
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
1149 return CODING_SYSTEMP (Fgethash (object, Vcoding_system_hash_table, Qnil)) |
|
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
1150 ? Qt : Qnil; |
|
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
1151 } |
|
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
1152 |
| 771 | 1153 /* Basic function to create new coding systems. For `make-coding-system', |
| 1154 NAME-OR-EXISTING is the NAME argument, PREFIX is null, and TYPE, | |
| 1155 DESCRIPTION, and PROPS are the same. All created coding systems are put | |
| 1156 in a hash table indexed by NAME. | |
| 1157 | |
| 1158 If PREFIX is a string, NAME-OR-EXISTING should specify an existing | |
| 1159 coding system (or nil), and an internal coding system will be created. | |
| 1160 The name of the coding system will be constructed by combining PREFIX | |
| 1161 with the name of the existing coding system (if given), and a number | |
| 1162 will be appended to insure uniqueness. In such a case, if Qunbound is | |
| 1163 given for DESCRIPTION, the description gets created based on the | |
| 1164 generated name. Also, if no mnemonic is given in the properties list, a | |
| 1165 mnemonic is created based on the generated name. | |
| 1166 | |
| 1167 For internal coding systems, the coding system is marked as internal | |
| 1168 (see `coding-system-list'), and no subsidiaries will be created or | |
| 1169 eol-wrapping will happen. Otherwise: | |
| 1170 | |
| 1171 -- if the eol-type property is `lf' or t, the coding system is merely | |
| 1172 created and returned. (For t, the coding system will be wrapped with | |
| 1173 an EOL autodetector when it's used to read a file.) | |
| 1174 | |
| 1175 -- if eol-type is `crlf' or `cr', after the coding system object is | |
| 1176 created, it will be wrapped in a chain with the appropriate | |
| 1177 convert-eol coding system (either `convert-eol-crlf' or | |
| 1178 `convert-eol-cr'), so that CRLF->LF or CR->LF conversion is done at | |
| 1179 decoding time, and the opposite at encoding time. The resulting | |
| 1180 chain becomes the CANONICAL field of the coding system object. | |
| 1181 | |
| 1182 -- if eol-type is nil or omitted, "subsidiaries" are generated: Three | |
| 1183 coding systems where the original coding system (before wrapping with | |
| 1184 convert-eol-autodetect) is either unwrapped or wrapped with | |
| 1185 convert-eol-crlf or convert-eol-cr, respectively, so that coding systems | |
| 1186 to handle LF, CRLF, and CR end-of-line indicators are created. (This | |
| 1187 crazy crap is based on existing behavior in other Mule versions, | |
| 1188 including FSF Emacs.) | |
| 1189 */ | |
| 428 | 1190 |
| 1191 static Lisp_Object | |
|
4528
726060ee587c
First draft of g++ 4.3 warning removal patch. Builds. *Needs ChangeLogs.*
Stephen J. Turnbull <stephen@xemacs.org>
parents:
4522
diff
changeset
|
1192 make_coding_system_1 (Lisp_Object name_or_existing, const Ascbyte *prefix, |
| 771 | 1193 Lisp_Object type, Lisp_Object description, |
| 1194 Lisp_Object props) | |
| 428 | 1195 { |
| 771 | 1196 Lisp_Coding_System *cs; |
| 1197 int need_to_setup_eol_systems = 1; | |
| 1198 enum eol_type eol_wrapper = EOL_AUTODETECT; | |
| 1199 struct coding_system_methods *meths; | |
| 1200 Lisp_Object csobj; | |
|
4690
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
1201 Lisp_Object defmnem = Qnil, aliases = Qnil; |
| 771 | 1202 |
| 1203 if (NILP (type)) | |
| 1204 type = Qundecided; | |
| 1205 meths = decode_coding_system_type (type, ERROR_ME); | |
| 1206 | |
| 1207 if (prefix) | |
| 428 | 1208 { |
| 867 | 1209 Ibyte *newname = |
| 771 | 1210 emacs_sprintf_malloc (NULL, "%s-%s-%d", |
| 1211 prefix, | |
| 867 | 1212 NILP (name_or_existing) ? (Ibyte *) "nil" : |
| 771 | 1213 XSTRING_DATA (Fsymbol_name (XCODING_SYSTEM_NAME |
| 1214 (name_or_existing))), | |
| 1215 ++coding_system_tick); | |
| 1216 name_or_existing = intern_int (newname); | |
| 1726 | 1217 xfree (newname, Ibyte *); |
| 771 | 1218 |
| 1219 if (UNBOUNDP (description)) | |
| 1220 { | |
| 1221 newname = | |
| 1222 emacs_sprintf_malloc | |
| 1223 (NULL, "For Internal Use (%s)", | |
| 1224 XSTRING_DATA (Fsymbol_name (name_or_existing))); | |
| 1225 description = build_intstring (newname); | |
| 1726 | 1226 xfree (newname, Ibyte *); |
| 771 | 1227 } |
| 1228 | |
| 1229 newname = emacs_sprintf_malloc (NULL, "Int%d", coding_system_tick); | |
| 1230 defmnem = build_intstring (newname); | |
| 1726 | 1231 xfree (newname, Ibyte *); |
| 428 | 1232 } |
| 771 | 1233 else |
| 1234 CHECK_SYMBOL (name_or_existing); | |
| 1235 | |
| 4303 | 1236 /* See is there an entry for name_or_existing in the defined coding system |
| 1237 hash table. */ | |
| 1238 csobj = find_coding_system (name_or_existing, 0); | |
| 1239 /* Error if it's there and not an autoload form. */ | |
| 1240 if (!NILP (csobj) && !CONSP (csobj)) | |
| 771 | 1241 invalid_operation ("Cannot redefine existing coding system", |
| 4303 | 1242 name_or_existing); |
| 771 | 1243 |
| 1244 cs = allocate_coding_system (meths, meths->extra_data_size, | |
| 1245 name_or_existing); | |
| 793 | 1246 csobj = wrap_coding_system (cs); |
| 771 | 1247 |
| 1248 cs->internal_p = !!prefix; | |
| 1249 | |
| 1250 if (NILP (description)) | |
| 1251 description = build_string (""); | |
| 1252 else | |
| 1253 CHECK_STRING (description); | |
| 1254 CODING_SYSTEM_DESCRIPTION (cs) = description; | |
| 1255 | |
| 1256 if (!NILP (defmnem)) | |
| 1257 CODING_SYSTEM_MNEMONIC (cs) = defmnem; | |
| 1258 | |
| 1259 { | |
| 1260 EXTERNAL_PROPERTY_LIST_LOOP_3 (key, value, props) | |
| 1261 { | |
| 1262 int recognized = 1; | |
| 1263 | |
| 1264 if (EQ (key, Qmnemonic)) | |
| 1265 { | |
| 1266 if (!NILP (value)) | |
| 1267 CHECK_STRING (value); | |
| 1268 CODING_SYSTEM_MNEMONIC (cs) = value; | |
| 1269 } | |
| 1270 | |
| 1271 else if (EQ (key, Qdocumentation)) | |
| 1272 { | |
| 1273 if (!NILP (value)) | |
| 1274 CHECK_STRING (value); | |
| 1275 CODING_SYSTEM_DOCUMENTATION (cs) = value; | |
| 1276 } | |
| 1277 | |
| 1278 else if (EQ (key, Qeol_type)) | |
| 1279 { | |
| 1280 need_to_setup_eol_systems = NILP (value); | |
| 1281 if (EQ (value, Qt)) | |
| 1282 value = Qnil; | |
| 1283 eol_wrapper = symbol_to_eol_type (value); | |
| 1284 } | |
| 1285 | |
| 1286 else if (EQ (key, Qpost_read_conversion)) | |
| 1287 CODING_SYSTEM_POST_READ_CONVERSION (cs) = value; | |
| 1288 else if (EQ (key, Qpre_write_conversion)) | |
| 1289 CODING_SYSTEM_PRE_WRITE_CONVERSION (cs) = value; | |
|
4690
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
1290 else if (EQ (key, Qaliases)) |
|
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
1291 { |
|
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
1292 EXTERNAL_LIST_LOOP_2 (alias, value) |
|
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
1293 { |
|
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
1294 CHECK_SYMBOL (alias); |
|
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
1295 |
|
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
1296 if (!NILP (Fcoding_system_canonical_name_p (alias))) |
|
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
1297 { |
|
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
1298 invalid_change ("Symbol is the canonical name of a " |
|
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
1299 "coding system and cannot be redefined", |
|
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
1300 alias); |
|
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
1301 } |
|
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
1302 } |
|
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
1303 aliases = value; |
|
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
1304 } |
| 771 | 1305 /* FSF compatibility */ |
| 1306 else if (EQ (key, Qtranslation_table_for_decode)) | |
| 1307 ; | |
| 1308 else if (EQ (key, Qtranslation_table_for_encode)) | |
| 1309 ; | |
| 1310 else if (EQ (key, Qsafe_chars)) | |
|
4690
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
1311 { |
|
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
1312 CHECK_CHAR_TABLE (value); |
|
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
1313 CODING_SYSTEM_SAFE_CHARS (cs) = value; |
|
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
1314 } |
| 771 | 1315 else if (EQ (key, Qsafe_charsets)) |
|
4690
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
1316 { |
|
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
1317 if (!EQ (Qt, value) |
|
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
1318 /* Would be nice to actually do this check, but there are |
|
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
1319 some order conflicts with japanese.el and |
|
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
1320 mule-coding.el */ |
|
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
1321 && 0) |
|
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
1322 { |
|
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
1323 #ifdef MULE |
|
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
1324 EXTERNAL_LIST_LOOP_2 (safe_charset, value) |
|
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
1325 CHECK_CHARSET (Ffind_charset (safe_charset)); |
|
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
1326 #endif |
|
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
1327 } |
|
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
1328 |
|
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
1329 CODING_SYSTEM_SAFE_CHARSETS (cs) = value; |
|
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
1330 } |
|
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
1331 else if (EQ (key, Qcategory)) |
|
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
1332 { |
|
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
1333 Fput (name_or_existing, intern ("coding-system-property"), |
|
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
1334 Fplist_put (Fget (name_or_existing, |
|
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
1335 intern ("coding-system-property"), |
|
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
1336 Qnil), |
|
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
1337 Qcategory, value)); |
|
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
1338 } |
| 771 | 1339 else if (EQ (key, Qmime_charset)) |
| 1340 ; | |
| 1341 else if (EQ (key, Qvalid_codes)) | |
| 1342 ; | |
| 1343 else | |
| 1344 recognized = CODESYSMETH_OR_GIVEN (cs, putprop, | |
| 1345 (csobj, key, value), 0); | |
| 1346 | |
| 1347 if (!recognized) | |
| 1348 invalid_constant ("Unrecognized property", key); | |
| 1349 } | |
| 1350 } | |
| 1351 | |
| 1352 { | |
| 1353 XCODING_SYSTEM_CANONICAL (csobj) = | |
| 1354 CODESYSMETH_OR_GIVEN (cs, canonicalize, (csobj), Qnil); | |
| 1355 XCODING_SYSTEM_EOL_TYPE (csobj) = EOL_AUTODETECT; /* for copy-coding-system | |
| 1356 below */ | |
| 1357 | |
| 4303 | 1358 Fputhash (name_or_existing, csobj, Vcoding_system_hash_table); |
| 1359 | |
| 771 | 1360 if (need_to_setup_eol_systems && !cs->internal_p) |
| 1361 setup_eol_coding_systems (csobj); | |
| 1362 else if (eol_wrapper == EOL_CR || eol_wrapper == EOL_CRLF) | |
| 1363 { | |
| 1364 /* If a specific eol-type (other than LF) was specified, we handle | |
| 1365 this by converting the coding system into a chain that wraps the | |
| 1366 coding system along with a convert-eol system after it, in | |
| 1367 exactly that same switcheroo fashion that the normal | |
| 1368 canonicalize method works -- BUT we will run into a problem if | |
| 1369 we do it the obvious way, because when `chain' creates its | |
| 1370 substreams, the substream containing the coding system we're | |
| 1371 creating will have canonicalization expansion done on it, | |
| 1372 leading to infinite recursion. So we have to generate a new, | |
| 1373 internal coding system with the previous value of CANONICAL. */ | |
| 867 | 1374 Ibyte *newname = |
| 771 | 1375 emacs_sprintf_malloc |
| 1376 (NULL, "internal-eol-copy-%s-%d", | |
| 1377 XSTRING_DATA (Fsymbol_name (name_or_existing)), | |
| 1378 ++coding_system_tick); | |
| 1379 Lisp_Object newnamesym = intern_int (newname); | |
| 1380 Lisp_Object copied = Fcopy_coding_system (csobj, newnamesym); | |
| 1726 | 1381 xfree (newname, Ibyte *); |
| 771 | 1382 |
| 1383 XCODING_SYSTEM_CANONICAL (csobj) = | |
| 1384 make_internal_coding_system | |
| 1385 (csobj, | |
| 1386 "internal-eol-wrapper", | |
| 1387 Qchain, Qunbound, | |
| 1388 list4 (Qchain, | |
| 1389 list2 (copied, | |
| 1390 eol_wrapper == EOL_CR ? | |
| 1391 Qconvert_eol_cr : | |
| 1392 Qconvert_eol_crlf), | |
| 1393 Qcanonicalize_after_coding, | |
| 1394 csobj)); | |
| 1395 } | |
| 1396 XCODING_SYSTEM_EOL_TYPE (csobj) = eol_wrapper; | |
|
4690
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
1397 |
|
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
1398 { |
|
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
1399 EXTERNAL_LIST_LOOP_2 (alias, aliases) |
|
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
1400 Fdefine_coding_system_alias (alias, csobj); |
|
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
1401 } |
| 771 | 1402 } |
| 1403 | |
| 1404 return csobj; | |
| 428 | 1405 } |
| 1406 | |
| 771 | 1407 Lisp_Object |
|
4528
726060ee587c
First draft of g++ 4.3 warning removal patch. Builds. *Needs ChangeLogs.*
Stephen J. Turnbull <stephen@xemacs.org>
parents:
4522
diff
changeset
|
1408 make_internal_coding_system (Lisp_Object existing, const Ascbyte *prefix, |
| 771 | 1409 Lisp_Object type, Lisp_Object description, |
| 1410 Lisp_Object props) | |
| 1411 { | |
| 1412 return make_coding_system_1 (existing, prefix, type, description, props); | |
| 1413 } | |
| 428 | 1414 |
|
4690
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
1415 DEFUN ("make-coding-system-internal", Fmake_coding_system_internal, 2, 4, 0, /* |
|
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
1416 See `make-coding-system'. This does much of the work of that function. |
|
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
1417 |
|
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
1418 Without Mule support, it does all the work of that function, and an alias |
|
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
1419 exists, mapping `make-coding-system' to |
|
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
1420 `make-coding-system-internal'. You'll need a non-Mule XEmacs to read the |
|
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
1421 complete docstring. Or you can just read it in make-coding-system.el; |
|
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
1422 something like the following should work: |
|
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
1423 |
|
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
1424 \\[find-function-other-window] find-file RET \\[find-file] mule/make-coding-system.el RET |
| 771 | 1425 |
| 428 | 1426 */ |
| 771 | 1427 (name, type, description, props)) |
| 428 | 1428 { |
| 771 | 1429 return make_coding_system_1 (name, 0, type, description, props); |
| 428 | 1430 } |
| 1431 | |
| 1432 DEFUN ("copy-coding-system", Fcopy_coding_system, 2, 2, 0, /* | |
| 1433 Copy OLD-CODING-SYSTEM to NEW-NAME. | |
| 1434 If NEW-NAME does not name an existing coding system, a new one will | |
| 1435 be created. | |
| 771 | 1436 If you are using this function to create an alias, think again: |
| 1437 Use `define-coding-system-alias' instead. | |
| 428 | 1438 */ |
| 1439 (old_coding_system, new_name)) | |
| 1440 { | |
| 1441 Lisp_Object new_coding_system; | |
| 1442 old_coding_system = Fget_coding_system (old_coding_system); | |
| 771 | 1443 new_coding_system = |
| 4303 | 1444 UNBOUNDP (new_name) ? Qnil : find_coding_system (new_name, 0); |
| 428 | 1445 if (NILP (new_coding_system)) |
| 1446 { | |
| 793 | 1447 new_coding_system = |
| 1448 wrap_coding_system | |
| 1449 (allocate_coding_system | |
| 1450 (XCODING_SYSTEM (old_coding_system)->methods, | |
| 1451 XCODING_SYSTEM (old_coding_system)->methods->extra_data_size, | |
| 1452 new_name)); | |
| 771 | 1453 if (!UNBOUNDP (new_name)) |
| 1454 Fputhash (new_name, new_coding_system, Vcoding_system_hash_table); | |
| 428 | 1455 } |
| 771 | 1456 else if (XCODING_SYSTEM (old_coding_system)->methods != |
| 1457 XCODING_SYSTEM (new_coding_system)->methods) | |
| 1458 invalid_operation_2 ("Coding systems not same type", | |
| 1459 old_coding_system, new_coding_system); | |
| 428 | 1460 |
| 1461 { | |
| 1462 Lisp_Coding_System *to = XCODING_SYSTEM (new_coding_system); | |
| 1463 Lisp_Coding_System *from = XCODING_SYSTEM (old_coding_system); | |
| 3017 | 1464 COPY_SIZED_LCRECORD (to, from, sizeof_coding_system (from)); |
| 428 | 1465 to->name = new_name; |
| 1466 } | |
| 1467 return new_coding_system; | |
| 1468 } | |
| 1469 | |
| 2297 | 1470 /* #### Shouldn't this really be a find/get pair? */ |
| 1471 | |
| 440 | 1472 DEFUN ("coding-system-alias-p", Fcoding_system_alias_p, 1, 1, 0, /* |
| 1473 Return t if OBJECT is a coding system alias. | |
| 1474 All coding system aliases are created by `define-coding-system-alias'. | |
| 1475 */ | |
| 1476 (object)) | |
| 428 | 1477 { |
| 440 | 1478 return SYMBOLP (Fgethash (object, Vcoding_system_hash_table, Qzero)) |
| 1479 ? Qt : Qnil; | |
| 1480 } | |
| 1481 | |
| 1482 DEFUN ("coding-system-aliasee", Fcoding_system_aliasee, 1, 1, 0, /* | |
| 1483 Return the coding-system symbol for which symbol ALIAS is an alias. | |
| 1484 */ | |
| 1485 (alias)) | |
| 1486 { | |
| 1487 Lisp_Object aliasee = Fgethash (alias, Vcoding_system_hash_table, Qnil); | |
| 1488 if (SYMBOLP (aliasee)) | |
| 1489 return aliasee; | |
| 1490 else | |
| 563 | 1491 invalid_argument ("Symbol is not a coding system alias", alias); |
| 1204 | 1492 RETURN_NOT_REACHED (Qnil); |
| 440 | 1493 } |
| 1494 | |
| 1495 /* A maphash function, for removing dangling coding system aliases. */ | |
| 1496 static int | |
| 2286 | 1497 dangling_coding_system_alias_p (Lisp_Object UNUSED (alias), |
| 440 | 1498 Lisp_Object aliasee, |
| 1499 void *dangling_aliases) | |
| 1500 { | |
| 1501 if (SYMBOLP (aliasee) | |
| 1502 && NILP (Fgethash (aliasee, Vcoding_system_hash_table, Qnil))) | |
| 428 | 1503 { |
| 440 | 1504 (*(int *) dangling_aliases)++; |
| 1505 return 1; | |
| 428 | 1506 } |
| 440 | 1507 else |
| 1508 return 0; | |
| 1509 } | |
| 1510 | |
| 1511 DEFUN ("define-coding-system-alias", Fdefine_coding_system_alias, 2, 2, 0, /* | |
| 1512 Define symbol ALIAS as an alias for coding system ALIASEE. | |
| 1513 | |
| 1514 You can use this function to redefine an alias that has already been defined, | |
| 1515 but you cannot redefine a name which is the canonical name for a coding system. | |
| 1516 \(a canonical name of a coding system is what is returned when you call | |
| 1517 `coding-system-name' on a coding system). | |
| 1518 | |
| 1519 ALIASEE itself can be an alias, which allows you to define nested aliases. | |
| 1520 | |
| 1521 You are forbidden, however, from creating alias loops or `dangling' aliases. | |
| 1522 These will be detected, and an error will be signaled if you attempt to do so. | |
| 1523 | |
| 1524 If ALIASEE is nil, then ALIAS will simply be undefined. | |
| 1525 | |
| 1526 See also `coding-system-alias-p', `coding-system-aliasee', | |
| 1527 and `coding-system-canonical-name-p'. | |
| 1528 */ | |
| 1529 (alias, aliasee)) | |
| 1530 { | |
| 2286 | 1531 Lisp_Object probe; |
| 440 | 1532 |
| 1533 CHECK_SYMBOL (alias); | |
| 1534 | |
| 1535 if (!NILP (Fcoding_system_canonical_name_p (alias))) | |
| 563 | 1536 invalid_change |
| 440 | 1537 ("Symbol is the canonical name of a coding system and cannot be redefined", |
| 1538 alias); | |
| 1539 | |
| 1540 if (NILP (aliasee)) | |
| 1541 { | |
| 771 | 1542 Lisp_Object subsidiary_unix = add_suffix_to_symbol (alias, "-unix"); |
| 1543 Lisp_Object subsidiary_dos = add_suffix_to_symbol (alias, "-dos"); | |
| 1544 Lisp_Object subsidiary_mac = add_suffix_to_symbol (alias, "-mac"); | |
| 440 | 1545 |
| 1546 Fremhash (alias, Vcoding_system_hash_table); | |
| 1547 | |
| 1548 /* Undefine subsidiary aliases, | |
| 1549 presumably created by a previous call to this function */ | |
| 1550 if (! NILP (Fcoding_system_alias_p (subsidiary_unix)) && | |
| 1551 ! NILP (Fcoding_system_alias_p (subsidiary_dos)) && | |
| 1552 ! NILP (Fcoding_system_alias_p (subsidiary_mac))) | |
| 1553 { | |
| 1554 Fdefine_coding_system_alias (subsidiary_unix, Qnil); | |
| 1555 Fdefine_coding_system_alias (subsidiary_dos, Qnil); | |
| 1556 Fdefine_coding_system_alias (subsidiary_mac, Qnil); | |
| 1557 } | |
| 1558 | |
| 1559 /* Undefine dangling coding system aliases. */ | |
| 1560 { | |
| 1561 int dangling_aliases; | |
| 1562 | |
| 1563 do { | |
| 1564 dangling_aliases = 0; | |
| 1565 elisp_map_remhash (dangling_coding_system_alias_p, | |
| 1566 Vcoding_system_hash_table, | |
| 1567 &dangling_aliases); | |
| 1568 } while (dangling_aliases > 0); | |
| 1569 } | |
| 1570 | |
| 1571 return Qnil; | |
| 1572 } | |
| 1573 | |
| 1574 if (CODING_SYSTEMP (aliasee)) | |
| 1575 aliasee = XCODING_SYSTEM_NAME (aliasee); | |
| 1576 | |
| 1577 /* Checks that aliasee names a coding-system */ | |
| 2286 | 1578 (void) Fget_coding_system (aliasee); |
| 440 | 1579 |
| 1580 /* Check for coding system alias loops */ | |
| 1581 if (EQ (alias, aliasee)) | |
| 563 | 1582 alias_loop: invalid_operation_2 |
| 440 | 1583 ("Attempt to create a coding system alias loop", alias, aliasee); |
| 1584 | |
| 1585 for (probe = aliasee; | |
| 1586 SYMBOLP (probe); | |
| 1587 probe = Fgethash (probe, Vcoding_system_hash_table, Qzero)) | |
| 1588 { | |
| 1589 if (EQ (probe, alias)) | |
| 1590 goto alias_loop; | |
| 1591 } | |
| 1592 | |
| 1593 Fputhash (alias, aliasee, Vcoding_system_hash_table); | |
| 1594 | |
| 1595 /* Set up aliases for subsidiaries. | |
| 2297 | 1596 #### There must be a better way to handle subsidiary coding systems. |
| 1597 Inquiring Minds Want To Know: shouldn't they always be chains? */ | |
| 440 | 1598 { |
| 1599 static const char *suffixes[] = { "-unix", "-dos", "-mac" }; | |
| 1600 int i; | |
| 1601 for (i = 0; i < countof (suffixes); i++) | |
| 1602 { | |
| 1603 Lisp_Object alias_subsidiary = | |
| 771 | 1604 add_suffix_to_symbol (alias, suffixes[i]); |
| 440 | 1605 Lisp_Object aliasee_subsidiary = |
| 771 | 1606 add_suffix_to_symbol (aliasee, suffixes[i]); |
| 440 | 1607 |
| 1608 if (! NILP (Ffind_coding_system (aliasee_subsidiary))) | |
| 1609 Fdefine_coding_system_alias (alias_subsidiary, aliasee_subsidiary); | |
| 1610 } | |
| 1611 } | |
| 428 | 1612 /* FSF return value is a vector of [ALIAS-unix ALIAS-dos ALIAS-mac], |
| 1613 but it doesn't look intentional, so I'd rather return something | |
| 1614 meaningful or nothing at all. */ | |
| 1615 return Qnil; | |
| 1616 } | |
| 1617 | |
| 1618 static Lisp_Object | |
| 771 | 1619 subsidiary_coding_system (Lisp_Object coding_system, enum eol_type type) |
| 428 | 1620 { |
| 1621 Lisp_Coding_System *cs = XCODING_SYSTEM (coding_system); | |
| 1622 Lisp_Object new_coding_system; | |
| 1623 | |
| 1624 switch (type) | |
| 1625 { | |
| 1626 case EOL_AUTODETECT: return coding_system; | |
| 1627 case EOL_LF: new_coding_system = CODING_SYSTEM_EOL_LF (cs); break; | |
| 1628 case EOL_CR: new_coding_system = CODING_SYSTEM_EOL_CR (cs); break; | |
| 1629 case EOL_CRLF: new_coding_system = CODING_SYSTEM_EOL_CRLF (cs); break; | |
| 2500 | 1630 default: ABORT (); return Qnil; |
| 428 | 1631 } |
| 1632 | |
| 1633 return NILP (new_coding_system) ? coding_system : new_coding_system; | |
| 1634 } | |
| 1635 | |
| 1636 DEFUN ("subsidiary-coding-system", Fsubsidiary_coding_system, 2, 2, 0, /* | |
| 1637 Return the subsidiary coding system of CODING-SYSTEM with eol type EOL-TYPE. | |
| 771 | 1638 The logically opposite operation is `coding-system-base'. |
| 428 | 1639 */ |
| 1640 (coding_system, eol_type)) | |
| 1641 { | |
| 771 | 1642 coding_system = get_coding_system_for_text_file (coding_system, 0); |
| 428 | 1643 |
| 1644 return subsidiary_coding_system (coding_system, | |
| 1645 symbol_to_eol_type (eol_type)); | |
| 1646 } | |
| 1647 | |
| 771 | 1648 DEFUN ("coding-system-base", Fcoding_system_base, |
| 1649 1, 1, 0, /* | |
| 1650 Return the base coding system of CODING-SYSTEM. | |
| 1651 If CODING-SYSTEM is a subsidiary, this returns its parent; otherwise, it | |
| 1652 returns CODING-SYSTEM. | |
| 1653 The logically opposite operation is `subsidiary-coding-system'. | |
| 1654 */ | |
| 1655 (coding_system)) | |
| 1656 { | |
| 1657 Lisp_Object base; | |
| 1658 | |
| 1659 coding_system = Fget_coding_system (coding_system); | |
| 1660 if (EQ (XCODING_SYSTEM_NAME (coding_system), Qbinary)) | |
| 1661 return Fget_coding_system (Qraw_text); /* hack! */ | |
| 1662 base = XCODING_SYSTEM_SUBSIDIARY_PARENT (coding_system); | |
| 1663 if (!NILP (base)) | |
| 1664 return base; | |
| 1665 return coding_system; | |
| 1666 } | |
| 1667 | |
| 1668 DEFUN ("coding-system-used-for-io", Fcoding_system_used_for_io, | |
| 1669 1, 1, 0, /* | |
| 1670 Return the coding system actually used for I/O. | |
| 1671 In some cases (e.g. when a particular EOL type is specified) this won't be | |
| 2297 | 1672 the coding system itself. This can be useful when trying to determine |
| 1673 precisely how data was decoded. | |
| 771 | 1674 */ |
| 1675 (coding_system)) | |
| 1676 { | |
| 1677 Lisp_Object canon; | |
| 1678 | |
| 1679 coding_system = Fget_coding_system (coding_system); | |
| 1680 canon = XCODING_SYSTEM_CANONICAL (coding_system); | |
| 1681 if (!NILP (canon)) | |
| 1682 return canon; | |
| 1683 return coding_system; | |
| 1684 } | |
| 1685 | |
| 428 | 1686 |
| 1687 /************************************************************************/ | |
| 1688 /* Coding system accessors */ | |
| 1689 /************************************************************************/ | |
| 1690 | |
| 771 | 1691 DEFUN ("coding-system-description", Fcoding_system_description, 1, 1, 0, /* |
| 1692 Return the description for CODING-SYSTEM. | |
| 1693 The `description' of a coding system is a short English phrase giving the | |
| 1694 name rendered according to English punctuation rules, plus possibly some | |
| 1695 explanatory text (typically in the form of a parenthetical phrase). The | |
| 1696 description is intended to be short enough that it can appear as a menu item, | |
| 1697 and clear enough to be recognizable even to someone who is assumed to have | |
| 1698 some basic familiarity with different encodings but may not know all the | |
| 1699 technical names; thus, for `cn-gb-2312' is described as "Chinese EUC" and | |
| 1700 `hz-gb-2312' is described as "Hz/ZW (Chinese)", where the actual name of | |
| 1701 the encoding is given, followed by a note that this is a Chinese encoding, | |
| 1702 because the great majority of people encountering this would have no idea | |
| 1703 what it is, and giving the language indicates whether the encoding should | |
| 1704 just be ignored or (conceivably) investigated more thoroughly. | |
| 428 | 1705 */ |
| 1706 (coding_system)) | |
| 1707 { | |
| 1708 coding_system = Fget_coding_system (coding_system); | |
| 771 | 1709 return XCODING_SYSTEM_DESCRIPTION (coding_system); |
| 428 | 1710 } |
| 1711 | |
| 1712 DEFUN ("coding-system-type", Fcoding_system_type, 1, 1, 0, /* | |
| 1713 Return the type of CODING-SYSTEM. | |
| 1714 */ | |
| 1715 (coding_system)) | |
| 1716 { | |
| 771 | 1717 coding_system = Fget_coding_system (coding_system); |
| 1718 return XCODING_SYSTEM_TYPE (coding_system); | |
| 428 | 1719 } |
| 1720 | |
| 1721 DEFUN ("coding-system-property", Fcoding_system_property, 2, 2, 0, /* | |
| 1722 Return the PROP property of CODING-SYSTEM. | |
| 1723 */ | |
| 1724 (coding_system, prop)) | |
| 1725 { | |
| 1726 coding_system = Fget_coding_system (coding_system); | |
| 1727 CHECK_SYMBOL (prop); | |
| 1728 | |
| 1729 if (EQ (prop, Qname)) | |
| 1730 return XCODING_SYSTEM_NAME (coding_system); | |
| 1731 else if (EQ (prop, Qtype)) | |
| 1732 return Fcoding_system_type (coding_system); | |
| 771 | 1733 else if (EQ (prop, Qdescription)) |
| 1734 return XCODING_SYSTEM_DESCRIPTION (coding_system); | |
| 428 | 1735 else if (EQ (prop, Qmnemonic)) |
| 1736 return XCODING_SYSTEM_MNEMONIC (coding_system); | |
| 771 | 1737 else if (EQ (prop, Qdocumentation)) |
| 1738 return XCODING_SYSTEM_DOCUMENTATION (coding_system); | |
| 428 | 1739 else if (EQ (prop, Qeol_type)) |
| 771 | 1740 return eol_type_to_symbol (XCODING_SYSTEM_EOL_TYPE |
| 1741 (coding_system)); | |
| 428 | 1742 else if (EQ (prop, Qeol_lf)) |
| 1743 return XCODING_SYSTEM_EOL_LF (coding_system); | |
| 1744 else if (EQ (prop, Qeol_crlf)) | |
| 1745 return XCODING_SYSTEM_EOL_CRLF (coding_system); | |
| 1746 else if (EQ (prop, Qeol_cr)) | |
| 1747 return XCODING_SYSTEM_EOL_CR (coding_system); | |
| 1748 else if (EQ (prop, Qpost_read_conversion)) | |
| 1749 return XCODING_SYSTEM_POST_READ_CONVERSION (coding_system); | |
| 1750 else if (EQ (prop, Qpre_write_conversion)) | |
| 1751 return XCODING_SYSTEM_PRE_WRITE_CONVERSION (coding_system); | |
|
4568
1d74a1d115ee
Add #'query-coding-region tests; do the work necessary to get them running.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4303
diff
changeset
|
1752 else if (EQ (prop, Qsafe_charsets)) |
|
1d74a1d115ee
Add #'query-coding-region tests; do the work necessary to get them running.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4303
diff
changeset
|
1753 return XCODING_SYSTEM_SAFE_CHARSETS (coding_system); |
|
1d74a1d115ee
Add #'query-coding-region tests; do the work necessary to get them running.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4303
diff
changeset
|
1754 else if (EQ (prop, Qsafe_chars)) |
|
1d74a1d115ee
Add #'query-coding-region tests; do the work necessary to get them running.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4303
diff
changeset
|
1755 return XCODING_SYSTEM_SAFE_CHARS (coding_system); |
| 771 | 1756 else |
| 1757 { | |
| 1758 Lisp_Object value = CODESYSMETH_OR_GIVEN (XCODING_SYSTEM (coding_system), | |
| 1759 getprop, | |
| 1760 (coding_system, prop), | |
| 1761 Qunbound); | |
| 1762 if (UNBOUNDP (value)) | |
| 1763 invalid_constant ("Unrecognized property", prop); | |
| 1764 return value; | |
| 1765 } | |
| 1766 } | |
| 1767 | |
| 1768 | |
| 1769 /************************************************************************/ | |
| 1770 /* Coding stream functions */ | |
| 1771 /************************************************************************/ | |
| 1772 | |
| 1773 /* A coding stream is a stream used for encoding or decoding text. The | |
| 1774 coding-stream object keeps track of the actual coding system, the stream | |
| 1775 that is at the other end, and data that needs to be persistent across | |
| 1776 the lifetime of the stream. */ | |
| 1777 | |
| 1204 | 1778 extern const struct sized_memory_description chain_coding_stream_description; |
| 1779 extern const struct sized_memory_description undecided_coding_stream_description; | |
| 1780 | |
| 1781 static const struct memory_description coding_stream_data_description_1 []= { | |
| 2551 | 1782 { XD_BLOCK_PTR, chain_coding_system, 1, |
| 1783 { &chain_coding_stream_description } }, | |
| 1784 { XD_BLOCK_PTR, undecided_coding_system, 1, | |
| 1785 { &undecided_coding_stream_description } }, | |
| 1204 | 1786 { XD_END } |
| 1787 }; | |
| 1788 | |
| 1789 static const struct sized_memory_description coding_stream_data_description = { | |
| 1790 sizeof (void *), coding_stream_data_description_1 | |
| 1791 }; | |
| 1792 | |
| 1793 static const struct memory_description coding_lstream_description[] = { | |
| 1794 { XD_INT, offsetof (struct coding_stream, type) }, | |
| 1795 { XD_LISP_OBJECT, offsetof (struct coding_stream, orig_codesys) }, | |
| 1796 { XD_LISP_OBJECT, offsetof (struct coding_stream, codesys) }, | |
| 1797 { XD_LISP_OBJECT, offsetof (struct coding_stream, other_end) }, | |
| 1798 { XD_UNION, offsetof (struct coding_stream, data), | |
| 2551 | 1799 XD_INDIRECT (0, 0), { &coding_stream_data_description } }, |
| 1204 | 1800 { XD_END } |
| 1801 }; | |
| 1802 | |
| 1803 DEFINE_LSTREAM_IMPLEMENTATION_WITH_DATA ("coding", coding); | |
| 771 | 1804 |
| 1805 /* Encoding and decoding are parallel operations, so we create just one | |
| 1806 stream for both. "Decoding" may involve the extra step of autodetection | |
| 1807 of the data format, but that's only because of the conventional | |
| 1808 definition of decoding as converting from external- to | |
| 1809 internal-formatted data. | |
| 1810 | |
| 2297 | 1811 [[ REWRITE ME! ]] |
| 1812 | |
| 771 | 1813 #### We really need to abstract out the concept of "data formats" and |
| 1814 define "converters" that convert from and to specified formats, | |
| 1815 eliminating the idea of decoding and encoding. When specifying a | |
| 1816 conversion process, we need to give the data formats themselves, not the | |
| 1817 conversion processes -- e.g. a coding system called "Unicode->multibyte" | |
| 1818 converts in both directions, and we could auto-detect the format of data | |
| 1819 at either end. */ | |
| 1820 | |
| 1821 static Bytecount | |
| 1822 coding_reader (Lstream *stream, unsigned char *data, Bytecount size) | |
| 1823 { | |
| 1824 unsigned char *orig_data = data; | |
| 1825 Bytecount read_size; | |
| 1826 int error_occurred = 0; | |
| 1827 struct coding_stream *str = CODING_STREAM_DATA (stream); | |
| 1828 | |
| 1829 /* We need to interface to coding_{de,en}code_1(), which expects to take | |
| 1830 some amount of data and store the result into a Dynarr. We have | |
| 1831 coding_{de,en}code_1() store into c->runoff, and take data from there | |
| 1832 as necessary. */ | |
| 1833 | |
| 1834 /* We loop until we have enough data, reading chunks from the other | |
| 1835 end and converting it. */ | |
| 1836 while (1) | |
| 1837 { | |
| 1838 /* Take data from convert_to if we can. Make sure to take at | |
| 1839 most SIZE bytes, and delete the data from convert_to. */ | |
| 1840 if (Dynarr_length (str->convert_to) > 0) | |
| 1841 { | |
| 1842 Bytecount chunk = | |
| 1843 min (size, (Bytecount) Dynarr_length (str->convert_to)); | |
| 1844 memcpy (data, Dynarr_atp (str->convert_to, 0), chunk); | |
| 1845 Dynarr_delete_many (str->convert_to, 0, chunk); | |
| 1846 data += chunk; | |
| 1847 size -= chunk; | |
| 1848 } | |
| 1849 | |
| 1850 if (size == 0) | |
| 1851 break; /* No more room for data */ | |
| 1852 | |
| 1853 if (str->eof) | |
| 1854 break; | |
| 1855 | |
| 1856 { | |
| 1857 /* Exhausted convert_to, so get some more. Read into convert_from, | |
| 1858 after existing "rejected" data from the last conversion. */ | |
| 1859 Bytecount rejected = Dynarr_length (str->convert_from); | |
| 1860 /* #### 1024 is arbitrary; we really need to separate 0 from EOF, | |
| 1861 and when we get 0, keep taking more data until we don't get 0 -- | |
| 1862 we don't know how much data the conversion routine might need | |
| 2297 | 1863 before it can generate any data of its own (eg, bzip2). */ |
| 814 | 1864 Bytecount readmore = |
| 1865 str->one_byte_at_a_time ? (Bytecount) 1 : | |
| 1866 max (size, (Bytecount) 1024); | |
| 771 | 1867 |
| 1868 Dynarr_add_many (str->convert_from, 0, readmore); | |
| 1869 read_size = Lstream_read (str->other_end, | |
| 1870 Dynarr_atp (str->convert_from, rejected), | |
| 1871 readmore); | |
| 1872 /* Trim size down to how much we actually got */ | |
| 1873 Dynarr_set_size (str->convert_from, rejected + max (0, read_size)); | |
| 1874 } | |
| 1875 | |
| 1876 if (read_size < 0) /* LSTREAM_ERROR */ | |
| 1877 { | |
| 1878 error_occurred = 1; | |
| 1879 break; | |
| 1880 } | |
| 1881 if (read_size == 0) /* LSTREAM_EOF */ | |
| 1882 /* There might be some more end data produced in the translation, | |
| 1883 so we set a flag and call the conversion method once more to | |
| 1884 output any final stuff it may be holding, any "go back to a sane | |
| 1885 state" escape sequences, etc. The conversion method is free to | |
| 1886 look at this flag, and we use it above to stop looping. */ | |
| 1887 str->eof = 1; | |
| 1888 { | |
| 1889 Bytecount processed; | |
| 1890 Bytecount to_process = Dynarr_length (str->convert_from); | |
| 1891 | |
| 1892 /* Convert the data, and save any rejected data in convert_from */ | |
| 1893 processed = | |
| 1894 XCODESYSMETH (str->codesys, convert, | |
| 1895 (str, Dynarr_atp (str->convert_from, 0), | |
| 1896 str->convert_to, to_process)); | |
| 1897 if (processed < 0) | |
| 1898 { | |
| 1899 error_occurred = 1; | |
| 1900 break; | |
| 1901 } | |
| 1902 assert (processed <= to_process); | |
| 1903 if (processed < to_process) | |
| 1904 memmove (Dynarr_atp (str->convert_from, 0), | |
| 1905 Dynarr_atp (str->convert_from, processed), | |
| 1906 to_process - processed); | |
| 1907 Dynarr_set_size (str->convert_from, to_process - processed); | |
| 1908 } | |
| 1909 } | |
| 1910 | |
| 1911 if (data - orig_data == 0) | |
| 1912 return error_occurred ? -1 : 0; | |
| 1913 else | |
| 1914 return data - orig_data; | |
| 1915 } | |
| 1916 | |
| 1917 static Bytecount | |
| 1918 coding_writer (Lstream *stream, const unsigned char *data, Bytecount size) | |
| 1919 { | |
| 1920 struct coding_stream *str = CODING_STREAM_DATA (stream); | |
| 1921 | |
| 1922 /* Convert all our data into convert_to, and then attempt to write | |
| 1923 it all out to the other end. */ | |
| 1924 Dynarr_reset (str->convert_to); | |
| 1925 size = XCODESYSMETH (str->codesys, convert, | |
| 1926 (str, data, str->convert_to, size)); | |
| 1927 if (Lstream_write (str->other_end, Dynarr_atp (str->convert_to, 0), | |
| 1928 Dynarr_length (str->convert_to)) < 0) | |
| 1929 return -1; | |
| 1930 else | |
| 1931 /* The return value indicates how much of the incoming data was | |
| 1932 processed, not how many bytes were written. */ | |
| 1933 return size; | |
| 1934 } | |
| 1935 | |
| 1936 static int | |
| 1937 encode_decode_source_sink_type_is_char (Lisp_Object cs, | |
| 1938 enum source_or_sink sex, | |
| 1939 enum encode_decode direction) | |
| 1940 { | |
| 1941 return (direction == CODING_DECODE ? | |
| 1942 decoding_source_sink_type_is_char (cs, sex) : | |
| 1943 encoding_source_sink_type_is_char (cs, sex)); | |
| 1944 } | |
| 1945 | |
| 1946 /* Ensure that the convert methods only get full characters sent to them to | |
| 1947 convert if the source of that conversion is characters; and that no such | |
| 1948 full-character checking happens when the source is bytes. Keep in mind | |
| 1949 that (1) the conversion_end_type return values take the perspective of | |
| 1950 encoding; (2) the source for decoding is the same as the sink for | |
| 1951 encoding; (3) when writing, the data is given to us, and we set our own | |
| 1952 stream to be character mode or not; (4) when reading, the data comes | |
| 1953 from the other_end stream, and we set that one to be character mode or | |
| 1954 not. This is consistent with the comment above the prototype for | |
| 1955 Lstream_set_character_mode(), which lays out rules for who is allowed to | |
| 1956 modify the character type mode on a stream. | |
| 1957 | |
| 814 | 1958 If we're a read stream, we're always setting character mode on the |
| 1959 source, but we also set it on ourselves consistent with the flag that | |
| 1960 can disable this (see again the comment above | |
| 1961 Lstream_set_character_mode()). | |
| 1962 */ | |
| 771 | 1963 |
| 1964 static void | |
| 1965 set_coding_character_mode (Lstream *stream) | |
| 1966 { | |
| 1967 struct coding_stream *str = CODING_STREAM_DATA (stream); | |
| 1968 Lstream *stream_to_set = | |
| 1969 stream->flags & LSTREAM_FL_WRITE ? stream : str->other_end; | |
| 1970 if (encode_decode_source_sink_type_is_char | |
| 1971 (str->codesys, CODING_SOURCE, str->direction)) | |
| 1972 Lstream_set_character_mode (stream_to_set); | |
| 1973 else | |
| 1974 Lstream_unset_character_mode (stream_to_set); | |
| 814 | 1975 if (str->set_char_mode_on_us_when_reading && |
| 1976 (stream->flags & LSTREAM_FL_READ)) | |
| 1977 { | |
| 1978 if (encode_decode_source_sink_type_is_char | |
| 1979 (str->codesys, CODING_SINK, str->direction)) | |
| 1980 Lstream_set_character_mode (stream); | |
| 1981 else | |
| 1982 Lstream_unset_character_mode (stream); | |
| 1983 } | |
| 771 | 1984 } |
| 1985 | |
| 1986 static Lisp_Object | |
| 1987 coding_marker (Lisp_Object stream) | |
| 1988 { | |
| 1989 struct coding_stream *str = CODING_STREAM_DATA (XLSTREAM (stream)); | |
| 1990 | |
| 1991 mark_object (str->orig_codesys); | |
| 1992 mark_object (str->codesys); | |
| 1993 MAYBE_XCODESYSMETH (str->codesys, mark_coding_stream, (str)); | |
| 1994 return wrap_lstream (str->other_end); | |
| 1995 } | |
| 1996 | |
| 1997 static int | |
| 1998 coding_rewinder (Lstream *stream) | |
| 1999 { | |
| 2000 struct coding_stream *str = CODING_STREAM_DATA (stream); | |
| 2001 MAYBE_XCODESYSMETH (str->codesys, rewind_coding_stream, (str)); | |
| 2002 | |
| 2003 str->ch = 0; | |
| 2004 Dynarr_reset (str->convert_to); | |
| 2005 Dynarr_reset (str->convert_from); | |
| 2006 return Lstream_rewind (str->other_end); | |
| 2007 } | |
| 2008 | |
| 2009 static int | |
| 2010 coding_seekable_p (Lstream *stream) | |
| 2011 { | |
| 2012 struct coding_stream *str = CODING_STREAM_DATA (stream); | |
| 2013 return Lstream_seekable_p (str->other_end); | |
| 2014 } | |
| 2015 | |
| 2016 static int | |
| 2017 coding_flusher (Lstream *stream) | |
| 2018 { | |
| 2019 struct coding_stream *str = CODING_STREAM_DATA (stream); | |
| 2020 return Lstream_flush (str->other_end); | |
| 2021 } | |
| 2022 | |
| 2023 static int | |
| 2024 coding_closer (Lstream *stream) | |
| 2025 { | |
| 2026 struct coding_stream *str = CODING_STREAM_DATA (stream); | |
| 2027 if (stream->flags & LSTREAM_FL_WRITE) | |
| 2028 { | |
| 2029 str->eof = 1; | |
| 2030 coding_writer (stream, 0, 0); | |
| 2031 str->eof = 0; | |
| 2032 } | |
| 2033 /* It's safe to free the runoff dynarrs now because they are used only | |
| 2034 during conversion. We need to keep the type-specific data around, | |
| 2035 though, because of canonicalize_after_coding. */ | |
| 2036 if (str->convert_to) | |
| 2037 { | |
| 2038 Dynarr_free (str->convert_to); | |
| 2039 str->convert_to = 0; | |
| 2040 } | |
| 2041 if (str->convert_from) | |
| 428 | 2042 { |
| 771 | 2043 Dynarr_free (str->convert_from); |
| 2044 str->convert_from = 0; | |
| 2045 } | |
| 2046 | |
| 800 | 2047 if (str->no_close_other) |
| 2048 return Lstream_flush (str->other_end); | |
| 2049 else | |
| 2050 return Lstream_close (str->other_end); | |
| 771 | 2051 } |
| 2052 | |
| 2053 static void | |
| 2054 coding_finalizer (Lstream *stream) | |
| 2055 { | |
| 2056 struct coding_stream *str = CODING_STREAM_DATA (stream); | |
| 2057 | |
| 2058 assert (!str->finalized); | |
| 2059 MAYBE_XCODESYSMETH (str->codesys, finalize_coding_stream, (str)); | |
| 2060 if (str->data) | |
| 2061 { | |
| 1726 | 2062 xfree (str->data, void *); |
| 771 | 2063 str->data = 0; |
| 2064 } | |
| 2065 str->finalized = 1; | |
| 2066 } | |
| 2067 | |
| 2068 static Lisp_Object | |
| 2069 coding_stream_canonicalize_after_coding (Lstream *stream) | |
| 2070 { | |
| 2071 struct coding_stream *str = CODING_STREAM_DATA (stream); | |
| 2072 | |
| 2073 return XCODESYSMETH_OR_GIVEN (str->codesys, canonicalize_after_coding, | |
| 2074 (str), str->codesys); | |
| 2075 } | |
| 2076 | |
| 2077 Lisp_Object | |
| 2078 coding_stream_detected_coding_system (Lstream *stream) | |
| 2079 { | |
| 2080 Lisp_Object codesys = | |
| 2081 coding_stream_canonicalize_after_coding (stream); | |
| 2082 if (NILP (codesys)) | |
| 2083 return Fget_coding_system (Qidentity); | |
| 2084 return codesys; | |
| 2085 } | |
| 2086 | |
| 2087 Lisp_Object | |
| 2088 coding_stream_coding_system (Lstream *stream) | |
| 2089 { | |
| 2090 return CODING_STREAM_DATA (stream)->codesys; | |
| 2091 } | |
| 2092 | |
| 2093 /* Change the coding system associated with a stream. */ | |
| 2094 | |
| 2095 void | |
| 2096 set_coding_stream_coding_system (Lstream *lstr, Lisp_Object codesys) | |
| 2097 { | |
| 2098 struct coding_stream *str = CODING_STREAM_DATA (lstr); | |
| 2099 if (EQ (str->orig_codesys, codesys)) | |
| 2100 return; | |
| 2101 /* We do the equivalent of closing the stream, destroying it, and | |
| 2102 reinitializing it. This includes flushing out the data and signalling | |
| 2103 EOF, if we're a writing stream; we also replace the type-specific data | |
| 2104 with the data appropriate for the new coding system. */ | |
| 2105 if (!NILP (str->codesys)) | |
| 2106 { | |
| 2107 if (lstr->flags & LSTREAM_FL_WRITE) | |
| 2108 { | |
| 2109 Lstream_flush (lstr); | |
| 2110 str->eof = 1; | |
| 2111 coding_writer (lstr, 0, 0); | |
| 2112 str->eof = 0; | |
| 2113 } | |
| 2114 MAYBE_XCODESYSMETH (str->codesys, finalize_coding_stream, (str)); | |
| 2115 } | |
| 2116 str->orig_codesys = codesys; | |
| 2117 str->codesys = coding_system_real_canonical (codesys); | |
| 2118 | |
| 2119 if (str->data) | |
| 2120 { | |
| 1726 | 2121 xfree (str->data, void *); |
| 771 | 2122 str->data = 0; |
| 428 | 2123 } |
| 771 | 2124 if (XCODING_SYSTEM_METHODS (str->codesys)->coding_data_size) |
| 1204 | 2125 { |
| 2126 str->data = | |
| 2127 xmalloc_and_zero (XCODING_SYSTEM_METHODS (str->codesys)-> | |
| 2128 coding_data_size); | |
| 2129 str->type = XCODING_SYSTEM_METHODS (str->codesys)->enumtype; | |
| 2130 } | |
| 771 | 2131 MAYBE_XCODESYSMETH (str->codesys, init_coding_stream, (str)); |
| 2132 /* The new coding system may have different ideas regarding whether its | |
| 2133 ends are characters or bytes. */ | |
| 2134 set_coding_character_mode (lstr); | |
| 2135 } | |
| 2136 | |
| 2137 /* WARNING WARNING WARNING WARNING!!!!! If you open up a coding | |
| 2138 stream for writing, no automatic code detection will be performed. | |
| 2139 The reason for this is that automatic code detection requires a | |
| 2140 seekable input. Things will also fail if you open a coding | |
| 2141 stream for reading using a non-fully-specified coding system and | |
| 2142 a non-seekable input stream. */ | |
| 2143 | |
| 2144 static Lisp_Object | |
| 2145 make_coding_stream_1 (Lstream *stream, Lisp_Object codesys, | |
| 800 | 2146 const char *mode, enum encode_decode direction, |
| 802 | 2147 int flags) |
| 771 | 2148 { |
| 2149 Lstream *lstr = Lstream_new (lstream_coding, mode); | |
| 2150 struct coding_stream *str = CODING_STREAM_DATA (lstr); | |
| 2151 | |
| 2152 codesys = Fget_coding_system (codesys); | |
| 2153 xzero (*str); | |
| 2154 str->codesys = Qnil; | |
| 2155 str->orig_codesys = Qnil; | |
| 2156 str->us = lstr; | |
| 2157 str->other_end = stream; | |
| 2158 str->convert_to = Dynarr_new (unsigned_char); | |
| 2159 str->convert_from = Dynarr_new (unsigned_char); | |
| 2160 str->direction = direction; | |
| 814 | 2161 if (flags & LSTREAM_FL_NO_CLOSE_OTHER) |
| 802 | 2162 str->no_close_other = 1; |
| 814 | 2163 if (flags & LSTREAM_FL_READ_ONE_BYTE_AT_A_TIME) |
| 802 | 2164 str->one_byte_at_a_time = 1; |
| 814 | 2165 if (!(flags & LSTREAM_FL_NO_INIT_CHAR_MODE_WHEN_READING)) |
| 2166 str->set_char_mode_on_us_when_reading = 1; | |
| 802 | 2167 |
| 771 | 2168 set_coding_stream_coding_system (lstr, codesys); |
| 793 | 2169 return wrap_lstream (lstr); |
| 771 | 2170 } |
| 2171 | |
| 814 | 2172 /* FLAGS: |
| 2173 | |
| 2174 LSTREAM_FL_NO_CLOSE_OTHER | |
| 2175 Don't close STREAM (the stream at the other end) when this stream is | |
| 2176 closed. | |
| 2177 | |
| 2178 LSTREAM_FL_READ_ONE_BYTE_AT_A_TIME | |
| 2179 When reading from STREAM, read and process one byte at a time rather | |
| 2180 than in large chunks. This is for reading from TTY's, so we don't | |
| 2181 block. #### We should instead create a non-blocking filedesc stream | |
| 2182 that emulates the behavior as necessary using select(), when the | |
| 2183 fcntls don't work. (As seems to be the case on Cygwin.) | |
| 2184 | |
| 2185 LSTREAM_FL_NO_INIT_CHAR_MODE_WHEN_READING | |
| 2186 When reading from STREAM, read and process one byte at a time rather | |
| 2187 than in large chunks. This is for reading from TTY's, so we don't | |
| 2188 block. #### We should instead create a non-blocking filedesc stream | |
| 2189 that emulates the behavior as necessary using select(), when the | |
| 2190 fcntls don't work. (As seems to be the case on Cygwin.) | |
| 2191 */ | |
| 771 | 2192 Lisp_Object |
| 2193 make_coding_input_stream (Lstream *stream, Lisp_Object codesys, | |
| 802 | 2194 enum encode_decode direction, int flags) |
| 771 | 2195 { |
| 800 | 2196 return make_coding_stream_1 (stream, codesys, "r", direction, |
| 802 | 2197 flags); |
| 771 | 2198 } |
| 2199 | |
| 814 | 2200 /* FLAGS: |
| 2201 | |
| 2202 LSTREAM_FL_NO_CLOSE_OTHER | |
| 2203 Don't close STREAM (the stream at the other end) when this stream is | |
| 2204 closed. | |
| 2205 */ | |
| 771 | 2206 Lisp_Object |
| 2207 make_coding_output_stream (Lstream *stream, Lisp_Object codesys, | |
| 802 | 2208 enum encode_decode direction, int flags) |
| 771 | 2209 { |
| 800 | 2210 return make_coding_stream_1 (stream, codesys, "w", direction, |
| 802 | 2211 flags); |
| 771 | 2212 } |
| 2213 | |
| 2214 static Lisp_Object | |
| 2215 encode_decode_coding_region (Lisp_Object start, Lisp_Object end, | |
| 2216 Lisp_Object coding_system, Lisp_Object buffer, | |
| 2217 enum encode_decode direction) | |
| 2218 { | |
| 2219 Charbpos b, e; | |
| 2220 struct buffer *buf = decode_buffer (buffer, 0); | |
| 2221 Lisp_Object instream = Qnil, to_outstream = Qnil, outstream = Qnil; | |
| 2222 Lisp_Object from_outstream = Qnil, auto_outstream = Qnil; | |
| 2223 Lisp_Object lb_outstream = Qnil; | |
| 2224 Lisp_Object next; | |
| 2225 Lstream *istr, *ostr; | |
| 2226 struct gcpro gcpro1, gcpro2, gcpro3, gcpro4, gcpro5; | |
| 2227 struct gcpro ngcpro1; | |
| 2228 int source_char, sink_char; | |
| 2229 | |
| 2230 get_buffer_range_char (buf, start, end, &b, &e, 0); | |
| 2231 barf_if_buffer_read_only (buf, b, e); | |
| 2232 | |
| 2233 GCPRO5 (instream, to_outstream, outstream, from_outstream, lb_outstream); | |
| 2234 NGCPRO1 (auto_outstream); | |
| 2235 | |
| 2236 coding_system = Fget_coding_system (coding_system); | |
| 2237 source_char = encode_decode_source_sink_type_is_char (coding_system, | |
| 2238 CODING_SOURCE, | |
| 2239 direction); | |
| 2240 sink_char = encode_decode_source_sink_type_is_char (coding_system, | |
| 2241 CODING_SINK, | |
| 2242 direction); | |
| 2243 | |
| 2244 /* Order is IN <---> [TO] -> OUT -> [FROM] -> [AUTODETECT-EOL] -> LB */ | |
| 2245 instream = make_lisp_buffer_input_stream (buf, b, e, 0); | |
| 2246 next = lb_outstream = make_lisp_buffer_output_stream (buf, b, 0); | |
| 2247 | |
| 2248 if (direction == CODING_DECODE && | |
| 2249 XCODING_SYSTEM_EOL_TYPE (coding_system) == EOL_AUTODETECT) | |
| 2250 next = auto_outstream = | |
| 2251 make_coding_output_stream | |
| 800 | 2252 (XLSTREAM (next), Fget_coding_system (Qconvert_eol_autodetect), |
| 2253 CODING_DECODE, 0); | |
| 771 | 2254 |
| 2255 if (!sink_char) | |
| 2256 next = from_outstream = | |
| 800 | 2257 make_coding_output_stream (XLSTREAM (next), Qbinary, CODING_DECODE, 0); |
| 771 | 2258 outstream = make_coding_output_stream (XLSTREAM (next), coding_system, |
| 800 | 2259 direction, 0); |
| 771 | 2260 if (!source_char) |
| 428 | 2261 { |
| 771 | 2262 to_outstream = |
| 2263 make_coding_output_stream (XLSTREAM (outstream), | |
| 800 | 2264 Qbinary, CODING_ENCODE, 0); |
| 771 | 2265 ostr = XLSTREAM (to_outstream); |
| 2266 } | |
| 2267 else | |
| 2268 ostr = XLSTREAM (outstream); | |
| 2269 istr = XLSTREAM (instream); | |
| 2270 | |
| 2271 /* The chain of streams looks like this: | |
| 2272 | |
| 2297 | 2273 [BUFFER] <----- (( read from/send to loop )) |
| 771 | 2274 ------> [CHAR->BYTE i.e. ENCODE AS BINARY if source is |
| 2275 in bytes] | |
| 2276 ------> [ENCODE/DECODE AS SPECIFIED] | |
| 2277 ------> [BYTE->CHAR i.e. DECODE AS BINARY | |
| 2278 if sink is in bytes] | |
| 2279 ------> [AUTODETECT EOL if | |
| 2280 we're decoding and | |
| 2281 coding system calls | |
| 2282 for this] | |
| 2283 ------> [BUFFER] | |
| 2284 */ | |
| 2367 | 2285 |
| 2286 /* #### See comment | |
| 2287 | |
| 2288 EFFICIENCY OF CODING CONVERSION WITH MULTIPLE COPIES/CHAINS | |
| 2289 | |
| 2290 in text.c. | |
| 2291 */ | |
| 2292 | |
| 771 | 2293 while (1) |
| 2294 { | |
| 2295 char tempbuf[1024]; /* some random amount */ | |
| 2296 Charbpos newpos, even_newer_pos; | |
| 2297 Charbpos oldpos = lisp_buffer_stream_startpos (istr); | |
| 2298 Bytecount size_in_bytes = | |
| 2299 Lstream_read (istr, tempbuf, sizeof (tempbuf)); | |
| 2300 | |
| 2301 if (!size_in_bytes) | |
| 2302 break; | |
| 2303 newpos = lisp_buffer_stream_startpos (istr); | |
| 2304 Lstream_write (ostr, tempbuf, size_in_bytes); | |
| 2305 even_newer_pos = lisp_buffer_stream_startpos (istr); | |
| 2306 buffer_delete_range (buf, even_newer_pos - (newpos - oldpos), | |
| 2307 even_newer_pos, 0); | |
| 428 | 2308 } |
| 771 | 2309 |
| 2310 { | |
| 2311 Charcount retlen = | |
| 2312 lisp_buffer_stream_startpos (XLSTREAM (instream)) - b; | |
| 2313 Lstream_close (istr); | |
| 2314 Lstream_close (ostr); | |
| 2315 NUNGCPRO; | |
| 2316 UNGCPRO; | |
| 2317 Lstream_delete (istr); | |
| 2318 if (!NILP (from_outstream)) | |
| 2319 Lstream_delete (XLSTREAM (from_outstream)); | |
| 2320 Lstream_delete (XLSTREAM (outstream)); | |
| 2321 if (!NILP (to_outstream)) | |
| 2322 Lstream_delete (XLSTREAM (to_outstream)); | |
| 2323 if (!NILP (auto_outstream)) | |
| 2324 Lstream_delete (XLSTREAM (auto_outstream)); | |
| 2325 Lstream_delete (XLSTREAM (lb_outstream)); | |
| 2326 return make_int (retlen); | |
| 2327 } | |
| 2328 } | |
| 2329 | |
| 3302 | 2330 DEFUN ("decode-coding-region", Fdecode_coding_region, 3, 4, |
| 2331 "*r\nzDecode from coding system: \ni", /* | |
| 771 | 2332 Decode the text between START and END which is encoded in CODING-SYSTEM. |
| 2333 This is useful if you've read in encoded text from a file without decoding | |
| 2334 it (e.g. you read in a JIS-formatted file but used the `binary' or | |
| 2335 `no-conversion' coding system, so that it shows up as "^[$B!<!+^[(B"). | |
| 2336 Return length of decoded text. | |
| 3302 | 2337 BUFFER defaults to the current buffer if unspecified, and when interactive. |
| 771 | 2338 */ |
| 2339 (start, end, coding_system, buffer)) | |
| 2340 { | |
| 2341 return encode_decode_coding_region (start, end, coding_system, buffer, | |
| 2342 CODING_DECODE); | |
| 2343 } | |
| 2344 | |
| 3302 | 2345 DEFUN ("encode-coding-region", Fencode_coding_region, 3, 4, |
| 2346 "*r\nzEncode to coding system: \ni", /* | |
| 771 | 2347 Encode the text between START and END using CODING-SYSTEM. |
| 2348 This will, for example, convert Japanese characters into stuff such as | |
| 3302 | 2349 "^[$B!<!+^[(B" if you use the JIS encoding. Return length of encoded text. |
| 2350 BUFFER defaults to the current buffer if unspecified, and when interactive. | |
| 771 | 2351 */ |
| 2352 (start, end, coding_system, buffer)) | |
| 2353 { | |
| 2354 return encode_decode_coding_region (start, end, coding_system, buffer, | |
| 2355 CODING_ENCODE); | |
| 428 | 2356 } |
| 2357 | |
|
4690
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
2358 DEFUN ("query-coding-region", Fquery_coding_region, 3, 7, 0, /* |
|
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
2359 Work out whether CODING-SYSTEM can losslessly encode a region. |
|
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
2360 |
|
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
2361 START and END are the beginning and end of the region to check. |
|
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
2362 CODING-SYSTEM is the coding system to try. |
|
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
2363 |
|
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
2364 Optional argument BUFFER is the buffer to check, and defaults to the current |
|
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
2365 buffer. |
|
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
2366 |
|
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
2367 IGNORE-INVALID-SEQUENCESP, also an optional argument, says to treat XEmacs |
|
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
2368 characters which have an unambiguous encoded representation, despite being |
|
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
2369 undefined in what they represent, as encodable. These chiefly arise with |
|
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
2370 variable-length encodings like UTF-8 and UTF-16, where an invalid sequence |
|
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
2371 is passed through to XEmacs as a sequence of characters with a defined |
|
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
2372 correspondence to the octets on disk, but no non-error semantics; see the |
|
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
2373 `invalid-sequence-coding-system' argument to `set-language-info'. |
|
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
2374 |
|
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
2375 They can also arise with fixed-length encodings like ISO 8859-7, where |
|
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
2376 certain octets on disk have undefined values, and treating them as |
|
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
2377 corresponding to the ISO 8859-1 characters with the same numerical values |
|
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
2378 may lead to data that is not understood by other applications. |
|
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
2379 |
|
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
2380 Optional argument ERRORP says to signal a `text-conversion-error' if some |
|
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
2381 character in the region cannot be encoded, and defaults to nil. |
|
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
2382 |
|
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
2383 Optional argument HIGHLIGHT says to display unencodable characters in the |
|
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
2384 region using `query-coding-warning-face'. It defaults to nil. |
|
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
2385 |
|
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
2386 This function can return multiple values; the intention is that callers use |
|
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
2387 `multiple-value-bind' or the related CL multiple value functions to deal |
|
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
2388 with it. The first result is `t' if the region can be encoded using |
|
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
2389 CODING-SYSTEM, or `nil' if not. If the region cannot be encoded using |
|
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
2390 CODING-SYSTEM, the second result is a range table describing the positions |
|
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
2391 of the unencodable characters. |
|
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
2392 |
|
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
2393 Ranges that describe characters that would be ignored were |
|
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
2394 IGNORE-INVALID-SEQUENCESP non-nil map to the symbol `invalid-sequence'; |
|
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
2395 other ranges map to the symbol `unencodable'. If IGNORE-INVALID-SEQUENCESP |
|
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
2396 is non-nil, all ranges will map to the symbol `unencodable'. See |
|
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
2397 `make-range-table' for more details of range tables. |
|
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
2398 */ |
|
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
2399 (start, end, coding_system, buffer, ignore_invalid_sequencesp, |
|
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
2400 errorp, highlight)) |
|
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
2401 { |
|
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
2402 Charbpos b, e; |
|
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
2403 struct buffer *buf = decode_buffer (buffer, 1); |
|
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
2404 Lisp_Object result; |
|
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
2405 int flags = 0, speccount = specpdl_depth (); |
|
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
2406 |
|
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
2407 coding_system = Fget_coding_system (coding_system); |
|
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
2408 |
|
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
2409 get_buffer_range_char (buf, start, end, &b, &e, 0); |
|
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
2410 |
|
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
2411 if (buf != current_buffer) |
|
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
2412 { |
|
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
2413 record_unwind_protect (save_current_buffer_restore, Fcurrent_buffer ()); |
|
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
2414 set_buffer_internal (buf); |
|
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
2415 } |
|
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
2416 |
|
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
2417 record_unwind_protect (save_excursion_restore, save_excursion_save ()); |
|
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
2418 |
|
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
2419 BUF_SET_PT (buf, b); |
|
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
2420 |
|
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
2421 if (!NILP (ignore_invalid_sequencesp)) |
|
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
2422 { |
|
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
2423 flags |= QUERY_METHOD_IGNORE_INVALID_SEQUENCES; |
|
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
2424 } |
|
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
2425 |
|
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
2426 if (!NILP (errorp)) |
|
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
2427 { |
|
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
2428 flags |= QUERY_METHOD_ERRORP; |
|
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
2429 } |
|
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
2430 |
|
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
2431 if (!NILP (highlight)) |
|
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
2432 { |
|
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
2433 flags |= QUERY_METHOD_HIGHLIGHT; |
|
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
2434 } |
|
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
2435 |
|
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
2436 result = XCODESYSMETH_OR_GIVEN (coding_system, query, |
|
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
2437 (coding_system, buf, e, flags), Qunbound); |
|
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
2438 |
|
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
2439 if (UNBOUNDP (result)) |
|
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
2440 { |
|
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
2441 signal_error (Qtext_conversion_error, |
|
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
2442 "Coding system doesn't say what it can encode", |
|
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
2443 XCODING_SYSTEM_NAME (coding_system)); |
|
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
2444 } |
|
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
2445 |
|
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
2446 result = (NILP (result)) ? Qt : values2 (Qnil, result); |
|
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
2447 |
|
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
2448 return unbind_to_1 (speccount, result); |
|
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
2449 } |
|
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
2450 |
|
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
2451 |
| 428 | 2452 |
| 2453 /************************************************************************/ | |
| 771 | 2454 /* Chain methods */ |
| 428 | 2455 /************************************************************************/ |
| 2456 | |
| 771 | 2457 /* #### Need a way to create "opposite-direction" coding systems. */ |
| 2458 | |
| 2459 /* Chain two or more coding systems together to make a combination coding | |
| 2460 system. */ | |
| 2461 | |
| 2462 struct chain_coding_system | |
| 2463 { | |
| 2464 /* List of coding systems, in decode order */ | |
| 2465 Lisp_Object *chain; | |
| 2466 /* Number of coding systems in list */ | |
| 2467 int count; | |
| 2468 /* Coding system to return as a result of canonicalize-after-coding */ | |
| 2469 Lisp_Object canonicalize_after_coding; | |
| 2470 }; | |
| 2471 | |
| 2472 struct chain_coding_stream | |
| 2473 { | |
| 2474 int initted; | |
| 2475 /* Lstreams for chain coding system */ | |
| 2476 Lisp_Object *lstreams; | |
| 2477 int lstream_count; | |
| 2478 }; | |
| 2479 | |
| 1204 | 2480 static const struct memory_description chain_coding_system_description[] = { |
| 2481 { XD_INT, offsetof (struct chain_coding_system, count) }, | |
| 2367 | 2482 { XD_BLOCK_PTR, offsetof (struct chain_coding_system, chain), |
| 2551 | 2483 XD_INDIRECT (0, 0), { &lisp_object_description } }, |
| 1204 | 2484 { XD_LISP_OBJECT, offsetof (struct chain_coding_system, |
| 2485 canonicalize_after_coding) }, | |
| 771 | 2486 { XD_END } |
| 2487 }; | |
| 2488 | |
| 1204 | 2489 static const struct memory_description chain_coding_stream_description_1 [] = { |
| 2490 { XD_INT, offsetof (struct chain_coding_stream, lstream_count) }, | |
| 2367 | 2491 { XD_BLOCK_PTR, offsetof (struct chain_coding_stream, lstreams), |
| 2551 | 2492 XD_INDIRECT (0, 0), { &lisp_object_description } }, |
| 771 | 2493 { XD_END } |
| 2494 }; | |
| 2495 | |
| 1204 | 2496 const struct sized_memory_description chain_coding_stream_description = { |
| 2497 sizeof (struct chain_coding_stream), chain_coding_stream_description_1 | |
| 2498 }; | |
| 2499 | |
| 2500 DEFINE_CODING_SYSTEM_TYPE_WITH_DATA (chain); | |
| 2501 | |
| 771 | 2502 static Lisp_Object |
| 2503 chain_canonicalize (Lisp_Object codesys) | |
| 2504 { | |
| 2505 /* We make use of the fact that this method is called at init time, after | |
| 2506 properties have been parsed. init_method is called too early. */ | |
| 2507 /* #### It's not clear we need this whole chain-canonicalize mechanism | |
| 2508 any more. */ | |
| 2509 Lisp_Object chain = Flist (XCODING_SYSTEM_CHAIN_COUNT (codesys), | |
| 2510 XCODING_SYSTEM_CHAIN_CHAIN (codesys)); | |
| 2511 chain = Fcons (XCODING_SYSTEM_PRE_WRITE_CONVERSION (codesys), | |
| 2512 Fcons (XCODING_SYSTEM_POST_READ_CONVERSION (codesys), | |
| 2513 chain)); | |
| 2514 Fputhash (chain, codesys, Vchain_canonicalize_hash_table); | |
| 2515 return codesys; | |
| 2516 } | |
| 2517 | |
| 2518 static Lisp_Object | |
| 2519 chain_canonicalize_after_coding (struct coding_stream *str) | |
| 2520 { | |
| 2521 Lisp_Object cac = | |
| 2522 XCODING_SYSTEM_CHAIN_CANONICALIZE_AFTER_CODING (str->codesys); | |
| 2523 if (!NILP (cac)) | |
| 2524 return cac; | |
| 2525 return str->codesys; | |
| 2526 #if 0 | |
| 2527 struct chain_coding_stream *data = CODING_STREAM_TYPE_DATA (str, chain); | |
| 2528 Lisp_Object us = str->codesys, codesys; | |
| 2529 int i; | |
| 2530 Lisp_Object chain; | |
| 2531 Lisp_Object tail; | |
| 2532 int changed = 0; | |
| 2533 | |
| 2534 /* #### It's not clear we need this whole chain-canonicalize mechanism | |
| 2535 any more. */ | |
| 2536 if (str->direction == CODING_ENCODE || !data->initted) | |
| 2537 return us; | |
| 2538 | |
| 2539 chain = Flist (XCODING_SYSTEM_CHAIN_COUNT (us), | |
| 2540 XCODING_SYSTEM_CHAIN_CHAIN (us)); | |
| 2541 | |
| 2542 tail = chain; | |
| 2543 for (i = 0; i < XCODING_SYSTEM_CHAIN_COUNT (us); i++) | |
| 2544 { | |
| 2545 codesys = (coding_stream_canonicalize_after_coding | |
| 2546 (XLSTREAM (data->lstreams[i]))); | |
| 2547 if (!EQ (codesys, XCAR (tail))) | |
| 2548 changed = 1; | |
| 2549 XCAR (tail) = codesys; | |
| 2550 tail = XCDR (tail); | |
| 2551 } | |
| 2552 | |
| 2553 if (!changed) | |
| 2554 return us; | |
| 2555 | |
| 2556 chain = delq_no_quit (Qnil, chain); | |
| 2557 | |
| 2558 if (NILP (XCODING_SYSTEM_PRE_WRITE_CONVERSION (us)) && | |
| 2559 NILP (XCODING_SYSTEM_POST_READ_CONVERSION (us))) | |
| 2560 { | |
| 2561 if (NILP (chain)) | |
| 2562 return Qnil; | |
| 2563 if (NILP (XCDR (chain))) | |
| 2564 return XCAR (chain); | |
| 2565 } | |
| 2566 | |
| 2567 codesys = Fgethash (Fcons (XCODING_SYSTEM_PRE_WRITE_CONVERSION (us), | |
| 2568 Fcons (XCODING_SYSTEM_POST_READ_CONVERSION (us), | |
| 2569 chain)), Vchain_canonicalize_hash_table, | |
| 2570 Qnil); | |
| 2571 if (!NILP (codesys)) | |
| 2572 return codesys; | |
| 2573 return make_internal_coding_system | |
| 2574 (us, "internal-chain-canonicalizer-wrapper", | |
| 2575 Qchain, Qunbound, list2 (Qchain, chain)); | |
| 2576 #endif /* 0 */ | |
| 2577 } | |
| 2578 | |
| 2579 static void | |
| 2580 chain_init (Lisp_Object codesys) | |
| 2581 { | |
| 2582 XCODING_SYSTEM_CHAIN_CANONICALIZE_AFTER_CODING (codesys) = Qnil; | |
| 2583 } | |
| 2584 | |
| 2585 static void | |
| 2586 chain_mark (Lisp_Object codesys) | |
| 2587 { | |
| 2588 int i; | |
| 2589 | |
| 2590 for (i = 0; i < XCODING_SYSTEM_CHAIN_COUNT (codesys); i++) | |
| 2591 mark_object (XCODING_SYSTEM_CHAIN_CHAIN (codesys)[i]); | |
| 2592 mark_object (XCODING_SYSTEM_CHAIN_CANONICALIZE_AFTER_CODING (codesys)); | |
| 2593 } | |
| 2594 | |
| 2595 static void | |
| 2596 chain_mark_coding_stream_1 (struct chain_coding_stream *data) | |
| 2597 { | |
| 2598 int i; | |
| 2599 | |
| 2600 for (i = 0; i < data->lstream_count; i++) | |
| 2601 mark_object (data->lstreams[i]); | |
| 2602 } | |
| 2603 | |
| 2604 static void | |
| 2605 chain_mark_coding_stream (struct coding_stream *str) | |
| 2606 { | |
| 2607 chain_mark_coding_stream_1 (CODING_STREAM_TYPE_DATA (str, chain)); | |
| 2608 } | |
| 2609 | |
| 2610 static void | |
| 2611 chain_print (Lisp_Object cs, Lisp_Object printcharfun, int escapeflag) | |
| 2612 { | |
| 2613 int i; | |
| 2614 | |
| 826 | 2615 write_c_string (printcharfun, "("); |
| 771 | 2616 for (i = 0; i < XCODING_SYSTEM_CHAIN_COUNT (cs); i++) |
| 2617 { | |
| 826 | 2618 write_c_string (printcharfun, i == 0 ? "" : "->"); |
| 771 | 2619 print_coding_system_in_print_method (XCODING_SYSTEM_CHAIN_CHAIN (cs)[i], |
| 2620 printcharfun, escapeflag); | |
| 2621 } | |
| 2622 { | |
| 2623 Lisp_Object cac = XCODING_SYSTEM_CHAIN_CANONICALIZE_AFTER_CODING (cs); | |
| 2624 if (!NILP (cac)) | |
| 2625 { | |
| 2626 if (i > 0) | |
| 826 | 2627 write_c_string (printcharfun, " "); |
| 2628 write_c_string (printcharfun, "canonicalize-after-coding="); | |
| 771 | 2629 print_coding_system_in_print_method (cac, printcharfun, escapeflag); |
| 2630 } | |
| 2631 } | |
| 2632 | |
| 826 | 2633 write_c_string (printcharfun, ")"); |
| 771 | 2634 } |
| 2635 | |
| 2636 static void | |
| 2637 chain_rewind_coding_stream_1 (struct chain_coding_stream *data) | |
| 2638 { | |
| 2639 /* Each will rewind the next; there is always at least one stream (the | |
| 2640 dynarr stream at the end) if we're initted */ | |
| 2641 if (data->initted) | |
| 2642 Lstream_rewind (XLSTREAM (data->lstreams[0])); | |
| 2643 } | |
| 2644 | |
| 2645 static void | |
| 2646 chain_rewind_coding_stream (struct coding_stream *str) | |
| 2647 { | |
| 2648 chain_rewind_coding_stream_1 (CODING_STREAM_TYPE_DATA (str, chain)); | |
| 2649 } | |
| 2650 | |
| 2651 static void | |
| 2652 chain_init_coding_streams_1 (struct chain_coding_stream *data, | |
| 2653 unsigned_char_dynarr *dst, | |
| 2654 int ncodesys, Lisp_Object *codesys, | |
| 2655 enum encode_decode direction) | |
| 2656 { | |
| 2657 int i; | |
| 2658 Lisp_Object lstream_out; | |
| 2659 | |
| 2660 data->lstream_count = ncodesys + 1; | |
| 2661 data->lstreams = xnew_array (Lisp_Object, data->lstream_count); | |
| 2662 | |
| 2663 lstream_out = make_dynarr_output_stream (dst); | |
| 2664 Lstream_set_buffering (XLSTREAM (lstream_out), LSTREAM_UNBUFFERED, 0); | |
| 2665 data->lstreams[data->lstream_count - 1] = lstream_out; | |
| 2666 | |
| 2667 for (i = ncodesys - 1; i >= 0; i--) | |
| 2668 { | |
| 2669 data->lstreams[i] = | |
| 2670 make_coding_output_stream | |
| 2671 (XLSTREAM (lstream_out), | |
| 2672 codesys[direction == CODING_ENCODE ? ncodesys - (i + 1) : i], | |
| 800 | 2673 direction, 0); |
| 771 | 2674 lstream_out = data->lstreams[i]; |
| 2675 Lstream_set_buffering (XLSTREAM (lstream_out), LSTREAM_UNBUFFERED, | |
| 2676 0); | |
| 2677 } | |
| 2678 data->initted = 1; | |
| 2679 } | |
| 2680 | |
| 2681 static Bytecount | |
| 2682 chain_convert (struct coding_stream *str, const UExtbyte *src, | |
| 2683 unsigned_char_dynarr *dst, Bytecount n) | |
| 2684 { | |
| 2685 struct chain_coding_stream *data = CODING_STREAM_TYPE_DATA (str, chain); | |
| 2686 | |
| 2687 if (str->eof) | |
| 2688 { | |
| 2689 /* Each will close the next; there is always at least one stream (the | |
| 2690 dynarr stream at the end) if we're initted. We need to close now | |
| 2691 because more data may be generated. */ | |
| 2692 if (data->initted) | |
| 2693 Lstream_close (XLSTREAM (data->lstreams[0])); | |
| 2694 return n; | |
| 2695 } | |
| 2696 | |
| 2697 if (!data->initted) | |
| 2698 chain_init_coding_streams_1 | |
| 2699 (data, dst, XCODING_SYSTEM_CHAIN_COUNT (str->codesys), | |
| 2700 XCODING_SYSTEM_CHAIN_CHAIN (str->codesys), str->direction); | |
| 2701 | |
| 2702 if (Lstream_write (XLSTREAM (data->lstreams[0]), src, n) < 0) | |
| 2703 return -1; | |
| 2704 return n; | |
| 2705 } | |
| 2706 | |
| 2707 static void | |
| 2708 chain_finalize_coding_stream_1 (struct chain_coding_stream *data) | |
| 2709 { | |
| 2710 if (data->lstreams) | |
| 2711 { | |
| 2297 | 2712 /* During GC, these objects are unmarked, and are about to be freed. |
| 2713 We do NOT want them on the free list, and that will cause lots of | |
| 2714 nastiness including crashes. Just let them be freed normally. */ | |
| 771 | 2715 if (!gc_in_progress) |
| 2716 { | |
| 2717 int i; | |
| 2297 | 2718 /* Order of deletion is important here! Delete from the head of |
| 2719 the chain and work your way towards the tail. In general, | |
| 2720 when you delete an object, there should be *NO* pointers to it | |
| 2721 anywhere. Deleting back-to-front would be a problem because | |
| 2722 there are pointers going forward. If there were pointers in | |
| 2723 both directions, you'd have to disconnect the pointers to a | |
| 2724 particular object before deleting it. */ | |
| 771 | 2725 for (i = 0; i < data->lstream_count; i++) |
| 2726 Lstream_delete (XLSTREAM ((data->lstreams)[i])); | |
| 2727 } | |
| 1726 | 2728 xfree (data->lstreams, Lisp_Object *); |
| 771 | 2729 } |
| 2730 } | |
| 2731 | |
| 2732 static void | |
| 2733 chain_finalize_coding_stream (struct coding_stream *str) | |
| 2734 { | |
| 2735 chain_finalize_coding_stream_1 (CODING_STREAM_TYPE_DATA (str, chain)); | |
| 2736 } | |
| 2737 | |
| 2738 static void | |
| 2739 chain_finalize (Lisp_Object c) | |
| 2740 { | |
| 2741 if (XCODING_SYSTEM_CHAIN_CHAIN (c)) | |
| 1726 | 2742 xfree (XCODING_SYSTEM_CHAIN_CHAIN (c), Lisp_Object *); |
| 771 | 2743 } |
| 2744 | |
| 428 | 2745 static int |
| 771 | 2746 chain_putprop (Lisp_Object codesys, Lisp_Object key, Lisp_Object value) |
| 2747 { | |
| 2748 if (EQ (key, Qchain)) | |
| 2749 { | |
| 2750 Lisp_Object *cslist; | |
| 2751 int count = 0; | |
| 2752 int i; | |
| 2753 | |
| 2367 | 2754 { |
| 2755 EXTERNAL_LIST_LOOP_2 (elt, value) | |
| 2756 { | |
| 2757 Fget_coding_system (elt); | |
| 2758 count++; | |
| 2759 } | |
| 2760 } | |
| 771 | 2761 |
| 2762 cslist = xnew_array (Lisp_Object, count); | |
| 2763 XCODING_SYSTEM_CHAIN_CHAIN (codesys) = cslist; | |
| 2764 | |
| 2765 count = 0; | |
| 2367 | 2766 { |
| 2767 EXTERNAL_LIST_LOOP_2 (elt, value) | |
| 2768 { | |
| 2769 cslist[count] = Fget_coding_system (elt); | |
| 2770 count++; | |
| 2771 } | |
| 2772 } | |
| 771 | 2773 |
| 2774 XCODING_SYSTEM_CHAIN_COUNT (codesys) = count; | |
| 2775 | |
| 2776 for (i = 0; i < count - 1; i++) | |
| 2777 { | |
| 2778 if (decoding_source_sink_type_is_char (cslist[i], CODING_SINK) != | |
| 2779 decoding_source_sink_type_is_char (cslist[i + 1], CODING_SOURCE)) | |
| 2780 invalid_argument_2 ("Sink of first must match source of second", | |
| 2781 cslist[i], cslist[i + 1]); | |
| 2782 } | |
| 2783 } | |
| 2784 else if (EQ (key, Qcanonicalize_after_coding)) | |
| 2785 XCODING_SYSTEM_CHAIN_CANONICALIZE_AFTER_CODING (codesys) = | |
| 2786 Fget_coding_system (value); | |
| 2787 else | |
| 2788 return 0; | |
| 2789 return 1; | |
| 2790 } | |
| 2791 | |
| 2792 static Lisp_Object | |
| 2793 chain_getprop (Lisp_Object coding_system, Lisp_Object prop) | |
| 2794 { | |
| 2795 if (EQ (prop, Qchain)) | |
| 2796 { | |
| 2797 Lisp_Object result = Qnil; | |
| 2798 int i; | |
| 2799 | |
| 2800 for (i = 0; i < XCODING_SYSTEM_CHAIN_COUNT (coding_system); i++) | |
| 2801 result = Fcons (XCODING_SYSTEM_CHAIN_CHAIN (coding_system)[i], | |
| 2802 result); | |
| 2803 | |
| 2804 return Fnreverse (result); | |
| 2805 } | |
| 2806 else if (EQ (prop, Qcanonicalize_after_coding)) | |
| 2807 return XCODING_SYSTEM_CHAIN_CANONICALIZE_AFTER_CODING (coding_system); | |
| 2808 else | |
| 2809 return Qunbound; | |
| 2810 } | |
| 2811 | |
| 2812 static enum source_sink_type | |
| 2813 chain_conversion_end_type (Lisp_Object codesys) | |
| 2814 { | |
| 2815 Lisp_Object *cslist = XCODING_SYSTEM_CHAIN_CHAIN (codesys); | |
| 2816 int n = XCODING_SYSTEM_CHAIN_COUNT (codesys); | |
| 2817 int charp_source, charp_sink; | |
| 2818 | |
| 2819 if (n == 0) | |
| 2820 return DECODES_BYTE_TO_BYTE; /* arbitrary */ | |
| 2821 charp_source = decoding_source_sink_type_is_char (cslist[0], CODING_SOURCE); | |
| 2822 charp_sink = decoding_source_sink_type_is_char (cslist[n - 1], CODING_SINK); | |
| 2823 | |
| 2824 switch (charp_source * 2 + charp_sink) | |
| 2825 { | |
| 2826 case 0: return DECODES_BYTE_TO_BYTE; | |
| 2827 case 1: return DECODES_BYTE_TO_CHARACTER; | |
| 2828 case 2: return DECODES_CHARACTER_TO_BYTE; | |
| 2829 case 3: return DECODES_CHARACTER_TO_CHARACTER; | |
| 2830 } | |
| 2831 | |
| 2500 | 2832 ABORT (); |
| 771 | 2833 return DECODES_BYTE_TO_BYTE; |
| 2834 } | |
| 2835 | |
| 2836 | |
| 2837 /************************************************************************/ | |
| 2838 /* No-conversion methods */ | |
| 2839 /************************************************************************/ | |
| 2840 | |
| 2841 /* "No conversion"; used for binary files. We use quotes because there | |
| 2842 really is some conversion being applied (it does byte<->char | |
| 2843 conversion), but it appears to the user as if the text is read in | |
| 2297 | 2844 without conversion. |
| 2845 | |
| 2846 #### Shouldn't we _call_ it that, then? And while we're at it, | |
| 2847 separate it into "to_internal" and "to_external"? */ | |
| 771 | 2848 DEFINE_CODING_SYSTEM_TYPE (no_conversion); |
| 2849 | |
| 2850 /* This is used when reading in "binary" files -- i.e. files that may | |
| 2851 contain all 256 possible byte values and that are not to be | |
| 2852 interpreted as being in any particular encoding. */ | |
| 2853 static Bytecount | |
| 2854 no_conversion_convert (struct coding_stream *str, | |
| 2855 const UExtbyte *src, | |
| 2856 unsigned_char_dynarr *dst, Bytecount n) | |
| 2857 { | |
| 2858 UExtbyte c; | |
| 2859 unsigned int ch = str->ch; | |
| 2860 Bytecount orign = n; | |
| 2861 | |
| 2862 if (str->direction == CODING_DECODE) | |
| 2863 { | |
| 2864 while (n--) | |
| 2865 { | |
| 2866 c = *src++; | |
| 2867 | |
| 2868 DECODE_ADD_BINARY_CHAR (c, dst); | |
| 2869 } | |
| 2870 | |
| 2871 if (str->eof) | |
| 2872 DECODE_OUTPUT_PARTIAL_CHAR (ch, dst); | |
| 2873 } | |
| 2874 else | |
| 2875 { | |
| 2876 | |
| 2877 while (n--) | |
| 2878 { | |
| 2879 c = *src++; | |
| 826 | 2880 if (byte_ascii_p (c)) |
| 771 | 2881 { |
| 2882 assert (ch == 0); | |
| 2883 Dynarr_add (dst, c); | |
| 2884 } | |
| 2885 #ifdef MULE | |
| 867 | 2886 else if (ibyte_leading_byte_p (c)) |
| 771 | 2887 { |
| 2888 assert (ch == 0); | |
| 2889 if (c == LEADING_BYTE_LATIN_ISO8859_1 || | |
| 2890 c == LEADING_BYTE_CONTROL_1) | |
| 2891 ch = c; | |
| 2892 else | |
| 2297 | 2893 /* #### This is just plain unacceptable. */ |
| 771 | 2894 Dynarr_add (dst, '~'); /* untranslatable character */ |
| 2895 } | |
| 2896 else | |
| 2897 { | |
| 2898 if (ch == LEADING_BYTE_LATIN_ISO8859_1) | |
| 2899 Dynarr_add (dst, c); | |
| 2900 else if (ch == LEADING_BYTE_CONTROL_1) | |
| 2901 { | |
| 2902 assert (c < 0xC0); | |
| 2903 Dynarr_add (dst, c - 0x20); | |
| 2904 } | |
| 2905 /* else it should be the second or third byte of an | |
| 2906 untranslatable character, so ignore it */ | |
| 2907 ch = 0; | |
| 2908 } | |
| 2909 #endif /* MULE */ | |
| 2910 | |
| 2911 } | |
| 2912 } | |
| 2913 | |
| 2914 str->ch = ch; | |
| 2915 return orign; | |
| 2916 } | |
| 2917 | |
| 2918 DEFINE_DETECTOR (no_conversion); | |
| 2919 DEFINE_DETECTOR_CATEGORY (no_conversion, no_conversion); | |
| 2920 | |
| 2921 struct no_conversion_detector | |
| 2922 { | |
| 2923 int dummy; | |
| 2924 }; | |
| 2925 | |
| 2926 static void | |
| 2286 | 2927 no_conversion_detect (struct detection_state *st, const UExtbyte *UNUSED (src), |
| 2928 Bytecount UNUSED (n)) | |
| 771 | 2929 { |
| 2930 /* Hack until we get better handling of this stuff! */ | |
| 2931 DET_RESULT (st, no_conversion) = DET_SLIGHTLY_LIKELY; | |
| 2932 } | |
| 2933 | |
| 2934 | |
| 2935 /************************************************************************/ | |
| 2936 /* Convert-eol methods */ | |
| 2937 /************************************************************************/ | |
| 2938 | |
| 2939 /* This is used to handle end-of-line (EOL) differences. It is | |
| 2819 | 2940 character-to-character, and works (when encoding) *BEFORE* sending data to |
| 2941 the main encoding routine -- thus, that routine must handle different EOL | |
| 2942 types itself if it does line-oriented type processing. This is unavoidable | |
| 2943 because we don't know whether the output of the main encoding routine is | |
| 2944 ASCII compatible (UTF-16 is definitely not, for example). [[ sjt sez this | |
| 2945 is bogus. There should be _no_ EOL processing (or processing of any kind) | |
| 2946 after conversion to external. ]] | |
| 771 | 2947 |
| 793 | 2948 There is one parameter: `subtype', either `cr', `lf', `crlf', or nil. |
| 771 | 2949 */ |
| 2950 | |
| 2951 struct convert_eol_coding_system | |
| 2952 { | |
| 2953 enum eol_type subtype; | |
| 2132 | 2954 int dummy; /* On some architectures (eg ia64) the portable dumper can |
| 2955 produce unaligned access errors without this field. Probably | |
| 2956 because the combined structure of this structure and | |
| 2957 Lisp_Coding_System is not properly aligned. */ | |
| 771 | 2958 }; |
| 2959 | |
| 2960 #define CODING_SYSTEM_CONVERT_EOL_SUBTYPE(codesys) \ | |
| 2961 (CODING_SYSTEM_TYPE_DATA (codesys, convert_eol)->subtype) | |
| 2962 #define XCODING_SYSTEM_CONVERT_EOL_SUBTYPE(codesys) \ | |
| 2963 (XCODING_SYSTEM_TYPE_DATA (codesys, convert_eol)->subtype) | |
| 2964 | |
| 2965 struct convert_eol_coding_stream | |
| 2966 { | |
| 2967 enum eol_type actual; | |
| 2968 }; | |
| 2969 | |
| 1204 | 2970 static const struct memory_description |
| 771 | 2971 convert_eol_coding_system_description[] = { |
| 2972 { XD_END } | |
| 2973 }; | |
| 2974 | |
| 1204 | 2975 DEFINE_CODING_SYSTEM_TYPE_WITH_DATA (convert_eol); |
| 2976 | |
| 771 | 2977 static void |
| 2286 | 2978 convert_eol_print (Lisp_Object cs, Lisp_Object printcharfun, |
| 2979 int UNUSED (escapeflag)) | |
| 771 | 2980 { |
| 2981 struct convert_eol_coding_system *data = | |
| 2982 XCODING_SYSTEM_TYPE_DATA (cs, convert_eol); | |
| 2983 | |
| 2984 write_fmt_string (printcharfun, "(%s)", | |
| 2985 data->subtype == EOL_LF ? "lf" : | |
| 2986 data->subtype == EOL_CRLF ? "crlf" : | |
| 2987 data->subtype == EOL_CR ? "cr" : | |
| 793 | 2988 data->subtype == EOL_AUTODETECT ? "nil" : |
| 2500 | 2989 (ABORT(), "")); |
| 771 | 2990 } |
| 2991 | |
| 2992 static enum source_sink_type | |
| 2286 | 2993 convert_eol_conversion_end_type (Lisp_Object UNUSED (codesys)) |
| 771 | 2994 { |
| 2995 return DECODES_CHARACTER_TO_CHARACTER; | |
| 2996 } | |
| 2997 | |
| 2998 static int | |
| 2999 convert_eol_putprop (Lisp_Object codesys, | |
| 3000 Lisp_Object key, | |
| 3001 Lisp_Object value) | |
| 3002 { | |
| 3003 struct convert_eol_coding_system *data = | |
| 3004 XCODING_SYSTEM_TYPE_DATA (codesys, convert_eol); | |
| 3005 | |
| 3006 if (EQ (key, Qsubtype)) | |
| 3007 { | |
| 3008 if (EQ (value, Qlf) /* || EQ (value, Qunix) */) | |
| 3009 data->subtype = EOL_LF; | |
| 3010 else if (EQ (value, Qcrlf) /* || EQ (value, Qdos) */) | |
| 3011 data->subtype = EOL_CRLF; | |
| 3012 else if (EQ (value, Qcr) /* || EQ (value, Qmac) */) | |
| 3013 data->subtype = EOL_CR; | |
| 793 | 3014 else if (EQ (value, Qnil)) |
| 771 | 3015 data->subtype = EOL_AUTODETECT; |
| 3016 else invalid_constant ("Unrecognized eol type", value); | |
| 3017 } | |
| 3018 else | |
| 3019 return 0; | |
| 3020 return 1; | |
| 3021 } | |
| 3022 | |
| 3023 static Lisp_Object | |
| 3024 convert_eol_getprop (Lisp_Object coding_system, Lisp_Object prop) | |
| 3025 { | |
| 3026 struct convert_eol_coding_system *data = | |
| 3027 XCODING_SYSTEM_TYPE_DATA (coding_system, convert_eol); | |
| 3028 | |
| 3029 if (EQ (prop, Qsubtype)) | |
| 3030 { | |
| 3031 switch (data->subtype) | |
| 3032 { | |
| 3033 case EOL_LF: return Qlf; | |
| 3034 case EOL_CRLF: return Qcrlf; | |
| 3035 case EOL_CR: return Qcr; | |
| 793 | 3036 case EOL_AUTODETECT: return Qnil; |
| 2500 | 3037 default: ABORT (); |
| 771 | 3038 } |
| 3039 } | |
| 3040 | |
| 3041 return Qunbound; | |
| 3042 } | |
| 3043 | |
| 3044 static void | |
| 3045 convert_eol_init_coding_stream (struct coding_stream *str) | |
| 3046 { | |
| 3047 struct convert_eol_coding_stream *data = | |
| 3048 CODING_STREAM_TYPE_DATA (str, convert_eol); | |
| 3049 data->actual = XCODING_SYSTEM_CONVERT_EOL_SUBTYPE (str->codesys); | |
| 3050 } | |
| 3051 | |
| 3052 static Bytecount | |
| 867 | 3053 convert_eol_convert (struct coding_stream *str, const Ibyte *src, |
| 771 | 3054 unsigned_char_dynarr *dst, Bytecount n) |
| 3055 { | |
| 3056 if (str->direction == CODING_DECODE) | |
| 3057 { | |
| 3058 struct convert_eol_coding_stream *data = | |
| 3059 CODING_STREAM_TYPE_DATA (str, convert_eol); | |
| 3060 | |
| 3061 if (data->actual == EOL_AUTODETECT) | |
| 3062 { | |
| 3063 Bytecount n2 = n; | |
| 867 | 3064 const Ibyte *src2 = src; |
| 771 | 3065 |
| 3066 for (; n2; n2--) | |
| 3067 { | |
| 867 | 3068 Ibyte c = *src2++; |
| 771 | 3069 if (c == '\n') |
| 3070 { | |
| 3071 data->actual = EOL_LF; | |
| 3072 break; | |
| 3073 } | |
| 3074 else if (c == '\r') | |
| 3075 { | |
| 3076 if (n2 == 1) | |
| 3077 { | |
| 3078 /* If we're seeing a '\r' at the end of the data, then | |
| 3079 reject the '\r' right now so it doesn't become an | |
| 3080 issue in the code below -- unless we're at the end of | |
| 3081 the stream, in which case we can't do that (because | |
| 3082 then the '\r' will never get written out), and in any | |
| 3083 case we should be recognizing it at EOL_CR format. */ | |
| 3084 if (str->eof) | |
| 3085 data->actual = EOL_CR; | |
| 3086 else | |
| 3087 n--; | |
| 3088 break; | |
| 3089 } | |
| 3090 else if (*src2 == '\n') | |
| 3091 data->actual = EOL_CRLF; | |
| 3092 else | |
| 3093 data->actual = EOL_CR; | |
| 3094 break; | |
| 3095 } | |
| 3096 } | |
| 3097 } | |
| 3098 | |
| 3099 /* str->eof is set, the caller reached EOF on the other end and has | |
| 3100 no new data to give us. The only data we get is the data we | |
| 3101 rejected from last time. */ | |
| 3102 if (data->actual == EOL_LF || data->actual == EOL_AUTODETECT || | |
| 3103 (str->eof)) | |
| 3104 Dynarr_add_many (dst, src, n); | |
| 3105 else | |
| 3106 { | |
| 867 | 3107 const Ibyte *end = src + n; |
| 771 | 3108 while (1) |
| 3109 { | |
| 3110 /* Find the next section with no \r and add it. */ | |
| 867 | 3111 const Ibyte *runstart = src; |
| 3112 src = (Ibyte *) memchr (src, '\r', end - src); | |
| 771 | 3113 if (!src) |
| 3114 src = end; | |
| 3115 Dynarr_add_many (dst, runstart, src - runstart); | |
| 3116 /* Stop if at end ... */ | |
| 3117 if (src == end) | |
| 3118 break; | |
| 3119 /* ... else, translate as necessary. */ | |
| 3120 src++; | |
| 3121 if (data->actual == EOL_CR) | |
| 3122 Dynarr_add (dst, '\n'); | |
| 3123 /* We need to be careful here with CRLF. If we see a CR at the | |
| 3124 end of the data, we don't know if it's part of a CRLF, so we | |
| 3125 reject it. Otherwise: If it's part of a CRLF, eat it and | |
| 3126 loop; the following LF gets added next time around. If it's | |
| 3127 not part of a CRLF, add the CR and loop. The following | |
| 3128 character will be processed in the next loop iteration. This | |
| 3129 correctly handles a sequence like CR+CR+LF. */ | |
| 3130 else if (src == end) | |
| 3131 return n - 1; /* reject the CR at the end; we'll get it again | |
| 3132 next time the convert method is called */ | |
| 3133 else if (*src != '\n') | |
| 3134 Dynarr_add (dst, '\r'); | |
| 3135 } | |
| 3136 } | |
| 3137 | |
| 3138 return n; | |
| 3139 } | |
| 3140 else | |
| 3141 { | |
| 3142 enum eol_type subtype = | |
| 3143 XCODING_SYSTEM_CONVERT_EOL_SUBTYPE (str->codesys); | |
| 867 | 3144 const Ibyte *end = src + n; |
| 771 | 3145 |
| 3146 /* We try to be relatively efficient here. */ | |
| 3147 if (subtype == EOL_LF) | |
| 3148 Dynarr_add_many (dst, src, n); | |
| 3149 else | |
| 3150 { | |
| 3151 while (1) | |
| 3152 { | |
| 3153 /* Find the next section with no \n and add it. */ | |
| 867 | 3154 const Ibyte *runstart = src; |
| 3155 src = (Ibyte *) memchr (src, '\n', end - src); | |
| 771 | 3156 if (!src) |
| 3157 src = end; | |
| 3158 Dynarr_add_many (dst, runstart, src - runstart); | |
| 3159 /* Stop if at end ... */ | |
| 3160 if (src == end) | |
| 3161 break; | |
| 3162 /* ... else, skip over \n and add its translation. */ | |
| 3163 src++; | |
| 3164 Dynarr_add (dst, '\r'); | |
| 3165 if (subtype == EOL_CRLF) | |
| 3166 Dynarr_add (dst, '\n'); | |
| 3167 } | |
| 3168 } | |
| 3169 | |
| 3170 return n; | |
| 3171 } | |
| 3172 } | |
| 3173 | |
| 3174 static Lisp_Object | |
| 3175 convert_eol_canonicalize_after_coding (struct coding_stream *str) | |
| 3176 { | |
| 3177 struct convert_eol_coding_stream *data = | |
| 3178 CODING_STREAM_TYPE_DATA (str, convert_eol); | |
| 3179 | |
| 3180 if (str->direction == CODING_ENCODE) | |
| 3181 return str->codesys; | |
| 3182 | |
| 3183 switch (data->actual) | |
| 3184 { | |
| 3185 case EOL_LF: return Fget_coding_system (Qconvert_eol_lf); | |
| 3186 case EOL_CRLF: return Fget_coding_system (Qconvert_eol_crlf); | |
| 3187 case EOL_CR: return Fget_coding_system (Qconvert_eol_cr); | |
| 3188 case EOL_AUTODETECT: return str->codesys; | |
| 2500 | 3189 default: ABORT (); return Qnil; |
| 771 | 3190 } |
| 3191 } | |
| 3192 | |
| 3193 | |
| 3194 /************************************************************************/ | |
| 3195 /* Undecided methods */ | |
| 3196 /************************************************************************/ | |
| 3197 | |
| 3198 /* Do autodetection. We can autodetect the EOL type only, the coding | |
| 3199 system only, or both. We only do autodetection when decoding; when | |
| 3200 encoding, we just pass the data through. | |
| 3201 | |
| 3202 When doing just EOL detection, a coding system can be specified; if so, | |
| 3203 we will decode this data through the coding system before doing EOL | |
| 3204 detection. The reason for specifying this is so that | |
| 3205 canonicalize-after-coding works: We will canonicalize the specified | |
| 3206 coding system into the appropriate EOL type. When doing both coding and | |
| 3207 EOL detection, we do similar canonicalization, and also catch situations | |
| 3208 where the EOL type is overspecified, i.e. the detected coding system | |
| 3209 specifies an EOL type, and either switch to the equivalent | |
| 3210 non-EOL-processing coding system (if possible), or terminate EOL | |
| 3211 detection and use the specified EOL type. This prevents data from being | |
| 3212 EOL-processed twice. | |
| 3213 */ | |
| 3214 | |
| 3215 struct undecided_coding_system | |
| 3216 { | |
| 3217 int do_eol, do_coding; | |
| 3218 Lisp_Object cs; | |
| 3219 }; | |
| 3220 | |
| 3221 struct undecided_coding_stream | |
| 3222 { | |
| 3223 Lisp_Object actual; | |
| 3224 /* Either 2 or 3 lstreams here; see undecided_convert */ | |
| 3225 struct chain_coding_stream c; | |
| 3226 | |
| 3227 struct detection_state *st; | |
| 3228 }; | |
| 3229 | |
| 1204 | 3230 static const struct memory_description undecided_coding_system_description[] = { |
| 3231 { XD_LISP_OBJECT, offsetof (struct undecided_coding_system, cs) }, | |
| 771 | 3232 { XD_END } |
| 3233 }; | |
| 3234 | |
| 1204 | 3235 static const struct memory_description undecided_coding_stream_description_1 [] = { |
| 3236 { XD_LISP_OBJECT, offsetof (struct undecided_coding_stream, actual) }, | |
| 2367 | 3237 { XD_BLOCK_ARRAY, offsetof (struct undecided_coding_stream, c), |
| 2551 | 3238 1, { &chain_coding_stream_description } }, |
| 1204 | 3239 { XD_END } |
| 3240 }; | |
| 3241 | |
| 3242 const struct sized_memory_description undecided_coding_stream_description = { | |
| 3243 sizeof (struct undecided_coding_stream), undecided_coding_stream_description_1 | |
| 3244 }; | |
| 3245 | |
| 3246 DEFINE_CODING_SYSTEM_TYPE_WITH_DATA (undecided); | |
| 3247 | |
| 771 | 3248 static void |
| 3249 undecided_init (Lisp_Object codesys) | |
| 3250 { | |
| 3251 struct undecided_coding_system *data = | |
| 3252 XCODING_SYSTEM_TYPE_DATA (codesys, undecided); | |
| 3253 | |
| 3254 data->cs = Qnil; | |
| 3255 } | |
| 3256 | |
| 3257 static void | |
| 3258 undecided_mark (Lisp_Object codesys) | |
| 3259 { | |
| 3260 struct undecided_coding_system *data = | |
| 3261 XCODING_SYSTEM_TYPE_DATA (codesys, undecided); | |
| 3262 | |
| 3263 mark_object (data->cs); | |
| 3264 } | |
| 3265 | |
| 3266 static void | |
| 3267 undecided_print (Lisp_Object cs, Lisp_Object printcharfun, int escapeflag) | |
| 3268 { | |
| 3269 struct undecided_coding_system *data = | |
| 3270 XCODING_SYSTEM_TYPE_DATA (cs, undecided); | |
| 3271 int need_space = 0; | |
| 3272 | |
| 826 | 3273 write_c_string (printcharfun, "("); |
| 771 | 3274 if (data->do_eol) |
| 3275 { | |
| 826 | 3276 write_c_string (printcharfun, "do-eol"); |
| 771 | 3277 need_space = 1; |
| 3278 } | |
| 3279 if (data->do_coding) | |
| 3280 { | |
| 3281 if (need_space) | |
| 826 | 3282 write_c_string (printcharfun, " "); |
| 3283 write_c_string (printcharfun, "do-coding"); | |
| 771 | 3284 need_space = 1; |
| 3285 } | |
| 3286 if (!NILP (data->cs)) | |
| 3287 { | |
| 3288 if (need_space) | |
| 826 | 3289 write_c_string (printcharfun, " "); |
| 3290 write_c_string (printcharfun, "coding-system="); | |
| 771 | 3291 print_coding_system_in_print_method (data->cs, printcharfun, escapeflag); |
| 3292 } | |
| 826 | 3293 write_c_string (printcharfun, ")"); |
| 771 | 3294 } |
| 3295 | |
| 3296 static void | |
| 3297 undecided_mark_coding_stream (struct coding_stream *str) | |
| 3298 { | |
| 1204 | 3299 mark_object (CODING_STREAM_TYPE_DATA (str, undecided)->actual); |
| 771 | 3300 chain_mark_coding_stream_1 (&CODING_STREAM_TYPE_DATA (str, undecided)->c); |
| 3301 } | |
| 3302 | |
| 3303 static int | |
| 3304 undecided_putprop (Lisp_Object codesys, Lisp_Object key, Lisp_Object value) | |
| 3305 { | |
| 3306 struct undecided_coding_system *data = | |
| 3307 XCODING_SYSTEM_TYPE_DATA (codesys, undecided); | |
| 3308 | |
| 3309 if (EQ (key, Qdo_eol)) | |
| 3310 data->do_eol = 1; | |
| 3311 else if (EQ (key, Qdo_coding)) | |
| 3312 data->do_coding = 1; | |
| 3313 else if (EQ (key, Qcoding_system)) | |
| 3314 data->cs = get_coding_system_for_text_file (value, 0); | |
| 3315 else | |
| 3316 return 0; | |
| 3317 return 1; | |
| 3318 } | |
| 3319 | |
| 3320 static Lisp_Object | |
| 3321 undecided_getprop (Lisp_Object codesys, Lisp_Object prop) | |
| 3322 { | |
| 3323 struct undecided_coding_system *data = | |
| 3324 XCODING_SYSTEM_TYPE_DATA (codesys, undecided); | |
| 3325 | |
| 3326 if (EQ (prop, Qdo_eol)) | |
| 3327 return data->do_eol ? Qt : Qnil; | |
| 3328 if (EQ (prop, Qdo_coding)) | |
| 3329 return data->do_coding ? Qt : Qnil; | |
| 3330 if (EQ (prop, Qcoding_system)) | |
| 3331 return data->cs; | |
| 3332 return Qunbound; | |
| 3333 } | |
| 3334 | |
| 3335 static struct detection_state * | |
| 3336 allocate_detection_state (void) | |
| 3337 { | |
| 3338 int i; | |
| 3339 Bytecount size = MAX_ALIGN_SIZE (sizeof (struct detection_state)); | |
| 3340 struct detection_state *block; | |
| 3341 | |
| 3342 for (i = 0; i < coding_detector_count; i++) | |
| 3343 size += MAX_ALIGN_SIZE (Dynarr_at (all_coding_detectors, i).data_size); | |
| 3344 | |
| 3345 block = (struct detection_state *) xmalloc_and_zero (size); | |
| 3346 | |
| 3347 size = MAX_ALIGN_SIZE (sizeof (struct detection_state)); | |
| 3348 for (i = 0; i < coding_detector_count; i++) | |
| 3349 { | |
| 3350 block->data_offset[i] = size; | |
| 3351 size += MAX_ALIGN_SIZE (Dynarr_at (all_coding_detectors, i).data_size); | |
| 3352 } | |
| 3353 | |
| 3354 return block; | |
| 3355 } | |
| 3356 | |
| 3357 static void | |
| 3358 free_detection_state (struct detection_state *st) | |
| 3359 { | |
| 3360 int i; | |
| 3361 | |
| 3362 for (i = 0; i < coding_detector_count; i++) | |
| 3363 { | |
| 3364 if (Dynarr_at (all_coding_detectors, i).finalize_detection_state_method) | |
| 3365 Dynarr_at (all_coding_detectors, i).finalize_detection_state_method | |
| 3366 (st); | |
| 3367 } | |
| 3368 | |
| 1726 | 3369 xfree (st, struct detection_state *); |
| 771 | 3370 } |
| 3371 | |
| 3372 static int | |
| 3373 coding_category_symbol_to_id (Lisp_Object symbol) | |
| 428 | 3374 { |
| 3375 int i; | |
| 3376 | |
| 3377 CHECK_SYMBOL (symbol); | |
| 771 | 3378 for (i = 0; i < coding_detector_count; i++) |
| 3379 { | |
| 3380 detector_category_dynarr *cats = | |
| 3381 Dynarr_at (all_coding_detectors, i).cats; | |
| 3382 int j; | |
| 3383 | |
| 3384 for (j = 0; j < Dynarr_length (cats); j++) | |
| 3385 if (EQ (Dynarr_at (cats, j).sym, symbol)) | |
| 3386 return Dynarr_at (cats, j).id; | |
| 3387 } | |
| 3388 | |
| 563 | 3389 invalid_constant ("Unrecognized coding category", symbol); |
| 1204 | 3390 RETURN_NOT_REACHED (0); |
| 428 | 3391 } |
| 3392 | |
| 771 | 3393 static Lisp_Object |
| 3394 coding_category_id_to_symbol (int id) | |
| 428 | 3395 { |
| 3396 int i; | |
| 771 | 3397 |
| 3398 for (i = 0; i < coding_detector_count; i++) | |
| 3399 { | |
| 3400 detector_category_dynarr *cats = | |
| 3401 Dynarr_at (all_coding_detectors, i).cats; | |
| 3402 int j; | |
| 3403 | |
| 3404 for (j = 0; j < Dynarr_length (cats); j++) | |
| 3405 if (id == Dynarr_at (cats, j).id) | |
| 3406 return Dynarr_at (cats, j).sym; | |
| 3407 } | |
| 3408 | |
| 2500 | 3409 ABORT (); |
| 771 | 3410 return Qnil; /* (usually) not reached */ |
| 428 | 3411 } |
| 3412 | |
| 771 | 3413 static Lisp_Object |
| 3414 detection_result_number_to_symbol (enum detection_result result) | |
| 428 | 3415 { |
| 1494 | 3416 /* let compiler warn if not all enumerators are handled */ |
| 3417 switch (result) { | |
| 3418 #define FROB(sym, num) case num: return (sym) | |
| 771 | 3419 FROB (Qnear_certainty, DET_NEAR_CERTAINTY); |
| 3420 FROB (Qquite_probable, DET_QUITE_PROBABLE); | |
| 3421 FROB (Qsomewhat_likely, DET_SOMEWHAT_LIKELY); | |
| 1494 | 3422 FROB (Qslightly_likely, DET_SLIGHTLY_LIKELY); |
| 771 | 3423 FROB (Qas_likely_as_unlikely, DET_AS_LIKELY_AS_UNLIKELY); |
| 3424 FROB (Qsomewhat_unlikely, DET_SOMEWHAT_UNLIKELY); | |
| 3425 FROB (Qquite_improbable, DET_QUITE_IMPROBABLE); | |
| 3426 FROB (Qnearly_impossible, DET_NEARLY_IMPOSSIBLE); | |
| 3427 #undef FROB | |
| 1494 | 3428 } |
| 771 | 3429 |
| 2500 | 3430 ABORT (); |
| 771 | 3431 return Qnil; /* (usually) not reached */ |
| 3432 } | |
| 3433 | |
| 778 | 3434 #if 0 /* not used */ |
| 771 | 3435 static enum detection_result |
| 3436 detection_result_symbol_to_number (Lisp_Object symbol) | |
| 3437 { | |
| 1494 | 3438 /* using switch here would be bad style, and doesn't help */ |
| 771 | 3439 #define FROB(sym, num) if (EQ (symbol, sym)) return (num) |
| 3440 FROB (Qnear_certainty, DET_NEAR_CERTAINTY); | |
| 3441 FROB (Qquite_probable, DET_QUITE_PROBABLE); | |
| 3442 FROB (Qsomewhat_likely, DET_SOMEWHAT_LIKELY); | |
| 1494 | 3443 FROB (Qslightly_likely, DET_SLIGHTLY_LIKELY); |
| 771 | 3444 FROB (Qas_likely_as_unlikely, DET_AS_LIKELY_AS_UNLIKELY); |
| 3445 FROB (Qsomewhat_unlikely, DET_SOMEWHAT_UNLIKELY); | |
| 3446 FROB (Qquite_improbable, DET_QUITE_IMPROBABLE); | |
| 3447 FROB (Qnearly_impossible, DET_NEARLY_IMPOSSIBLE); | |
| 3448 #undef FROB | |
| 3449 | |
| 3450 invalid_constant ("Unrecognized detection result", symbol); | |
| 3451 return ((enum detection_result) 0); /* not reached */ | |
| 3452 } | |
| 778 | 3453 #endif /* 0 */ |
| 771 | 3454 |
| 3455 /* Set all detection results for a given detector to a specified value. */ | |
| 3456 void | |
| 3457 set_detection_results (struct detection_state *st, int detector, int given) | |
| 3458 { | |
| 3459 detector_category_dynarr *cats = | |
| 3460 Dynarr_at (all_coding_detectors, detector).cats; | |
| 3461 int i; | |
| 3462 | |
| 3463 for (i = 0; i < Dynarr_length (cats); i++) | |
| 3464 st->categories[Dynarr_at (cats, i).id] = given; | |
| 3465 } | |
| 428 | 3466 |
| 3467 static int | |
| 3468 acceptable_control_char_p (int c) | |
| 3469 { | |
| 3470 switch (c) | |
| 3471 { | |
| 3472 /* Allow and ignore control characters that you might | |
| 3473 reasonably see in a text file */ | |
| 3474 case '\r': | |
| 3475 case '\n': | |
| 3476 case '\t': | |
| 3477 case 7: /* bell */ | |
| 3478 case 8: /* backspace */ | |
| 3479 case 11: /* vertical tab */ | |
| 3480 case 12: /* form feed */ | |
| 3481 case 26: /* MS-DOS C-z junk */ | |
| 3482 case 31: /* '^_' -- for info */ | |
| 3483 return 1; | |
| 3484 default: | |
| 3485 return 0; | |
| 3486 } | |
| 3487 } | |
| 3488 | |
| 771 | 3489 #ifdef DEBUG_XEMACS |
| 3490 | |
| 3491 static UExtbyte | |
| 3492 hex_digit_to_char (int digit) | |
| 428 | 3493 { |
| 771 | 3494 if (digit < 10) |
| 3495 return digit + '0'; | |
| 3496 else | |
| 3497 return digit - 10 + 'A'; | |
| 428 | 3498 } |
| 3499 | |
| 771 | 3500 static void |
| 3501 output_bytes_in_ascii_and_hex (const UExtbyte *src, Bytecount n) | |
| 428 | 3502 { |
| 3425 | 3503 Extbyte *ascii = alloca_array (Extbyte, n + 1); |
| 3504 Extbyte *hex = alloca_array (Extbyte, 3 * n + 1); | |
| 771 | 3505 int i; |
| 3413 | 3506 DECLARE_EISTRING (eistr_ascii); |
| 3507 DECLARE_EISTRING (eistr_hex); | |
| 771 | 3508 |
| 3509 for (i = 0; i < n; i++) | |
| 428 | 3510 { |
| 3425 | 3511 Extbyte c = src[i]; |
| 771 | 3512 if (c < 0x20) |
| 3513 ascii[i] = '.'; | |
| 428 | 3514 else |
| 771 | 3515 ascii[i] = c; |
| 3516 hex[3 * i] = hex_digit_to_char (c >> 4); | |
| 3517 hex[3 * i + 1] = hex_digit_to_char (c & 0xF); | |
| 3518 hex[3 * i + 2] = ' '; | |
| 428 | 3519 } |
| 771 | 3520 ascii[i] = '\0'; |
| 3521 hex[3 * i - 1] = '\0'; | |
| 3413 | 3522 |
| 3523 eicpy_ext(eistr_hex, hex, Qbinary); | |
| 3524 eicpy_ext(eistr_ascii, ascii, Qbinary); | |
| 3525 | |
| 3425 | 3526 stderr_out ("%s %s", eidata(eistr_ascii), eidata(eistr_hex)); |
| 428 | 3527 } |
| 3528 | |
| 771 | 3529 #endif /* DEBUG_XEMACS */ |
| 3530 | |
| 3531 /* Attempt to determine the encoding of the given text. Before calling | |
| 3532 this function for the first time, you must zero out the detection state. | |
| 428 | 3533 |
| 3534 Returns: | |
| 3535 | |
| 771 | 3536 0 == keep going |
| 3537 1 == stop | |
| 428 | 3538 */ |
| 3539 | |
| 3540 static int | |
| 771 | 3541 detect_coding_type (struct detection_state *st, const UExtbyte *src, |
| 3542 Bytecount n) | |
| 428 | 3543 { |
| 771 | 3544 Bytecount n2 = n; |
| 3545 const UExtbyte *src2 = src; | |
| 3546 int i; | |
| 3547 | |
| 3548 #ifdef DEBUG_XEMACS | |
| 3549 if (!NILP (Vdebug_coding_detection)) | |
| 3550 { | |
| 3551 int bytes = min (16, n); | |
| 3552 stderr_out ("detect_coding_type: processing %ld bytes\n", n); | |
| 3553 stderr_out ("First %d: ", bytes); | |
| 3554 output_bytes_in_ascii_and_hex (src, bytes); | |
| 3555 stderr_out ("\nLast %d: ", bytes); | |
| 3556 output_bytes_in_ascii_and_hex (src + n - bytes, bytes); | |
| 3557 stderr_out ("\n"); | |
| 3558 } | |
| 3559 #endif /* DEBUG_XEMACS */ | |
| 428 | 3560 if (!st->seen_non_ascii) |
| 3561 { | |
| 771 | 3562 for (; n2; n2--, src2++) |
| 428 | 3563 { |
| 771 | 3564 UExtbyte c = *src2; |
| 428 | 3565 if ((c < 0x20 && !acceptable_control_char_p (c)) || c >= 0x80) |
| 3566 { | |
| 3567 st->seen_non_ascii = 1; | |
| 3568 break; | |
| 3569 } | |
| 3570 } | |
| 3571 } | |
| 3572 | |
| 771 | 3573 for (i = 0; i < coding_detector_count; i++) |
| 3574 Dynarr_at (all_coding_detectors, i).detect_method (st, src, n); | |
| 3575 | |
| 3576 st->bytes_seen += n; | |
| 3577 | |
| 3578 #ifdef DEBUG_XEMACS | |
| 3579 if (!NILP (Vdebug_coding_detection)) | |
| 3580 { | |
| 3581 stderr_out ("seen_non_ascii: %d\n", st->seen_non_ascii); | |
| 1494 | 3582 if (coding_detector_category_count <= 0) |
| 3583 stderr_out ("found %d detector categories\n", | |
| 3584 coding_detector_category_count); | |
| 771 | 3585 for (i = 0; i < coding_detector_category_count; i++) |
| 3586 stderr_out_lisp | |
| 3587 ("%s: %s\n", | |
| 3588 2, | |
| 3589 coding_category_id_to_symbol (i), | |
| 3590 detection_result_number_to_symbol ((enum detection_result) | |
| 3591 st->categories[i])); | |
| 3592 } | |
| 3593 #endif /* DEBUG_XEMACS */ | |
| 3594 | |
| 3595 { | |
| 3596 int not_unlikely = 0; | |
| 3597 int retval; | |
| 3598 | |
| 3599 for (i = 0; i < coding_detector_category_count; i++) | |
| 3600 if (st->categories[i] >= 0) | |
| 3601 not_unlikely++; | |
| 3602 | |
| 3603 retval = (not_unlikely <= 1 | |
| 3604 #if 0 /* this is bogus */ | |
| 3605 || st->bytes_seen >= MAX_BYTES_PROCESSED_FOR_DETECTION | |
| 428 | 3606 #endif |
| 771 | 3607 ); |
| 3608 | |
| 3609 #ifdef DEBUG_XEMACS | |
| 3610 if (!NILP (Vdebug_coding_detection)) | |
| 3611 stderr_out ("detect_coding_type: returning %d (%s)\n", | |
| 3612 retval, retval ? "stop" : "keep going"); | |
| 3613 #endif /* DEBUG_XEMACS */ | |
| 3614 | |
| 3615 return retval; | |
| 428 | 3616 } |
| 3617 } | |
| 3618 | |
| 3619 static Lisp_Object | |
| 771 | 3620 detected_coding_system (struct detection_state *st) |
| 428 | 3621 { |
| 771 | 3622 int i; |
| 3623 int even = 1; | |
| 3624 | |
| 3625 if (st->seen_non_ascii) | |
| 3626 { | |
| 3627 for (i = 0; i < coding_detector_category_count; i++) | |
| 3628 if (st->categories[i] != DET_AS_LIKELY_AS_UNLIKELY) | |
| 3629 { | |
| 3630 even = 0; | |
| 3631 break; | |
| 3632 } | |
| 3633 } | |
| 3634 | |
| 3635 /* #### Here we are ignoring the results of detection when it's all | |
| 3636 ASCII. This is obviously a bad thing. But we need to fix up the | |
| 3637 existing detection methods somewhat before we can switch. */ | |
| 3638 if (even) | |
| 428 | 3639 { |
| 3640 /* If the file was entirely or basically ASCII, use the | |
| 3641 default value of `buffer-file-coding-system'. */ | |
| 3642 Lisp_Object retval = | |
| 3643 XBUFFER (Vbuffer_defaults)->buffer_file_coding_system; | |
| 3644 if (!NILP (retval)) | |
| 3645 { | |
| 771 | 3646 retval = find_coding_system_for_text_file (retval, 0); |
| 428 | 3647 if (NILP (retval)) |
| 3648 { | |
| 3649 warn_when_safe | |
| 3650 (Qbad_variable, Qwarning, | |
| 3651 "Invalid `default-buffer-file-coding-system', set to nil"); | |
| 3652 XBUFFER (Vbuffer_defaults)->buffer_file_coding_system = Qnil; | |
| 3653 } | |
| 3654 } | |
| 3655 if (NILP (retval)) | |
| 4100 | 3656 retval = Fget_coding_system (Qbinary); |
| 428 | 3657 return retval; |
| 3658 } | |
| 3659 else | |
| 3660 { | |
| 771 | 3661 int likelihood; |
| 3662 Lisp_Object retval = Qnil; | |
| 3663 | |
| 3664 /* Look through the coding categories first by likelihood and then by | |
| 3665 priority and find the first one that is allowed. */ | |
| 3666 | |
| 3667 for (likelihood = DET_HIGHEST; likelihood >= DET_LOWEST; likelihood--) | |
| 428 | 3668 { |
| 771 | 3669 for (i = 0; i < coding_detector_category_count; i++) |
| 3670 { | |
| 3671 int cat = coding_category_by_priority[i]; | |
| 3672 if (st->categories[cat] == likelihood && | |
| 3673 !NILP (coding_category_system[cat])) | |
| 3674 { | |
| 3675 retval = (get_coding_system_for_text_file | |
| 3676 (coding_category_system[cat], 0)); | |
| 3677 if (likelihood < DET_AS_LIKELY_AS_UNLIKELY) | |
| 3678 warn_when_safe_lispobj | |
| 3679 (intern ("detection"), | |
| 793 | 3680 Qwarning, |
| 771 | 3681 emacs_sprintf_string_lisp |
| 3682 ( | |
| 3683 "Detected coding %s is unlikely to be correct (likelihood == `%s')", | |
| 3684 Qnil, 2, XCODING_SYSTEM_NAME (retval), | |
| 3685 detection_result_number_to_symbol | |
| 3686 ((enum detection_result) likelihood))); | |
| 3687 return retval; | |
| 3688 } | |
| 3689 } | |
| 428 | 3690 } |
| 771 | 3691 |
| 3692 return Fget_coding_system (Qraw_text); | |
| 428 | 3693 } |
| 3694 } | |
| 3695 | |
| 1347 | 3696 /* Look for a coding system in the string (skipping over leading |
| 3697 blanks). If found, return it, otherwise nil. */ | |
| 3698 | |
| 3699 static Lisp_Object | |
| 2531 | 3700 snarf_coding_system (const UExtbyte *p, Bytecount len) |
| 1347 | 3701 { |
| 3702 Bytecount n; | |
| 2531 | 3703 UExtbyte *name; |
| 1347 | 3704 |
| 3705 while (*p == ' ' || *p == '\t') p++, len--; | |
| 3706 len = min (len, 1000); | |
| 3707 name = alloca_ibytes (len + 1); | |
| 3708 memcpy (name, p, len); | |
| 3709 name[len] = '\0'; | |
| 3710 | |
| 3711 /* Get coding system name */ | |
| 3712 /* Characters valid in a MIME charset name (rfc 1521), | |
| 3713 and in a Lisp symbol name. */ | |
| 3714 n = qxestrspn (name, | |
| 3715 "ABCDEFGHIJKLMNOPQRSTUVWXYZ" | |
| 3716 "abcdefghijklmnopqrstuvwxyz" | |
| 3717 "0123456789" | |
| 3718 "!$%&*+-.^_{|}~"); | |
| 3719 if (n > 0) | |
| 3720 { | |
| 3721 name[n] = '\0'; | |
| 2531 | 3722 /* This call to intern_int() is OK because we already verified that |
| 3723 there are only ASCII characters in the string */ | |
| 3724 return find_coding_system_for_text_file (intern_int ((Ibyte *) name), 0); | |
| 1347 | 3725 } |
| 3726 | |
| 3727 return Qnil; | |
| 3728 } | |
| 3729 | |
| 428 | 3730 /* Given a seekable read stream and potential coding system and EOL type |
| 3731 as specified, do any autodetection that is called for. If the | |
| 3732 coding system and/or EOL type are not `autodetect', they will be left | |
| 3733 alone; but this function will never return an autodetect coding system | |
| 3734 or EOL type. | |
| 3735 | |
| 3736 This function does not automatically fetch subsidiary coding systems; | |
| 3737 that should be unnecessary with the explicit eol-type argument. */ | |
| 3738 | |
| 3739 #define LENGTH(string_constant) (sizeof (string_constant) - 1) | |
| 3740 | |
| 771 | 3741 static Lisp_Object |
| 3742 unwind_free_detection_state (Lisp_Object opaque) | |
| 3743 { | |
| 3744 struct detection_state *st = | |
| 3745 (struct detection_state *) get_opaque_ptr (opaque); | |
| 3746 free_detection_state (st); | |
| 3747 free_opaque_ptr (opaque); | |
| 3748 return Qnil; | |
| 3749 } | |
| 3750 | |
| 1347 | 3751 /* #### This duplicates code in `find-coding-system-magic-cookie-in-file' |
| 3752 in files.el. Look into combining them. */ | |
| 3753 | |
| 771 | 3754 static Lisp_Object |
| 3755 look_for_coding_system_magic_cookie (const UExtbyte *data, Bytecount len) | |
| 428 | 3756 { |
| 771 | 3757 const UExtbyte *p; |
| 3758 const UExtbyte *scan_end; | |
| 2531 | 3759 Bytecount cookie_len; |
| 771 | 3760 |
| 3761 /* Look for initial "-*-"; mode line prefix */ | |
| 3762 for (p = data, | |
| 3763 scan_end = data + len - LENGTH ("-*-coding:?-*-"); | |
| 3764 p <= scan_end | |
| 3765 && *p != '\n' | |
| 3766 && *p != '\r'; | |
| 3767 p++) | |
| 3768 if (*p == '-' && *(p+1) == '*' && *(p+2) == '-') | |
| 3769 { | |
| 3770 const UExtbyte *local_vars_beg = p + 3; | |
| 3771 /* Look for final "-*-"; mode line suffix */ | |
| 3772 for (p = local_vars_beg, | |
| 3773 scan_end = data + len - LENGTH ("-*-"); | |
| 3774 p <= scan_end | |
| 428 | 3775 && *p != '\n' |
| 3776 && *p != '\r'; | |
| 771 | 3777 p++) |
| 3778 if (*p == '-' && *(p+1) == '*' && *(p+2) == '-') | |
| 3779 { | |
| 3780 const UExtbyte *suffix = p; | |
| 3781 /* Look for "coding:" */ | |
| 3782 for (p = local_vars_beg, | |
| 3783 scan_end = suffix - LENGTH ("coding:?"); | |
| 3784 p <= scan_end; | |
| 3785 p++) | |
| 3786 if (memcmp ("coding:", p, LENGTH ("coding:")) == 0 | |
| 3787 && (p == local_vars_beg | |
| 3788 || (*(p-1) == ' ' || | |
| 3789 *(p-1) == '\t' || | |
| 3790 *(p-1) == ';'))) | |
| 3791 { | |
| 3792 p += LENGTH ("coding:"); | |
| 1347 | 3793 return snarf_coding_system (p, suffix - p); |
| 771 | 3794 break; |
| 3795 } | |
| 3796 break; | |
| 3797 } | |
| 3798 break; | |
| 3799 } | |
| 3800 | |
| 2531 | 3801 /* Look for ;;;###coding system */ |
| 3802 | |
| 3803 cookie_len = LENGTH (";;;###coding system: "); | |
| 3804 | |
| 3805 for (p = data, | |
| 3806 scan_end = data + len - cookie_len; | |
| 3807 p <= scan_end; | |
| 3808 p++) | |
| 1347 | 3809 { |
| 2531 | 3810 if (*p == ';' && !memcmp (p, ";;;###coding system: ", cookie_len)) |
| 3811 { | |
| 3812 const UExtbyte *suffix; | |
| 3813 | |
| 3814 p += cookie_len; | |
| 3815 suffix = p; | |
| 3816 while (suffix < scan_end && !isspace (*suffix)) | |
| 3817 suffix++; | |
| 3818 return snarf_coding_system (p, suffix - p); | |
| 3819 } | |
| 1347 | 3820 } |
| 3821 | |
| 3822 return Qnil; | |
| 771 | 3823 } |
| 3824 | |
| 3825 static Lisp_Object | |
| 3826 determine_real_coding_system (Lstream *stream) | |
| 3827 { | |
| 3828 struct detection_state *st = allocate_detection_state (); | |
| 3829 int depth = record_unwind_protect (unwind_free_detection_state, | |
| 3830 make_opaque_ptr (st)); | |
| 3831 UExtbyte buf[4096]; | |
| 3832 Bytecount nread = Lstream_read (stream, buf, sizeof (buf)); | |
| 3833 Lisp_Object coding_system = look_for_coding_system_magic_cookie (buf, nread); | |
| 3834 | |
| 3835 if (NILP (coding_system)) | |
| 3836 { | |
| 3837 while (1) | |
| 3838 { | |
| 3839 if (detect_coding_type (st, buf, nread)) | |
| 428 | 3840 break; |
| 771 | 3841 nread = Lstream_read (stream, buf, sizeof (buf)); |
| 3842 if (nread == 0) | |
| 3843 break; | |
| 428 | 3844 } |
| 771 | 3845 |
| 3846 coding_system = detected_coding_system (st); | |
| 428 | 3847 } |
| 3848 | |
| 3849 Lstream_rewind (stream); | |
| 771 | 3850 |
| 3851 unbind_to (depth); | |
| 3852 return coding_system; | |
| 3853 } | |
| 3854 | |
| 3855 static void | |
| 3856 undecided_init_coding_stream (struct coding_stream *str) | |
| 3857 { | |
| 3858 struct undecided_coding_stream *data = | |
| 3859 CODING_STREAM_TYPE_DATA (str, undecided); | |
| 3860 struct undecided_coding_system *csdata = | |
| 3861 XCODING_SYSTEM_TYPE_DATA (str->codesys, undecided); | |
| 3862 | |
| 3863 data->actual = Qnil; | |
| 3864 | |
| 3865 if (str->direction == CODING_DECODE) | |
| 3866 { | |
| 3867 Lstream *lst = str->other_end; | |
| 3868 | |
| 3869 if ((lst->flags & LSTREAM_FL_READ) && | |
| 3870 Lstream_seekable_p (lst) && | |
| 3871 csdata->do_coding) | |
| 3872 /* We can determine the coding system now. */ | |
| 3873 data->actual = determine_real_coding_system (lst); | |
| 3874 } | |
| 1494 | 3875 |
| 3876 #ifdef DEBUG_XEMACS | |
| 3877 if (!NILP (Vdebug_coding_detection)) | |
| 3878 stderr_out_lisp ("detected coding system: %s\n", 1, data->actual); | |
| 3879 #endif /* DEBUG_XEMACS */ | |
| 771 | 3880 } |
| 3881 | |
| 3882 static void | |
| 3883 undecided_rewind_coding_stream (struct coding_stream *str) | |
| 3884 { | |
| 3885 chain_rewind_coding_stream_1 (&CODING_STREAM_TYPE_DATA (str, undecided)->c); | |
| 3886 } | |
| 3887 | |
| 3888 static void | |
| 3889 undecided_finalize_coding_stream (struct coding_stream *str) | |
| 3890 { | |
| 3891 struct undecided_coding_stream *data = | |
| 3892 CODING_STREAM_TYPE_DATA (str, undecided); | |
| 3893 | |
| 3894 chain_finalize_coding_stream_1 | |
| 3895 (&CODING_STREAM_TYPE_DATA (str, undecided)->c); | |
| 3896 if (data->st) | |
| 3897 free_detection_state (data->st); | |
| 3898 } | |
| 3899 | |
| 3900 static Lisp_Object | |
| 3901 undecided_canonicalize (Lisp_Object codesys) | |
| 3902 { | |
| 3903 struct undecided_coding_system *csdata = | |
| 3904 XCODING_SYSTEM_TYPE_DATA (codesys, undecided); | |
| 3905 if (!csdata->do_eol && !csdata->do_coding) | |
| 3906 return NILP (csdata->cs) ? Fget_coding_system (Qbinary) : csdata->cs; | |
| 3907 if (csdata->do_eol && !csdata->do_coding && NILP (csdata->cs)) | |
| 3908 return Fget_coding_system (Qconvert_eol_autodetect); | |
| 3909 return codesys; | |
| 3910 } | |
| 3911 | |
| 3912 static Bytecount | |
| 3913 undecided_convert (struct coding_stream *str, const UExtbyte *src, | |
| 3914 unsigned_char_dynarr *dst, Bytecount n) | |
| 3915 { | |
| 3916 int first_time = 0; | |
| 3917 | |
| 3918 if (str->direction == CODING_DECODE) | |
| 3919 { | |
| 3920 /* At this point, we have only the following possibilities: | |
| 3921 | |
| 3922 do_eol && do_coding | |
| 3923 do_coding only | |
| 3924 do_eol only and a coding system was specified | |
| 3925 | |
| 3926 Other possibilities are removed during undecided_canonicalize. | |
| 3927 | |
| 3928 Therefore, our substreams are either | |
| 3929 | |
| 3930 lstream_coding -> lstream_dynarr, or | |
| 3931 lstream_coding -> lstream_eol -> lstream_dynarr. | |
| 3932 */ | |
| 3933 struct undecided_coding_system *csdata = | |
| 3934 XCODING_SYSTEM_TYPE_DATA (str->codesys, undecided); | |
| 3935 struct undecided_coding_stream *data = | |
| 3936 CODING_STREAM_TYPE_DATA (str, undecided); | |
| 3937 | |
| 3938 if (str->eof) | |
| 3939 { | |
| 3940 /* Each will close the next. We need to close now because more | |
| 3941 data may be generated. */ | |
| 3942 if (data->c.initted) | |
| 3943 Lstream_close (XLSTREAM (data->c.lstreams[0])); | |
| 3944 return n; | |
| 3945 } | |
| 3946 | |
| 3947 if (!data->c.initted) | |
| 3948 { | |
| 3949 data->c.lstream_count = csdata->do_eol ? 3 : 2; | |
| 3950 data->c.lstreams = xnew_array (Lisp_Object, data->c.lstream_count); | |
| 3951 | |
| 3952 data->c.lstreams[data->c.lstream_count - 1] = | |
| 3953 make_dynarr_output_stream (dst); | |
| 3954 Lstream_set_buffering | |
| 3955 (XLSTREAM (data->c.lstreams[data->c.lstream_count - 1]), | |
| 3956 LSTREAM_UNBUFFERED, 0); | |
| 3957 if (csdata->do_eol) | |
| 3958 { | |
| 3959 data->c.lstreams[1] = | |
| 3960 make_coding_output_stream | |
| 3961 (XLSTREAM (data->c.lstreams[data->c.lstream_count - 1]), | |
| 3962 Fget_coding_system (Qconvert_eol_autodetect), | |
| 800 | 3963 CODING_DECODE, 0); |
| 771 | 3964 Lstream_set_buffering |
| 3965 (XLSTREAM (data->c.lstreams[1]), | |
| 3966 LSTREAM_UNBUFFERED, 0); | |
| 3967 } | |
| 3968 | |
| 3969 data->c.lstreams[0] = | |
| 3970 make_coding_output_stream | |
| 3971 (XLSTREAM (data->c.lstreams[1]), | |
| 3972 /* Substitute binary if we need to detect the encoding */ | |
| 3973 csdata->do_coding ? Qbinary : csdata->cs, | |
| 800 | 3974 CODING_DECODE, 0); |
| 771 | 3975 Lstream_set_buffering (XLSTREAM (data->c.lstreams[0]), |
| 3976 LSTREAM_UNBUFFERED, 0); | |
| 3977 | |
| 3978 first_time = 1; | |
| 3979 data->c.initted = 1; | |
| 3980 } | |
| 3981 | |
| 3982 /* If necessary, do encoding-detection now. We do this when we're a | |
| 3983 writing stream or a non-seekable reading stream, meaning that we | |
| 3984 can't just process the whole input, rewind, and start over. */ | |
| 3985 | |
| 3986 if (csdata->do_coding) | |
| 3987 { | |
| 3988 int actual_was_nil = NILP (data->actual); | |
| 3989 if (NILP (data->actual)) | |
| 3990 { | |
| 3991 if (!data->st) | |
| 3992 data->st = allocate_detection_state (); | |
| 3993 if (first_time) | |
| 3994 /* #### This is cheesy. What we really ought to do is buffer | |
| 3995 up a certain minimum amount of data to get a better result. | |
| 3996 */ | |
| 3997 data->actual = look_for_coding_system_magic_cookie (src, n); | |
| 3998 if (NILP (data->actual)) | |
| 3999 { | |
| 4000 /* #### This is cheesy. What we really ought to do is buffer | |
| 4001 up a certain minimum amount of data so as to get a less | |
| 4002 random result when doing subprocess detection. */ | |
| 4003 detect_coding_type (data->st, src, n); | |
| 4004 data->actual = detected_coding_system (data->st); | |
| 4100 | 4005 /* kludge to prevent infinite recursion */ |
| 4006 if (XCODING_SYSTEM(data->actual)->methods->enumtype == undecided_coding_system) | |
| 4007 data->actual = Fget_coding_system (Qbinary); | |
| 771 | 4008 } |
| 4009 } | |
| 4010 /* We need to set the detected coding system if we actually have | |
| 4011 such a coding system but didn't before. That is the case | |
| 4012 either when we just detected it in the previous code or when | |
| 4013 it was detected during undecided_init_coding_stream(). We | |
| 4014 can check for that using first_time. */ | |
| 4015 if (!NILP (data->actual) && (actual_was_nil || first_time)) | |
| 4016 { | |
| 4017 /* If the detected coding system doesn't allow for EOL | |
| 4018 autodetection, try to get the equivalent that does; | |
| 4019 otherwise, disable EOL detection (overriding whatever | |
| 4020 may already have been detected). */ | |
| 4021 if (XCODING_SYSTEM_EOL_TYPE (data->actual) != EOL_AUTODETECT) | |
| 4022 { | |
| 4023 if (!NILP (XCODING_SYSTEM_SUBSIDIARY_PARENT (data->actual))) | |
| 4024 data->actual = | |
| 4025 XCODING_SYSTEM_SUBSIDIARY_PARENT (data->actual); | |
| 4026 else if (data->c.lstream_count == 3) | |
| 4027 set_coding_stream_coding_system | |
| 4028 (XLSTREAM (data->c.lstreams[1]), | |
| 4029 Fget_coding_system (Qidentity)); | |
| 4030 } | |
| 4031 set_coding_stream_coding_system | |
| 4032 (XLSTREAM (data->c.lstreams[0]), data->actual); | |
| 4033 } | |
| 4034 } | |
| 4035 | |
| 4036 if (Lstream_write (XLSTREAM (data->c.lstreams[0]), src, n) < 0) | |
| 4037 return -1; | |
| 4038 return n; | |
| 4039 } | |
| 4040 else | |
| 4041 return no_conversion_convert (str, src, dst, n); | |
| 4042 } | |
| 4043 | |
| 4044 static Lisp_Object | |
| 4045 undecided_canonicalize_after_coding (struct coding_stream *str) | |
| 4046 { | |
| 4047 struct undecided_coding_stream *data = | |
| 4048 CODING_STREAM_TYPE_DATA (str, undecided); | |
| 4049 Lisp_Object ret, eolret; | |
| 4050 | |
| 4051 if (str->direction == CODING_ENCODE) | |
| 4052 return str->codesys; | |
| 4053 | |
| 4054 if (!data->c.initted) | |
|
4647
e4ed58cb0e5b
Fix bugs with #'find-file, 0-length files, & coding-system-for-read specified.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4569
diff
changeset
|
4055 return str->codesys; |
| 771 | 4056 |
| 4057 ret = coding_stream_canonicalize_after_coding | |
| 4058 (XLSTREAM (data->c.lstreams[0])); | |
| 4059 if (NILP (ret)) | |
|
4647
e4ed58cb0e5b
Fix bugs with #'find-file, 0-length files, & coding-system-for-read specified.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4569
diff
changeset
|
4060 ret = str->codesys; |
| 771 | 4061 if (XCODING_SYSTEM_EOL_TYPE (ret) != EOL_AUTODETECT) |
| 4062 return ret; | |
| 4063 eolret = coding_stream_canonicalize_after_coding | |
| 4064 (XLSTREAM (data->c.lstreams[1])); | |
| 4065 if (!EQ (XCODING_SYSTEM_TYPE (eolret), Qconvert_eol)) | |
| 4066 return ret; | |
| 4067 return | |
| 4068 Fsubsidiary_coding_system (ret, Fcoding_system_property (eolret, | |
| 4069 Qsubtype)); | |
| 4070 } | |
| 4071 | |
| 4072 | |
| 4073 /************************************************************************/ | |
| 4074 /* Lisp interface: Coding category functions and detection */ | |
| 4075 /************************************************************************/ | |
| 4076 | |
| 4077 DEFUN ("coding-category-list", Fcoding_category_list, 0, 0, 0, /* | |
| 4078 Return a list of all recognized coding categories. | |
| 4079 */ | |
| 4080 ()) | |
| 4081 { | |
| 4082 int i; | |
| 4083 Lisp_Object list = Qnil; | |
| 4084 | |
| 4085 for (i = 0; i < coding_detector_count; i++) | |
| 4086 { | |
| 4087 detector_category_dynarr *cats = | |
| 4088 Dynarr_at (all_coding_detectors, i).cats; | |
| 4089 int j; | |
| 4090 | |
| 4091 for (j = 0; j < Dynarr_length (cats); j++) | |
| 4092 list = Fcons (Dynarr_at (cats, j).sym, list); | |
| 4093 } | |
| 4094 | |
| 4095 return Fnreverse (list); | |
| 4096 } | |
| 4097 | |
| 4098 DEFUN ("set-coding-priority-list", Fset_coding_priority_list, 1, 1, 0, /* | |
| 4099 Change the priority order of the coding categories. | |
| 4100 LIST should be list of coding categories, in descending order of | |
| 4101 priority. Unspecified coding categories will be lower in priority | |
| 4102 than all specified ones, in the same relative order they were in | |
| 4103 previously. | |
| 4104 */ | |
| 4105 (list)) | |
| 4106 { | |
| 4107 int *category_to_priority = | |
| 4108 alloca_array (int, coding_detector_category_count); | |
| 4109 int i, j; | |
| 4110 | |
| 4111 /* First generate a list that maps coding categories to priorities. */ | |
| 4112 | |
| 4113 for (i = 0; i < coding_detector_category_count; i++) | |
| 4114 category_to_priority[i] = -1; | |
| 4115 | |
| 4116 /* Highest priority comes from the specified list. */ | |
| 4117 i = 0; | |
| 2367 | 4118 { |
| 4119 EXTERNAL_LIST_LOOP_2 (elt, list) | |
| 4120 { | |
| 4121 int cat = coding_category_symbol_to_id (elt); | |
| 4122 | |
| 4123 if (category_to_priority[cat] >= 0) | |
| 4124 sferror ("Duplicate coding category in list", elt); | |
| 4125 category_to_priority[cat] = i++; | |
| 4126 } | |
| 4127 } | |
| 771 | 4128 |
| 4129 /* Now go through the existing categories by priority to retrieve | |
| 4130 the categories not yet specified and preserve their priority | |
| 4131 order. */ | |
| 4132 for (j = 0; j < coding_detector_category_count; j++) | |
| 4133 { | |
| 4134 int cat = coding_category_by_priority[j]; | |
| 4135 if (category_to_priority[cat] < 0) | |
| 4136 category_to_priority[cat] = i++; | |
| 4137 } | |
| 4138 | |
| 4139 /* Now we need to construct the inverse of the mapping we just | |
| 4140 constructed. */ | |
| 4141 | |
| 4142 for (i = 0; i < coding_detector_category_count; i++) | |
| 4143 coding_category_by_priority[category_to_priority[i]] = i; | |
| 4144 | |
| 4145 /* Phew! That was confusing. */ | |
| 4146 return Qnil; | |
| 4147 } | |
| 4148 | |
| 4149 DEFUN ("coding-priority-list", Fcoding_priority_list, 0, 0, 0, /* | |
| 4150 Return a list of coding categories in descending order of priority. | |
| 4151 */ | |
| 4152 ()) | |
| 4153 { | |
| 4154 int i; | |
| 4155 Lisp_Object list = Qnil; | |
| 4156 | |
| 4157 for (i = 0; i < coding_detector_category_count; i++) | |
| 4158 list = | |
| 4159 Fcons (coding_category_id_to_symbol (coding_category_by_priority[i]), | |
| 4160 list); | |
| 4161 return Fnreverse (list); | |
| 4162 } | |
| 4163 | |
| 4164 DEFUN ("set-coding-category-system", Fset_coding_category_system, 2, 2, 0, /* | |
| 4165 Change the coding system associated with a coding category. | |
| 4166 */ | |
| 4167 (coding_category, coding_system)) | |
| 4168 { | |
| 4169 coding_category_system[coding_category_symbol_to_id (coding_category)] = | |
| 4170 Fget_coding_system (coding_system); | |
| 4171 return Qnil; | |
| 4172 } | |
| 4173 | |
| 4174 DEFUN ("coding-category-system", Fcoding_category_system, 1, 1, 0, /* | |
| 4175 Return the coding system associated with a coding category. | |
| 4176 */ | |
| 4177 (coding_category)) | |
| 4178 { | |
| 4179 Lisp_Object sys = | |
| 4180 coding_category_system[coding_category_symbol_to_id (coding_category)]; | |
| 4181 | |
| 4182 if (!NILP (sys)) | |
| 4183 return XCODING_SYSTEM_NAME (sys); | |
| 4184 return Qnil; | |
| 4185 } | |
| 4186 | |
| 800 | 4187 /* Detect the encoding of STREAM. Assumes stream is at the begnning and will |
| 4188 read through to the end of STREAM, leaving it there but open. */ | |
| 4189 | |
| 771 | 4190 Lisp_Object |
| 4191 detect_coding_stream (Lisp_Object stream) | |
| 4192 { | |
| 4193 Lisp_Object val = Qnil; | |
| 4194 struct gcpro gcpro1, gcpro2, gcpro3; | |
| 4195 UExtbyte random_buffer[65536]; | |
| 4196 Lisp_Object binary_instream = | |
| 4197 make_coding_input_stream | |
| 4198 (XLSTREAM (stream), Qbinary, | |
| 814 | 4199 CODING_ENCODE, LSTREAM_FL_NO_CLOSE_OTHER); |
| 771 | 4200 Lisp_Object decstream = |
| 4201 make_coding_input_stream | |
| 4202 (XLSTREAM (binary_instream), | |
| 800 | 4203 Qundecided, CODING_DECODE, 0); |
| 771 | 4204 Lstream *decstr = XLSTREAM (decstream); |
| 4205 | |
| 4206 GCPRO3 (decstream, stream, binary_instream); | |
| 4207 /* Read and discard all data; detection happens as a side effect of this, | |
| 4208 and we examine what was detected afterwards. */ | |
| 4209 while (Lstream_read (decstr, random_buffer, sizeof (random_buffer)) > 0) | |
| 4210 ; | |
| 4211 | |
| 4212 val = coding_stream_detected_coding_system (decstr); | |
| 4213 Lstream_close (decstr); | |
| 4214 Lstream_delete (decstr); | |
| 4215 Lstream_delete (XLSTREAM (binary_instream)); | |
| 4216 UNGCPRO; | |
| 4217 return val; | |
| 428 | 4218 } |
| 4219 | |
| 4220 DEFUN ("detect-coding-region", Fdetect_coding_region, 2, 3, 0, /* | |
| 4221 Detect coding system of the text in the region between START and END. | |
| 444 | 4222 Return a list of possible coding systems ordered by priority. |
| 3025 | 4223 If only ASCII characters are found, return `undecided' or one of |
| 428 | 4224 its subsidiary coding systems according to a detected end-of-line |
| 4225 type. Optional arg BUFFER defaults to the current buffer. | |
| 4226 */ | |
| 4227 (start, end, buffer)) | |
| 4228 { | |
| 4229 Lisp_Object val = Qnil; | |
| 4230 struct buffer *buf = decode_buffer (buffer, 0); | |
| 665 | 4231 Charbpos b, e; |
| 771 | 4232 Lisp_Object lb_instream; |
| 428 | 4233 |
| 4234 get_buffer_range_char (buf, start, end, &b, &e, 0); | |
| 4235 lb_instream = make_lisp_buffer_input_stream (buf, b, e, 0); | |
| 771 | 4236 |
| 4237 val = detect_coding_stream (lb_instream); | |
| 4238 Lstream_delete (XLSTREAM (lb_instream)); | |
| 428 | 4239 return val; |
| 4240 } | |
| 4241 | |
| 4242 | |
| 771 | 4243 |
| 4244 #ifdef DEBUG_XEMACS | |
| 4245 | |
| 428 | 4246 /************************************************************************/ |
| 771 | 4247 /* Internal methods */ |
| 4248 /************************************************************************/ | |
| 4249 | |
| 4250 /* Raw (internally-formatted) data. */ | |
| 4251 DEFINE_CODING_SYSTEM_TYPE (internal); | |
| 428 | 4252 |
| 665 | 4253 static Bytecount |
| 2286 | 4254 internal_convert (struct coding_stream *UNUSED (str), const UExtbyte *src, |
| 771 | 4255 unsigned_char_dynarr *dst, Bytecount n) |
| 4256 { | |
| 4257 Bytecount orign = n; | |
| 4258 Dynarr_add_many (dst, src, n); | |
| 4259 return orign; | |
| 4260 } | |
| 4261 | |
| 4262 #endif /* DEBUG_XEMACS */ | |
| 4263 | |
| 4264 | |
| 4265 | |
| 4266 #ifdef HAVE_ZLIB | |
| 4267 | |
| 4268 /************************************************************************/ | |
| 4269 /* Gzip methods */ | |
| 4270 /************************************************************************/ | |
| 4271 | |
| 4272 struct gzip_coding_system | |
| 428 | 4273 { |
| 771 | 4274 int level; /* 0 through 9, or -1 for default */ |
| 4275 }; | |
| 4276 | |
| 4277 #define CODING_SYSTEM_GZIP_LEVEL(codesys) \ | |
| 4278 (CODING_SYSTEM_TYPE_DATA (codesys, gzip)->level) | |
| 4279 #define XCODING_SYSTEM_GZIP_LEVEL(codesys) \ | |
| 4280 (XCODING_SYSTEM_TYPE_DATA (codesys, gzip)->level) | |
| 4281 | |
| 4282 struct gzip_coding_stream | |
| 428 | 4283 { |
| 771 | 4284 z_stream stream; |
| 4285 int stream_initted; | |
| 4286 int reached_eof; /* #### this should be handled by the caller, once we | |
| 4287 return LSTREAM_EOF */ | |
| 4288 }; | |
| 4289 | |
| 1204 | 4290 static const struct memory_description |
| 771 | 4291 gzip_coding_system_description[] = { |
| 4292 { XD_END } | |
| 4293 }; | |
| 4294 | |
| 1204 | 4295 DEFINE_CODING_SYSTEM_TYPE_WITH_DATA (gzip); |
| 4296 | |
| 771 | 4297 enum source_sink_type |
| 4298 gzip_conversion_end_type (Lisp_Object codesys) | |
| 4299 { | |
| 4300 return DECODES_BYTE_TO_BYTE; | |
| 428 | 4301 } |
| 4302 | |
| 4303 static void | |
| 771 | 4304 gzip_init (Lisp_Object codesys) |
| 4305 { | |
| 4306 struct gzip_coding_system *data = XCODING_SYSTEM_TYPE_DATA (codesys, gzip); | |
| 4307 data->level = -1; | |
| 4308 } | |
| 4309 | |
| 4310 static void | |
| 4311 gzip_print (Lisp_Object cs, Lisp_Object printcharfun, int escapeflag) | |
| 428 | 4312 { |
| 771 | 4313 struct gzip_coding_system *data = XCODING_SYSTEM_TYPE_DATA (cs, gzip); |
| 4314 | |
| 826 | 4315 write_c_string (printcharfun, "("); |
| 771 | 4316 if (data->level == -1) |
| 826 | 4317 write_c_string (printcharfun, "default"); |
| 771 | 4318 else |
| 4319 print_internal (make_int (data->level), printcharfun, 0); | |
| 826 | 4320 write_c_string (printcharfun, ")"); |
| 428 | 4321 } |
| 4322 | |
| 4323 static int | |
| 771 | 4324 gzip_putprop (Lisp_Object codesys, Lisp_Object key, Lisp_Object value) |
| 428 | 4325 { |
| 771 | 4326 struct gzip_coding_system *data = XCODING_SYSTEM_TYPE_DATA (codesys, gzip); |
| 4327 | |
| 4328 if (EQ (key, Qlevel)) | |
| 428 | 4329 { |
| 771 | 4330 if (EQ (value, Qdefault)) |
| 4331 data->level = -1; | |
| 4332 else | |
| 428 | 4333 { |
| 771 | 4334 CHECK_INT (value); |
| 4335 check_int_range (XINT (value), 0, 9); | |
| 4336 data->level = XINT (value); | |
| 428 | 4337 } |
| 4338 } | |
| 4339 else | |
| 771 | 4340 return 0; |
| 4341 return 1; | |
| 428 | 4342 } |
| 4343 | |
| 4344 static Lisp_Object | |
| 771 | 4345 gzip_getprop (Lisp_Object coding_system, Lisp_Object prop) |
| 428 | 4346 { |
| 771 | 4347 struct gzip_coding_system *data = |
| 4348 XCODING_SYSTEM_TYPE_DATA (coding_system, gzip); | |
| 4349 | |
| 4350 if (EQ (prop, Qlevel)) | |
| 428 | 4351 { |
| 771 | 4352 if (data->level == -1) |
| 4353 return Qdefault; | |
| 4354 return make_int (data->level); | |
| 428 | 4355 } |
| 771 | 4356 |
| 4357 return Qunbound; | |
| 428 | 4358 } |
| 4359 | |
| 4360 static void | |
| 771 | 4361 gzip_init_coding_stream (struct coding_stream *str) |
| 428 | 4362 { |
| 771 | 4363 struct gzip_coding_stream *data = CODING_STREAM_TYPE_DATA (str, gzip); |
| 4364 if (data->stream_initted) | |
| 428 | 4365 { |
| 771 | 4366 if (str->direction == CODING_DECODE) |
| 4367 inflateEnd (&data->stream); | |
| 4368 else | |
| 4369 deflateEnd (&data->stream); | |
| 4370 data->stream_initted = 0; | |
| 428 | 4371 } |
| 771 | 4372 data->reached_eof = 0; |
| 428 | 4373 } |
| 4374 | |
| 4375 static void | |
| 771 | 4376 gzip_rewind_coding_stream (struct coding_stream *str) |
| 428 | 4377 { |
| 771 | 4378 gzip_init_coding_stream (str); |
| 428 | 4379 } |
| 4380 | |
| 771 | 4381 static Bytecount |
| 4382 gzip_convert (struct coding_stream *str, | |
| 4383 const UExtbyte *src, | |
| 4384 unsigned_char_dynarr *dst, Bytecount n) | |
| 428 | 4385 { |
| 771 | 4386 struct gzip_coding_stream *data = CODING_STREAM_TYPE_DATA (str, gzip); |
| 4387 int zerr; | |
| 4388 if (str->direction == CODING_DECODE) | |
| 428 | 4389 { |
| 771 | 4390 if (data->reached_eof) |
| 4391 return n; /* eat the data */ | |
| 4392 | |
| 4393 if (!data->stream_initted) | |
| 428 | 4394 { |
| 771 | 4395 xzero (data->stream); |
| 4396 if (inflateInit (&data->stream) != Z_OK) | |
| 4397 return LSTREAM_ERROR; | |
| 4398 data->stream_initted = 1; | |
| 428 | 4399 } |
| 771 | 4400 |
| 4401 data->stream.next_in = (Bytef *) src; | |
| 4402 data->stream.avail_in = n; | |
| 4403 | |
| 4404 /* Normally we stop when we've fed all data to the decompressor; but | |
| 4405 if we're at the end of the input, and the decompressor hasn't | |
| 4406 reported EOF, we need to keep going, as there might be more output | |
| 4407 to generate. Z_OK from the decompressor means input was processed | |
| 4408 or output was generated; if neither, we break out of the loop. | |
| 4409 Other return values are: | |
| 4410 | |
| 4411 Z_STREAM_END EOF from decompressor | |
| 4412 Z_DATA_ERROR Corrupted data | |
| 4413 Z_BUF_ERROR No progress possible (this should happen if | |
| 4414 we try to feed it an incomplete file) | |
| 4415 Z_MEM_ERROR Out of memory | |
| 4416 Z_STREAM_ERROR (should never happen) | |
| 4417 Z_NEED_DICT (#### when will this happen?) | |
| 4418 */ | |
| 4419 while (data->stream.avail_in > 0 || str->eof) | |
| 4420 { | |
| 4421 /* Reserve an output buffer of the same size as the input buffer; | |
| 4422 if that's not enough, we keep reserving the same size. */ | |
| 4423 Bytecount reserved = n; | |
| 4424 Dynarr_add_many (dst, 0, reserved); | |
| 4425 /* Careful here! Don't retrieve the pointer until after | |
| 4426 reserving the space, or it might be bogus */ | |
| 4427 data->stream.next_out = | |
| 4428 Dynarr_atp (dst, Dynarr_length (dst) - reserved); | |
| 4429 data->stream.avail_out = reserved; | |
| 4430 zerr = inflate (&data->stream, Z_NO_FLUSH); | |
| 4431 /* Lop off the unused portion */ | |
| 4432 Dynarr_set_size (dst, Dynarr_length (dst) - data->stream.avail_out); | |
| 4433 if (zerr != Z_OK) | |
| 4434 break; | |
| 4435 } | |
| 4436 | |
| 4437 if (zerr == Z_STREAM_END) | |
| 4438 data->reached_eof = 1; | |
| 4439 | |
| 4440 if ((Bytecount) data->stream.avail_in < n) | |
| 4441 return n - data->stream.avail_in; | |
| 4442 | |
| 4443 if (zerr == Z_OK || zerr == Z_STREAM_END) | |
| 4444 return 0; | |
| 4445 | |
| 4446 return LSTREAM_ERROR; | |
| 428 | 4447 } |
| 4448 else | |
| 4449 { | |
| 771 | 4450 if (!data->stream_initted) |
| 4451 { | |
| 4452 int level = XCODING_SYSTEM_GZIP_LEVEL (str->codesys); | |
| 4453 xzero (data->stream); | |
| 4454 if (deflateInit (&data->stream, | |
| 4455 level == -1 ? Z_DEFAULT_COMPRESSION : level) != | |
| 4456 Z_OK) | |
| 4457 return LSTREAM_ERROR; | |
| 4458 data->stream_initted = 1; | |
| 428 | 4459 } |
| 771 | 4460 |
| 4461 data->stream.next_in = (Bytef *) src; | |
| 4462 data->stream.avail_in = n; | |
| 4463 | |
| 4464 /* Normally we stop when we've fed all data to the compressor; but if | |
| 4465 we're at the end of the input, and the compressor hasn't reported | |
| 4466 EOF, we need to keep going, as there might be more output to | |
| 4467 generate. (To signal EOF on our end, we set the FLUSH parameter | |
| 4468 to Z_FINISH; when all data is output, Z_STREAM_END will be | |
| 4469 returned.) Z_OK from the compressor means input was processed or | |
| 4470 output was generated; if neither, we break out of the loop. Other | |
| 4471 return values are: | |
| 4472 | |
| 4473 Z_STREAM_END EOF from compressor | |
| 4474 Z_BUF_ERROR No progress possible (should never happen) | |
| 4475 Z_STREAM_ERROR (should never happen) | |
| 4476 */ | |
| 4477 while (data->stream.avail_in > 0 || str->eof) | |
| 4478 { | |
| 4479 /* Reserve an output buffer of the same size as the input buffer; | |
| 4480 if that's not enough, we keep reserving the same size. */ | |
| 4481 Bytecount reserved = n; | |
| 4482 Dynarr_add_many (dst, 0, reserved); | |
| 4483 /* Careful here! Don't retrieve the pointer until after | |
| 4484 reserving the space, or it might be bogus */ | |
| 4485 data->stream.next_out = | |
| 4486 Dynarr_atp (dst, Dynarr_length (dst) - reserved); | |
| 4487 data->stream.avail_out = reserved; | |
| 4488 zerr = | |
| 4489 deflate (&data->stream, | |
| 4490 str->eof ? Z_FINISH : Z_NO_FLUSH); | |
| 4491 /* Lop off the unused portion */ | |
| 4492 Dynarr_set_size (dst, Dynarr_length (dst) - data->stream.avail_out); | |
| 4493 if (zerr != Z_OK) | |
| 4494 break; | |
| 4495 } | |
| 4496 | |
| 4497 if ((Bytecount) data->stream.avail_in < n) | |
| 4498 return n - data->stream.avail_in; | |
| 4499 | |
| 4500 if (zerr == Z_OK || zerr == Z_STREAM_END) | |
| 4501 return 0; | |
| 4502 | |
| 4503 return LSTREAM_ERROR; | |
| 428 | 4504 } |
| 4505 } | |
| 4506 | |
| 771 | 4507 #endif /* HAVE_ZLIB */ |
| 428 | 4508 |
| 4509 | |
| 4510 /************************************************************************/ | |
| 4511 /* Initialization */ | |
| 4512 /************************************************************************/ | |
| 4513 | |
| 4514 void | |
| 4515 syms_of_file_coding (void) | |
| 4516 { | |
| 442 | 4517 INIT_LRECORD_IMPLEMENTATION (coding_system); |
| 4518 | |
| 771 | 4519 DEFSUBR (Fvalid_coding_system_type_p); |
| 4520 DEFSUBR (Fcoding_system_type_list); | |
| 428 | 4521 DEFSUBR (Fcoding_system_p); |
| 4303 | 4522 DEFSUBR (Fautoload_coding_system); |
| 428 | 4523 DEFSUBR (Ffind_coding_system); |
| 4524 DEFSUBR (Fget_coding_system); | |
| 4525 DEFSUBR (Fcoding_system_list); | |
| 4526 DEFSUBR (Fcoding_system_name); | |
|
4690
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
4527 DEFSUBR (Fmake_coding_system_internal); |
| 428 | 4528 DEFSUBR (Fcopy_coding_system); |
| 440 | 4529 DEFSUBR (Fcoding_system_canonical_name_p); |
| 4530 DEFSUBR (Fcoding_system_alias_p); | |
| 4531 DEFSUBR (Fcoding_system_aliasee); | |
| 428 | 4532 DEFSUBR (Fdefine_coding_system_alias); |
| 4533 DEFSUBR (Fsubsidiary_coding_system); | |
| 771 | 4534 DEFSUBR (Fcoding_system_base); |
| 4535 DEFSUBR (Fcoding_system_used_for_io); | |
| 428 | 4536 |
| 4537 DEFSUBR (Fcoding_system_type); | |
| 771 | 4538 DEFSUBR (Fcoding_system_description); |
| 428 | 4539 DEFSUBR (Fcoding_system_property); |
| 4540 | |
| 4541 DEFSUBR (Fcoding_category_list); | |
| 4542 DEFSUBR (Fset_coding_priority_list); | |
| 4543 DEFSUBR (Fcoding_priority_list); | |
| 4544 DEFSUBR (Fset_coding_category_system); | |
| 4545 DEFSUBR (Fcoding_category_system); | |
| 4546 | |
| 4547 DEFSUBR (Fdetect_coding_region); | |
| 4548 DEFSUBR (Fdecode_coding_region); | |
| 4549 DEFSUBR (Fencode_coding_region); | |
|
4690
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
4550 DEFSUBR (Fquery_coding_region); |
| 563 | 4551 DEFSYMBOL_MULTIWORD_PREDICATE (Qcoding_systemp); |
| 4552 DEFSYMBOL (Qno_conversion); | |
| 771 | 4553 DEFSYMBOL (Qconvert_eol); |
| 4554 DEFSYMBOL (Qconvert_eol_autodetect); | |
| 4555 DEFSYMBOL (Qconvert_eol_lf); | |
| 4556 DEFSYMBOL (Qconvert_eol_cr); | |
| 4557 DEFSYMBOL (Qconvert_eol_crlf); | |
| 563 | 4558 DEFSYMBOL (Qraw_text); |
| 771 | 4559 |
| 563 | 4560 DEFSYMBOL (Qmnemonic); |
| 4561 DEFSYMBOL (Qeol_type); | |
| 4562 DEFSYMBOL (Qpost_read_conversion); | |
| 4563 DEFSYMBOL (Qpre_write_conversion); | |
| 4564 | |
| 771 | 4565 DEFSYMBOL (Qtranslation_table_for_decode); |
| 4566 DEFSYMBOL (Qtranslation_table_for_encode); | |
| 4567 DEFSYMBOL (Qsafe_chars); | |
| 4568 DEFSYMBOL (Qsafe_charsets); | |
| 4569 DEFSYMBOL (Qmime_charset); | |
| 4570 DEFSYMBOL (Qvalid_codes); | |
| 4571 | |
| 563 | 4572 DEFSYMBOL (Qcr); |
| 4573 DEFSYMBOL (Qlf); | |
| 4574 DEFSYMBOL (Qcrlf); | |
| 4575 DEFSYMBOL (Qeol_cr); | |
| 4576 DEFSYMBOL (Qeol_lf); | |
| 4577 DEFSYMBOL (Qeol_crlf); | |
| 4578 DEFSYMBOL (Qencode); | |
| 4579 DEFSYMBOL (Qdecode); | |
| 428 | 4580 |
| 771 | 4581 DEFSYMBOL (Qnear_certainty); |
| 4582 DEFSYMBOL (Qquite_probable); | |
| 4583 DEFSYMBOL (Qsomewhat_likely); | |
| 1494 | 4584 DEFSYMBOL (Qslightly_likely); |
| 771 | 4585 DEFSYMBOL (Qas_likely_as_unlikely); |
| 4586 DEFSYMBOL (Qsomewhat_unlikely); | |
| 4587 DEFSYMBOL (Qquite_improbable); | |
| 4588 DEFSYMBOL (Qnearly_impossible); | |
| 4589 | |
| 4590 DEFSYMBOL (Qdo_eol); | |
| 4591 DEFSYMBOL (Qdo_coding); | |
| 4592 | |
| 4593 DEFSYMBOL (Qcanonicalize_after_coding); | |
| 4594 | |
| 4303 | 4595 DEFSYMBOL (Qposix_charset_to_coding_system_hash); |
| 4596 | |
| 771 | 4597 DEFSYMBOL (Qescape_quoted); |
| 4598 | |
|
4690
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
4599 DEFSYMBOL (Qquery_coding_warning_face); |
|
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
4600 DEFSYMBOL (Qaliases); |
|
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
4601 DEFSYMBOL (Qcharset_skip_chars_string); |
|
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
4602 |
| 771 | 4603 #ifdef HAVE_ZLIB |
| 4604 DEFSYMBOL (Qgzip); | |
| 4605 #endif | |
| 4606 | |
| 428 | 4607 } |
| 4608 | |
| 4609 void | |
| 4610 lstream_type_create_file_coding (void) | |
| 4611 { | |
| 771 | 4612 LSTREAM_HAS_METHOD (coding, reader); |
| 4613 LSTREAM_HAS_METHOD (coding, writer); | |
| 4614 LSTREAM_HAS_METHOD (coding, rewinder); | |
| 4615 LSTREAM_HAS_METHOD (coding, seekable_p); | |
| 4616 LSTREAM_HAS_METHOD (coding, marker); | |
| 4617 LSTREAM_HAS_METHOD (coding, flusher); | |
| 4618 LSTREAM_HAS_METHOD (coding, closer); | |
| 4619 LSTREAM_HAS_METHOD (coding, finalizer); | |
| 4620 } | |
| 4621 | |
| 4622 void | |
| 4623 coding_system_type_create (void) | |
| 4624 { | |
| 4625 int i; | |
| 4626 | |
| 4627 staticpro (&Vcoding_system_hash_table); | |
| 4628 Vcoding_system_hash_table = | |
| 4629 make_lisp_hash_table (50, HASH_TABLE_NON_WEAK, HASH_TABLE_EQ); | |
| 4630 | |
| 4631 the_coding_system_type_entry_dynarr = Dynarr_new (coding_system_type_entry); | |
| 2367 | 4632 dump_add_root_block_ptr (&the_coding_system_type_entry_dynarr, |
| 771 | 4633 &csted_description); |
| 4634 | |
| 4635 Vcoding_system_type_list = Qnil; | |
| 4636 staticpro (&Vcoding_system_type_list); | |
| 4637 | |
| 4638 /* Initialize to something reasonable ... */ | |
| 4639 for (i = 0; i < MAX_DETECTOR_CATEGORIES; i++) | |
| 4640 { | |
| 4641 coding_category_system[i] = Qnil; | |
| 1204 | 4642 dump_add_root_lisp_object (&coding_category_system[i]); |
| 771 | 4643 coding_category_by_priority[i] = i; |
| 4644 } | |
| 4645 | |
| 4646 dump_add_opaque (coding_category_by_priority, | |
| 4647 sizeof (coding_category_by_priority)); | |
| 4648 | |
| 4649 all_coding_detectors = Dynarr_new2 (detector_dynarr, struct detector); | |
| 2367 | 4650 dump_add_root_block_ptr (&all_coding_detectors, |
| 771 | 4651 &detector_dynarr_description); |
| 4652 | |
| 4653 dump_add_opaque_int (&coding_system_tick); | |
| 4654 dump_add_opaque_int (&coding_detector_count); | |
| 4655 dump_add_opaque_int (&coding_detector_category_count); | |
| 4656 | |
| 4657 INITIALIZE_CODING_SYSTEM_TYPE (no_conversion, | |
| 4658 "no-conversion-coding-system-p"); | |
| 4659 CODING_SYSTEM_HAS_METHOD (no_conversion, convert); | |
| 4660 | |
| 4661 INITIALIZE_DETECTOR (no_conversion); | |
| 4662 DETECTOR_HAS_METHOD (no_conversion, detect); | |
| 4663 INITIALIZE_DETECTOR_CATEGORY (no_conversion, no_conversion); | |
| 4664 | |
| 4665 INITIALIZE_CODING_SYSTEM_TYPE_WITH_DATA (convert_eol, | |
| 4666 "convert-eol-coding-system-p"); | |
| 4667 CODING_SYSTEM_HAS_METHOD (convert_eol, print); | |
| 4668 CODING_SYSTEM_HAS_METHOD (convert_eol, convert); | |
| 4669 CODING_SYSTEM_HAS_METHOD (convert_eol, getprop); | |
| 4670 CODING_SYSTEM_HAS_METHOD (convert_eol, putprop); | |
| 4671 CODING_SYSTEM_HAS_METHOD (convert_eol, conversion_end_type); | |
| 4672 CODING_SYSTEM_HAS_METHOD (convert_eol, canonicalize_after_coding); | |
| 4673 CODING_SYSTEM_HAS_METHOD (convert_eol, init_coding_stream); | |
| 4674 | |
| 4675 INITIALIZE_CODING_SYSTEM_TYPE_WITH_DATA (undecided, | |
| 4676 "undecided-coding-system-p"); | |
| 4677 CODING_SYSTEM_HAS_METHOD (undecided, init); | |
| 4678 CODING_SYSTEM_HAS_METHOD (undecided, mark); | |
| 4679 CODING_SYSTEM_HAS_METHOD (undecided, print); | |
| 4680 CODING_SYSTEM_HAS_METHOD (undecided, convert); | |
| 4681 CODING_SYSTEM_HAS_METHOD (undecided, putprop); | |
| 4682 CODING_SYSTEM_HAS_METHOD (undecided, getprop); | |
| 4683 CODING_SYSTEM_HAS_METHOD (undecided, init_coding_stream); | |
| 4684 CODING_SYSTEM_HAS_METHOD (undecided, rewind_coding_stream); | |
| 4685 CODING_SYSTEM_HAS_METHOD (undecided, finalize_coding_stream); | |
| 4686 CODING_SYSTEM_HAS_METHOD (undecided, mark_coding_stream); | |
| 4687 CODING_SYSTEM_HAS_METHOD (undecided, canonicalize); | |
| 4688 CODING_SYSTEM_HAS_METHOD (undecided, canonicalize_after_coding); | |
| 4689 | |
| 4690 INITIALIZE_CODING_SYSTEM_TYPE_WITH_DATA (chain, "chain-coding-system-p"); | |
| 4691 | |
| 4692 CODING_SYSTEM_HAS_METHOD (chain, print); | |
| 4693 CODING_SYSTEM_HAS_METHOD (chain, canonicalize); | |
| 4694 CODING_SYSTEM_HAS_METHOD (chain, init); | |
| 4695 CODING_SYSTEM_HAS_METHOD (chain, mark); | |
| 4696 CODING_SYSTEM_HAS_METHOD (chain, mark_coding_stream); | |
| 4697 CODING_SYSTEM_HAS_METHOD (chain, convert); | |
| 4698 CODING_SYSTEM_HAS_METHOD (chain, rewind_coding_stream); | |
| 4699 CODING_SYSTEM_HAS_METHOD (chain, finalize_coding_stream); | |
| 4700 CODING_SYSTEM_HAS_METHOD (chain, finalize); | |
| 4701 CODING_SYSTEM_HAS_METHOD (chain, putprop); | |
| 4702 CODING_SYSTEM_HAS_METHOD (chain, getprop); | |
| 4703 CODING_SYSTEM_HAS_METHOD (chain, conversion_end_type); | |
| 4704 CODING_SYSTEM_HAS_METHOD (chain, canonicalize_after_coding); | |
| 4705 | |
| 4706 #ifdef DEBUG_XEMACS | |
| 4707 INITIALIZE_CODING_SYSTEM_TYPE (internal, "internal-coding-system-p"); | |
| 4708 CODING_SYSTEM_HAS_METHOD (internal, convert); | |
| 4709 #endif | |
| 4710 | |
| 4711 #ifdef HAVE_ZLIB | |
| 4712 INITIALIZE_CODING_SYSTEM_TYPE_WITH_DATA (gzip, "gzip-coding-system-p"); | |
| 4713 CODING_SYSTEM_HAS_METHOD (gzip, conversion_end_type); | |
| 4714 CODING_SYSTEM_HAS_METHOD (gzip, convert); | |
| 4715 CODING_SYSTEM_HAS_METHOD (gzip, init); | |
| 4716 CODING_SYSTEM_HAS_METHOD (gzip, print); | |
| 4717 CODING_SYSTEM_HAS_METHOD (gzip, init_coding_stream); | |
| 4718 CODING_SYSTEM_HAS_METHOD (gzip, rewind_coding_stream); | |
| 4719 CODING_SYSTEM_HAS_METHOD (gzip, putprop); | |
| 4720 CODING_SYSTEM_HAS_METHOD (gzip, getprop); | |
| 4721 #endif | |
| 4722 } | |
| 4723 | |
| 4724 void | |
| 4725 reinit_coding_system_type_create (void) | |
| 4726 { | |
| 4727 REINITIALIZE_CODING_SYSTEM_TYPE (no_conversion); | |
| 4728 REINITIALIZE_CODING_SYSTEM_TYPE (convert_eol); | |
| 4729 REINITIALIZE_CODING_SYSTEM_TYPE (undecided); | |
| 4730 REINITIALIZE_CODING_SYSTEM_TYPE (chain); | |
| 4731 #if 0 | |
| 4732 REINITIALIZE_CODING_SYSTEM_TYPE (text_file_wrapper); | |
| 4733 #endif /* 0 */ | |
| 4734 #ifdef DEBUG_XEMACS | |
| 4735 REINITIALIZE_CODING_SYSTEM_TYPE (internal); | |
| 4736 #endif | |
| 4737 #ifdef HAVE_ZLIB | |
| 4738 REINITIALIZE_CODING_SYSTEM_TYPE (gzip); | |
| 4739 #endif | |
| 4740 } | |
| 4741 | |
| 4742 void | |
| 4743 reinit_vars_of_file_coding (void) | |
| 4744 { | |
| 428 | 4745 } |
| 4746 | |
| 4747 void | |
| 4748 vars_of_file_coding (void) | |
| 4749 { | |
| 771 | 4750 /* We always have file-coding support */ |
| 428 | 4751 Fprovide (intern ("file-coding")); |
| 4752 | |
| 1347 | 4753 QScoding_system_cookie = build_string (";;;###coding system: "); |
| 4754 staticpro (&QScoding_system_cookie); | |
| 4755 | |
| 1242 | 4756 #ifdef HAVE_DEFAULT_EOL_DETECTION |
| 2297 | 4757 /* #### Find a more appropriate place for this comment. |
| 4758 WARNING: The existing categories are intimately tied to the function | |
| 1242 | 4759 `coding-system-category' in coding.el. If you change a category, or |
| 4760 change the layout of any coding system associated with a category, you | |
| 4761 need to check that function and make sure it's written properly. */ | |
| 4762 | |
| 4763 Fprovide (intern ("unix-default-eol-detection")); | |
| 4764 #endif | |
| 4765 | |
| 428 | 4766 DEFVAR_LISP ("keyboard-coding-system", &Vkeyboard_coding_system /* |
| 3142 | 4767 Default coding system used for TTY and X11 keyboard input. |
| 4768 Under X11, used only to interpet the character for a key event when that | |
| 4769 event has a KeySym of NoSymbol but does have an associated string keysym, | |
| 4770 something that's seen with input methods. | |
| 4771 | |
| 4772 If you need to set these things to different coding systems, call the | |
| 4773 function `set-console-tty-coding-system' for the TTY and use this variable | |
| 4774 for X11. | |
| 428 | 4775 */ ); |
| 4776 Vkeyboard_coding_system = Qnil; | |
| 4777 | |
| 4778 DEFVAR_LISP ("terminal-coding-system", &Vterminal_coding_system /* | |
| 4779 Coding system used for TTY display output. | |
| 4780 Not used under a windowing system. | |
| 4781 */ ); | |
| 4782 Vterminal_coding_system = Qnil; | |
| 4783 | |
| 4784 DEFVAR_LISP ("coding-system-for-read", &Vcoding_system_for_read /* | |
| 440 | 4785 Overriding coding system used when reading from a file or process. |
| 4786 You should bind this variable with `let', but do not set it globally. | |
| 4787 If this is non-nil, it specifies the coding system that will be used | |
| 4788 to decode input on read operations, such as from a file or process. | |
| 4789 It overrides `buffer-file-coding-system-for-read', | |
| 428 | 4790 `insert-file-contents-pre-hook', etc. Use those variables instead of |
| 440 | 4791 this one for permanent changes to the environment. */ ); |
| 428 | 4792 Vcoding_system_for_read = Qnil; |
| 4793 | |
| 4794 DEFVAR_LISP ("coding-system-for-write", | |
| 4795 &Vcoding_system_for_write /* | |
| 440 | 4796 Overriding coding system used when writing to a file or process. |
| 4797 You should bind this variable with `let', but do not set it globally. | |
| 4798 If this is non-nil, it specifies the coding system that will be used | |
| 4799 to encode output for write operations, such as to a file or process. | |
| 4800 It overrides `buffer-file-coding-system', `write-region-pre-hook', etc. | |
| 4801 Use those variables instead of this one for permanent changes to the | |
| 4802 environment. */ ); | |
| 428 | 4803 Vcoding_system_for_write = Qnil; |
| 4804 | |
| 4805 DEFVAR_LISP ("file-name-coding-system", &Vfile_name_coding_system /* | |
| 4806 Coding system used to convert pathnames when accessing files. | |
| 4807 */ ); | |
| 4808 Vfile_name_coding_system = Qnil; | |
| 4809 | |
| 4810 DEFVAR_BOOL ("enable-multibyte-characters", &enable_multibyte_characters /* | |
| 771 | 4811 Setting this has no effect. It is purely for FSF compatibility. |
| 428 | 4812 */ ); |
| 4813 enable_multibyte_characters = 1; | |
| 771 | 4814 |
| 4815 Vchain_canonicalize_hash_table = | |
| 4816 make_lisp_hash_table (50, HASH_TABLE_NON_WEAK, HASH_TABLE_EQUAL); | |
| 4817 staticpro (&Vchain_canonicalize_hash_table); | |
| 4818 | |
| 4819 #ifdef DEBUG_XEMACS | |
| 4820 DEFVAR_LISP ("debug-coding-detection", &Vdebug_coding_detection /* | |
| 4821 If non-nil, display debug information about detection operations in progress. | |
| 4822 Information is displayed on stderr. | |
| 4823 */ ); | |
| 4824 Vdebug_coding_detection = Qnil; | |
| 4825 #endif | |
|
4690
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
4826 |
|
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
4827 #ifdef MULE |
|
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
4828 Vdefault_query_coding_region_chartab_cache |
|
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
4829 = make_lisp_hash_table (25, HASH_TABLE_NON_WEAK, HASH_TABLE_EQUAL); |
|
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
4830 staticpro (&Vdefault_query_coding_region_chartab_cache); |
|
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
4831 #endif |
| 428 | 4832 } |
| 4833 | |
| 2297 | 4834 /* #### reformat this for consistent appearance? */ |
| 4835 | |
| 428 | 4836 void |
| 4837 complex_vars_of_file_coding (void) | |
| 4838 { | |
|
4690
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
4839 Fmake_coding_system_internal |
| 771 | 4840 (Qconvert_eol_cr, Qconvert_eol, |
| 4841 build_msg_string ("Convert CR to LF"), | |
| 4842 nconc2 (list6 (Qdocumentation, | |
| 4843 build_msg_string ( | |
| 4844 "Converts CR (used to mark the end of a line on Macintosh systems) to LF\n" | |
| 4845 "(used internally and under Unix to mark the end of a line)."), | |
| 4846 Qmnemonic, build_string ("CR->LF"), | |
| 4847 Qsubtype, Qcr), | |
| 4848 /* VERY IMPORTANT! Tell make-coding-system not to generate | |
| 4849 subsidiaries -- it needs the coding systems we're creating | |
| 4850 to do so! */ | |
|
4690
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
4851 list4 (Qeol_type, Qlf, |
|
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
4852 Qsafe_charsets, Qt))); |
|
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
4853 |
|
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
4854 Fmake_coding_system_internal |
| 771 | 4855 (Qconvert_eol_lf, Qconvert_eol, |
| 4856 build_msg_string ("Convert LF to LF (do nothing)"), | |
| 4857 nconc2 (list6 (Qdocumentation, | |
| 4858 build_msg_string ( | |
| 4859 "Do nothing."), | |
| 4860 Qmnemonic, build_string ("LF->LF"), | |
| 4861 Qsubtype, Qlf), | |
| 4862 /* VERY IMPORTANT! Tell make-coding-system not to generate | |
| 4863 subsidiaries -- it needs the coding systems we're creating | |
| 4864 to do so! */ | |
|
4690
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
4865 list4 (Qeol_type, Qlf, |
|
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
4866 Qsafe_charsets, Qt))); |
|
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
4867 |
|
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
4868 Fmake_coding_system_internal |
| 771 | 4869 (Qconvert_eol_crlf, Qconvert_eol, |
| 4870 build_msg_string ("Convert CRLF to LF"), | |
| 4871 nconc2 (list6 (Qdocumentation, | |
| 4872 build_msg_string ( | |
| 4873 "Converts CR+LF (used to mark the end of a line on Macintosh systems) to LF\n" | |
| 4874 "(used internally and under Unix to mark the end of a line)."), | |
| 4875 Qmnemonic, build_string ("CRLF->LF"), | |
| 4876 Qsubtype, Qcrlf), | |
|
4690
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
4877 |
| 771 | 4878 /* VERY IMPORTANT! Tell make-coding-system not to generate |
| 4879 subsidiaries -- it needs the coding systems we're creating | |
| 4880 to do so! */ | |
|
4690
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
4881 list4 (Qeol_type, Qlf, |
|
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
4882 Qsafe_charsets, Qt))); |
|
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
4883 |
|
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
4884 Fmake_coding_system_internal |
| 771 | 4885 (Qconvert_eol_autodetect, Qconvert_eol, |
| 4886 build_msg_string ("Autodetect EOL type"), | |
| 4887 nconc2 (list6 (Qdocumentation, | |
| 4888 build_msg_string ( | |
| 4889 "Autodetect the end-of-line type."), | |
| 4890 Qmnemonic, build_string ("Auto-EOL"), | |
| 793 | 4891 Qsubtype, Qnil), |
| 771 | 4892 /* VERY IMPORTANT! Tell make-coding-system not to generate |
| 4893 subsidiaries -- it needs the coding systems we're creating | |
| 4894 to do so! */ | |
|
4690
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
4895 list4 (Qeol_type, Qlf, |
|
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
4896 Qsafe_charsets, Qt))); |
|
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
4897 |
|
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
4898 Fmake_coding_system_internal |
| 771 | 4899 (Qundecided, Qundecided, |
| 4900 build_msg_string ("Undecided (auto-detect)"), | |
| 4901 nconc2 (list4 (Qdocumentation, | |
| 4902 build_msg_string | |
| 4903 ("Automatically detects the correct encoding."), | |
| 4904 Qmnemonic, build_string ("Auto")), | |
| 4905 list6 (Qdo_eol, Qt, Qdo_coding, Qt, | |
| 4906 /* We do EOL detection ourselves so we don't need to be | |
| 4907 wrapped in an EOL detector. (It doesn't actually hurt, | |
| 4908 though, I don't think.) */ | |
| 4909 Qeol_type, Qlf))); | |
| 4910 | |
|
4690
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
4911 Fmake_coding_system_internal |
| 771 | 4912 (intern ("undecided-dos"), Qundecided, |
| 4913 build_msg_string ("Undecided (auto-detect) (CRLF)"), | |
| 4914 nconc2 (list4 (Qdocumentation, | |
| 4915 build_msg_string | |
| 4916 ("Automatically detects the correct encoding; EOL type of CRLF forced."), | |
| 4917 Qmnemonic, build_string ("Auto")), | |
| 4918 list4 (Qdo_coding, Qt, | |
| 4919 Qeol_type, Qcrlf))); | |
| 4920 | |
|
4690
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
4921 Fmake_coding_system_internal |
| 771 | 4922 (intern ("undecided-unix"), Qundecided, |
| 4923 build_msg_string ("Undecided (auto-detect) (LF)"), | |
| 4924 nconc2 (list4 (Qdocumentation, | |
| 4925 build_msg_string | |
| 4926 ("Automatically detects the correct encoding; EOL type of LF forced."), | |
| 4927 Qmnemonic, build_string ("Auto")), | |
| 4928 list4 (Qdo_coding, Qt, | |
| 4929 Qeol_type, Qlf))); | |
| 4930 | |
|
4690
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
4931 Fmake_coding_system_internal |
| 771 | 4932 (intern ("undecided-mac"), Qundecided, |
| 4933 build_msg_string ("Undecided (auto-detect) (CR)"), | |
| 4934 nconc2 (list4 (Qdocumentation, | |
| 4935 build_msg_string | |
| 4936 ("Automatically detects the correct encoding; EOL type of CR forced."), | |
| 4937 Qmnemonic, build_string ("Auto")), | |
| 4938 list4 (Qdo_coding, Qt, | |
| 4939 Qeol_type, Qcr))); | |
| 4940 | |
| 428 | 4941 /* Need to create this here or we're really screwed. */ |
|
4690
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
4942 Fmake_coding_system_internal |
| 428 | 4943 (Qraw_text, Qno_conversion, |
| 771 | 4944 build_msg_string ("Raw Text"), |
|
4690
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
4945 nconc2 (list4 (Qdocumentation, |
|
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
4946 build_msg_string ("Raw text converts only line-break " |
|
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
4947 "codes, and acts otherwise like " |
|
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
4948 "`binary'."), |
|
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
4949 Qmnemonic, build_string ("Raw")), |
|
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
4950 #ifdef MULE |
|
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
4951 list2 (Qsafe_charsets, list3 (Vcharset_ascii, Vcharset_control_1, |
|
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
4952 Vcharset_latin_iso8859_1)))); |
|
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
4953 |
|
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
4954 #else |
|
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
4955 Qnil)); |
|
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
4956 #endif |
|
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
4957 |
|
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
4958 Fmake_coding_system_internal |
| 428 | 4959 (Qbinary, Qno_conversion, |
| 771 | 4960 build_msg_string ("Binary"), |
|
4690
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
4961 nconc2 (list6 (Qdocumentation, |
|
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
4962 build_msg_string ( |
| 771 | 4963 "This coding system is as close as it comes to doing no conversion.\n" |
| 4964 "On input, each byte is converted directly into the character\n" | |
| 4965 "with the corresponding code -- i.e. from the `ascii', `control-1',\n" | |
| 4966 "or `latin-1' character sets. On output, these characters are\n" | |
| 4967 "converted back to the corresponding bytes, and other characters\n" | |
| 4968 "are converted to the default character, i.e. `~'."), | |
|
4690
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
4969 Qeol_type, Qlf, |
|
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
4970 Qmnemonic, build_string ("Binary")), |
|
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
4971 #ifdef MULE |
|
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
4972 list2 (Qsafe_charsets, list3 (Vcharset_ascii, Vcharset_control_1, |
|
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
4973 Vcharset_latin_iso8859_1)))); |
|
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
4974 |
|
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
4975 #else |
|
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
4976 Qnil)); |
|
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4647
diff
changeset
|
4977 #endif |
| 428 | 4978 |
| 771 | 4979 /* Formerly aliased to raw-text! Completely bogus and not even the same |
| 4980 as FSF Emacs. */ | |
| 4981 Fdefine_coding_system_alias (Qno_conversion, Qbinary); | |
| 4982 Fdefine_coding_system_alias (intern ("no-conversion-unix"), | |
| 4983 intern ("raw-text-unix")); | |
| 4984 Fdefine_coding_system_alias (intern ("no-conversion-dos"), | |
| 4985 intern ("raw-text-dos")); | |
| 4986 Fdefine_coding_system_alias (intern ("no-conversion-mac"), | |
| 4987 intern ("raw-text-mac")); | |
| 4988 | |
| 1318 | 4989 /* These three below will get their defaults set correctly |
| 4990 in code-init.el. We init them now so we can handle stuff at dump | |
| 771 | 4991 time before we get to code-init.el. */ |
| 1318 | 4992 Fdefine_coding_system_alias (Qnative, Qbinary); |
| 440 | 4993 Fdefine_coding_system_alias (Qterminal, Qbinary); |
| 4994 Fdefine_coding_system_alias (Qkeyboard, Qbinary); | |
| 4995 | |
| 1318 | 4996 Fdefine_coding_system_alias (Qfile_name, Qnative); |
| 771 | 4997 Fdefine_coding_system_alias (Qidentity, Qconvert_eol_lf); |
| 4998 | |
| 428 | 4999 /* Need this for bootstrapping */ |
| 771 | 5000 coding_category_system[detector_category_no_conversion] = |
| 428 | 5001 Fget_coding_system (Qraw_text); |
| 5002 } |
