comparison src/file-coding.c @ 4690:257b468bf2ca

Move the #'query-coding-region implementation to C. This is necessary because there is no reasonable way to access the corresponding mswindows-multibyte functionality from Lisp, and we need such functionality if we're going to have a reliable and portable #'query-coding-region implementation. However, this change doesn't yet provide #'query-coding-region for the mswindow-multibyte coding systems, there should be no functional differences between an XEmacs with this change and one without it. src/ChangeLog addition: 2009-09-19 Aidan Kehoe <kehoea@parhasard.net> Move the #'query-coding-region implementation to C. This is necessary because there is no reasonable way to access the corresponding mswindows-multibyte functionality from Lisp, and we need such functionality if we're going to have a reliable and portable #'query-coding-region implementation. However, this change doesn't yet provide #'query-coding-region for the mswindow-multibyte coding systems, there should be no functional differences between an XEmacs with this change and one without it. * mule-coding.c (struct fixed_width_coding_system): Add a new coding system type, fixed_width, and implement it. It uses the CCL infrastructure but has a much simpler creation API, and its own query_method, formerly in lisp/mule/mule-coding.el. * unicode.c: Move the Unicode query method implementation here from unicode.el. * lisp.h: Declare Fmake_coding_system_internal, Fcopy_range_table here. * intl-win32.c (complex_vars_of_intl_win32): Use Fmake_coding_system_internal, not Fmake_coding_system. * general-slots.h: Add Qsucceeded, Qunencodable, Qinvalid_sequence here. * file-coding.h (enum coding_system_variant): Add fixed_width_coding_system here. (struct coding_system_methods): Add query_method and query_lstream_method to the coding system methods. Provide flags for the query methods. Declare the default query method; initialise it correctly in INITIALIZE_CODING_SYSTEM_TYPE. * file-coding.c (default_query_method): New function, the default query method for coding systems that do not set it. Moved from coding.el. (make_coding_system_1): Accept new elements in PROPS in #'make-coding-system; aliases, a list of aliases; safe-chars and safe-charsets (these were previously accepted but not saved); and category. (Fmake_coding_system_internal): New function, what used to be #'make-coding-system--on Mule builds, we've now moved some of the functionality of this to Lisp. (Fcoding_system_canonical_name_p): Move this earlier in the file, since it's now called from within make_coding_system_1. (Fquery_coding_region): Move the implementation of this here, from coding.el. (complex_vars_of_file_coding): Call Fmake_coding_system_internal, not Fmake_coding_system; specify safe-charsets properties when we're a mule build. * extents.h (mouse_highlight_priority, Fset_extent_priority, Fset_extent_face, Fmap_extents): Make these available to other C files. lisp/ChangeLog addition: 2009-09-19 Aidan Kehoe <kehoea@parhasard.net> Move the #'query-coding-region implementation to C. * coding.el: Consolidate code that depends on the presence or absence of Mule at the end of this file. (default-query-coding-region, query-coding-region): Move these functions to C. (default-query-coding-region-safe-charset-skip-chars-map): Remove this variable, the corresponding C variable is Vdefault_query_coding_region_chartab_cache in file-coding.c. (query-coding-string): Update docstring to reflect actual multiple values, be more careful about not modifying a range table that we're currently mapping over. (encode-coding-char): Make the implementation of this simpler. (featurep 'mule): Autoload #'make-coding-system from mule/make-coding-system.el if we're a mule build; provide an appropriate compiler macro. Do various non-mule compatibility things if we're not a mule build. * update-elc.el (additional-dump-dependencies): Add mule/make-coding-system as a dump time dependency if we're a mule build. * unicode.el (ccl-encode-to-ucs-2): (decode-char): (encode-char): Move these earlier in the file, for the sake of some byte compile warnings. (unicode-query-coding-region): Move this to unicode.c * mule/make-coding-system.el: New file, not dumped. Contains the functionality to rework the arguments necessary for fixed-width coding systems, and contains the implementation of #'make-coding-system, which now calls #'make-coding-system-internal. * mule/vietnamese.el (viscii): * mule/latin.el (iso-8859-2): (windows-1250): (iso-8859-3): (iso-8859-4): (iso-8859-14): (iso-8859-15): (iso-8859-16): (iso-8859-9): (macintosh): (windows-1252): * mule/hebrew.el (iso-8859-8): * mule/greek.el (iso-8859-7): (windows-1253): * mule/cyrillic.el (iso-8859-5): (koi8-r): (koi8-u): (windows-1251): (alternativnyj): (koi8-ru): (koi8-t): (koi8-c): (koi8-o): * mule/arabic.el (iso-8859-6): (windows-1256): Move all these coding systems to being of type fixed-width, not of type CCL. This allows the distinct query-coding-region for them to be in C, something which will eventually allow us to implement query-coding-region for the mswindows-multibyte coding systems. * mule/general-late.el (posix-charset-to-coding-system-hash): Document why we're pre-emptively persuading the byte compiler that the ELC for this file needs to be written using escape-quoted. Call #'set-unicode-query-skip-chars-args, now the Unicode query-coding-region implementation is in C. * mule/thai-xtis.el (tis-620): Don't bother checking whether we're XEmacs or not here. * mule/mule-coding.el: Move the eight bit fixed-width functionality from this file to make-coding-system.el. tests/ChangeLog addition: 2009-09-19 Aidan Kehoe <kehoea@parhasard.net> * automated/mule-tests.el: Check a coding system's type, not an 8-bit-fixed property, for whether that coding system should be treated as a fixed-width coding system. * automated/query-coding-tests.el: Don't test the query coding functionality for mswindows-multibyte coding systems, it's not yet implemented.
author Aidan Kehoe <kehoea@parhasard.net>
date Sat, 19 Sep 2009 22:53:13 +0100
parents e4ed58cb0e5b
children a9833e8a32ec e0db3c197671
comparison
equal deleted inserted replaced
4689:0636c6ccb430 4690:257b468bf2ca
76 #include "elhash.h" 76 #include "elhash.h"
77 #include "insdel.h" 77 #include "insdel.h"
78 #include "lstream.h" 78 #include "lstream.h"
79 #include "opaque.h" 79 #include "opaque.h"
80 #include "file-coding.h" 80 #include "file-coding.h"
81 #include "extents.h"
82 #include "rangetab.h"
83 #include "chartab.h"
81 84
82 #ifdef HAVE_ZLIB 85 #ifdef HAVE_ZLIB
83 #include "zlib.h" 86 #include "zlib.h"
84 #endif 87 #endif
85 88
87 Lisp_Object Vterminal_coding_system; 90 Lisp_Object Vterminal_coding_system;
88 Lisp_Object Vcoding_system_for_read; 91 Lisp_Object Vcoding_system_for_read;
89 Lisp_Object Vcoding_system_for_write; 92 Lisp_Object Vcoding_system_for_write;
90 Lisp_Object Vfile_name_coding_system; 93 Lisp_Object Vfile_name_coding_system;
91 94
95 Lisp_Object Qaliases, Qcharset_skip_chars_string;
96
92 #ifdef DEBUG_XEMACS 97 #ifdef DEBUG_XEMACS
93 Lisp_Object Vdebug_coding_detection; 98 Lisp_Object Vdebug_coding_detection;
99 #endif
100
101 #ifdef MULE
102 extern Lisp_Object Vcharset_ascii, Vcharset_control_1,
103 Vcharset_latin_iso8859_1;
94 #endif 104 #endif
95 105
96 typedef struct coding_system_type_entry 106 typedef struct coding_system_type_entry
97 { 107 {
98 struct coding_system_methods *meths; 108 struct coding_system_methods *meths;
415 valid_coding_system_type_p (Lisp_Object type) 425 valid_coding_system_type_p (Lisp_Object type)
416 { 426 {
417 return decode_coding_system_type (type, ERROR_ME_NOT) != 0; 427 return decode_coding_system_type (type, ERROR_ME_NOT) != 0;
418 } 428 }
419 429
430 #ifdef MULE
431 static Lisp_Object Vdefault_query_coding_region_chartab_cache;
432
433 /* Non-static because it's used in INITIALIZE_CODING_SYSTEM_TYPE_WITH_DATA. */
434 Lisp_Object
435 default_query_method (Lisp_Object codesys, struct buffer *buf,
436 Charbpos end, int flags)
437 {
438 Charbpos pos = BUF_PT (buf), fail_range_start, fail_range_end;
439 Charbpos pos_byte = BYTE_BUF_PT (buf);
440 Lisp_Object safe_charsets = XCODING_SYSTEM_SAFE_CHARSETS (codesys);
441 Lisp_Object safe_chars = XCODING_SYSTEM_SAFE_CHARS (codesys),
442 result = Qnil;
443 enum query_coding_failure_reasons failed_reason,
444 previous_failed_reason = query_coding_succeeded;
445
446 /* safe-charsets of t means the coding system can encode everything. */
447 if (EQ (Qnil, safe_chars))
448 {
449 if (EQ (Qt, safe_charsets))
450 {
451 return Qnil;
452 }
453
454 /* If we've no information on what characters the coding system can
455 encode, give up. */
456 if (EQ (Qnil, safe_charsets) && EQ (Qnil, safe_chars))
457 {
458 return Qunbound;
459 }
460
461 safe_chars = Fgethash (safe_charsets,
462 Vdefault_query_coding_region_chartab_cache,
463 Qnil);
464 if (NILP (safe_chars))
465 {
466 safe_chars = Fmake_char_table (Qgeneric);
467 {
468 EXTERNAL_LIST_LOOP_2 (safe_charset, safe_charsets)
469 Fput_char_table (safe_charset, Qt, safe_chars);
470 }
471
472 Fputhash (safe_charsets, safe_chars,
473 Vdefault_query_coding_region_chartab_cache);
474 }
475 }
476
477 if (flags & QUERY_METHOD_HIGHLIGHT &&
478 /* If we're being called really early, live without highlights getting
479 cleared properly: */
480 !(UNBOUNDP (XSYMBOL (Qquery_coding_clear_highlights)->function)))
481 {
482 /* It's okay to call Lisp here, the only non-stack object we may have
483 allocated up to this point is safe_chars, and that's
484 reachable from its entry in
485 Vdefault_query_coding_region_chartab_cache */
486 call3 (Qquery_coding_clear_highlights, make_int (pos), make_int (end),
487 wrap_buffer (buf));
488 }
489
490 while (pos < end)
491 {
492 Ichar ch = BYTE_BUF_FETCH_CHAR (buf, pos_byte);
493 if (!EQ (Qnil, get_char_table (ch, safe_chars)))
494 {
495 pos++;
496 INC_BYTEBPOS (buf, pos_byte);
497 }
498 else
499 {
500 fail_range_start = pos;
501 while ((pos < end) &&
502 (EQ (Qnil, get_char_table (ch, safe_chars))
503 && (failed_reason = query_coding_unencodable))
504 && (previous_failed_reason == query_coding_succeeded
505 || previous_failed_reason == failed_reason))
506 {
507 pos++;
508 INC_BYTEBPOS (buf, pos_byte);
509 ch = BYTE_BUF_FETCH_CHAR (buf, pos_byte);
510 previous_failed_reason = failed_reason;
511 }
512
513 if (fail_range_start == pos)
514 {
515 /* The character can actually be encoded; move on. */
516 pos++;
517 INC_BYTEBPOS (buf, pos_byte);
518 }
519 else
520 {
521 assert (previous_failed_reason == query_coding_unencodable);
522
523 if (flags & QUERY_METHOD_ERRORP)
524 {
525 DECLARE_EISTRING (error_details);
526
527 eicpy_ascii (error_details, "Cannot encode ");
528 eicat_lstr (error_details,
529 make_string_from_buffer (buf, fail_range_start,
530 pos -
531 fail_range_start));
532 eicat_ascii (error_details, " using coding system");
533
534 signal_error (Qtext_conversion_error,
535 (const CIbyte *)(eidata (error_details)),
536 XCODING_SYSTEM_NAME (codesys));
537 }
538
539 if (NILP (result))
540 {
541 result = Fmake_range_table (Qstart_closed_end_open);
542 }
543
544 fail_range_end = pos;
545
546 Fput_range_table (make_int (fail_range_start),
547 make_int (fail_range_end),
548 Qunencodable,
549 result);
550 previous_failed_reason = query_coding_succeeded;
551
552 if (flags & QUERY_METHOD_HIGHLIGHT)
553 {
554 Lisp_Object extent
555 = Fmake_extent (make_int (fail_range_start),
556 make_int (fail_range_end),
557 wrap_buffer (buf));
558
559 Fset_extent_priority
560 (extent, make_int (2 + mouse_highlight_priority));
561 Fset_extent_face (extent, Qquery_coding_warning_face);
562 }
563 }
564 }
565 }
566
567 return result;
568 }
569 #else
570 Lisp_Object
571 default_query_method (Lisp_Object UNUSED (codesys),
572 struct buffer * UNUSED (buf),
573 Charbpos UNUSED (end), int UNUSED (flags))
574 {
575 return Qnil;
576 }
577 #endif /* defined MULE */
578
420 DEFUN ("valid-coding-system-type-p", Fvalid_coding_system_type_p, 1, 1, 0, /* 579 DEFUN ("valid-coding-system-type-p", Fvalid_coding_system_type_p, 1, 1, 0, /*
421 Given a CODING-SYSTEM-TYPE, return non-nil if it is valid. 580 Given a CODING-SYSTEM-TYPE, return non-nil if it is valid.
422 Valid types depend on how XEmacs was compiled but may include 581 Valid types depend on how XEmacs was compiled but may include
423 `undecided', `chain', `integer', `ccl', `iso2022', `big5', `shift-jis', 582 `undecided', `chain', `integer', `ccl', `iso2022', `big5', `shift-jis',
424 `utf-16', `ucs-4', `utf-8', etc. 583 `utf-16', `ucs-4', `utf-8', etc.
980 XCODING_SYSTEM_SUBSIDIARY_PARENT (sub_codesys) = codesys; 1139 XCODING_SYSTEM_SUBSIDIARY_PARENT (sub_codesys) = codesys;
981 XCODING_SYSTEM (codesys)->eol[eol] = sub_codesys; 1140 XCODING_SYSTEM (codesys)->eol[eol] = sub_codesys;
982 } 1141 }
983 } 1142 }
984 1143
1144 DEFUN ("coding-system-canonical-name-p", Fcoding_system_canonical_name_p,
1145 1, 1, 0, /*
1146 Return t if OBJECT names a coding system, and is not a coding system alias.
1147 */
1148 (object))
1149 {
1150 return CODING_SYSTEMP (Fgethash (object, Vcoding_system_hash_table, Qnil))
1151 ? Qt : Qnil;
1152 }
1153
985 /* Basic function to create new coding systems. For `make-coding-system', 1154 /* Basic function to create new coding systems. For `make-coding-system',
986 NAME-OR-EXISTING is the NAME argument, PREFIX is null, and TYPE, 1155 NAME-OR-EXISTING is the NAME argument, PREFIX is null, and TYPE,
987 DESCRIPTION, and PROPS are the same. All created coding systems are put 1156 DESCRIPTION, and PROPS are the same. All created coding systems are put
988 in a hash table indexed by NAME. 1157 in a hash table indexed by NAME.
989 1158
1028 Lisp_Coding_System *cs; 1197 Lisp_Coding_System *cs;
1029 int need_to_setup_eol_systems = 1; 1198 int need_to_setup_eol_systems = 1;
1030 enum eol_type eol_wrapper = EOL_AUTODETECT; 1199 enum eol_type eol_wrapper = EOL_AUTODETECT;
1031 struct coding_system_methods *meths; 1200 struct coding_system_methods *meths;
1032 Lisp_Object csobj; 1201 Lisp_Object csobj;
1033 Lisp_Object defmnem = Qnil; 1202 Lisp_Object defmnem = Qnil, aliases = Qnil;
1034 1203
1035 if (NILP (type)) 1204 if (NILP (type))
1036 type = Qundecided; 1205 type = Qundecided;
1037 meths = decode_coding_system_type (type, ERROR_ME); 1206 meths = decode_coding_system_type (type, ERROR_ME);
1038 1207
1117 1286
1118 else if (EQ (key, Qpost_read_conversion)) 1287 else if (EQ (key, Qpost_read_conversion))
1119 CODING_SYSTEM_POST_READ_CONVERSION (cs) = value; 1288 CODING_SYSTEM_POST_READ_CONVERSION (cs) = value;
1120 else if (EQ (key, Qpre_write_conversion)) 1289 else if (EQ (key, Qpre_write_conversion))
1121 CODING_SYSTEM_PRE_WRITE_CONVERSION (cs) = value; 1290 CODING_SYSTEM_PRE_WRITE_CONVERSION (cs) = value;
1291 else if (EQ (key, Qaliases))
1292 {
1293 EXTERNAL_LIST_LOOP_2 (alias, value)
1294 {
1295 CHECK_SYMBOL (alias);
1296
1297 if (!NILP (Fcoding_system_canonical_name_p (alias)))
1298 {
1299 invalid_change ("Symbol is the canonical name of a "
1300 "coding system and cannot be redefined",
1301 alias);
1302 }
1303 }
1304 aliases = value;
1305 }
1122 /* FSF compatibility */ 1306 /* FSF compatibility */
1123 else if (EQ (key, Qtranslation_table_for_decode)) 1307 else if (EQ (key, Qtranslation_table_for_decode))
1124 ; 1308 ;
1125 else if (EQ (key, Qtranslation_table_for_encode)) 1309 else if (EQ (key, Qtranslation_table_for_encode))
1126 ; 1310 ;
1127 else if (EQ (key, Qsafe_chars)) 1311 else if (EQ (key, Qsafe_chars))
1128 CODING_SYSTEM_SAFE_CHARS (cs) = value; 1312 {
1313 CHECK_CHAR_TABLE (value);
1314 CODING_SYSTEM_SAFE_CHARS (cs) = value;
1315 }
1129 else if (EQ (key, Qsafe_charsets)) 1316 else if (EQ (key, Qsafe_charsets))
1130 CODING_SYSTEM_SAFE_CHARSETS (cs) = value; 1317 {
1318 if (!EQ (Qt, value)
1319 /* Would be nice to actually do this check, but there are
1320 some order conflicts with japanese.el and
1321 mule-coding.el */
1322 && 0)
1323 {
1324 #ifdef MULE
1325 EXTERNAL_LIST_LOOP_2 (safe_charset, value)
1326 CHECK_CHARSET (Ffind_charset (safe_charset));
1327 #endif
1328 }
1329
1330 CODING_SYSTEM_SAFE_CHARSETS (cs) = value;
1331 }
1332 else if (EQ (key, Qcategory))
1333 {
1334 Fput (name_or_existing, intern ("coding-system-property"),
1335 Fplist_put (Fget (name_or_existing,
1336 intern ("coding-system-property"),
1337 Qnil),
1338 Qcategory, value));
1339 }
1131 else if (EQ (key, Qmime_charset)) 1340 else if (EQ (key, Qmime_charset))
1132 ; 1341 ;
1133 else if (EQ (key, Qvalid_codes)) 1342 else if (EQ (key, Qvalid_codes))
1134 ; 1343 ;
1135 else 1344 else
1184 Qconvert_eol_crlf), 1393 Qconvert_eol_crlf),
1185 Qcanonicalize_after_coding, 1394 Qcanonicalize_after_coding,
1186 csobj)); 1395 csobj));
1187 } 1396 }
1188 XCODING_SYSTEM_EOL_TYPE (csobj) = eol_wrapper; 1397 XCODING_SYSTEM_EOL_TYPE (csobj) = eol_wrapper;
1398
1399 {
1400 EXTERNAL_LIST_LOOP_2 (alias, aliases)
1401 Fdefine_coding_system_alias (alias, csobj);
1402 }
1189 } 1403 }
1190 1404
1191 return csobj; 1405 return csobj;
1192 } 1406 }
1193 1407
1197 Lisp_Object props) 1411 Lisp_Object props)
1198 { 1412 {
1199 return make_coding_system_1 (existing, prefix, type, description, props); 1413 return make_coding_system_1 (existing, prefix, type, description, props);
1200 } 1414 }
1201 1415
1202 DEFUN ("make-coding-system", Fmake_coding_system, 2, 4, 0, /* 1416 DEFUN ("make-coding-system-internal", Fmake_coding_system_internal, 2, 4, 0, /*
1203 Register symbol NAME as a coding system. 1417 See `make-coding-system'. This does much of the work of that function.
1204 1418
1205 TYPE describes the conversion method used and should be one of 1419 Without Mule support, it does all the work of that function, and an alias
1206 1420 exists, mapping `make-coding-system' to
1207 nil or `undecided' 1421 `make-coding-system-internal'. You'll need a non-Mule XEmacs to read the
1208 Automatic conversion. XEmacs attempts to detect the coding system 1422 complete docstring. Or you can just read it in make-coding-system.el;
1209 used in the file. 1423 something like the following should work:
1210 `chain' 1424
1211 Chain two or more coding systems together to make a combination coding 1425 \\[find-function-other-window] find-file RET \\[find-file] mule/make-coding-system.el RET
1212 system.
1213 `no-conversion'
1214 No conversion. Use this for binary files and such. On output,
1215 graphic characters that are not in ASCII or Latin-1 will be
1216 replaced by a ?. (For a no-conversion-encoded buffer, these
1217 characters will only be present if you explicitly insert them.)
1218 `convert-eol'
1219 Convert CRLF sequences or CR to LF.
1220 `shift-jis'
1221 Shift-JIS (a Japanese encoding commonly used in PC operating systems).
1222 `unicode'
1223 Any Unicode encoding (UCS-4, UTF-8, UTF-16, etc.).
1224 `mswindows-unicode-to-multibyte'
1225 (MS Windows only) Converts from Windows Unicode to Windows Multibyte
1226 (any code page encoding) upon encoding, and the other way upon decoding.
1227 `mswindows-multibyte'
1228 Converts to or from Windows Multibyte (any code page encoding).
1229 This is resolved into a chain of `mswindows-unicode' and
1230 `mswindows-unicode-to-multibyte'.
1231 `iso2022'
1232 Any ISO2022-compliant encoding. Among other things, this includes
1233 JIS (the Japanese encoding commonly used for e-mail), EUC (the
1234 standard Unix encoding for Japanese and other languages), and
1235 Compound Text (the encoding used in X11). You can specify more
1236 specific information about the conversion with the PROPS argument.
1237 `big5'
1238 Big5 (the encoding commonly used for Mandarin Chinese in Taiwan).
1239 `ccl'
1240 The conversion is performed using a user-written pseudo-code
1241 program. CCL (Code Conversion Language) is the name of this
1242 pseudo-code.
1243 `gzip'
1244 GZIP compression format.
1245 `internal'
1246 Write out or read in the raw contents of the memory representing
1247 the buffer's text. This is primarily useful for debugging
1248 purposes, and is only enabled when XEmacs has been compiled with
1249 DEBUG_XEMACS defined (via the --debug configure option).
1250 WARNING: Reading in a file using `internal' conversion can result
1251 in an internal inconsistency in the memory representing a
1252 buffer's text, which will produce unpredictable results and may
1253 cause XEmacs to crash. Under normal circumstances you should
1254 never use `internal' conversion.
1255
1256 DESCRIPTION is a short English phrase describing the coding system,
1257 suitable for use as a menu item. (See also the `documentation' property
1258 below.)
1259
1260 PROPS is a property list, describing the specific nature of the
1261 character set. Recognized properties are:
1262
1263 `mnemonic'
1264 String to be displayed in the modeline when this coding system is
1265 active.
1266
1267 `documentation'
1268 Detailed documentation on the coding system.
1269
1270 `eol-type'
1271 End-of-line conversion to be used. It should be one of
1272
1273 nil
1274 Automatically detect the end-of-line type (LF, CRLF,
1275 or CR). Also generate subsidiary coding systems named
1276 `NAME-unix', `NAME-dos', and `NAME-mac', that are
1277 identical to this coding system but have an EOL-TYPE
1278 value of `lf', `crlf', and `cr', respectively.
1279 `lf'
1280 The end of a line is marked externally using ASCII LF.
1281 Since this is also the way that XEmacs represents an
1282 end-of-line internally, specifying this option results
1283 in no end-of-line conversion. This is the standard
1284 format for Unix text files.
1285 `crlf'
1286 The end of a line is marked externally using ASCII
1287 CRLF. This is the standard format for MS-DOS text
1288 files.
1289 `cr'
1290 The end of a line is marked externally using ASCII CR.
1291 This is the standard format for Macintosh text files.
1292 t
1293 Automatically detect the end-of-line type but do not
1294 generate subsidiary coding systems. (This value is
1295 converted to nil when stored internally, and
1296 `coding-system-property' will return nil.)
1297
1298 `post-read-conversion'
1299 The value is a function to call after some text is inserted and
1300 decoded by the coding system itself and before any functions in
1301 `after-change-functions' are called. (#### Not actually true in
1302 XEmacs. `after-change-functions' will be called twice if
1303 `post-read-conversion' changes something.) The argument of this
1304 function is the same as for a function in
1305 `after-insert-file-functions', i.e. LENGTH of the text inserted,
1306 with point at the head of the text to be decoded.
1307
1308 `pre-write-conversion'
1309 The value is a function to call after all functions in
1310 `write-region-annotate-functions' and `buffer-file-format' are
1311 called, and before the text is encoded by the coding system itself.
1312 The arguments to this function are the same as those of a function
1313 in `write-region-annotate-functions', i.e. FROM and TO, specifying
1314 a region of text.
1315
1316
1317
1318 The following properties are allowed for FSF compatibility but currently
1319 ignored:
1320
1321 `translation-table-for-decode'
1322 The value is a translation table to be applied on decoding. See
1323 the function `make-translation-table' for the format of translation
1324 table. This is not applicable to CCL-based coding systems.
1325
1326 `translation-table-for-encode'
1327 The value is a translation table to be applied on encoding. This is
1328 not applicable to CCL-based coding systems.
1329
1330 `mime-charset'
1331 The value is a symbol of which name is `MIME-charset' parameter of
1332 the coding system.
1333
1334 `valid-codes' (meaningful only for a coding system based on CCL)
1335 The value is a list to indicate valid byte ranges of the encoded
1336 file. Each element of the list is an integer or a cons of integer.
1337 In the former case, the integer value is a valid byte code. In the
1338 latter case, the integers specifies the range of valid byte codes.
1339
1340 The following properties are used by `default-query-coding-region',
1341 the default implementation of `query-coding-region'. This
1342 implementation and these properties are not used by the Unicode coding
1343 systems, nor by those CCL coding systems created with
1344 `make-8-bit-coding-system'.
1345
1346 `safe-chars'
1347 The value is a char table. If a character has non-nil value in it,
1348 the character is safely supported by the coding system.
1349 Under XEmacs, for the moment, this is used in addition to the
1350 `safe-charsets' property. It does not override it as it does
1351 under GNU Emacs. #### We need to consider if we should keep this
1352 behaviour.
1353
1354 `safe-charsets'
1355 The value is a list of charsets safely supported by the coding
1356 system. For coding systems based on ISO 2022, XEmacs may try to
1357 encode characters outside these character sets, but outside of
1358 East Asia and East Asian coding systems, it is unlikely that
1359 consumers of the data will understand XEmacs' encoding.
1360 The value t means that all XEmacs character sets handles are supported.
1361
1362 The following additional property is recognized if TYPE is `convert-eol':
1363
1364 `subtype'
1365 One of `lf', `crlf', `cr' or nil (for autodetection). When decoding,
1366 the corresponding sequence will be converted to LF. When encoding,
1367 the opposite happens. This coding system converts characters to
1368 characters.
1369
1370
1371
1372 The following additional properties are recognized if TYPE is `iso2022':
1373
1374 `charset-g0'
1375 `charset-g1'
1376 `charset-g2'
1377 `charset-g3'
1378 The character set initially designated to the G0 - G3 registers.
1379 The value should be one of
1380
1381 -- A charset object (designate that character set)
1382 -- nil (do not ever use this register)
1383 -- t (no character set is initially designated to
1384 the register, but may be later on; this automatically
1385 sets the corresponding `force-g*-on-output' property)
1386
1387 `force-g0-on-output'
1388 `force-g1-on-output'
1389 `force-g2-on-output'
1390 `force-g2-on-output'
1391 If non-nil, send an explicit designation sequence on output before
1392 using the specified register.
1393
1394 `short'
1395 If non-nil, use the short forms "ESC $ @", "ESC $ A", and
1396 "ESC $ B" on output in place of the full designation sequences
1397 "ESC $ ( @", "ESC $ ( A", and "ESC $ ( B".
1398
1399 `no-ascii-eol'
1400 If non-nil, don't designate ASCII to G0 at each end of line on output.
1401 Setting this to non-nil also suppresses other state-resetting that
1402 normally happens at the end of a line.
1403
1404 `no-ascii-cntl'
1405 If non-nil, don't designate ASCII to G0 before control chars on output.
1406
1407 `seven'
1408 If non-nil, use 7-bit environment on output. Otherwise, use 8-bit
1409 environment.
1410
1411 `lock-shift'
1412 If non-nil, use locking-shift (SO/SI) instead of single-shift
1413 or designation by escape sequence.
1414
1415 `no-iso6429'
1416 If non-nil, don't use ISO6429's direction specification.
1417
1418 `escape-quoted'
1419 If non-nil, literal control characters that are the same as
1420 the beginning of a recognized ISO2022 or ISO6429 escape sequence
1421 (in particular, ESC (0x1B), SO (0x0E), SI (0x0F), SS2 (0x8E),
1422 SS3 (0x8F), and CSI (0x9B)) are "quoted" with an escape character
1423 so that they can be properly distinguished from an escape sequence.
1424 (Note that doing this results in a non-portable encoding.) This
1425 encoding flag is used for byte-compiled files. Note that ESC
1426 is a good choice for a quoting character because there are no
1427 escape sequences whose second byte is a character from the Control-0
1428 or Control-1 character sets; this is explicitly disallowed by the
1429 ISO2022 standard.
1430
1431 `input-charset-conversion'
1432 A list of conversion specifications, specifying conversion of
1433 characters in one charset to another when decoding is performed.
1434 Each specification is a list of two elements: the source charset,
1435 and the destination charset.
1436
1437 `output-charset-conversion'
1438 A list of conversion specifications, specifying conversion of
1439 characters in one charset to another when encoding is performed.
1440 The form of each specification is the same as for
1441 `input-charset-conversion'.
1442
1443
1444
1445 The following additional properties are recognized (and required)
1446 if TYPE is `ccl':
1447
1448 `decode'
1449 CCL program used for decoding (converting to internal format).
1450
1451 `encode'
1452 CCL program used for encoding (converting to external format).
1453
1454
1455 The following additional properties are recognized if TYPE is `chain':
1456
1457 `chain'
1458 List of coding systems to be chained together, in decoding order.
1459
1460 `canonicalize-after-coding'
1461 Coding system to be returned by the detector routines in place of
1462 this coding system.
1463
1464
1465
1466 The following additional properties are recognized if TYPE is `unicode':
1467
1468 `unicode-type'
1469 One of `utf-16', `utf-8', `ucs-4', or `utf-7' (the latter is not
1470 yet implemented). `utf-16' is the basic two-byte encoding;
1471 `ucs-4' is the four-byte encoding; `utf-8' is an ASCII-compatible
1472 variable-width 8-bit encoding; `utf-7' is a 7-bit encoding using
1473 only characters that will safely pass through all mail gateways.
1474 [[ This should be \"transformation format\". There should also be
1475 `ucs-2' (or `bmp' -- no surrogates) and `utf-32' (range checked). ]]
1476
1477 `little-endian'
1478 If non-nil, `utf-16' and `ucs-4' will write out the groups of two
1479 or four bytes little-endian instead of big-endian. This is required,
1480 for example, under Windows.
1481
1482 `need-bom'
1483 If non-nil, a byte order mark (BOM, or Unicode FFFE) should be
1484 written out at the beginning of the data. This serves both to
1485 identify the endianness of the following data and to mark the
1486 data as Unicode (at least, this is how Windows uses it).
1487 [[ The correct term is \"signature\", since this technique may also
1488 be used with UTF-8. That is the term used in the standard. ]]
1489
1490
1491 The following additional properties are recognized if TYPE is
1492 `mswindows-multibyte':
1493
1494 `code-page'
1495 Either a number (specifying a particular code page) or one of the
1496 symbols `ansi', `oem', `mac', or `ebcdic', specifying the ANSI,
1497 OEM, Macintosh, or EBCDIC code page associated with a particular
1498 locale (given by the `locale' property). NOTE: EBCDIC code pages
1499 only exist in Windows 2000 and later.
1500
1501 `locale'
1502 If `code-page' is a symbol, this specifies the locale whose code
1503 page of the corresponding type should be used. This should be
1504 one of the following: A cons of two strings, (LANGUAGE
1505 . SUBLANGUAGE) (see `mswindows-set-current-locale'); a string (a
1506 language; SUBLANG_DEFAULT, i.e. the default sublanguage, is
1507 used); or one of the symbols `current', `user-default', or
1508 `system-default', corresponding to the values of
1509 `mswindows-current-locale', `mswindows-user-default-locale', or
1510 `mswindows-system-default-locale', respectively.
1511
1512
1513
1514 The following additional properties are recognized if TYPE is `undecided':
1515 \[[ Doesn't GNU use \"detect-*\" for the following two? ]]
1516
1517 `do-eol'
1518 Do EOL detection.
1519
1520 `do-coding'
1521 Do encoding detection.
1522
1523 `coding-system'
1524 If encoding detection is not done, use the specified coding system
1525 to do decoding. This is used internally when implementing coding
1526 systems with an EOL type that specifies autodetection (the default),
1527 so that the detector routines return the proper subsidiary.
1528
1529
1530
1531 The following additional property is recognized if TYPE is `gzip':
1532
1533 `level'
1534 Compression level: 0 through 9, or `default' (currently 6).
1535 1426
1536 */ 1427 */
1537 (name, type, description, props)) 1428 (name, type, description, props))
1538 { 1429 {
1539 return make_coding_system_1 (name, 0, type, description, props); 1430 return make_coding_system_1 (name, 0, type, description, props);
1573 Lisp_Coding_System *from = XCODING_SYSTEM (old_coding_system); 1464 Lisp_Coding_System *from = XCODING_SYSTEM (old_coding_system);
1574 COPY_SIZED_LCRECORD (to, from, sizeof_coding_system (from)); 1465 COPY_SIZED_LCRECORD (to, from, sizeof_coding_system (from));
1575 to->name = new_name; 1466 to->name = new_name;
1576 } 1467 }
1577 return new_coding_system; 1468 return new_coding_system;
1578 }
1579
1580 DEFUN ("coding-system-canonical-name-p", Fcoding_system_canonical_name_p,
1581 1, 1, 0, /*
1582 Return t if OBJECT names a coding system, and is not a coding system alias.
1583 */
1584 (object))
1585 {
1586 return CODING_SYSTEMP (Fgethash (object, Vcoding_system_hash_table, Qnil))
1587 ? Qt : Qnil;
1588 } 1469 }
1589 1470
1590 /* #### Shouldn't this really be a find/get pair? */ 1471 /* #### Shouldn't this really be a find/get pair? */
1591 1472
1592 DEFUN ("coding-system-alias-p", Fcoding_system_alias_p, 1, 1, 0, /* 1473 DEFUN ("coding-system-alias-p", Fcoding_system_alias_p, 1, 1, 0, /*
2472 (start, end, coding_system, buffer)) 2353 (start, end, coding_system, buffer))
2473 { 2354 {
2474 return encode_decode_coding_region (start, end, coding_system, buffer, 2355 return encode_decode_coding_region (start, end, coding_system, buffer,
2475 CODING_ENCODE); 2356 CODING_ENCODE);
2476 } 2357 }
2358
2359 DEFUN ("query-coding-region", Fquery_coding_region, 3, 7, 0, /*
2360 Work out whether CODING-SYSTEM can losslessly encode a region.
2361
2362 START and END are the beginning and end of the region to check.
2363 CODING-SYSTEM is the coding system to try.
2364
2365 Optional argument BUFFER is the buffer to check, and defaults to the current
2366 buffer.
2367
2368 IGNORE-INVALID-SEQUENCESP, also an optional argument, says to treat XEmacs
2369 characters which have an unambiguous encoded representation, despite being
2370 undefined in what they represent, as encodable. These chiefly arise with
2371 variable-length encodings like UTF-8 and UTF-16, where an invalid sequence
2372 is passed through to XEmacs as a sequence of characters with a defined
2373 correspondence to the octets on disk, but no non-error semantics; see the
2374 `invalid-sequence-coding-system' argument to `set-language-info'.
2375
2376 They can also arise with fixed-length encodings like ISO 8859-7, where
2377 certain octets on disk have undefined values, and treating them as
2378 corresponding to the ISO 8859-1 characters with the same numerical values
2379 may lead to data that is not understood by other applications.
2380
2381 Optional argument ERRORP says to signal a `text-conversion-error' if some
2382 character in the region cannot be encoded, and defaults to nil.
2383
2384 Optional argument HIGHLIGHT says to display unencodable characters in the
2385 region using `query-coding-warning-face'. It defaults to nil.
2386
2387 This function can return multiple values; the intention is that callers use
2388 `multiple-value-bind' or the related CL multiple value functions to deal
2389 with it. The first result is `t' if the region can be encoded using
2390 CODING-SYSTEM, or `nil' if not. If the region cannot be encoded using
2391 CODING-SYSTEM, the second result is a range table describing the positions
2392 of the unencodable characters.
2393
2394 Ranges that describe characters that would be ignored were
2395 IGNORE-INVALID-SEQUENCESP non-nil map to the symbol `invalid-sequence';
2396 other ranges map to the symbol `unencodable'. If IGNORE-INVALID-SEQUENCESP
2397 is non-nil, all ranges will map to the symbol `unencodable'. See
2398 `make-range-table' for more details of range tables.
2399 */
2400 (start, end, coding_system, buffer, ignore_invalid_sequencesp,
2401 errorp, highlight))
2402 {
2403 Charbpos b, e;
2404 struct buffer *buf = decode_buffer (buffer, 1);
2405 Lisp_Object result;
2406 int flags = 0, speccount = specpdl_depth ();
2407
2408 coding_system = Fget_coding_system (coding_system);
2409
2410 get_buffer_range_char (buf, start, end, &b, &e, 0);
2411
2412 if (buf != current_buffer)
2413 {
2414 record_unwind_protect (save_current_buffer_restore, Fcurrent_buffer ());
2415 set_buffer_internal (buf);
2416 }
2417
2418 record_unwind_protect (save_excursion_restore, save_excursion_save ());
2419
2420 BUF_SET_PT (buf, b);
2421
2422 if (!NILP (ignore_invalid_sequencesp))
2423 {
2424 flags |= QUERY_METHOD_IGNORE_INVALID_SEQUENCES;
2425 }
2426
2427 if (!NILP (errorp))
2428 {
2429 flags |= QUERY_METHOD_ERRORP;
2430 }
2431
2432 if (!NILP (highlight))
2433 {
2434 flags |= QUERY_METHOD_HIGHLIGHT;
2435 }
2436
2437 result = XCODESYSMETH_OR_GIVEN (coding_system, query,
2438 (coding_system, buf, e, flags), Qunbound);
2439
2440 if (UNBOUNDP (result))
2441 {
2442 signal_error (Qtext_conversion_error,
2443 "Coding system doesn't say what it can encode",
2444 XCODING_SYSTEM_NAME (coding_system));
2445 }
2446
2447 result = (NILP (result)) ? Qt : values2 (Qnil, result);
2448
2449 return unbind_to_1 (speccount, result);
2450 }
2451
2477 2452
2478 2453
2479 /************************************************************************/ 2454 /************************************************************************/
2480 /* Chain methods */ 2455 /* Chain methods */
2481 /************************************************************************/ 2456 /************************************************************************/
4548 DEFSUBR (Fautoload_coding_system); 4523 DEFSUBR (Fautoload_coding_system);
4549 DEFSUBR (Ffind_coding_system); 4524 DEFSUBR (Ffind_coding_system);
4550 DEFSUBR (Fget_coding_system); 4525 DEFSUBR (Fget_coding_system);
4551 DEFSUBR (Fcoding_system_list); 4526 DEFSUBR (Fcoding_system_list);
4552 DEFSUBR (Fcoding_system_name); 4527 DEFSUBR (Fcoding_system_name);
4553 DEFSUBR (Fmake_coding_system); 4528 DEFSUBR (Fmake_coding_system_internal);
4554 DEFSUBR (Fcopy_coding_system); 4529 DEFSUBR (Fcopy_coding_system);
4555 DEFSUBR (Fcoding_system_canonical_name_p); 4530 DEFSUBR (Fcoding_system_canonical_name_p);
4556 DEFSUBR (Fcoding_system_alias_p); 4531 DEFSUBR (Fcoding_system_alias_p);
4557 DEFSUBR (Fcoding_system_aliasee); 4532 DEFSUBR (Fcoding_system_aliasee);
4558 DEFSUBR (Fdefine_coding_system_alias); 4533 DEFSUBR (Fdefine_coding_system_alias);
4571 DEFSUBR (Fcoding_category_system); 4546 DEFSUBR (Fcoding_category_system);
4572 4547
4573 DEFSUBR (Fdetect_coding_region); 4548 DEFSUBR (Fdetect_coding_region);
4574 DEFSUBR (Fdecode_coding_region); 4549 DEFSUBR (Fdecode_coding_region);
4575 DEFSUBR (Fencode_coding_region); 4550 DEFSUBR (Fencode_coding_region);
4551 DEFSUBR (Fquery_coding_region);
4576 DEFSYMBOL_MULTIWORD_PREDICATE (Qcoding_systemp); 4552 DEFSYMBOL_MULTIWORD_PREDICATE (Qcoding_systemp);
4577 DEFSYMBOL (Qno_conversion); 4553 DEFSYMBOL (Qno_conversion);
4578 DEFSYMBOL (Qconvert_eol); 4554 DEFSYMBOL (Qconvert_eol);
4579 DEFSYMBOL (Qconvert_eol_autodetect); 4555 DEFSYMBOL (Qconvert_eol_autodetect);
4580 DEFSYMBOL (Qconvert_eol_lf); 4556 DEFSYMBOL (Qconvert_eol_lf);
4618 DEFSYMBOL (Qcanonicalize_after_coding); 4594 DEFSYMBOL (Qcanonicalize_after_coding);
4619 4595
4620 DEFSYMBOL (Qposix_charset_to_coding_system_hash); 4596 DEFSYMBOL (Qposix_charset_to_coding_system_hash);
4621 4597
4622 DEFSYMBOL (Qescape_quoted); 4598 DEFSYMBOL (Qescape_quoted);
4599
4600 DEFSYMBOL (Qquery_coding_warning_face);
4601 DEFSYMBOL (Qaliases);
4602 DEFSYMBOL (Qcharset_skip_chars_string);
4623 4603
4624 #ifdef HAVE_ZLIB 4604 #ifdef HAVE_ZLIB
4625 DEFSYMBOL (Qgzip); 4605 DEFSYMBOL (Qgzip);
4626 #endif 4606 #endif
4627 4607
4842 If non-nil, display debug information about detection operations in progress. 4822 If non-nil, display debug information about detection operations in progress.
4843 Information is displayed on stderr. 4823 Information is displayed on stderr.
4844 */ ); 4824 */ );
4845 Vdebug_coding_detection = Qnil; 4825 Vdebug_coding_detection = Qnil;
4846 #endif 4826 #endif
4827
4828 #ifdef MULE
4829 Vdefault_query_coding_region_chartab_cache
4830 = make_lisp_hash_table (25, HASH_TABLE_NON_WEAK, HASH_TABLE_EQUAL);
4831 staticpro (&Vdefault_query_coding_region_chartab_cache);
4832 #endif
4847 } 4833 }
4848 4834
4849 /* #### reformat this for consistent appearance? */ 4835 /* #### reformat this for consistent appearance? */
4850 4836
4851 void 4837 void
4852 complex_vars_of_file_coding (void) 4838 complex_vars_of_file_coding (void)
4853 { 4839 {
4854 Fmake_coding_system 4840 Fmake_coding_system_internal
4855 (Qconvert_eol_cr, Qconvert_eol, 4841 (Qconvert_eol_cr, Qconvert_eol,
4856 build_msg_string ("Convert CR to LF"), 4842 build_msg_string ("Convert CR to LF"),
4857 nconc2 (list6 (Qdocumentation, 4843 nconc2 (list6 (Qdocumentation,
4858 build_msg_string ( 4844 build_msg_string (
4859 "Converts CR (used to mark the end of a line on Macintosh systems) to LF\n" 4845 "Converts CR (used to mark the end of a line on Macintosh systems) to LF\n"
4861 Qmnemonic, build_string ("CR->LF"), 4847 Qmnemonic, build_string ("CR->LF"),
4862 Qsubtype, Qcr), 4848 Qsubtype, Qcr),
4863 /* VERY IMPORTANT! Tell make-coding-system not to generate 4849 /* VERY IMPORTANT! Tell make-coding-system not to generate
4864 subsidiaries -- it needs the coding systems we're creating 4850 subsidiaries -- it needs the coding systems we're creating
4865 to do so! */ 4851 to do so! */
4866 list2 (Qeol_type, Qlf))); 4852 list4 (Qeol_type, Qlf,
4867 4853 Qsafe_charsets, Qt)));
4868 Fmake_coding_system 4854
4855 Fmake_coding_system_internal
4869 (Qconvert_eol_lf, Qconvert_eol, 4856 (Qconvert_eol_lf, Qconvert_eol,
4870 build_msg_string ("Convert LF to LF (do nothing)"), 4857 build_msg_string ("Convert LF to LF (do nothing)"),
4871 nconc2 (list6 (Qdocumentation, 4858 nconc2 (list6 (Qdocumentation,
4872 build_msg_string ( 4859 build_msg_string (
4873 "Do nothing."), 4860 "Do nothing."),
4874 Qmnemonic, build_string ("LF->LF"), 4861 Qmnemonic, build_string ("LF->LF"),
4875 Qsubtype, Qlf), 4862 Qsubtype, Qlf),
4876 /* VERY IMPORTANT! Tell make-coding-system not to generate 4863 /* VERY IMPORTANT! Tell make-coding-system not to generate
4877 subsidiaries -- it needs the coding systems we're creating 4864 subsidiaries -- it needs the coding systems we're creating
4878 to do so! */ 4865 to do so! */
4879 list2 (Qeol_type, Qlf))); 4866 list4 (Qeol_type, Qlf,
4880 4867 Qsafe_charsets, Qt)));
4881 Fmake_coding_system 4868
4869 Fmake_coding_system_internal
4882 (Qconvert_eol_crlf, Qconvert_eol, 4870 (Qconvert_eol_crlf, Qconvert_eol,
4883 build_msg_string ("Convert CRLF to LF"), 4871 build_msg_string ("Convert CRLF to LF"),
4884 nconc2 (list6 (Qdocumentation, 4872 nconc2 (list6 (Qdocumentation,
4885 build_msg_string ( 4873 build_msg_string (
4886 "Converts CR+LF (used to mark the end of a line on Macintosh systems) to LF\n" 4874 "Converts CR+LF (used to mark the end of a line on Macintosh systems) to LF\n"
4887 "(used internally and under Unix to mark the end of a line)."), 4875 "(used internally and under Unix to mark the end of a line)."),
4888 Qmnemonic, build_string ("CRLF->LF"), 4876 Qmnemonic, build_string ("CRLF->LF"),
4889 Qsubtype, Qcrlf), 4877 Qsubtype, Qcrlf),
4878
4890 /* VERY IMPORTANT! Tell make-coding-system not to generate 4879 /* VERY IMPORTANT! Tell make-coding-system not to generate
4891 subsidiaries -- it needs the coding systems we're creating 4880 subsidiaries -- it needs the coding systems we're creating
4892 to do so! */ 4881 to do so! */
4893 list2 (Qeol_type, Qlf))); 4882 list4 (Qeol_type, Qlf,
4894 4883 Qsafe_charsets, Qt)));
4895 Fmake_coding_system 4884
4885 Fmake_coding_system_internal
4896 (Qconvert_eol_autodetect, Qconvert_eol, 4886 (Qconvert_eol_autodetect, Qconvert_eol,
4897 build_msg_string ("Autodetect EOL type"), 4887 build_msg_string ("Autodetect EOL type"),
4898 nconc2 (list6 (Qdocumentation, 4888 nconc2 (list6 (Qdocumentation,
4899 build_msg_string ( 4889 build_msg_string (
4900 "Autodetect the end-of-line type."), 4890 "Autodetect the end-of-line type."),
4901 Qmnemonic, build_string ("Auto-EOL"), 4891 Qmnemonic, build_string ("Auto-EOL"),
4902 Qsubtype, Qnil), 4892 Qsubtype, Qnil),
4903 /* VERY IMPORTANT! Tell make-coding-system not to generate 4893 /* VERY IMPORTANT! Tell make-coding-system not to generate
4904 subsidiaries -- it needs the coding systems we're creating 4894 subsidiaries -- it needs the coding systems we're creating
4905 to do so! */ 4895 to do so! */
4906 list2 (Qeol_type, Qlf))); 4896 list4 (Qeol_type, Qlf,
4907 4897 Qsafe_charsets, Qt)));
4908 Fmake_coding_system 4898
4899 Fmake_coding_system_internal
4909 (Qundecided, Qundecided, 4900 (Qundecided, Qundecided,
4910 build_msg_string ("Undecided (auto-detect)"), 4901 build_msg_string ("Undecided (auto-detect)"),
4911 nconc2 (list4 (Qdocumentation, 4902 nconc2 (list4 (Qdocumentation,
4912 build_msg_string 4903 build_msg_string
4913 ("Automatically detects the correct encoding."), 4904 ("Automatically detects the correct encoding."),
4916 /* We do EOL detection ourselves so we don't need to be 4907 /* We do EOL detection ourselves so we don't need to be
4917 wrapped in an EOL detector. (It doesn't actually hurt, 4908 wrapped in an EOL detector. (It doesn't actually hurt,
4918 though, I don't think.) */ 4909 though, I don't think.) */
4919 Qeol_type, Qlf))); 4910 Qeol_type, Qlf)));
4920 4911
4921 Fmake_coding_system 4912 Fmake_coding_system_internal
4922 (intern ("undecided-dos"), Qundecided, 4913 (intern ("undecided-dos"), Qundecided,
4923 build_msg_string ("Undecided (auto-detect) (CRLF)"), 4914 build_msg_string ("Undecided (auto-detect) (CRLF)"),
4924 nconc2 (list4 (Qdocumentation, 4915 nconc2 (list4 (Qdocumentation,
4925 build_msg_string 4916 build_msg_string
4926 ("Automatically detects the correct encoding; EOL type of CRLF forced."), 4917 ("Automatically detects the correct encoding; EOL type of CRLF forced."),
4927 Qmnemonic, build_string ("Auto")), 4918 Qmnemonic, build_string ("Auto")),
4928 list4 (Qdo_coding, Qt, 4919 list4 (Qdo_coding, Qt,
4929 Qeol_type, Qcrlf))); 4920 Qeol_type, Qcrlf)));
4930 4921
4931 Fmake_coding_system 4922 Fmake_coding_system_internal
4932 (intern ("undecided-unix"), Qundecided, 4923 (intern ("undecided-unix"), Qundecided,
4933 build_msg_string ("Undecided (auto-detect) (LF)"), 4924 build_msg_string ("Undecided (auto-detect) (LF)"),
4934 nconc2 (list4 (Qdocumentation, 4925 nconc2 (list4 (Qdocumentation,
4935 build_msg_string 4926 build_msg_string
4936 ("Automatically detects the correct encoding; EOL type of LF forced."), 4927 ("Automatically detects the correct encoding; EOL type of LF forced."),
4937 Qmnemonic, build_string ("Auto")), 4928 Qmnemonic, build_string ("Auto")),
4938 list4 (Qdo_coding, Qt, 4929 list4 (Qdo_coding, Qt,
4939 Qeol_type, Qlf))); 4930 Qeol_type, Qlf)));
4940 4931
4941 Fmake_coding_system 4932 Fmake_coding_system_internal
4942 (intern ("undecided-mac"), Qundecided, 4933 (intern ("undecided-mac"), Qundecided,
4943 build_msg_string ("Undecided (auto-detect) (CR)"), 4934 build_msg_string ("Undecided (auto-detect) (CR)"),
4944 nconc2 (list4 (Qdocumentation, 4935 nconc2 (list4 (Qdocumentation,
4945 build_msg_string 4936 build_msg_string
4946 ("Automatically detects the correct encoding; EOL type of CR forced."), 4937 ("Automatically detects the correct encoding; EOL type of CR forced."),
4947 Qmnemonic, build_string ("Auto")), 4938 Qmnemonic, build_string ("Auto")),
4948 list4 (Qdo_coding, Qt, 4939 list4 (Qdo_coding, Qt,
4949 Qeol_type, Qcr))); 4940 Qeol_type, Qcr)));
4950 4941
4951 /* Need to create this here or we're really screwed. */ 4942 /* Need to create this here or we're really screwed. */
4952 Fmake_coding_system 4943 Fmake_coding_system_internal
4953 (Qraw_text, Qno_conversion, 4944 (Qraw_text, Qno_conversion,
4954 build_msg_string ("Raw Text"), 4945 build_msg_string ("Raw Text"),
4955 list4 (Qdocumentation, 4946 nconc2 (list4 (Qdocumentation,
4956 build_msg_string ("Raw text converts only line-break codes, and acts otherwise like `binary'."), 4947 build_msg_string ("Raw text converts only line-break "
4957 Qmnemonic, build_string ("Raw"))); 4948 "codes, and acts otherwise like "
4958 4949 "`binary'."),
4959 Fmake_coding_system 4950 Qmnemonic, build_string ("Raw")),
4951 #ifdef MULE
4952 list2 (Qsafe_charsets, list3 (Vcharset_ascii, Vcharset_control_1,
4953 Vcharset_latin_iso8859_1))));
4954
4955 #else
4956 Qnil));
4957 #endif
4958
4959 Fmake_coding_system_internal
4960 (Qbinary, Qno_conversion, 4960 (Qbinary, Qno_conversion,
4961 build_msg_string ("Binary"), 4961 build_msg_string ("Binary"),
4962 list6 (Qdocumentation, 4962 nconc2 (list6 (Qdocumentation,
4963 build_msg_string ( 4963 build_msg_string (
4964 "This coding system is as close as it comes to doing no conversion.\n" 4964 "This coding system is as close as it comes to doing no conversion.\n"
4965 "On input, each byte is converted directly into the character\n" 4965 "On input, each byte is converted directly into the character\n"
4966 "with the corresponding code -- i.e. from the `ascii', `control-1',\n" 4966 "with the corresponding code -- i.e. from the `ascii', `control-1',\n"
4967 "or `latin-1' character sets. On output, these characters are\n" 4967 "or `latin-1' character sets. On output, these characters are\n"
4968 "converted back to the corresponding bytes, and other characters\n" 4968 "converted back to the corresponding bytes, and other characters\n"
4969 "are converted to the default character, i.e. `~'."), 4969 "are converted to the default character, i.e. `~'."),
4970 Qeol_type, Qlf, 4970 Qeol_type, Qlf,
4971 Qmnemonic, build_string ("Binary"))); 4971 Qmnemonic, build_string ("Binary")),
4972 #ifdef MULE
4973 list2 (Qsafe_charsets, list3 (Vcharset_ascii, Vcharset_control_1,
4974 Vcharset_latin_iso8859_1))));
4975
4976 #else
4977 Qnil));
4978 #endif
4972 4979
4973 /* Formerly aliased to raw-text! Completely bogus and not even the same 4980 /* Formerly aliased to raw-text! Completely bogus and not even the same
4974 as FSF Emacs. */ 4981 as FSF Emacs. */
4975 Fdefine_coding_system_alias (Qno_conversion, Qbinary); 4982 Fdefine_coding_system_alias (Qno_conversion, Qbinary);
4976 Fdefine_coding_system_alias (intern ("no-conversion-unix"), 4983 Fdefine_coding_system_alias (intern ("no-conversion-unix"),