Mercurial > hg > xemacs-beta
comparison src/file-coding.c @ 4690:257b468bf2ca
Move the #'query-coding-region implementation to C.
This is necessary because there is no reasonable way to access the
corresponding mswindows-multibyte functionality from Lisp, and we need such
functionality if we're going to have a reliable and portable
#'query-coding-region implementation. However, this change doesn't yet
provide #'query-coding-region for the mswindow-multibyte coding systems,
there should be no functional differences between an XEmacs with this change
and one without it.
src/ChangeLog addition:
2009-09-19 Aidan Kehoe <kehoea@parhasard.net>
Move the #'query-coding-region implementation to C.
This is necessary because there is no reasonable way to access the
corresponding mswindows-multibyte functionality from Lisp, and we
need such functionality if we're going to have a reliable and
portable #'query-coding-region implementation. However, this
change doesn't yet provide #'query-coding-region for the
mswindow-multibyte coding systems, there should be no functional
differences between an XEmacs with this change and one without it.
* mule-coding.c (struct fixed_width_coding_system):
Add a new coding system type, fixed_width, and implement it. It
uses the CCL infrastructure but has a much simpler creation API,
and its own query_method, formerly in lisp/mule/mule-coding.el.
* unicode.c:
Move the Unicode query method implementation here from
unicode.el.
* lisp.h: Declare Fmake_coding_system_internal, Fcopy_range_table
here.
* intl-win32.c (complex_vars_of_intl_win32):
Use Fmake_coding_system_internal, not Fmake_coding_system.
* general-slots.h: Add Qsucceeded, Qunencodable, Qinvalid_sequence
here.
* file-coding.h (enum coding_system_variant):
Add fixed_width_coding_system here.
(struct coding_system_methods):
Add query_method and query_lstream_method to the coding system
methods.
Provide flags for the query methods.
Declare the default query method; initialise it correctly in
INITIALIZE_CODING_SYSTEM_TYPE.
* file-coding.c (default_query_method):
New function, the default query method for coding systems that do
not set it. Moved from coding.el.
(make_coding_system_1):
Accept new elements in PROPS in #'make-coding-system; aliases, a
list of aliases; safe-chars and safe-charsets (these were
previously accepted but not saved); and category.
(Fmake_coding_system_internal):
New function, what used to be #'make-coding-system--on Mule
builds, we've now moved some of the functionality of this to
Lisp.
(Fcoding_system_canonical_name_p):
Move this earlier in the file, since it's now called from within
make_coding_system_1.
(Fquery_coding_region):
Move the implementation of this here, from coding.el.
(complex_vars_of_file_coding):
Call Fmake_coding_system_internal, not Fmake_coding_system;
specify safe-charsets properties when we're a mule build.
* extents.h (mouse_highlight_priority, Fset_extent_priority,
Fset_extent_face, Fmap_extents):
Make these available to other C files.
lisp/ChangeLog addition:
2009-09-19 Aidan Kehoe <kehoea@parhasard.net>
Move the #'query-coding-region implementation to C.
* coding.el:
Consolidate code that depends on the presence or absence of Mule
at the end of this file.
(default-query-coding-region, query-coding-region):
Move these functions to C.
(default-query-coding-region-safe-charset-skip-chars-map):
Remove this variable, the corresponding C variable is
Vdefault_query_coding_region_chartab_cache in file-coding.c.
(query-coding-string): Update docstring to reflect actual multiple
values, be more careful about not modifying a range table that
we're currently mapping over.
(encode-coding-char): Make the implementation of this simpler.
(featurep 'mule): Autoload #'make-coding-system from
mule/make-coding-system.el if we're a mule build; provide an
appropriate compiler macro.
Do various non-mule compatibility things if we're not a mule
build.
* update-elc.el (additional-dump-dependencies):
Add mule/make-coding-system as a dump time dependency if we're a
mule build.
* unicode.el (ccl-encode-to-ucs-2):
(decode-char):
(encode-char):
Move these earlier in the file, for the sake of some byte compile
warnings.
(unicode-query-coding-region):
Move this to unicode.c
* mule/make-coding-system.el:
New file, not dumped. Contains the functionality to rework the
arguments necessary for fixed-width coding systems, and contains
the implementation of #'make-coding-system, which now calls
#'make-coding-system-internal.
* mule/vietnamese.el (viscii):
* mule/latin.el (iso-8859-2):
(windows-1250):
(iso-8859-3):
(iso-8859-4):
(iso-8859-14):
(iso-8859-15):
(iso-8859-16):
(iso-8859-9):
(macintosh):
(windows-1252):
* mule/hebrew.el (iso-8859-8):
* mule/greek.el (iso-8859-7):
(windows-1253):
* mule/cyrillic.el (iso-8859-5):
(koi8-r):
(koi8-u):
(windows-1251):
(alternativnyj):
(koi8-ru):
(koi8-t):
(koi8-c):
(koi8-o):
* mule/arabic.el (iso-8859-6):
(windows-1256):
Move all these coding systems to being of type fixed-width, not of
type CCL. This allows the distinct query-coding-region for them to
be in C, something which will eventually allow us to implement
query-coding-region for the mswindows-multibyte coding systems.
* mule/general-late.el (posix-charset-to-coding-system-hash):
Document why we're pre-emptively persuading the byte compiler that
the ELC for this file needs to be written using escape-quoted.
Call #'set-unicode-query-skip-chars-args, now the Unicode
query-coding-region implementation is in C.
* mule/thai-xtis.el (tis-620):
Don't bother checking whether we're XEmacs or not here.
* mule/mule-coding.el:
Move the eight bit fixed-width functionality from this file to
make-coding-system.el.
tests/ChangeLog addition:
2009-09-19 Aidan Kehoe <kehoea@parhasard.net>
* automated/mule-tests.el:
Check a coding system's type, not an 8-bit-fixed property, for
whether that coding system should be treated as a fixed-width
coding system.
* automated/query-coding-tests.el:
Don't test the query coding functionality for mswindows-multibyte
coding systems, it's not yet implemented.
author | Aidan Kehoe <kehoea@parhasard.net> |
---|---|
date | Sat, 19 Sep 2009 22:53:13 +0100 |
parents | e4ed58cb0e5b |
children | a9833e8a32ec e0db3c197671 |
comparison
equal
deleted
inserted
replaced
4689:0636c6ccb430 | 4690:257b468bf2ca |
---|---|
76 #include "elhash.h" | 76 #include "elhash.h" |
77 #include "insdel.h" | 77 #include "insdel.h" |
78 #include "lstream.h" | 78 #include "lstream.h" |
79 #include "opaque.h" | 79 #include "opaque.h" |
80 #include "file-coding.h" | 80 #include "file-coding.h" |
81 #include "extents.h" | |
82 #include "rangetab.h" | |
83 #include "chartab.h" | |
81 | 84 |
82 #ifdef HAVE_ZLIB | 85 #ifdef HAVE_ZLIB |
83 #include "zlib.h" | 86 #include "zlib.h" |
84 #endif | 87 #endif |
85 | 88 |
87 Lisp_Object Vterminal_coding_system; | 90 Lisp_Object Vterminal_coding_system; |
88 Lisp_Object Vcoding_system_for_read; | 91 Lisp_Object Vcoding_system_for_read; |
89 Lisp_Object Vcoding_system_for_write; | 92 Lisp_Object Vcoding_system_for_write; |
90 Lisp_Object Vfile_name_coding_system; | 93 Lisp_Object Vfile_name_coding_system; |
91 | 94 |
95 Lisp_Object Qaliases, Qcharset_skip_chars_string; | |
96 | |
92 #ifdef DEBUG_XEMACS | 97 #ifdef DEBUG_XEMACS |
93 Lisp_Object Vdebug_coding_detection; | 98 Lisp_Object Vdebug_coding_detection; |
99 #endif | |
100 | |
101 #ifdef MULE | |
102 extern Lisp_Object Vcharset_ascii, Vcharset_control_1, | |
103 Vcharset_latin_iso8859_1; | |
94 #endif | 104 #endif |
95 | 105 |
96 typedef struct coding_system_type_entry | 106 typedef struct coding_system_type_entry |
97 { | 107 { |
98 struct coding_system_methods *meths; | 108 struct coding_system_methods *meths; |
415 valid_coding_system_type_p (Lisp_Object type) | 425 valid_coding_system_type_p (Lisp_Object type) |
416 { | 426 { |
417 return decode_coding_system_type (type, ERROR_ME_NOT) != 0; | 427 return decode_coding_system_type (type, ERROR_ME_NOT) != 0; |
418 } | 428 } |
419 | 429 |
430 #ifdef MULE | |
431 static Lisp_Object Vdefault_query_coding_region_chartab_cache; | |
432 | |
433 /* Non-static because it's used in INITIALIZE_CODING_SYSTEM_TYPE_WITH_DATA. */ | |
434 Lisp_Object | |
435 default_query_method (Lisp_Object codesys, struct buffer *buf, | |
436 Charbpos end, int flags) | |
437 { | |
438 Charbpos pos = BUF_PT (buf), fail_range_start, fail_range_end; | |
439 Charbpos pos_byte = BYTE_BUF_PT (buf); | |
440 Lisp_Object safe_charsets = XCODING_SYSTEM_SAFE_CHARSETS (codesys); | |
441 Lisp_Object safe_chars = XCODING_SYSTEM_SAFE_CHARS (codesys), | |
442 result = Qnil; | |
443 enum query_coding_failure_reasons failed_reason, | |
444 previous_failed_reason = query_coding_succeeded; | |
445 | |
446 /* safe-charsets of t means the coding system can encode everything. */ | |
447 if (EQ (Qnil, safe_chars)) | |
448 { | |
449 if (EQ (Qt, safe_charsets)) | |
450 { | |
451 return Qnil; | |
452 } | |
453 | |
454 /* If we've no information on what characters the coding system can | |
455 encode, give up. */ | |
456 if (EQ (Qnil, safe_charsets) && EQ (Qnil, safe_chars)) | |
457 { | |
458 return Qunbound; | |
459 } | |
460 | |
461 safe_chars = Fgethash (safe_charsets, | |
462 Vdefault_query_coding_region_chartab_cache, | |
463 Qnil); | |
464 if (NILP (safe_chars)) | |
465 { | |
466 safe_chars = Fmake_char_table (Qgeneric); | |
467 { | |
468 EXTERNAL_LIST_LOOP_2 (safe_charset, safe_charsets) | |
469 Fput_char_table (safe_charset, Qt, safe_chars); | |
470 } | |
471 | |
472 Fputhash (safe_charsets, safe_chars, | |
473 Vdefault_query_coding_region_chartab_cache); | |
474 } | |
475 } | |
476 | |
477 if (flags & QUERY_METHOD_HIGHLIGHT && | |
478 /* If we're being called really early, live without highlights getting | |
479 cleared properly: */ | |
480 !(UNBOUNDP (XSYMBOL (Qquery_coding_clear_highlights)->function))) | |
481 { | |
482 /* It's okay to call Lisp here, the only non-stack object we may have | |
483 allocated up to this point is safe_chars, and that's | |
484 reachable from its entry in | |
485 Vdefault_query_coding_region_chartab_cache */ | |
486 call3 (Qquery_coding_clear_highlights, make_int (pos), make_int (end), | |
487 wrap_buffer (buf)); | |
488 } | |
489 | |
490 while (pos < end) | |
491 { | |
492 Ichar ch = BYTE_BUF_FETCH_CHAR (buf, pos_byte); | |
493 if (!EQ (Qnil, get_char_table (ch, safe_chars))) | |
494 { | |
495 pos++; | |
496 INC_BYTEBPOS (buf, pos_byte); | |
497 } | |
498 else | |
499 { | |
500 fail_range_start = pos; | |
501 while ((pos < end) && | |
502 (EQ (Qnil, get_char_table (ch, safe_chars)) | |
503 && (failed_reason = query_coding_unencodable)) | |
504 && (previous_failed_reason == query_coding_succeeded | |
505 || previous_failed_reason == failed_reason)) | |
506 { | |
507 pos++; | |
508 INC_BYTEBPOS (buf, pos_byte); | |
509 ch = BYTE_BUF_FETCH_CHAR (buf, pos_byte); | |
510 previous_failed_reason = failed_reason; | |
511 } | |
512 | |
513 if (fail_range_start == pos) | |
514 { | |
515 /* The character can actually be encoded; move on. */ | |
516 pos++; | |
517 INC_BYTEBPOS (buf, pos_byte); | |
518 } | |
519 else | |
520 { | |
521 assert (previous_failed_reason == query_coding_unencodable); | |
522 | |
523 if (flags & QUERY_METHOD_ERRORP) | |
524 { | |
525 DECLARE_EISTRING (error_details); | |
526 | |
527 eicpy_ascii (error_details, "Cannot encode "); | |
528 eicat_lstr (error_details, | |
529 make_string_from_buffer (buf, fail_range_start, | |
530 pos - | |
531 fail_range_start)); | |
532 eicat_ascii (error_details, " using coding system"); | |
533 | |
534 signal_error (Qtext_conversion_error, | |
535 (const CIbyte *)(eidata (error_details)), | |
536 XCODING_SYSTEM_NAME (codesys)); | |
537 } | |
538 | |
539 if (NILP (result)) | |
540 { | |
541 result = Fmake_range_table (Qstart_closed_end_open); | |
542 } | |
543 | |
544 fail_range_end = pos; | |
545 | |
546 Fput_range_table (make_int (fail_range_start), | |
547 make_int (fail_range_end), | |
548 Qunencodable, | |
549 result); | |
550 previous_failed_reason = query_coding_succeeded; | |
551 | |
552 if (flags & QUERY_METHOD_HIGHLIGHT) | |
553 { | |
554 Lisp_Object extent | |
555 = Fmake_extent (make_int (fail_range_start), | |
556 make_int (fail_range_end), | |
557 wrap_buffer (buf)); | |
558 | |
559 Fset_extent_priority | |
560 (extent, make_int (2 + mouse_highlight_priority)); | |
561 Fset_extent_face (extent, Qquery_coding_warning_face); | |
562 } | |
563 } | |
564 } | |
565 } | |
566 | |
567 return result; | |
568 } | |
569 #else | |
570 Lisp_Object | |
571 default_query_method (Lisp_Object UNUSED (codesys), | |
572 struct buffer * UNUSED (buf), | |
573 Charbpos UNUSED (end), int UNUSED (flags)) | |
574 { | |
575 return Qnil; | |
576 } | |
577 #endif /* defined MULE */ | |
578 | |
420 DEFUN ("valid-coding-system-type-p", Fvalid_coding_system_type_p, 1, 1, 0, /* | 579 DEFUN ("valid-coding-system-type-p", Fvalid_coding_system_type_p, 1, 1, 0, /* |
421 Given a CODING-SYSTEM-TYPE, return non-nil if it is valid. | 580 Given a CODING-SYSTEM-TYPE, return non-nil if it is valid. |
422 Valid types depend on how XEmacs was compiled but may include | 581 Valid types depend on how XEmacs was compiled but may include |
423 `undecided', `chain', `integer', `ccl', `iso2022', `big5', `shift-jis', | 582 `undecided', `chain', `integer', `ccl', `iso2022', `big5', `shift-jis', |
424 `utf-16', `ucs-4', `utf-8', etc. | 583 `utf-16', `ucs-4', `utf-8', etc. |
980 XCODING_SYSTEM_SUBSIDIARY_PARENT (sub_codesys) = codesys; | 1139 XCODING_SYSTEM_SUBSIDIARY_PARENT (sub_codesys) = codesys; |
981 XCODING_SYSTEM (codesys)->eol[eol] = sub_codesys; | 1140 XCODING_SYSTEM (codesys)->eol[eol] = sub_codesys; |
982 } | 1141 } |
983 } | 1142 } |
984 | 1143 |
1144 DEFUN ("coding-system-canonical-name-p", Fcoding_system_canonical_name_p, | |
1145 1, 1, 0, /* | |
1146 Return t if OBJECT names a coding system, and is not a coding system alias. | |
1147 */ | |
1148 (object)) | |
1149 { | |
1150 return CODING_SYSTEMP (Fgethash (object, Vcoding_system_hash_table, Qnil)) | |
1151 ? Qt : Qnil; | |
1152 } | |
1153 | |
985 /* Basic function to create new coding systems. For `make-coding-system', | 1154 /* Basic function to create new coding systems. For `make-coding-system', |
986 NAME-OR-EXISTING is the NAME argument, PREFIX is null, and TYPE, | 1155 NAME-OR-EXISTING is the NAME argument, PREFIX is null, and TYPE, |
987 DESCRIPTION, and PROPS are the same. All created coding systems are put | 1156 DESCRIPTION, and PROPS are the same. All created coding systems are put |
988 in a hash table indexed by NAME. | 1157 in a hash table indexed by NAME. |
989 | 1158 |
1028 Lisp_Coding_System *cs; | 1197 Lisp_Coding_System *cs; |
1029 int need_to_setup_eol_systems = 1; | 1198 int need_to_setup_eol_systems = 1; |
1030 enum eol_type eol_wrapper = EOL_AUTODETECT; | 1199 enum eol_type eol_wrapper = EOL_AUTODETECT; |
1031 struct coding_system_methods *meths; | 1200 struct coding_system_methods *meths; |
1032 Lisp_Object csobj; | 1201 Lisp_Object csobj; |
1033 Lisp_Object defmnem = Qnil; | 1202 Lisp_Object defmnem = Qnil, aliases = Qnil; |
1034 | 1203 |
1035 if (NILP (type)) | 1204 if (NILP (type)) |
1036 type = Qundecided; | 1205 type = Qundecided; |
1037 meths = decode_coding_system_type (type, ERROR_ME); | 1206 meths = decode_coding_system_type (type, ERROR_ME); |
1038 | 1207 |
1117 | 1286 |
1118 else if (EQ (key, Qpost_read_conversion)) | 1287 else if (EQ (key, Qpost_read_conversion)) |
1119 CODING_SYSTEM_POST_READ_CONVERSION (cs) = value; | 1288 CODING_SYSTEM_POST_READ_CONVERSION (cs) = value; |
1120 else if (EQ (key, Qpre_write_conversion)) | 1289 else if (EQ (key, Qpre_write_conversion)) |
1121 CODING_SYSTEM_PRE_WRITE_CONVERSION (cs) = value; | 1290 CODING_SYSTEM_PRE_WRITE_CONVERSION (cs) = value; |
1291 else if (EQ (key, Qaliases)) | |
1292 { | |
1293 EXTERNAL_LIST_LOOP_2 (alias, value) | |
1294 { | |
1295 CHECK_SYMBOL (alias); | |
1296 | |
1297 if (!NILP (Fcoding_system_canonical_name_p (alias))) | |
1298 { | |
1299 invalid_change ("Symbol is the canonical name of a " | |
1300 "coding system and cannot be redefined", | |
1301 alias); | |
1302 } | |
1303 } | |
1304 aliases = value; | |
1305 } | |
1122 /* FSF compatibility */ | 1306 /* FSF compatibility */ |
1123 else if (EQ (key, Qtranslation_table_for_decode)) | 1307 else if (EQ (key, Qtranslation_table_for_decode)) |
1124 ; | 1308 ; |
1125 else if (EQ (key, Qtranslation_table_for_encode)) | 1309 else if (EQ (key, Qtranslation_table_for_encode)) |
1126 ; | 1310 ; |
1127 else if (EQ (key, Qsafe_chars)) | 1311 else if (EQ (key, Qsafe_chars)) |
1128 CODING_SYSTEM_SAFE_CHARS (cs) = value; | 1312 { |
1313 CHECK_CHAR_TABLE (value); | |
1314 CODING_SYSTEM_SAFE_CHARS (cs) = value; | |
1315 } | |
1129 else if (EQ (key, Qsafe_charsets)) | 1316 else if (EQ (key, Qsafe_charsets)) |
1130 CODING_SYSTEM_SAFE_CHARSETS (cs) = value; | 1317 { |
1318 if (!EQ (Qt, value) | |
1319 /* Would be nice to actually do this check, but there are | |
1320 some order conflicts with japanese.el and | |
1321 mule-coding.el */ | |
1322 && 0) | |
1323 { | |
1324 #ifdef MULE | |
1325 EXTERNAL_LIST_LOOP_2 (safe_charset, value) | |
1326 CHECK_CHARSET (Ffind_charset (safe_charset)); | |
1327 #endif | |
1328 } | |
1329 | |
1330 CODING_SYSTEM_SAFE_CHARSETS (cs) = value; | |
1331 } | |
1332 else if (EQ (key, Qcategory)) | |
1333 { | |
1334 Fput (name_or_existing, intern ("coding-system-property"), | |
1335 Fplist_put (Fget (name_or_existing, | |
1336 intern ("coding-system-property"), | |
1337 Qnil), | |
1338 Qcategory, value)); | |
1339 } | |
1131 else if (EQ (key, Qmime_charset)) | 1340 else if (EQ (key, Qmime_charset)) |
1132 ; | 1341 ; |
1133 else if (EQ (key, Qvalid_codes)) | 1342 else if (EQ (key, Qvalid_codes)) |
1134 ; | 1343 ; |
1135 else | 1344 else |
1184 Qconvert_eol_crlf), | 1393 Qconvert_eol_crlf), |
1185 Qcanonicalize_after_coding, | 1394 Qcanonicalize_after_coding, |
1186 csobj)); | 1395 csobj)); |
1187 } | 1396 } |
1188 XCODING_SYSTEM_EOL_TYPE (csobj) = eol_wrapper; | 1397 XCODING_SYSTEM_EOL_TYPE (csobj) = eol_wrapper; |
1398 | |
1399 { | |
1400 EXTERNAL_LIST_LOOP_2 (alias, aliases) | |
1401 Fdefine_coding_system_alias (alias, csobj); | |
1402 } | |
1189 } | 1403 } |
1190 | 1404 |
1191 return csobj; | 1405 return csobj; |
1192 } | 1406 } |
1193 | 1407 |
1197 Lisp_Object props) | 1411 Lisp_Object props) |
1198 { | 1412 { |
1199 return make_coding_system_1 (existing, prefix, type, description, props); | 1413 return make_coding_system_1 (existing, prefix, type, description, props); |
1200 } | 1414 } |
1201 | 1415 |
1202 DEFUN ("make-coding-system", Fmake_coding_system, 2, 4, 0, /* | 1416 DEFUN ("make-coding-system-internal", Fmake_coding_system_internal, 2, 4, 0, /* |
1203 Register symbol NAME as a coding system. | 1417 See `make-coding-system'. This does much of the work of that function. |
1204 | 1418 |
1205 TYPE describes the conversion method used and should be one of | 1419 Without Mule support, it does all the work of that function, and an alias |
1206 | 1420 exists, mapping `make-coding-system' to |
1207 nil or `undecided' | 1421 `make-coding-system-internal'. You'll need a non-Mule XEmacs to read the |
1208 Automatic conversion. XEmacs attempts to detect the coding system | 1422 complete docstring. Or you can just read it in make-coding-system.el; |
1209 used in the file. | 1423 something like the following should work: |
1210 `chain' | 1424 |
1211 Chain two or more coding systems together to make a combination coding | 1425 \\[find-function-other-window] find-file RET \\[find-file] mule/make-coding-system.el RET |
1212 system. | |
1213 `no-conversion' | |
1214 No conversion. Use this for binary files and such. On output, | |
1215 graphic characters that are not in ASCII or Latin-1 will be | |
1216 replaced by a ?. (For a no-conversion-encoded buffer, these | |
1217 characters will only be present if you explicitly insert them.) | |
1218 `convert-eol' | |
1219 Convert CRLF sequences or CR to LF. | |
1220 `shift-jis' | |
1221 Shift-JIS (a Japanese encoding commonly used in PC operating systems). | |
1222 `unicode' | |
1223 Any Unicode encoding (UCS-4, UTF-8, UTF-16, etc.). | |
1224 `mswindows-unicode-to-multibyte' | |
1225 (MS Windows only) Converts from Windows Unicode to Windows Multibyte | |
1226 (any code page encoding) upon encoding, and the other way upon decoding. | |
1227 `mswindows-multibyte' | |
1228 Converts to or from Windows Multibyte (any code page encoding). | |
1229 This is resolved into a chain of `mswindows-unicode' and | |
1230 `mswindows-unicode-to-multibyte'. | |
1231 `iso2022' | |
1232 Any ISO2022-compliant encoding. Among other things, this includes | |
1233 JIS (the Japanese encoding commonly used for e-mail), EUC (the | |
1234 standard Unix encoding for Japanese and other languages), and | |
1235 Compound Text (the encoding used in X11). You can specify more | |
1236 specific information about the conversion with the PROPS argument. | |
1237 `big5' | |
1238 Big5 (the encoding commonly used for Mandarin Chinese in Taiwan). | |
1239 `ccl' | |
1240 The conversion is performed using a user-written pseudo-code | |
1241 program. CCL (Code Conversion Language) is the name of this | |
1242 pseudo-code. | |
1243 `gzip' | |
1244 GZIP compression format. | |
1245 `internal' | |
1246 Write out or read in the raw contents of the memory representing | |
1247 the buffer's text. This is primarily useful for debugging | |
1248 purposes, and is only enabled when XEmacs has been compiled with | |
1249 DEBUG_XEMACS defined (via the --debug configure option). | |
1250 WARNING: Reading in a file using `internal' conversion can result | |
1251 in an internal inconsistency in the memory representing a | |
1252 buffer's text, which will produce unpredictable results and may | |
1253 cause XEmacs to crash. Under normal circumstances you should | |
1254 never use `internal' conversion. | |
1255 | |
1256 DESCRIPTION is a short English phrase describing the coding system, | |
1257 suitable for use as a menu item. (See also the `documentation' property | |
1258 below.) | |
1259 | |
1260 PROPS is a property list, describing the specific nature of the | |
1261 character set. Recognized properties are: | |
1262 | |
1263 `mnemonic' | |
1264 String to be displayed in the modeline when this coding system is | |
1265 active. | |
1266 | |
1267 `documentation' | |
1268 Detailed documentation on the coding system. | |
1269 | |
1270 `eol-type' | |
1271 End-of-line conversion to be used. It should be one of | |
1272 | |
1273 nil | |
1274 Automatically detect the end-of-line type (LF, CRLF, | |
1275 or CR). Also generate subsidiary coding systems named | |
1276 `NAME-unix', `NAME-dos', and `NAME-mac', that are | |
1277 identical to this coding system but have an EOL-TYPE | |
1278 value of `lf', `crlf', and `cr', respectively. | |
1279 `lf' | |
1280 The end of a line is marked externally using ASCII LF. | |
1281 Since this is also the way that XEmacs represents an | |
1282 end-of-line internally, specifying this option results | |
1283 in no end-of-line conversion. This is the standard | |
1284 format for Unix text files. | |
1285 `crlf' | |
1286 The end of a line is marked externally using ASCII | |
1287 CRLF. This is the standard format for MS-DOS text | |
1288 files. | |
1289 `cr' | |
1290 The end of a line is marked externally using ASCII CR. | |
1291 This is the standard format for Macintosh text files. | |
1292 t | |
1293 Automatically detect the end-of-line type but do not | |
1294 generate subsidiary coding systems. (This value is | |
1295 converted to nil when stored internally, and | |
1296 `coding-system-property' will return nil.) | |
1297 | |
1298 `post-read-conversion' | |
1299 The value is a function to call after some text is inserted and | |
1300 decoded by the coding system itself and before any functions in | |
1301 `after-change-functions' are called. (#### Not actually true in | |
1302 XEmacs. `after-change-functions' will be called twice if | |
1303 `post-read-conversion' changes something.) The argument of this | |
1304 function is the same as for a function in | |
1305 `after-insert-file-functions', i.e. LENGTH of the text inserted, | |
1306 with point at the head of the text to be decoded. | |
1307 | |
1308 `pre-write-conversion' | |
1309 The value is a function to call after all functions in | |
1310 `write-region-annotate-functions' and `buffer-file-format' are | |
1311 called, and before the text is encoded by the coding system itself. | |
1312 The arguments to this function are the same as those of a function | |
1313 in `write-region-annotate-functions', i.e. FROM and TO, specifying | |
1314 a region of text. | |
1315 | |
1316 | |
1317 | |
1318 The following properties are allowed for FSF compatibility but currently | |
1319 ignored: | |
1320 | |
1321 `translation-table-for-decode' | |
1322 The value is a translation table to be applied on decoding. See | |
1323 the function `make-translation-table' for the format of translation | |
1324 table. This is not applicable to CCL-based coding systems. | |
1325 | |
1326 `translation-table-for-encode' | |
1327 The value is a translation table to be applied on encoding. This is | |
1328 not applicable to CCL-based coding systems. | |
1329 | |
1330 `mime-charset' | |
1331 The value is a symbol of which name is `MIME-charset' parameter of | |
1332 the coding system. | |
1333 | |
1334 `valid-codes' (meaningful only for a coding system based on CCL) | |
1335 The value is a list to indicate valid byte ranges of the encoded | |
1336 file. Each element of the list is an integer or a cons of integer. | |
1337 In the former case, the integer value is a valid byte code. In the | |
1338 latter case, the integers specifies the range of valid byte codes. | |
1339 | |
1340 The following properties are used by `default-query-coding-region', | |
1341 the default implementation of `query-coding-region'. This | |
1342 implementation and these properties are not used by the Unicode coding | |
1343 systems, nor by those CCL coding systems created with | |
1344 `make-8-bit-coding-system'. | |
1345 | |
1346 `safe-chars' | |
1347 The value is a char table. If a character has non-nil value in it, | |
1348 the character is safely supported by the coding system. | |
1349 Under XEmacs, for the moment, this is used in addition to the | |
1350 `safe-charsets' property. It does not override it as it does | |
1351 under GNU Emacs. #### We need to consider if we should keep this | |
1352 behaviour. | |
1353 | |
1354 `safe-charsets' | |
1355 The value is a list of charsets safely supported by the coding | |
1356 system. For coding systems based on ISO 2022, XEmacs may try to | |
1357 encode characters outside these character sets, but outside of | |
1358 East Asia and East Asian coding systems, it is unlikely that | |
1359 consumers of the data will understand XEmacs' encoding. | |
1360 The value t means that all XEmacs character sets handles are supported. | |
1361 | |
1362 The following additional property is recognized if TYPE is `convert-eol': | |
1363 | |
1364 `subtype' | |
1365 One of `lf', `crlf', `cr' or nil (for autodetection). When decoding, | |
1366 the corresponding sequence will be converted to LF. When encoding, | |
1367 the opposite happens. This coding system converts characters to | |
1368 characters. | |
1369 | |
1370 | |
1371 | |
1372 The following additional properties are recognized if TYPE is `iso2022': | |
1373 | |
1374 `charset-g0' | |
1375 `charset-g1' | |
1376 `charset-g2' | |
1377 `charset-g3' | |
1378 The character set initially designated to the G0 - G3 registers. | |
1379 The value should be one of | |
1380 | |
1381 -- A charset object (designate that character set) | |
1382 -- nil (do not ever use this register) | |
1383 -- t (no character set is initially designated to | |
1384 the register, but may be later on; this automatically | |
1385 sets the corresponding `force-g*-on-output' property) | |
1386 | |
1387 `force-g0-on-output' | |
1388 `force-g1-on-output' | |
1389 `force-g2-on-output' | |
1390 `force-g2-on-output' | |
1391 If non-nil, send an explicit designation sequence on output before | |
1392 using the specified register. | |
1393 | |
1394 `short' | |
1395 If non-nil, use the short forms "ESC $ @", "ESC $ A", and | |
1396 "ESC $ B" on output in place of the full designation sequences | |
1397 "ESC $ ( @", "ESC $ ( A", and "ESC $ ( B". | |
1398 | |
1399 `no-ascii-eol' | |
1400 If non-nil, don't designate ASCII to G0 at each end of line on output. | |
1401 Setting this to non-nil also suppresses other state-resetting that | |
1402 normally happens at the end of a line. | |
1403 | |
1404 `no-ascii-cntl' | |
1405 If non-nil, don't designate ASCII to G0 before control chars on output. | |
1406 | |
1407 `seven' | |
1408 If non-nil, use 7-bit environment on output. Otherwise, use 8-bit | |
1409 environment. | |
1410 | |
1411 `lock-shift' | |
1412 If non-nil, use locking-shift (SO/SI) instead of single-shift | |
1413 or designation by escape sequence. | |
1414 | |
1415 `no-iso6429' | |
1416 If non-nil, don't use ISO6429's direction specification. | |
1417 | |
1418 `escape-quoted' | |
1419 If non-nil, literal control characters that are the same as | |
1420 the beginning of a recognized ISO2022 or ISO6429 escape sequence | |
1421 (in particular, ESC (0x1B), SO (0x0E), SI (0x0F), SS2 (0x8E), | |
1422 SS3 (0x8F), and CSI (0x9B)) are "quoted" with an escape character | |
1423 so that they can be properly distinguished from an escape sequence. | |
1424 (Note that doing this results in a non-portable encoding.) This | |
1425 encoding flag is used for byte-compiled files. Note that ESC | |
1426 is a good choice for a quoting character because there are no | |
1427 escape sequences whose second byte is a character from the Control-0 | |
1428 or Control-1 character sets; this is explicitly disallowed by the | |
1429 ISO2022 standard. | |
1430 | |
1431 `input-charset-conversion' | |
1432 A list of conversion specifications, specifying conversion of | |
1433 characters in one charset to another when decoding is performed. | |
1434 Each specification is a list of two elements: the source charset, | |
1435 and the destination charset. | |
1436 | |
1437 `output-charset-conversion' | |
1438 A list of conversion specifications, specifying conversion of | |
1439 characters in one charset to another when encoding is performed. | |
1440 The form of each specification is the same as for | |
1441 `input-charset-conversion'. | |
1442 | |
1443 | |
1444 | |
1445 The following additional properties are recognized (and required) | |
1446 if TYPE is `ccl': | |
1447 | |
1448 `decode' | |
1449 CCL program used for decoding (converting to internal format). | |
1450 | |
1451 `encode' | |
1452 CCL program used for encoding (converting to external format). | |
1453 | |
1454 | |
1455 The following additional properties are recognized if TYPE is `chain': | |
1456 | |
1457 `chain' | |
1458 List of coding systems to be chained together, in decoding order. | |
1459 | |
1460 `canonicalize-after-coding' | |
1461 Coding system to be returned by the detector routines in place of | |
1462 this coding system. | |
1463 | |
1464 | |
1465 | |
1466 The following additional properties are recognized if TYPE is `unicode': | |
1467 | |
1468 `unicode-type' | |
1469 One of `utf-16', `utf-8', `ucs-4', or `utf-7' (the latter is not | |
1470 yet implemented). `utf-16' is the basic two-byte encoding; | |
1471 `ucs-4' is the four-byte encoding; `utf-8' is an ASCII-compatible | |
1472 variable-width 8-bit encoding; `utf-7' is a 7-bit encoding using | |
1473 only characters that will safely pass through all mail gateways. | |
1474 [[ This should be \"transformation format\". There should also be | |
1475 `ucs-2' (or `bmp' -- no surrogates) and `utf-32' (range checked). ]] | |
1476 | |
1477 `little-endian' | |
1478 If non-nil, `utf-16' and `ucs-4' will write out the groups of two | |
1479 or four bytes little-endian instead of big-endian. This is required, | |
1480 for example, under Windows. | |
1481 | |
1482 `need-bom' | |
1483 If non-nil, a byte order mark (BOM, or Unicode FFFE) should be | |
1484 written out at the beginning of the data. This serves both to | |
1485 identify the endianness of the following data and to mark the | |
1486 data as Unicode (at least, this is how Windows uses it). | |
1487 [[ The correct term is \"signature\", since this technique may also | |
1488 be used with UTF-8. That is the term used in the standard. ]] | |
1489 | |
1490 | |
1491 The following additional properties are recognized if TYPE is | |
1492 `mswindows-multibyte': | |
1493 | |
1494 `code-page' | |
1495 Either a number (specifying a particular code page) or one of the | |
1496 symbols `ansi', `oem', `mac', or `ebcdic', specifying the ANSI, | |
1497 OEM, Macintosh, or EBCDIC code page associated with a particular | |
1498 locale (given by the `locale' property). NOTE: EBCDIC code pages | |
1499 only exist in Windows 2000 and later. | |
1500 | |
1501 `locale' | |
1502 If `code-page' is a symbol, this specifies the locale whose code | |
1503 page of the corresponding type should be used. This should be | |
1504 one of the following: A cons of two strings, (LANGUAGE | |
1505 . SUBLANGUAGE) (see `mswindows-set-current-locale'); a string (a | |
1506 language; SUBLANG_DEFAULT, i.e. the default sublanguage, is | |
1507 used); or one of the symbols `current', `user-default', or | |
1508 `system-default', corresponding to the values of | |
1509 `mswindows-current-locale', `mswindows-user-default-locale', or | |
1510 `mswindows-system-default-locale', respectively. | |
1511 | |
1512 | |
1513 | |
1514 The following additional properties are recognized if TYPE is `undecided': | |
1515 \[[ Doesn't GNU use \"detect-*\" for the following two? ]] | |
1516 | |
1517 `do-eol' | |
1518 Do EOL detection. | |
1519 | |
1520 `do-coding' | |
1521 Do encoding detection. | |
1522 | |
1523 `coding-system' | |
1524 If encoding detection is not done, use the specified coding system | |
1525 to do decoding. This is used internally when implementing coding | |
1526 systems with an EOL type that specifies autodetection (the default), | |
1527 so that the detector routines return the proper subsidiary. | |
1528 | |
1529 | |
1530 | |
1531 The following additional property is recognized if TYPE is `gzip': | |
1532 | |
1533 `level' | |
1534 Compression level: 0 through 9, or `default' (currently 6). | |
1535 | 1426 |
1536 */ | 1427 */ |
1537 (name, type, description, props)) | 1428 (name, type, description, props)) |
1538 { | 1429 { |
1539 return make_coding_system_1 (name, 0, type, description, props); | 1430 return make_coding_system_1 (name, 0, type, description, props); |
1573 Lisp_Coding_System *from = XCODING_SYSTEM (old_coding_system); | 1464 Lisp_Coding_System *from = XCODING_SYSTEM (old_coding_system); |
1574 COPY_SIZED_LCRECORD (to, from, sizeof_coding_system (from)); | 1465 COPY_SIZED_LCRECORD (to, from, sizeof_coding_system (from)); |
1575 to->name = new_name; | 1466 to->name = new_name; |
1576 } | 1467 } |
1577 return new_coding_system; | 1468 return new_coding_system; |
1578 } | |
1579 | |
1580 DEFUN ("coding-system-canonical-name-p", Fcoding_system_canonical_name_p, | |
1581 1, 1, 0, /* | |
1582 Return t if OBJECT names a coding system, and is not a coding system alias. | |
1583 */ | |
1584 (object)) | |
1585 { | |
1586 return CODING_SYSTEMP (Fgethash (object, Vcoding_system_hash_table, Qnil)) | |
1587 ? Qt : Qnil; | |
1588 } | 1469 } |
1589 | 1470 |
1590 /* #### Shouldn't this really be a find/get pair? */ | 1471 /* #### Shouldn't this really be a find/get pair? */ |
1591 | 1472 |
1592 DEFUN ("coding-system-alias-p", Fcoding_system_alias_p, 1, 1, 0, /* | 1473 DEFUN ("coding-system-alias-p", Fcoding_system_alias_p, 1, 1, 0, /* |
2472 (start, end, coding_system, buffer)) | 2353 (start, end, coding_system, buffer)) |
2473 { | 2354 { |
2474 return encode_decode_coding_region (start, end, coding_system, buffer, | 2355 return encode_decode_coding_region (start, end, coding_system, buffer, |
2475 CODING_ENCODE); | 2356 CODING_ENCODE); |
2476 } | 2357 } |
2358 | |
2359 DEFUN ("query-coding-region", Fquery_coding_region, 3, 7, 0, /* | |
2360 Work out whether CODING-SYSTEM can losslessly encode a region. | |
2361 | |
2362 START and END are the beginning and end of the region to check. | |
2363 CODING-SYSTEM is the coding system to try. | |
2364 | |
2365 Optional argument BUFFER is the buffer to check, and defaults to the current | |
2366 buffer. | |
2367 | |
2368 IGNORE-INVALID-SEQUENCESP, also an optional argument, says to treat XEmacs | |
2369 characters which have an unambiguous encoded representation, despite being | |
2370 undefined in what they represent, as encodable. These chiefly arise with | |
2371 variable-length encodings like UTF-8 and UTF-16, where an invalid sequence | |
2372 is passed through to XEmacs as a sequence of characters with a defined | |
2373 correspondence to the octets on disk, but no non-error semantics; see the | |
2374 `invalid-sequence-coding-system' argument to `set-language-info'. | |
2375 | |
2376 They can also arise with fixed-length encodings like ISO 8859-7, where | |
2377 certain octets on disk have undefined values, and treating them as | |
2378 corresponding to the ISO 8859-1 characters with the same numerical values | |
2379 may lead to data that is not understood by other applications. | |
2380 | |
2381 Optional argument ERRORP says to signal a `text-conversion-error' if some | |
2382 character in the region cannot be encoded, and defaults to nil. | |
2383 | |
2384 Optional argument HIGHLIGHT says to display unencodable characters in the | |
2385 region using `query-coding-warning-face'. It defaults to nil. | |
2386 | |
2387 This function can return multiple values; the intention is that callers use | |
2388 `multiple-value-bind' or the related CL multiple value functions to deal | |
2389 with it. The first result is `t' if the region can be encoded using | |
2390 CODING-SYSTEM, or `nil' if not. If the region cannot be encoded using | |
2391 CODING-SYSTEM, the second result is a range table describing the positions | |
2392 of the unencodable characters. | |
2393 | |
2394 Ranges that describe characters that would be ignored were | |
2395 IGNORE-INVALID-SEQUENCESP non-nil map to the symbol `invalid-sequence'; | |
2396 other ranges map to the symbol `unencodable'. If IGNORE-INVALID-SEQUENCESP | |
2397 is non-nil, all ranges will map to the symbol `unencodable'. See | |
2398 `make-range-table' for more details of range tables. | |
2399 */ | |
2400 (start, end, coding_system, buffer, ignore_invalid_sequencesp, | |
2401 errorp, highlight)) | |
2402 { | |
2403 Charbpos b, e; | |
2404 struct buffer *buf = decode_buffer (buffer, 1); | |
2405 Lisp_Object result; | |
2406 int flags = 0, speccount = specpdl_depth (); | |
2407 | |
2408 coding_system = Fget_coding_system (coding_system); | |
2409 | |
2410 get_buffer_range_char (buf, start, end, &b, &e, 0); | |
2411 | |
2412 if (buf != current_buffer) | |
2413 { | |
2414 record_unwind_protect (save_current_buffer_restore, Fcurrent_buffer ()); | |
2415 set_buffer_internal (buf); | |
2416 } | |
2417 | |
2418 record_unwind_protect (save_excursion_restore, save_excursion_save ()); | |
2419 | |
2420 BUF_SET_PT (buf, b); | |
2421 | |
2422 if (!NILP (ignore_invalid_sequencesp)) | |
2423 { | |
2424 flags |= QUERY_METHOD_IGNORE_INVALID_SEQUENCES; | |
2425 } | |
2426 | |
2427 if (!NILP (errorp)) | |
2428 { | |
2429 flags |= QUERY_METHOD_ERRORP; | |
2430 } | |
2431 | |
2432 if (!NILP (highlight)) | |
2433 { | |
2434 flags |= QUERY_METHOD_HIGHLIGHT; | |
2435 } | |
2436 | |
2437 result = XCODESYSMETH_OR_GIVEN (coding_system, query, | |
2438 (coding_system, buf, e, flags), Qunbound); | |
2439 | |
2440 if (UNBOUNDP (result)) | |
2441 { | |
2442 signal_error (Qtext_conversion_error, | |
2443 "Coding system doesn't say what it can encode", | |
2444 XCODING_SYSTEM_NAME (coding_system)); | |
2445 } | |
2446 | |
2447 result = (NILP (result)) ? Qt : values2 (Qnil, result); | |
2448 | |
2449 return unbind_to_1 (speccount, result); | |
2450 } | |
2451 | |
2477 | 2452 |
2478 | 2453 |
2479 /************************************************************************/ | 2454 /************************************************************************/ |
2480 /* Chain methods */ | 2455 /* Chain methods */ |
2481 /************************************************************************/ | 2456 /************************************************************************/ |
4548 DEFSUBR (Fautoload_coding_system); | 4523 DEFSUBR (Fautoload_coding_system); |
4549 DEFSUBR (Ffind_coding_system); | 4524 DEFSUBR (Ffind_coding_system); |
4550 DEFSUBR (Fget_coding_system); | 4525 DEFSUBR (Fget_coding_system); |
4551 DEFSUBR (Fcoding_system_list); | 4526 DEFSUBR (Fcoding_system_list); |
4552 DEFSUBR (Fcoding_system_name); | 4527 DEFSUBR (Fcoding_system_name); |
4553 DEFSUBR (Fmake_coding_system); | 4528 DEFSUBR (Fmake_coding_system_internal); |
4554 DEFSUBR (Fcopy_coding_system); | 4529 DEFSUBR (Fcopy_coding_system); |
4555 DEFSUBR (Fcoding_system_canonical_name_p); | 4530 DEFSUBR (Fcoding_system_canonical_name_p); |
4556 DEFSUBR (Fcoding_system_alias_p); | 4531 DEFSUBR (Fcoding_system_alias_p); |
4557 DEFSUBR (Fcoding_system_aliasee); | 4532 DEFSUBR (Fcoding_system_aliasee); |
4558 DEFSUBR (Fdefine_coding_system_alias); | 4533 DEFSUBR (Fdefine_coding_system_alias); |
4571 DEFSUBR (Fcoding_category_system); | 4546 DEFSUBR (Fcoding_category_system); |
4572 | 4547 |
4573 DEFSUBR (Fdetect_coding_region); | 4548 DEFSUBR (Fdetect_coding_region); |
4574 DEFSUBR (Fdecode_coding_region); | 4549 DEFSUBR (Fdecode_coding_region); |
4575 DEFSUBR (Fencode_coding_region); | 4550 DEFSUBR (Fencode_coding_region); |
4551 DEFSUBR (Fquery_coding_region); | |
4576 DEFSYMBOL_MULTIWORD_PREDICATE (Qcoding_systemp); | 4552 DEFSYMBOL_MULTIWORD_PREDICATE (Qcoding_systemp); |
4577 DEFSYMBOL (Qno_conversion); | 4553 DEFSYMBOL (Qno_conversion); |
4578 DEFSYMBOL (Qconvert_eol); | 4554 DEFSYMBOL (Qconvert_eol); |
4579 DEFSYMBOL (Qconvert_eol_autodetect); | 4555 DEFSYMBOL (Qconvert_eol_autodetect); |
4580 DEFSYMBOL (Qconvert_eol_lf); | 4556 DEFSYMBOL (Qconvert_eol_lf); |
4618 DEFSYMBOL (Qcanonicalize_after_coding); | 4594 DEFSYMBOL (Qcanonicalize_after_coding); |
4619 | 4595 |
4620 DEFSYMBOL (Qposix_charset_to_coding_system_hash); | 4596 DEFSYMBOL (Qposix_charset_to_coding_system_hash); |
4621 | 4597 |
4622 DEFSYMBOL (Qescape_quoted); | 4598 DEFSYMBOL (Qescape_quoted); |
4599 | |
4600 DEFSYMBOL (Qquery_coding_warning_face); | |
4601 DEFSYMBOL (Qaliases); | |
4602 DEFSYMBOL (Qcharset_skip_chars_string); | |
4623 | 4603 |
4624 #ifdef HAVE_ZLIB | 4604 #ifdef HAVE_ZLIB |
4625 DEFSYMBOL (Qgzip); | 4605 DEFSYMBOL (Qgzip); |
4626 #endif | 4606 #endif |
4627 | 4607 |
4842 If non-nil, display debug information about detection operations in progress. | 4822 If non-nil, display debug information about detection operations in progress. |
4843 Information is displayed on stderr. | 4823 Information is displayed on stderr. |
4844 */ ); | 4824 */ ); |
4845 Vdebug_coding_detection = Qnil; | 4825 Vdebug_coding_detection = Qnil; |
4846 #endif | 4826 #endif |
4827 | |
4828 #ifdef MULE | |
4829 Vdefault_query_coding_region_chartab_cache | |
4830 = make_lisp_hash_table (25, HASH_TABLE_NON_WEAK, HASH_TABLE_EQUAL); | |
4831 staticpro (&Vdefault_query_coding_region_chartab_cache); | |
4832 #endif | |
4847 } | 4833 } |
4848 | 4834 |
4849 /* #### reformat this for consistent appearance? */ | 4835 /* #### reformat this for consistent appearance? */ |
4850 | 4836 |
4851 void | 4837 void |
4852 complex_vars_of_file_coding (void) | 4838 complex_vars_of_file_coding (void) |
4853 { | 4839 { |
4854 Fmake_coding_system | 4840 Fmake_coding_system_internal |
4855 (Qconvert_eol_cr, Qconvert_eol, | 4841 (Qconvert_eol_cr, Qconvert_eol, |
4856 build_msg_string ("Convert CR to LF"), | 4842 build_msg_string ("Convert CR to LF"), |
4857 nconc2 (list6 (Qdocumentation, | 4843 nconc2 (list6 (Qdocumentation, |
4858 build_msg_string ( | 4844 build_msg_string ( |
4859 "Converts CR (used to mark the end of a line on Macintosh systems) to LF\n" | 4845 "Converts CR (used to mark the end of a line on Macintosh systems) to LF\n" |
4861 Qmnemonic, build_string ("CR->LF"), | 4847 Qmnemonic, build_string ("CR->LF"), |
4862 Qsubtype, Qcr), | 4848 Qsubtype, Qcr), |
4863 /* VERY IMPORTANT! Tell make-coding-system not to generate | 4849 /* VERY IMPORTANT! Tell make-coding-system not to generate |
4864 subsidiaries -- it needs the coding systems we're creating | 4850 subsidiaries -- it needs the coding systems we're creating |
4865 to do so! */ | 4851 to do so! */ |
4866 list2 (Qeol_type, Qlf))); | 4852 list4 (Qeol_type, Qlf, |
4867 | 4853 Qsafe_charsets, Qt))); |
4868 Fmake_coding_system | 4854 |
4855 Fmake_coding_system_internal | |
4869 (Qconvert_eol_lf, Qconvert_eol, | 4856 (Qconvert_eol_lf, Qconvert_eol, |
4870 build_msg_string ("Convert LF to LF (do nothing)"), | 4857 build_msg_string ("Convert LF to LF (do nothing)"), |
4871 nconc2 (list6 (Qdocumentation, | 4858 nconc2 (list6 (Qdocumentation, |
4872 build_msg_string ( | 4859 build_msg_string ( |
4873 "Do nothing."), | 4860 "Do nothing."), |
4874 Qmnemonic, build_string ("LF->LF"), | 4861 Qmnemonic, build_string ("LF->LF"), |
4875 Qsubtype, Qlf), | 4862 Qsubtype, Qlf), |
4876 /* VERY IMPORTANT! Tell make-coding-system not to generate | 4863 /* VERY IMPORTANT! Tell make-coding-system not to generate |
4877 subsidiaries -- it needs the coding systems we're creating | 4864 subsidiaries -- it needs the coding systems we're creating |
4878 to do so! */ | 4865 to do so! */ |
4879 list2 (Qeol_type, Qlf))); | 4866 list4 (Qeol_type, Qlf, |
4880 | 4867 Qsafe_charsets, Qt))); |
4881 Fmake_coding_system | 4868 |
4869 Fmake_coding_system_internal | |
4882 (Qconvert_eol_crlf, Qconvert_eol, | 4870 (Qconvert_eol_crlf, Qconvert_eol, |
4883 build_msg_string ("Convert CRLF to LF"), | 4871 build_msg_string ("Convert CRLF to LF"), |
4884 nconc2 (list6 (Qdocumentation, | 4872 nconc2 (list6 (Qdocumentation, |
4885 build_msg_string ( | 4873 build_msg_string ( |
4886 "Converts CR+LF (used to mark the end of a line on Macintosh systems) to LF\n" | 4874 "Converts CR+LF (used to mark the end of a line on Macintosh systems) to LF\n" |
4887 "(used internally and under Unix to mark the end of a line)."), | 4875 "(used internally and under Unix to mark the end of a line)."), |
4888 Qmnemonic, build_string ("CRLF->LF"), | 4876 Qmnemonic, build_string ("CRLF->LF"), |
4889 Qsubtype, Qcrlf), | 4877 Qsubtype, Qcrlf), |
4878 | |
4890 /* VERY IMPORTANT! Tell make-coding-system not to generate | 4879 /* VERY IMPORTANT! Tell make-coding-system not to generate |
4891 subsidiaries -- it needs the coding systems we're creating | 4880 subsidiaries -- it needs the coding systems we're creating |
4892 to do so! */ | 4881 to do so! */ |
4893 list2 (Qeol_type, Qlf))); | 4882 list4 (Qeol_type, Qlf, |
4894 | 4883 Qsafe_charsets, Qt))); |
4895 Fmake_coding_system | 4884 |
4885 Fmake_coding_system_internal | |
4896 (Qconvert_eol_autodetect, Qconvert_eol, | 4886 (Qconvert_eol_autodetect, Qconvert_eol, |
4897 build_msg_string ("Autodetect EOL type"), | 4887 build_msg_string ("Autodetect EOL type"), |
4898 nconc2 (list6 (Qdocumentation, | 4888 nconc2 (list6 (Qdocumentation, |
4899 build_msg_string ( | 4889 build_msg_string ( |
4900 "Autodetect the end-of-line type."), | 4890 "Autodetect the end-of-line type."), |
4901 Qmnemonic, build_string ("Auto-EOL"), | 4891 Qmnemonic, build_string ("Auto-EOL"), |
4902 Qsubtype, Qnil), | 4892 Qsubtype, Qnil), |
4903 /* VERY IMPORTANT! Tell make-coding-system not to generate | 4893 /* VERY IMPORTANT! Tell make-coding-system not to generate |
4904 subsidiaries -- it needs the coding systems we're creating | 4894 subsidiaries -- it needs the coding systems we're creating |
4905 to do so! */ | 4895 to do so! */ |
4906 list2 (Qeol_type, Qlf))); | 4896 list4 (Qeol_type, Qlf, |
4907 | 4897 Qsafe_charsets, Qt))); |
4908 Fmake_coding_system | 4898 |
4899 Fmake_coding_system_internal | |
4909 (Qundecided, Qundecided, | 4900 (Qundecided, Qundecided, |
4910 build_msg_string ("Undecided (auto-detect)"), | 4901 build_msg_string ("Undecided (auto-detect)"), |
4911 nconc2 (list4 (Qdocumentation, | 4902 nconc2 (list4 (Qdocumentation, |
4912 build_msg_string | 4903 build_msg_string |
4913 ("Automatically detects the correct encoding."), | 4904 ("Automatically detects the correct encoding."), |
4916 /* We do EOL detection ourselves so we don't need to be | 4907 /* We do EOL detection ourselves so we don't need to be |
4917 wrapped in an EOL detector. (It doesn't actually hurt, | 4908 wrapped in an EOL detector. (It doesn't actually hurt, |
4918 though, I don't think.) */ | 4909 though, I don't think.) */ |
4919 Qeol_type, Qlf))); | 4910 Qeol_type, Qlf))); |
4920 | 4911 |
4921 Fmake_coding_system | 4912 Fmake_coding_system_internal |
4922 (intern ("undecided-dos"), Qundecided, | 4913 (intern ("undecided-dos"), Qundecided, |
4923 build_msg_string ("Undecided (auto-detect) (CRLF)"), | 4914 build_msg_string ("Undecided (auto-detect) (CRLF)"), |
4924 nconc2 (list4 (Qdocumentation, | 4915 nconc2 (list4 (Qdocumentation, |
4925 build_msg_string | 4916 build_msg_string |
4926 ("Automatically detects the correct encoding; EOL type of CRLF forced."), | 4917 ("Automatically detects the correct encoding; EOL type of CRLF forced."), |
4927 Qmnemonic, build_string ("Auto")), | 4918 Qmnemonic, build_string ("Auto")), |
4928 list4 (Qdo_coding, Qt, | 4919 list4 (Qdo_coding, Qt, |
4929 Qeol_type, Qcrlf))); | 4920 Qeol_type, Qcrlf))); |
4930 | 4921 |
4931 Fmake_coding_system | 4922 Fmake_coding_system_internal |
4932 (intern ("undecided-unix"), Qundecided, | 4923 (intern ("undecided-unix"), Qundecided, |
4933 build_msg_string ("Undecided (auto-detect) (LF)"), | 4924 build_msg_string ("Undecided (auto-detect) (LF)"), |
4934 nconc2 (list4 (Qdocumentation, | 4925 nconc2 (list4 (Qdocumentation, |
4935 build_msg_string | 4926 build_msg_string |
4936 ("Automatically detects the correct encoding; EOL type of LF forced."), | 4927 ("Automatically detects the correct encoding; EOL type of LF forced."), |
4937 Qmnemonic, build_string ("Auto")), | 4928 Qmnemonic, build_string ("Auto")), |
4938 list4 (Qdo_coding, Qt, | 4929 list4 (Qdo_coding, Qt, |
4939 Qeol_type, Qlf))); | 4930 Qeol_type, Qlf))); |
4940 | 4931 |
4941 Fmake_coding_system | 4932 Fmake_coding_system_internal |
4942 (intern ("undecided-mac"), Qundecided, | 4933 (intern ("undecided-mac"), Qundecided, |
4943 build_msg_string ("Undecided (auto-detect) (CR)"), | 4934 build_msg_string ("Undecided (auto-detect) (CR)"), |
4944 nconc2 (list4 (Qdocumentation, | 4935 nconc2 (list4 (Qdocumentation, |
4945 build_msg_string | 4936 build_msg_string |
4946 ("Automatically detects the correct encoding; EOL type of CR forced."), | 4937 ("Automatically detects the correct encoding; EOL type of CR forced."), |
4947 Qmnemonic, build_string ("Auto")), | 4938 Qmnemonic, build_string ("Auto")), |
4948 list4 (Qdo_coding, Qt, | 4939 list4 (Qdo_coding, Qt, |
4949 Qeol_type, Qcr))); | 4940 Qeol_type, Qcr))); |
4950 | 4941 |
4951 /* Need to create this here or we're really screwed. */ | 4942 /* Need to create this here or we're really screwed. */ |
4952 Fmake_coding_system | 4943 Fmake_coding_system_internal |
4953 (Qraw_text, Qno_conversion, | 4944 (Qraw_text, Qno_conversion, |
4954 build_msg_string ("Raw Text"), | 4945 build_msg_string ("Raw Text"), |
4955 list4 (Qdocumentation, | 4946 nconc2 (list4 (Qdocumentation, |
4956 build_msg_string ("Raw text converts only line-break codes, and acts otherwise like `binary'."), | 4947 build_msg_string ("Raw text converts only line-break " |
4957 Qmnemonic, build_string ("Raw"))); | 4948 "codes, and acts otherwise like " |
4958 | 4949 "`binary'."), |
4959 Fmake_coding_system | 4950 Qmnemonic, build_string ("Raw")), |
4951 #ifdef MULE | |
4952 list2 (Qsafe_charsets, list3 (Vcharset_ascii, Vcharset_control_1, | |
4953 Vcharset_latin_iso8859_1)))); | |
4954 | |
4955 #else | |
4956 Qnil)); | |
4957 #endif | |
4958 | |
4959 Fmake_coding_system_internal | |
4960 (Qbinary, Qno_conversion, | 4960 (Qbinary, Qno_conversion, |
4961 build_msg_string ("Binary"), | 4961 build_msg_string ("Binary"), |
4962 list6 (Qdocumentation, | 4962 nconc2 (list6 (Qdocumentation, |
4963 build_msg_string ( | 4963 build_msg_string ( |
4964 "This coding system is as close as it comes to doing no conversion.\n" | 4964 "This coding system is as close as it comes to doing no conversion.\n" |
4965 "On input, each byte is converted directly into the character\n" | 4965 "On input, each byte is converted directly into the character\n" |
4966 "with the corresponding code -- i.e. from the `ascii', `control-1',\n" | 4966 "with the corresponding code -- i.e. from the `ascii', `control-1',\n" |
4967 "or `latin-1' character sets. On output, these characters are\n" | 4967 "or `latin-1' character sets. On output, these characters are\n" |
4968 "converted back to the corresponding bytes, and other characters\n" | 4968 "converted back to the corresponding bytes, and other characters\n" |
4969 "are converted to the default character, i.e. `~'."), | 4969 "are converted to the default character, i.e. `~'."), |
4970 Qeol_type, Qlf, | 4970 Qeol_type, Qlf, |
4971 Qmnemonic, build_string ("Binary"))); | 4971 Qmnemonic, build_string ("Binary")), |
4972 #ifdef MULE | |
4973 list2 (Qsafe_charsets, list3 (Vcharset_ascii, Vcharset_control_1, | |
4974 Vcharset_latin_iso8859_1)))); | |
4975 | |
4976 #else | |
4977 Qnil)); | |
4978 #endif | |
4972 | 4979 |
4973 /* Formerly aliased to raw-text! Completely bogus and not even the same | 4980 /* Formerly aliased to raw-text! Completely bogus and not even the same |
4974 as FSF Emacs. */ | 4981 as FSF Emacs. */ |
4975 Fdefine_coding_system_alias (Qno_conversion, Qbinary); | 4982 Fdefine_coding_system_alias (Qno_conversion, Qbinary); |
4976 Fdefine_coding_system_alias (intern ("no-conversion-unix"), | 4983 Fdefine_coding_system_alias (intern ("no-conversion-unix"), |