comparison src/file-coding.c @ 4568:1d74a1d115ee

Add #'query-coding-region tests; do the work necessary to get them running. lisp/ChangeLog addition: 2008-12-28 Aidan Kehoe <kehoea@parhasard.net> * coding.el (default-query-coding-region): Declare using defun*, so we can #'return-from to it on encountering a safe-charsets value of t. Comment out a few debug messages. (query-coding-region): Correct the docstring, it deals with a region, not a string. (unencodable-char-position): Correct the implementation for non-nil COUNT, special-case a zero value for count, treat it as one. Don't rely on dynamic scope when calling the main lambda. * unicode.el (unicode-query-coding-region): Comment out some debug messages here. * mule/mule-coding.el (8-bit-fixed-query-coding-region): Comment out some debug messages here. * code-init.el (raw-text): Add a safe-charsets property to this coding system. * mule/korean.el (iso-2022-int-1): * mule/korean.el (euc-kr): * mule/korean.el (iso-2022-kr): Add safe-charsets properties for these coding systems. * mule/japanese.el (iso-2022-jp): * mule/japanese.el (jis7): * mule/japanese.el (jis8): * mule/japanese.el (shift-jis): * mule/japanese.el (iso-2022-jp-1978-irv): * mule/japanese.el (euc-jp): Add safe-charsets properties for all these coding systems. * mule/iso-with-esc.el: Add safe-charsets properties to all the coding systems in here. Comment on the downside of a safe-charsets value of t for iso-latin-1-with-esc. * mule/hebrew.el (ctext-hebrew): Add a safe-charsets property for this coding system. * mule/devanagari.el (in-is13194-devanagari): Add a safe-charsets property for this coding system. * mule/chinese.el (cn-gb-2312): * mule/chinese.el (hz-gb-2312): * mule/chinese.el (big5): Add safe-charsets properties for these coding systems. * mule/latin.el (iso-8859-14): Add an implementation for this, using #'make-8-bit-coding-system. * mule/mule-coding.el (ctext): * mule/mule-coding.el (iso-2022-8bit-ss2): * mule/mule-coding.el (iso-2022-7bit-ss2): * mule/mule-coding.el (iso-2022-jp-2): * mule/mule-coding.el (iso-2022-7bit): * mule/mule-coding.el (iso-2022-8): * mule/mule-coding.el (escape-quoted): * mule/mule-coding.el (iso-2022-lock): Add safe-charsets properties for all these coding systems. src/ChangeLog addition: 2008-12-28 Aidan Kehoe <kehoea@parhasard.net> * file-coding.c (Fmake_coding_system): Document our use of the safe-chars and safe-charsets properties, and the differences compared to GNU. (make_coding_system_1): Don't drop the safe-chars and safe-charsets properties. (Fcoding_system_property): Return the safe-chars and safe-charsets properties when asked for them. * file-coding.h (CODING_SYSTEM_SAFE_CHARSETS): * coding-system-slots.h: Make the safe-chars and safe-charsets slots available in these headers. tests/ChangeLog addition: 2008-12-28 Aidan Kehoe <kehoea@parhasard.net> * automated/query-coding-tests.el: New file, testing the functionality of #'query-coding-region and #'query-coding-string.
author Aidan Kehoe <kehoea@parhasard.net>
date Sun, 28 Dec 2008 14:46:24 +0000
parents cee827542370
children 80e0588fb42f
comparison
equal deleted inserted replaced
4567:84d618b355f5 4568:1d74a1d115ee
1123 else if (EQ (key, Qtranslation_table_for_decode)) 1123 else if (EQ (key, Qtranslation_table_for_decode))
1124 ; 1124 ;
1125 else if (EQ (key, Qtranslation_table_for_encode)) 1125 else if (EQ (key, Qtranslation_table_for_encode))
1126 ; 1126 ;
1127 else if (EQ (key, Qsafe_chars)) 1127 else if (EQ (key, Qsafe_chars))
1128 ; 1128 CODING_SYSTEM_SAFE_CHARS (cs) = value;
1129 else if (EQ (key, Qsafe_charsets)) 1129 else if (EQ (key, Qsafe_charsets))
1130 ; 1130 CODING_SYSTEM_SAFE_CHARSETS (cs) = value;
1131 else if (EQ (key, Qmime_charset)) 1131 else if (EQ (key, Qmime_charset))
1132 ; 1132 ;
1133 else if (EQ (key, Qvalid_codes)) 1133 else if (EQ (key, Qvalid_codes))
1134 ; 1134 ;
1135 else 1135 else
1324 table. This is not applicable to CCL-based coding systems. 1324 table. This is not applicable to CCL-based coding systems.
1325 1325
1326 `translation-table-for-encode' 1326 `translation-table-for-encode'
1327 The value is a translation table to be applied on encoding. This is 1327 The value is a translation table to be applied on encoding. This is
1328 not applicable to CCL-based coding systems. 1328 not applicable to CCL-based coding systems.
1329 1329
1330 `safe-chars'
1331 The value is a char table. If a character has non-nil value in it,
1332 the character is safely supported by the coding system. This
1333 overrides the specification of safe-charsets.
1334
1335 `safe-charsets'
1336 The value is a list of charsets safely supported by the coding
1337 system. The value t means that all charsets Emacs handles are
1338 supported. Even if some charset is not in this list, it doesn't
1339 mean that the charset can't be encoded in the coding system;
1340 it just means that some other receiver of text encoded
1341 in the coding system won't be able to handle that charset.
1342
1343 `mime-charset' 1330 `mime-charset'
1344 The value is a symbol of which name is `MIME-charset' parameter of 1331 The value is a symbol of which name is `MIME-charset' parameter of
1345 the coding system. 1332 the coding system.
1346 1333
1347 `valid-codes' (meaningful only for a coding system based on CCL) 1334 `valid-codes' (meaningful only for a coding system based on CCL)
1348 The value is a list to indicate valid byte ranges of the encoded 1335 The value is a list to indicate valid byte ranges of the encoded
1349 file. Each element of the list is an integer or a cons of integer. 1336 file. Each element of the list is an integer or a cons of integer.
1350 In the former case, the integer value is a valid byte code. In the 1337 In the former case, the integer value is a valid byte code. In the
1351 latter case, the integers specifies the range of valid byte codes. 1338 latter case, the integers specifies the range of valid byte codes.
1352 1339
1353 1340 The following properties are used by `default-query-coding-region',
1341 the default implementation of `query-coding-region'. This
1342 implementation and these properties are not used by the Unicode coding
1343 systems, nor by those CCL coding systems created with
1344 `make-8-bit-coding-system'.
1345
1346 `safe-chars'
1347 The value is a char table. If a character has non-nil value in it,
1348 the character is safely supported by the coding system.
1349 Under XEmacs, for the moment, this is used in addition to the
1350 `safe-charsets' property. It does not override it as it does
1351 under GNU Emacs. #### We need to consider if we should keep this
1352 behaviour.
1353
1354 `safe-charsets'
1355 The value is a list of charsets safely supported by the coding
1356 system. For coding systems based on ISO 2022, XEmacs may try to
1357 encode characters outside these character sets, but outside of
1358 East Asia and East Asian coding systems, it is unlikely that
1359 consumers of the data will understand XEmacs' encoding.
1360 The value t means that all XEmacs character sets handles are supported.
1354 1361
1355 The following additional property is recognized if TYPE is `convert-eol': 1362 The following additional property is recognized if TYPE is `convert-eol':
1356 1363
1357 `subtype' 1364 `subtype'
1358 One of `lf', `crlf', `cr' or nil (for autodetection). When decoding, 1365 One of `lf', `crlf', `cr' or nil (for autodetection). When decoding,
1860 return XCODING_SYSTEM_EOL_CR (coding_system); 1867 return XCODING_SYSTEM_EOL_CR (coding_system);
1861 else if (EQ (prop, Qpost_read_conversion)) 1868 else if (EQ (prop, Qpost_read_conversion))
1862 return XCODING_SYSTEM_POST_READ_CONVERSION (coding_system); 1869 return XCODING_SYSTEM_POST_READ_CONVERSION (coding_system);
1863 else if (EQ (prop, Qpre_write_conversion)) 1870 else if (EQ (prop, Qpre_write_conversion))
1864 return XCODING_SYSTEM_PRE_WRITE_CONVERSION (coding_system); 1871 return XCODING_SYSTEM_PRE_WRITE_CONVERSION (coding_system);
1872 else if (EQ (prop, Qsafe_charsets))
1873 return XCODING_SYSTEM_SAFE_CHARSETS (coding_system);
1874 else if (EQ (prop, Qsafe_chars))
1875 return XCODING_SYSTEM_SAFE_CHARS (coding_system);
1865 else 1876 else
1866 { 1877 {
1867 Lisp_Object value = CODESYSMETH_OR_GIVEN (XCODING_SYSTEM (coding_system), 1878 Lisp_Object value = CODESYSMETH_OR_GIVEN (XCODING_SYSTEM (coding_system),
1868 getprop, 1879 getprop,
1869 (coding_system, prop), 1880 (coding_system, prop),