annotate lisp/mule/mule-coding.el @ 4604:e0a8715fdb1f

Support new IGNORE-INVALID-SEQUENCESP argument, #'query-coding-region. lisp/ChangeLog addition: 2009-02-07 Aidan Kehoe <kehoea@parhasard.net> * coding.el (query-coding-clear-highlights): Rename the BUFFER argument to BUFFER-OR-STRING, describe it as possibly being a string in its documentation. (default-query-coding-region): Add a new IGNORE-INVALID-SEQUENCESP argument, document that this function does not support it. Bind case-fold-search to nil, we don't want this to influence what the function thinks is encodable or not. (query-coding-region): Add a new IGNORE-INVALID-SEQUENCESP argument, document what it does; reflect this new argument in the associated compiler macro. (query-coding-string): Add a new IGNORE-INVALID-SEQUENCESP argument, document what it does. Support the HIGHLIGHT argument correctly. * unicode.el (unicode-query-coding-region): Add a new IGNORE-INVALID-SEQUENCESP argument, document what it does, implement this. Document a potential problem. Use #'query-coding-clear-highlights instead of reimplementing it ourselves. Remove some debugging messages. * mule/arabic.el (iso-8859-6): * mule/cyrillic.el (iso-8859-5): * mule/greek.el (iso-8859-7): * mule/hebrew.el (iso-8859-8): * mule/latin.el (iso-8859-2): * mule/latin.el (iso-8859-3): * mule/latin.el (iso-8859-4): * mule/latin.el (iso-8859-14): * mule/latin.el (iso-8859-15): * mule/latin.el (iso-8859-16): * mule/latin.el (iso-8859-9): * mule/latin.el (windows-1252): * mule/mule-coding.el (iso-8859-1): Avoid the assumption that characters not given an explicit mapping in these coding systems map to the ISO 8859-1 characters corresponding to the octets on disk; this makes it much more reasonable to implement the IGNORE-INVALID-SEQUENCESP argument to query-coding-region. * mule/mule-cmds.el (set-language-info): Correct the docstring. * mule/mule-cmds.el (finish-set-language-environment): Treat invalid Unicode sequences produced from invalid-sequence-coding-system and corresponding to control characters the same as control characters in redisplay. * mule/mule-cmds.el: Document that encode-coding-char is available in coding.el * mule/mule-coding.el (make-8-bit-generate-helper): Change to return the both the encode-program generated and the relevant non-ASCII charset; update the docstring to reflect this. * mule/mule-coding.el (make-8-bit-generate-encode-program-and-skip-chars-strings): Rename this function; have it return skip-chars-strings as well as the encode program. Have these skip-chars-strings use ranges for charsets, where possible. * mule/mule-coding.el (make-8-bit-create-decode-encode-tables): Revise this to allow people to specify explicitly characters that should be undefined (= corresponding to keys in unicode-error-default-translation-table), and treating unspecified octets above #x7f as undefined by default. * mule/mule-coding.el (8-bit-fixed-query-coding-region): Add a new IGNORE-INVALID-SEQUENCESP argument, implement support for it using the 8-bit-fixed-invalid-sequences-skip-chars coding system property; remove some debugging messages. * mule/mule-coding.el (make-8-bit-coding-system): This function is dumped, autoloading it makes no sense. Document what happens when characters above #x7f are not specified, implement this. * mule/vietnamese.el: Correct spelling. tests/ChangeLog addition: 2009-02-07 Aidan Kehoe <kehoea@parhasard.net> * automated/query-coding-tests.el: Add FAILING-CASE arguments to the Assert calls, making #'q-c-debug mostly unnecessary. Remove #'q-c-debug. Add new tests that use the IGNORE-INVALID-SEQUENCESP argument to #'query-coding-region; rework the existing ones to respect it.
author Aidan Kehoe <kehoea@parhasard.net>
date Sat, 07 Feb 2009 17:13:37 +0000
parents 1d74a1d115ee
children c786c3fd0740
Ignore whitespace changes - Everywhere: Within whitespace: At end of lines:
rev   line source
502
7039e6323819 [xemacs-hg @ 2001-05-04 22:41:46 by ben]
ben
parents: 448
diff changeset
1 ;;; mule-coding.el --- Coding-system functions for Mule. -*- coding: iso-2022-7bit; -*-
333
4f79e16b1112 Import from CVS: tag r21-0-64
cvs
parents:
diff changeset
2
4f79e16b1112 Import from CVS: tag r21-0-64
cvs
parents:
diff changeset
3 ;; Copyright (C) 1995 Electrotechnical Laboratory, JAPAN.
4f79e16b1112 Import from CVS: tag r21-0-64
cvs
parents:
diff changeset
4 ;; Licensed to the Free Software Foundation.
4f79e16b1112 Import from CVS: tag r21-0-64
cvs
parents:
diff changeset
5 ;; Copyright (C) 1995 Amdahl Corporation.
4f79e16b1112 Import from CVS: tag r21-0-64
cvs
parents:
diff changeset
6 ;; Copyright (C) 1995 Sun Microsystems.
4f79e16b1112 Import from CVS: tag r21-0-64
cvs
parents:
diff changeset
7 ;; Copyright (C) 1997 MORIOKA Tomohiko
771
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents: 528
diff changeset
8 ;; Copyright (C) 2001 Ben Wing.
333
4f79e16b1112 Import from CVS: tag r21-0-64
cvs
parents:
diff changeset
9
4f79e16b1112 Import from CVS: tag r21-0-64
cvs
parents:
diff changeset
10 ;; This file is part of XEmacs.
4f79e16b1112 Import from CVS: tag r21-0-64
cvs
parents:
diff changeset
11
4f79e16b1112 Import from CVS: tag r21-0-64
cvs
parents:
diff changeset
12 ;; XEmacs is free software; you can redistribute it and/or modify it
4f79e16b1112 Import from CVS: tag r21-0-64
cvs
parents:
diff changeset
13 ;; under the terms of the GNU General Public License as published by
4f79e16b1112 Import from CVS: tag r21-0-64
cvs
parents:
diff changeset
14 ;; the Free Software Foundation; either version 2, or (at your option)
4f79e16b1112 Import from CVS: tag r21-0-64
cvs
parents:
diff changeset
15 ;; any later version.
4f79e16b1112 Import from CVS: tag r21-0-64
cvs
parents:
diff changeset
16
4f79e16b1112 Import from CVS: tag r21-0-64
cvs
parents:
diff changeset
17 ;; XEmacs is distributed in the hope that it will be useful, but
4f79e16b1112 Import from CVS: tag r21-0-64
cvs
parents:
diff changeset
18 ;; WITHOUT ANY WARRANTY; without even the implied warranty of
4f79e16b1112 Import from CVS: tag r21-0-64
cvs
parents:
diff changeset
19 ;; MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
4f79e16b1112 Import from CVS: tag r21-0-64
cvs
parents:
diff changeset
20 ;; General Public License for more details.
4f79e16b1112 Import from CVS: tag r21-0-64
cvs
parents:
diff changeset
21
4f79e16b1112 Import from CVS: tag r21-0-64
cvs
parents:
diff changeset
22 ;; You should have received a copy of the GNU General Public License
444
576fb035e263 Import from CVS: tag r21-2-37
cvs
parents: 442
diff changeset
23 ;; along with XEmacs; see the file COPYING. If not, write to the
333
4f79e16b1112 Import from CVS: tag r21-0-64
cvs
parents:
diff changeset
24 ;; Free Software Foundation, Inc., 59 Temple Place - Suite 330,
4f79e16b1112 Import from CVS: tag r21-0-64
cvs
parents:
diff changeset
25 ;; Boston, MA 02111-1307, USA.
4f79e16b1112 Import from CVS: tag r21-0-64
cvs
parents:
diff changeset
26
4f79e16b1112 Import from CVS: tag r21-0-64
cvs
parents:
diff changeset
27 ;;; Commentary:
4f79e16b1112 Import from CVS: tag r21-0-64
cvs
parents:
diff changeset
28
4f79e16b1112 Import from CVS: tag r21-0-64
cvs
parents:
diff changeset
29 ;;; split off of mule.el and mostly moved to coding.el
4f79e16b1112 Import from CVS: tag r21-0-64
cvs
parents:
diff changeset
30
4072
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
31 ;; Needed for make-8-bit-coding-system.
4080
476d0799d704 [xemacs-hg @ 2007-07-27 18:56:45 by aidan]
aidan
parents: 4072
diff changeset
32 (eval-when-compile (require 'ccl))
4072
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
33
333
4f79e16b1112 Import from CVS: tag r21-0-64
cvs
parents:
diff changeset
34 ;;; Code:
4f79e16b1112 Import from CVS: tag r21-0-64
cvs
parents:
diff changeset
35
4f79e16b1112 Import from CVS: tag r21-0-64
cvs
parents:
diff changeset
36 (defun coding-system-force-on-output (coding-system register)
4f79e16b1112 Import from CVS: tag r21-0-64
cvs
parents:
diff changeset
37 "Return the 'force-on-output property of CODING-SYSTEM for the specified REGISTER."
444
576fb035e263 Import from CVS: tag r21-2-37
cvs
parents: 442
diff changeset
38 (check-type register integer)
333
4f79e16b1112 Import from CVS: tag r21-0-64
cvs
parents:
diff changeset
39 (coding-system-property
4f79e16b1112 Import from CVS: tag r21-0-64
cvs
parents:
diff changeset
40 coding-system
4f79e16b1112 Import from CVS: tag r21-0-64
cvs
parents:
diff changeset
41 (case register
4f79e16b1112 Import from CVS: tag r21-0-64
cvs
parents:
diff changeset
42 (0 'force-g0-on-output)
4f79e16b1112 Import from CVS: tag r21-0-64
cvs
parents:
diff changeset
43 (1 'force-g1-on-output)
4f79e16b1112 Import from CVS: tag r21-0-64
cvs
parents:
diff changeset
44 (2 'force-g2-on-output)
4f79e16b1112 Import from CVS: tag r21-0-64
cvs
parents:
diff changeset
45 (3 'force-g3-on-output)
4f79e16b1112 Import from CVS: tag r21-0-64
cvs
parents:
diff changeset
46 (t (signal 'args-out-of-range (list register 0 3))))))
4f79e16b1112 Import from CVS: tag r21-0-64
cvs
parents:
diff changeset
47
4f79e16b1112 Import from CVS: tag r21-0-64
cvs
parents:
diff changeset
48 (defun coding-system-short (coding-system)
4f79e16b1112 Import from CVS: tag r21-0-64
cvs
parents:
diff changeset
49 "Return the 'short property of CODING-SYSTEM."
4f79e16b1112 Import from CVS: tag r21-0-64
cvs
parents:
diff changeset
50 (coding-system-property coding-system 'short))
4f79e16b1112 Import from CVS: tag r21-0-64
cvs
parents:
diff changeset
51
4f79e16b1112 Import from CVS: tag r21-0-64
cvs
parents:
diff changeset
52 (defun coding-system-no-ascii-eol (coding-system)
4f79e16b1112 Import from CVS: tag r21-0-64
cvs
parents:
diff changeset
53 "Return the 'no-ascii-eol property of CODING-SYSTEM."
4f79e16b1112 Import from CVS: tag r21-0-64
cvs
parents:
diff changeset
54 (coding-system-property coding-system 'no-ascii-eol))
4f79e16b1112 Import from CVS: tag r21-0-64
cvs
parents:
diff changeset
55
4f79e16b1112 Import from CVS: tag r21-0-64
cvs
parents:
diff changeset
56 (defun coding-system-no-ascii-cntl (coding-system)
4f79e16b1112 Import from CVS: tag r21-0-64
cvs
parents:
diff changeset
57 "Return the 'no-ascii-cntl property of CODING-SYSTEM."
4f79e16b1112 Import from CVS: tag r21-0-64
cvs
parents:
diff changeset
58 (coding-system-property coding-system 'no-ascii-cntl))
4f79e16b1112 Import from CVS: tag r21-0-64
cvs
parents:
diff changeset
59
4f79e16b1112 Import from CVS: tag r21-0-64
cvs
parents:
diff changeset
60 (defun coding-system-seven (coding-system)
4f79e16b1112 Import from CVS: tag r21-0-64
cvs
parents:
diff changeset
61 "Return the 'seven property of CODING-SYSTEM."
4f79e16b1112 Import from CVS: tag r21-0-64
cvs
parents:
diff changeset
62 (coding-system-property coding-system 'seven))
4f79e16b1112 Import from CVS: tag r21-0-64
cvs
parents:
diff changeset
63
4f79e16b1112 Import from CVS: tag r21-0-64
cvs
parents:
diff changeset
64 (defun coding-system-lock-shift (coding-system)
4f79e16b1112 Import from CVS: tag r21-0-64
cvs
parents:
diff changeset
65 "Return the 'lock-shift property of CODING-SYSTEM."
4f79e16b1112 Import from CVS: tag r21-0-64
cvs
parents:
diff changeset
66 (coding-system-property coding-system 'lock-shift))
4f79e16b1112 Import from CVS: tag r21-0-64
cvs
parents:
diff changeset
67
4f79e16b1112 Import from CVS: tag r21-0-64
cvs
parents:
diff changeset
68 ;;(defun coding-system-use-japanese-jisx0201-roman (coding-system)
4f79e16b1112 Import from CVS: tag r21-0-64
cvs
parents:
diff changeset
69 ;; "Return the 'use-japanese-jisx0201-roman property of CODING-SYSTEM."
4f79e16b1112 Import from CVS: tag r21-0-64
cvs
parents:
diff changeset
70 ;; (coding-system-property coding-system 'use-japanese-jisx0201-roman))
4f79e16b1112 Import from CVS: tag r21-0-64
cvs
parents:
diff changeset
71
4f79e16b1112 Import from CVS: tag r21-0-64
cvs
parents:
diff changeset
72 ;;(defun coding-system-use-japanese-jisx0208-1978 (coding-system)
4f79e16b1112 Import from CVS: tag r21-0-64
cvs
parents:
diff changeset
73 ;; "Return the 'use-japanese-jisx0208-1978 property of CODING-SYSTEM."
4f79e16b1112 Import from CVS: tag r21-0-64
cvs
parents:
diff changeset
74 ;; (coding-system-property coding-system 'use-japanese-jisx0208-2978))
4f79e16b1112 Import from CVS: tag r21-0-64
cvs
parents:
diff changeset
75
4f79e16b1112 Import from CVS: tag r21-0-64
cvs
parents:
diff changeset
76 (defun coding-system-no-iso6429 (coding-system)
4f79e16b1112 Import from CVS: tag r21-0-64
cvs
parents:
diff changeset
77 "Return the 'no-iso6429 property of CODING-SYSTEM."
4f79e16b1112 Import from CVS: tag r21-0-64
cvs
parents:
diff changeset
78 (coding-system-property coding-system 'no-iso6429))
4f79e16b1112 Import from CVS: tag r21-0-64
cvs
parents:
diff changeset
79
4f79e16b1112 Import from CVS: tag r21-0-64
cvs
parents:
diff changeset
80 (defun coding-system-ccl-encode (coding-system)
4f79e16b1112 Import from CVS: tag r21-0-64
cvs
parents:
diff changeset
81 "Return the CCL 'encode property of CODING-SYSTEM."
4f79e16b1112 Import from CVS: tag r21-0-64
cvs
parents:
diff changeset
82 (coding-system-property coding-system 'encode))
4f79e16b1112 Import from CVS: tag r21-0-64
cvs
parents:
diff changeset
83
4f79e16b1112 Import from CVS: tag r21-0-64
cvs
parents:
diff changeset
84 (defun coding-system-ccl-decode (coding-system)
4f79e16b1112 Import from CVS: tag r21-0-64
cvs
parents:
diff changeset
85 "Return the CCL 'decode property of CODING-SYSTEM."
4f79e16b1112 Import from CVS: tag r21-0-64
cvs
parents:
diff changeset
86 (coding-system-property coding-system 'decode))
4f79e16b1112 Import from CVS: tag r21-0-64
cvs
parents:
diff changeset
87
771
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents: 528
diff changeset
88 (defun coding-system-iso2022-charset (coding-system register)
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents: 528
diff changeset
89 "Return the charset initially designated to REGISTER in CODING-SYSTEM.
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents: 528
diff changeset
90 The allowable range of REGISTER is 0 through 3."
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents: 528
diff changeset
91 (if (or (< register 0) (> register 3))
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents: 528
diff changeset
92 (error 'args-out-of-range "coding-system-charset REGISTER" register 0 3))
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents: 528
diff changeset
93 (coding-system-property coding-system (nth register '(charset-g0
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents: 528
diff changeset
94 charset-g1
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents: 528
diff changeset
95 charset-g2
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents: 528
diff changeset
96 charset-g3))))
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents: 528
diff changeset
97
333
4f79e16b1112 Import from CVS: tag r21-0-64
cvs
parents:
diff changeset
98
4f79e16b1112 Import from CVS: tag r21-0-64
cvs
parents:
diff changeset
99 ;;;; Definitions of predefined coding systems
4f79e16b1112 Import from CVS: tag r21-0-64
cvs
parents:
diff changeset
100
4f79e16b1112 Import from CVS: tag r21-0-64
cvs
parents:
diff changeset
101 (make-coding-system
4f79e16b1112 Import from CVS: tag r21-0-64
cvs
parents:
diff changeset
102 'ctext 'iso2022
771
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents: 528
diff changeset
103 "Compound Text"
333
4f79e16b1112 Import from CVS: tag r21-0-64
cvs
parents:
diff changeset
104 '(charset-g0 ascii
4f79e16b1112 Import from CVS: tag r21-0-64
cvs
parents:
diff changeset
105 charset-g1 latin-iso8859-1
4f79e16b1112 Import from CVS: tag r21-0-64
cvs
parents:
diff changeset
106 eol-type nil
4568
1d74a1d115ee Add #'query-coding-region tests; do the work necessary to get them running.
Aidan Kehoe <kehoea@parhasard.net>
parents: 4567
diff changeset
107 safe-charsets t ;; Reasonable
333
4f79e16b1112 Import from CVS: tag r21-0-64
cvs
parents:
diff changeset
108 mnemonic "CText"))
4f79e16b1112 Import from CVS: tag r21-0-64
cvs
parents:
diff changeset
109
4f79e16b1112 Import from CVS: tag r21-0-64
cvs
parents:
diff changeset
110 (make-coding-system
4f79e16b1112 Import from CVS: tag r21-0-64
cvs
parents:
diff changeset
111 'iso-2022-8bit-ss2 'iso2022
771
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents: 528
diff changeset
112 "ISO-2022 8-bit w/SS2"
333
4f79e16b1112 Import from CVS: tag r21-0-64
cvs
parents:
diff changeset
113 '(charset-g0 ascii
4f79e16b1112 Import from CVS: tag r21-0-64
cvs
parents:
diff changeset
114 charset-g1 latin-iso8859-1
4f79e16b1112 Import from CVS: tag r21-0-64
cvs
parents:
diff changeset
115 charset-g2 t ;; unspecified but can be used later.
4f79e16b1112 Import from CVS: tag r21-0-64
cvs
parents:
diff changeset
116 short t
4568
1d74a1d115ee Add #'query-coding-region tests; do the work necessary to get them running.
Aidan Kehoe <kehoea@parhasard.net>
parents: 4567
diff changeset
117 safe-charsets (ascii katakana-jisx0201 japanese-jisx0208-1978
1d74a1d115ee Add #'query-coding-region tests; do the work necessary to get them running.
Aidan Kehoe <kehoea@parhasard.net>
parents: 4567
diff changeset
118 japanese-jisx0208 japanese-jisx0212 japanese-jisx0213-1
1d74a1d115ee Add #'query-coding-region tests; do the work necessary to get them running.
Aidan Kehoe <kehoea@parhasard.net>
parents: 4567
diff changeset
119 japanese-jisx0213-2)
333
4f79e16b1112 Import from CVS: tag r21-0-64
cvs
parents:
diff changeset
120 mnemonic "ISO8/SS"
771
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents: 528
diff changeset
121 documentation "ISO 2022 based 8-bit encoding using SS2 for 96-charset"
333
4f79e16b1112 Import from CVS: tag r21-0-64
cvs
parents:
diff changeset
122 ))
4f79e16b1112 Import from CVS: tag r21-0-64
cvs
parents:
diff changeset
123
4f79e16b1112 Import from CVS: tag r21-0-64
cvs
parents:
diff changeset
124 (make-coding-system
4f79e16b1112 Import from CVS: tag r21-0-64
cvs
parents:
diff changeset
125 'iso-2022-7bit-ss2 'iso2022
771
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents: 528
diff changeset
126 "ISO-2022 7-bit w/SS2"
333
4f79e16b1112 Import from CVS: tag r21-0-64
cvs
parents:
diff changeset
127 '(charset-g0 ascii
4f79e16b1112 Import from CVS: tag r21-0-64
cvs
parents:
diff changeset
128 charset-g2 t ;; unspecified but can be used later.
4f79e16b1112 Import from CVS: tag r21-0-64
cvs
parents:
diff changeset
129 seven t
4f79e16b1112 Import from CVS: tag r21-0-64
cvs
parents:
diff changeset
130 short t
4568
1d74a1d115ee Add #'query-coding-region tests; do the work necessary to get them running.
Aidan Kehoe <kehoea@parhasard.net>
parents: 4567
diff changeset
131 safe-charsets t
333
4f79e16b1112 Import from CVS: tag r21-0-64
cvs
parents:
diff changeset
132 mnemonic "ISO7/SS"
771
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents: 528
diff changeset
133 documentation "ISO 2022 based 7-bit encoding using SS2 for 96-charset"
333
4f79e16b1112 Import from CVS: tag r21-0-64
cvs
parents:
diff changeset
134 eol-type nil))
4f79e16b1112 Import from CVS: tag r21-0-64
cvs
parents:
diff changeset
135
4f79e16b1112 Import from CVS: tag r21-0-64
cvs
parents:
diff changeset
136 ;; (copy-coding-system 'iso-2022-7bit-ss2 'iso-2022-jp-2)
4f79e16b1112 Import from CVS: tag r21-0-64
cvs
parents:
diff changeset
137 (make-coding-system
4f79e16b1112 Import from CVS: tag r21-0-64
cvs
parents:
diff changeset
138 'iso-2022-jp-2 'iso2022
771
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents: 528
diff changeset
139 "ISO-2022-JP-2"
333
4f79e16b1112 Import from CVS: tag r21-0-64
cvs
parents:
diff changeset
140 '(charset-g0 ascii
4f79e16b1112 Import from CVS: tag r21-0-64
cvs
parents:
diff changeset
141 charset-g2 t ;; unspecified but can be used later.
4f79e16b1112 Import from CVS: tag r21-0-64
cvs
parents:
diff changeset
142 seven t
4f79e16b1112 Import from CVS: tag r21-0-64
cvs
parents:
diff changeset
143 short t
4568
1d74a1d115ee Add #'query-coding-region tests; do the work necessary to get them running.
Aidan Kehoe <kehoea@parhasard.net>
parents: 4567
diff changeset
144 safe-charsets t
333
4f79e16b1112 Import from CVS: tag r21-0-64
cvs
parents:
diff changeset
145 mnemonic "ISO7/SS"
4f79e16b1112 Import from CVS: tag r21-0-64
cvs
parents:
diff changeset
146 eol-type nil))
4f79e16b1112 Import from CVS: tag r21-0-64
cvs
parents:
diff changeset
147
4f79e16b1112 Import from CVS: tag r21-0-64
cvs
parents:
diff changeset
148 (make-coding-system
4f79e16b1112 Import from CVS: tag r21-0-64
cvs
parents:
diff changeset
149 'iso-2022-7bit 'iso2022
771
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents: 528
diff changeset
150 "ISO 2022 7-bit"
333
4f79e16b1112 Import from CVS: tag r21-0-64
cvs
parents:
diff changeset
151 '(charset-g0 ascii
4f79e16b1112 Import from CVS: tag r21-0-64
cvs
parents:
diff changeset
152 seven t
4f79e16b1112 Import from CVS: tag r21-0-64
cvs
parents:
diff changeset
153 short t
4568
1d74a1d115ee Add #'query-coding-region tests; do the work necessary to get them running.
Aidan Kehoe <kehoea@parhasard.net>
parents: 4567
diff changeset
154 safe-charsets t
771
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents: 528
diff changeset
155 mnemonic "ISO7"
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents: 528
diff changeset
156 documentation "ISO-2022-based 7-bit encoding using only G0"
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents: 528
diff changeset
157 ))
333
4f79e16b1112 Import from CVS: tag r21-0-64
cvs
parents:
diff changeset
158
4f79e16b1112 Import from CVS: tag r21-0-64
cvs
parents:
diff changeset
159 ;; compatibility for old XEmacsen
771
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents: 528
diff changeset
160 (define-coding-system-alias 'iso-2022-7 'iso-2022-7bit)
333
4f79e16b1112 Import from CVS: tag r21-0-64
cvs
parents:
diff changeset
161
4f79e16b1112 Import from CVS: tag r21-0-64
cvs
parents:
diff changeset
162 (make-coding-system
4f79e16b1112 Import from CVS: tag r21-0-64
cvs
parents:
diff changeset
163 'iso-2022-8 'iso2022
771
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents: 528
diff changeset
164 "ISO-2022 8-bit"
333
4f79e16b1112 Import from CVS: tag r21-0-64
cvs
parents:
diff changeset
165 '(charset-g0 ascii
4f79e16b1112 Import from CVS: tag r21-0-64
cvs
parents:
diff changeset
166 charset-g1 latin-iso8859-1
4f79e16b1112 Import from CVS: tag r21-0-64
cvs
parents:
diff changeset
167 short t
4568
1d74a1d115ee Add #'query-coding-region tests; do the work necessary to get them running.
Aidan Kehoe <kehoea@parhasard.net>
parents: 4567
diff changeset
168 safe-charsets t
333
4f79e16b1112 Import from CVS: tag r21-0-64
cvs
parents:
diff changeset
169 mnemonic "ISO8"
771
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents: 528
diff changeset
170 documentation "ISO-2022 eight-bit coding system. No single-shift or locking-shift."
333
4f79e16b1112 Import from CVS: tag r21-0-64
cvs
parents:
diff changeset
171 ))
4f79e16b1112 Import from CVS: tag r21-0-64
cvs
parents:
diff changeset
172
4f79e16b1112 Import from CVS: tag r21-0-64
cvs
parents:
diff changeset
173 (make-coding-system
4f79e16b1112 Import from CVS: tag r21-0-64
cvs
parents:
diff changeset
174 'escape-quoted 'iso2022
771
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents: 528
diff changeset
175 "Escape-Quoted (for .ELC files)"
333
4f79e16b1112 Import from CVS: tag r21-0-64
cvs
parents:
diff changeset
176 '(charset-g0 ascii
4f79e16b1112 Import from CVS: tag r21-0-64
cvs
parents:
diff changeset
177 charset-g1 latin-iso8859-1
4f79e16b1112 Import from CVS: tag r21-0-64
cvs
parents:
diff changeset
178 eol-type lf
4f79e16b1112 Import from CVS: tag r21-0-64
cvs
parents:
diff changeset
179 escape-quoted t
4568
1d74a1d115ee Add #'query-coding-region tests; do the work necessary to get them running.
Aidan Kehoe <kehoea@parhasard.net>
parents: 4567
diff changeset
180 safe-charsets t
333
4f79e16b1112 Import from CVS: tag r21-0-64
cvs
parents:
diff changeset
181 mnemonic "ESC/Quot"
771
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents: 528
diff changeset
182 documentation "ISO-2022 eight-bit coding system with escape quoting; used for .ELC files."
333
4f79e16b1112 Import from CVS: tag r21-0-64
cvs
parents:
diff changeset
183 ))
4f79e16b1112 Import from CVS: tag r21-0-64
cvs
parents:
diff changeset
184
4f79e16b1112 Import from CVS: tag r21-0-64
cvs
parents:
diff changeset
185 (make-coding-system
4f79e16b1112 Import from CVS: tag r21-0-64
cvs
parents:
diff changeset
186 'iso-2022-lock 'iso2022
771
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents: 528
diff changeset
187 "ISO-2022 w/locking-shift"
333
4f79e16b1112 Import from CVS: tag r21-0-64
cvs
parents:
diff changeset
188 '(charset-g0 ascii
4f79e16b1112 Import from CVS: tag r21-0-64
cvs
parents:
diff changeset
189 charset-g1 t ;; unspecified but can be used later.
4f79e16b1112 Import from CVS: tag r21-0-64
cvs
parents:
diff changeset
190 seven t
4f79e16b1112 Import from CVS: tag r21-0-64
cvs
parents:
diff changeset
191 lock-shift t
4568
1d74a1d115ee Add #'query-coding-region tests; do the work necessary to get them running.
Aidan Kehoe <kehoea@parhasard.net>
parents: 4567
diff changeset
192 safe-charsets t
333
4f79e16b1112 Import from CVS: tag r21-0-64
cvs
parents:
diff changeset
193 mnemonic "ISO7/Lock"
771
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents: 528
diff changeset
194 documentation "ISO-2022 coding system using Locking-Shift for 96-charset."
333
4f79e16b1112 Import from CVS: tag r21-0-64
cvs
parents:
diff changeset
195 ))
4072
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
196
333
4f79e16b1112 Import from CVS: tag r21-0-64
cvs
parents:
diff changeset
197
4072
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
198 ;; This is used by people writing CCL programs, but is called at runtime.
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
199 (defun define-translation-hash-table (symbol table)
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
200 "Define SYMBOL as the name of the hash translation TABLE for use in CCL.
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
201
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
202 Analogous to `define-translation-table', but updates
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
203 `translation-hash-table-vector' and the table is for use in the CCL
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
204 `lookup-integer' and `lookup-character' functions."
4145
edb00a8b4eff [xemacs-hg @ 2007-08-26 20:00:29 by aidan]
aidan
parents: 4103
diff changeset
205 (check-argument-type #'symbolp symbol)
edb00a8b4eff [xemacs-hg @ 2007-08-26 20:00:29 by aidan]
aidan
parents: 4103
diff changeset
206 (check-argument-type #'hash-table-p table)
4072
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
207 (let ((len (length translation-hash-table-vector))
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
208 (id 0)
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
209 done)
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
210 (put symbol 'translation-hash-table table)
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
211 (while (not done)
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
212 (if (>= id len)
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
213 (setq translation-hash-table-vector
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
214 (vconcat translation-hash-table-vector [nil])))
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
215 (let ((slot (aref translation-hash-table-vector id)))
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
216 (if (or (not slot)
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
217 (eq (car slot) symbol))
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
218 (progn
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
219 (aset translation-hash-table-vector id (cons symbol table))
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
220 (setq done t))
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
221 (setq id (1+ id)))))
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
222 (put symbol 'translation-hash-table-id id)
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
223 id))
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
224
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
225 (defvar make-8-bit-private-use-start (decode-char 'ucs #xE000)
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
226 "Start of a 256 code private use area for make-8-bit-coding-system.
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
227
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
228 This is used to ensure that distinct octets on disk for a given coding
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
229 system map to distinct XEmacs characters, preventing a spurious changes when
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
230 a file is read, not changed, and then written. ")
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
231
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
232 (defun make-8-bit-generate-helper (decode-table encode-table
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
233 encode-failure-octet)
4604
e0a8715fdb1f Support new IGNORE-INVALID-SEQUENCESP argument, #'query-coding-region.
Aidan Kehoe <kehoea@parhasard.net>
parents: 4568
diff changeset
234 "Helper function, `make-8-bit-generate-encode-program-and-skip-chars-strings',
e0a8715fdb1f Support new IGNORE-INVALID-SEQUENCESP argument, #'query-coding-region.
Aidan Kehoe <kehoea@parhasard.net>
parents: 4568
diff changeset
235 which see.
4072
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
236
4145
edb00a8b4eff [xemacs-hg @ 2007-08-26 20:00:29 by aidan]
aidan
parents: 4103
diff changeset
237 Deals with the case where ASCII and another character set can both be
edb00a8b4eff [xemacs-hg @ 2007-08-26 20:00:29 by aidan]
aidan
parents: 4103
diff changeset
238 encoded unambiguously and completely into the coding-system; if this is so,
4604
e0a8715fdb1f Support new IGNORE-INVALID-SEQUENCESP argument, #'query-coding-region.
Aidan Kehoe <kehoea@parhasard.net>
parents: 4568
diff changeset
239 returns a list comprised of such a ccl-program and the character set in
e0a8715fdb1f Support new IGNORE-INVALID-SEQUENCESP argument, #'query-coding-region.
Aidan Kehoe <kehoea@parhasard.net>
parents: 4568
diff changeset
240 question. If not, it returns a list with both entries nil."
4072
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
241 (let ((tentative-encode-program-parts
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
242 (eval-when-compile
4295
eded49463f9a [xemacs-hg @ 2007-11-29 13:37:51 by aidan]
aidan
parents: 4145
diff changeset
243 (let* ((vec-len 128)
eded49463f9a [xemacs-hg @ 2007-11-29 13:37:51 by aidan]
aidan
parents: 4145
diff changeset
244 (compiled
4072
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
245 (append
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
246 (ccl-compile
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
247 `(1
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
248 (loop
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
249 (read-multibyte-character r0 r1)
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
250 (if (r0 == ,(charset-id 'ascii))
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
251 (write r1)
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
252 ((if (r0 == #xABAB)
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
253 ;; #xBFFE is a sentinel in the compiled
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
254 ;; program.
4295
eded49463f9a [xemacs-hg @ 2007-11-29 13:37:51 by aidan]
aidan
parents: 4145
diff changeset
255 ((r0 = r1 & #x7F)
eded49463f9a [xemacs-hg @ 2007-11-29 13:37:51 by aidan]
aidan
parents: 4145
diff changeset
256 (write r0 ,(make-vector vec-len #xBFFE)))
4072
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
257 ((mule-to-unicode r0 r1)
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
258 (if (r0 == #xFFFD)
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
259 (write #xBEEF)
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
260 ((lookup-integer encode-table-sym r0 r3)
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
261 (if r7
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
262 (write-multibyte-character r0 r3)
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
263 (write #xBEEF))))))))
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
264 (repeat)))) nil))
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
265 (first-part compiled)
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
266 (last-part
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
267 (member-if-not (lambda (entr) (eq #xBFFE entr))
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
268 (member-if
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
269 (lambda (entr) (eq #xBFFE entr))
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
270 first-part))))
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
271 (while compiled
4295
eded49463f9a [xemacs-hg @ 2007-11-29 13:37:51 by aidan]
aidan
parents: 4145
diff changeset
272 (when (eq #xBFFE (cadr compiled))
eded49463f9a [xemacs-hg @ 2007-11-29 13:37:51 by aidan]
aidan
parents: 4145
diff changeset
273 (assert (= vec-len (search '(#xBFFE) (cdr compiled)
eded49463f9a [xemacs-hg @ 2007-11-29 13:37:51 by aidan]
aidan
parents: 4145
diff changeset
274 :test #'/=)) nil
eded49463f9a [xemacs-hg @ 2007-11-29 13:37:51 by aidan]
aidan
parents: 4145
diff changeset
275 "Strange ccl vector length")
eded49463f9a [xemacs-hg @ 2007-11-29 13:37:51 by aidan]
aidan
parents: 4145
diff changeset
276 (setcdr compiled nil))
4072
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
277 (setq compiled (cdr compiled)))
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
278 ;; Is the generated code as we expect it to be?
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
279 (assert (and (memq #xABAB first-part)
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
280 (memq #xBEEF14 last-part))
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
281 nil
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
282 "This code assumes that the constant #xBEEF is #xBEEF14 in \
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
283 compiled CCL code,\nand that the constant #xABAB is #xABAB. If that is
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
284 not the case, and it appears not to be--that's why you're getting this
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
285 message--it will not work. ")
4295
eded49463f9a [xemacs-hg @ 2007-11-29 13:37:51 by aidan]
aidan
parents: 4145
diff changeset
286 (list first-part last-part vec-len))))
4072
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
287 (charset-lower -1)
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
288 (charset-upper -1)
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
289 worth-trying known-charsets encode-program
4295
eded49463f9a [xemacs-hg @ 2007-11-29 13:37:51 by aidan]
aidan
parents: 4145
diff changeset
290 other-charset-vector ucs args-out-of-range)
4072
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
291
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
292 (loop for char across decode-table
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
293 do (pushnew (char-charset char) known-charsets))
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
294 (setq known-charsets (delq 'ascii known-charsets))
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
295
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
296 (loop for known-charset in known-charsets
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
297 do
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
298 ;; This is not possible for two dimensional charsets.
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
299 (when (eq 1 (charset-dimension known-charset))
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
300 (setq args-out-of-range t)
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
301 (if (eq 'control-1 known-charset)
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
302 (setq charset-lower 0
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
303 charset-upper 31)
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
304 ;; There should be a nicer way to get the limits here.
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
305 (condition-case args-out-of-range
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
306 (make-char known-charset #x100)
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
307 (args-out-of-range
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
308 (setq charset-lower (third args-out-of-range)
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
309 charset-upper (fourth args-out-of-range)))))
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
310 (loop
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
311 for i from charset-lower to charset-upper
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
312 always (and (setq ucs
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
313 (encode-char (make-char known-charset i) 'ucs))
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
314 (gethash ucs encode-table))
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
315 finally (setq worth-trying known-charset))
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
316
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
317 ;; Only trying this for one charset at a time, the first find.
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
318 (when worth-trying (return))
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
319
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
320 ;; Okay, this charset is not worth trying, Try the next.
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
321 (setq charset-lower -1
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
322 charset-upper -1
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
323 worth-trying nil)))
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
324
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
325 (when worth-trying
4295
eded49463f9a [xemacs-hg @ 2007-11-29 13:37:51 by aidan]
aidan
parents: 4145
diff changeset
326 (setq other-charset-vector
eded49463f9a [xemacs-hg @ 2007-11-29 13:37:51 by aidan]
aidan
parents: 4145
diff changeset
327 (make-vector (third tentative-encode-program-parts)
eded49463f9a [xemacs-hg @ 2007-11-29 13:37:51 by aidan]
aidan
parents: 4145
diff changeset
328 encode-failure-octet))
4072
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
329 (loop for i from charset-lower to charset-upper
4090
751ae075e76e [xemacs-hg @ 2007-08-01 13:53:32 by aidan]
aidan
parents: 4085
diff changeset
330 do (aset other-charset-vector i
4072
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
331 (gethash (encode-char (make-char worth-trying i)
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
332 'ucs) encode-table)))
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
333 (setq encode-program
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
334 (nsublis
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
335 (list (cons #xABAB (charset-id worth-trying)))
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
336 (nconc
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
337 (copy-list (first
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
338 tentative-encode-program-parts))
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
339 (append other-charset-vector nil)
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
340 (copy-tree (second
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
341 tentative-encode-program-parts))))))
4604
e0a8715fdb1f Support new IGNORE-INVALID-SEQUENCESP argument, #'query-coding-region.
Aidan Kehoe <kehoea@parhasard.net>
parents: 4568
diff changeset
342 (values encode-program worth-trying)))
4072
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
343
4604
e0a8715fdb1f Support new IGNORE-INVALID-SEQUENCESP argument, #'query-coding-region.
Aidan Kehoe <kehoea@parhasard.net>
parents: 4568
diff changeset
344 (defun make-8-bit-generate-encode-program-and-skip-chars-strings
e0a8715fdb1f Support new IGNORE-INVALID-SEQUENCESP argument, #'query-coding-region.
Aidan Kehoe <kehoea@parhasard.net>
parents: 4568
diff changeset
345 (decode-table encode-table encode-failure-octet)
e0a8715fdb1f Support new IGNORE-INVALID-SEQUENCESP argument, #'query-coding-region.
Aidan Kehoe <kehoea@parhasard.net>
parents: 4568
diff changeset
346 "Generate a CCL program to encode a 8-bit fixed-width charset.
4072
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
347
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
348 DECODE-TABLE must have 256 non-cons entries, and will be regarded as
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
349 describing a map from the octet corresponding to an offset in the
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
350 table to the that entry in the table. ENCODE-TABLE is a hash table
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
351 map from unicode values to characters in the range [0,255].
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
352 ENCODE-FAILURE-OCTET describes an integer between 0 and 255
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
353 \(inclusive) to write in the event that a character cannot be encoded. "
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
354 (check-argument-type #'vectorp decode-table)
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
355 (check-argument-range (length decode-table) #x100 #x100)
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
356 (check-argument-type #'hash-table-p encode-table)
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
357 (check-argument-type #'integerp encode-failure-octet)
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
358 (check-argument-range encode-failure-octet #x00 #xFF)
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
359 (let ((encode-program nil)
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
360 (general-encode-program
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
361 (eval-when-compile
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
362 (let ((prog (append
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
363 (ccl-compile
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
364 `(1
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
365 (loop
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
366 (read-multibyte-character r0 r1)
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
367 (mule-to-unicode r0 r1)
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
368 (if (r0 == #xFFFD)
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
369 (write #xBEEF)
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
370 ((lookup-integer encode-table-sym r0 r3)
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
371 (if r7
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
372 (write-multibyte-character r0 r3)
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
373 (write #xBEEF))))
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
374 (repeat)))) nil)))
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
375 (assert (memq #xBEEF14 prog)
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
376 nil
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
377 "This code assumes that the constant #xBEEF is #xBEEF14 \
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
378 in compiled CCL code.\nIf that is not the case, and it appears not to
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
379 be--that's why you're getting this message--it will not work. ")
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
380 prog)))
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
381 (encode-program-with-ascii-optimisation
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
382 (eval-when-compile
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
383 (let ((prog (append
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
384 (ccl-compile
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
385 `(1
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
386 (loop
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
387 (read-multibyte-character r0 r1)
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
388 (if (r0 == ,(charset-id 'ascii))
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
389 (write r1)
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
390 ((mule-to-unicode r0 r1)
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
391 (if (r0 == #xFFFD)
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
392 (write #xBEEF)
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
393 ((lookup-integer encode-table-sym r0 r3)
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
394 (if r7
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
395 (write-multibyte-character r0 r3)
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
396 (write #xBEEF))))))
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
397 (repeat)))) nil)))
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
398 (assert (memq #xBEEF14 prog)
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
399 nil
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
400 "This code assumes that the constant #xBEEF is #xBEEF14 \
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
401 in compiled CCL code.\nIf that is not the case, and it appears not to
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
402 be--that's why you're getting this message--it will not work. ")
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
403 prog)))
4604
e0a8715fdb1f Support new IGNORE-INVALID-SEQUENCESP argument, #'query-coding-region.
Aidan Kehoe <kehoea@parhasard.net>
parents: 4568
diff changeset
404 (ascii-encodes-as-itself nil)
e0a8715fdb1f Support new IGNORE-INVALID-SEQUENCESP argument, #'query-coding-region.
Aidan Kehoe <kehoea@parhasard.net>
parents: 4568
diff changeset
405 (control-1-encodes-as-itself t)
e0a8715fdb1f Support new IGNORE-INVALID-SEQUENCESP argument, #'query-coding-region.
Aidan Kehoe <kehoea@parhasard.net>
parents: 4568
diff changeset
406 (invalid-sequence-code-point-start
e0a8715fdb1f Support new IGNORE-INVALID-SEQUENCESP argument, #'query-coding-region.
Aidan Kehoe <kehoea@parhasard.net>
parents: 4568
diff changeset
407 (eval-when-compile
e0a8715fdb1f Support new IGNORE-INVALID-SEQUENCESP argument, #'query-coding-region.
Aidan Kehoe <kehoea@parhasard.net>
parents: 4568
diff changeset
408 (char-to-unicode
e0a8715fdb1f Support new IGNORE-INVALID-SEQUENCESP argument, #'query-coding-region.
Aidan Kehoe <kehoea@parhasard.net>
parents: 4568
diff changeset
409 (aref (decode-coding-string "\xd8\x00\x00\x00" 'utf-16-be) 3))))
e0a8715fdb1f Support new IGNORE-INVALID-SEQUENCESP argument, #'query-coding-region.
Aidan Kehoe <kehoea@parhasard.net>
parents: 4568
diff changeset
410 further-char-set skip-chars invalid-sequences-skip-chars)
4072
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
411
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
412 ;; Is this coding system ASCII-compatible? If so, we can avoid the hash
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
413 ;; table lookup for those characters.
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
414 (loop
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
415 for i from #x00 to #x7f
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
416 always (eq (int-to-char i) (gethash i encode-table))
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
417 finally (setq ascii-encodes-as-itself t))
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
418
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
419 ;; Note that this logic handles EBCDIC badly. For example, CP037,
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
420 ;; MIME name ebcdic-na, has the entire repertoire of ASCII and
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
421 ;; Latin 1, and thus a more optimal ccl encode program would check
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
422 ;; for those character sets and use tables. But for now, we do a
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
423 ;; hash table lookup for every character.
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
424 (if (null ascii-encodes-as-itself)
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
425 ;; General encode program. Pros; general and correct. Cons;
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
426 ;; slow, a hash table lookup + mule-unicode conversion is done
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
427 ;; for every character encoding.
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
428 (setq encode-program general-encode-program)
4604
e0a8715fdb1f Support new IGNORE-INVALID-SEQUENCESP argument, #'query-coding-region.
Aidan Kehoe <kehoea@parhasard.net>
parents: 4568
diff changeset
429 (multiple-value-setq
e0a8715fdb1f Support new IGNORE-INVALID-SEQUENCESP argument, #'query-coding-region.
Aidan Kehoe <kehoea@parhasard.net>
parents: 4568
diff changeset
430 (encode-program further-char-set)
e0a8715fdb1f Support new IGNORE-INVALID-SEQUENCESP argument, #'query-coding-region.
Aidan Kehoe <kehoea@parhasard.net>
parents: 4568
diff changeset
431 ;; Encode program with ascii-ascii mapping (based on a
e0a8715fdb1f Support new IGNORE-INVALID-SEQUENCESP argument, #'query-coding-region.
Aidan Kehoe <kehoea@parhasard.net>
parents: 4568
diff changeset
432 ;; character's mule character set), and one other mule
e0a8715fdb1f Support new IGNORE-INVALID-SEQUENCESP argument, #'query-coding-region.
Aidan Kehoe <kehoea@parhasard.net>
parents: 4568
diff changeset
433 ;; character set using table-based encoding, other
e0a8715fdb1f Support new IGNORE-INVALID-SEQUENCESP argument, #'query-coding-region.
Aidan Kehoe <kehoea@parhasard.net>
parents: 4568
diff changeset
434 ;; character sets using hash table lookups.
e0a8715fdb1f Support new IGNORE-INVALID-SEQUENCESP argument, #'query-coding-region.
Aidan Kehoe <kehoea@parhasard.net>
parents: 4568
diff changeset
435 ;; make-8-bit-non-ascii-completely-coveredp only returns
e0a8715fdb1f Support new IGNORE-INVALID-SEQUENCESP argument, #'query-coding-region.
Aidan Kehoe <kehoea@parhasard.net>
parents: 4568
diff changeset
436 ;; such a mapping if some non-ASCII charset with
e0a8715fdb1f Support new IGNORE-INVALID-SEQUENCESP argument, #'query-coding-region.
Aidan Kehoe <kehoea@parhasard.net>
parents: 4568
diff changeset
437 ;; characters in decode-table is entirely covered by
e0a8715fdb1f Support new IGNORE-INVALID-SEQUENCESP argument, #'query-coding-region.
Aidan Kehoe <kehoea@parhasard.net>
parents: 4568
diff changeset
438 ;; encode-table.
e0a8715fdb1f Support new IGNORE-INVALID-SEQUENCESP argument, #'query-coding-region.
Aidan Kehoe <kehoea@parhasard.net>
parents: 4568
diff changeset
439 (make-8-bit-generate-helper decode-table encode-table
e0a8715fdb1f Support new IGNORE-INVALID-SEQUENCESP argument, #'query-coding-region.
Aidan Kehoe <kehoea@parhasard.net>
parents: 4568
diff changeset
440 encode-failure-octet))
4072
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
441 (unless encode-program
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
442 ;; If make-8-bit-non-ascii-completely-coveredp returned nil,
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
443 ;; but ASCII still encodes as itself, do one-to-one mapping
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
444 ;; for ASCII, and a hash table lookup for everything else.
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
445 (setq encode-program encode-program-with-ascii-optimisation)))
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
446
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
447 (setq encode-program
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
448 (nsublis
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
449 (list (cons #xBEEF14
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
450 (logior (lsh encode-failure-octet 8)
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
451 #x14)))
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
452 (copy-tree encode-program)))
4604
e0a8715fdb1f Support new IGNORE-INVALID-SEQUENCESP argument, #'query-coding-region.
Aidan Kehoe <kehoea@parhasard.net>
parents: 4568
diff changeset
453 (loop
e0a8715fdb1f Support new IGNORE-INVALID-SEQUENCESP argument, #'query-coding-region.
Aidan Kehoe <kehoea@parhasard.net>
parents: 4568
diff changeset
454 for i from #x80 to #x9f
e0a8715fdb1f Support new IGNORE-INVALID-SEQUENCESP argument, #'query-coding-region.
Aidan Kehoe <kehoea@parhasard.net>
parents: 4568
diff changeset
455 do (unless (= i (aref decode-table i))
e0a8715fdb1f Support new IGNORE-INVALID-SEQUENCESP argument, #'query-coding-region.
Aidan Kehoe <kehoea@parhasard.net>
parents: 4568
diff changeset
456 (setq control-1-encodes-as-itself nil)
e0a8715fdb1f Support new IGNORE-INVALID-SEQUENCESP argument, #'query-coding-region.
Aidan Kehoe <kehoea@parhasard.net>
parents: 4568
diff changeset
457 (return)))
e0a8715fdb1f Support new IGNORE-INVALID-SEQUENCESP argument, #'query-coding-region.
Aidan Kehoe <kehoea@parhasard.net>
parents: 4568
diff changeset
458 (loop
e0a8715fdb1f Support new IGNORE-INVALID-SEQUENCESP argument, #'query-coding-region.
Aidan Kehoe <kehoea@parhasard.net>
parents: 4568
diff changeset
459 for i from #x00 to #xFF
e0a8715fdb1f Support new IGNORE-INVALID-SEQUENCESP argument, #'query-coding-region.
Aidan Kehoe <kehoea@parhasard.net>
parents: 4568
diff changeset
460 initially (setq skip-chars
e0a8715fdb1f Support new IGNORE-INVALID-SEQUENCESP argument, #'query-coding-region.
Aidan Kehoe <kehoea@parhasard.net>
parents: 4568
diff changeset
461 (cond
e0a8715fdb1f Support new IGNORE-INVALID-SEQUENCESP argument, #'query-coding-region.
Aidan Kehoe <kehoea@parhasard.net>
parents: 4568
diff changeset
462 ((and ascii-encodes-as-itself
e0a8715fdb1f Support new IGNORE-INVALID-SEQUENCESP argument, #'query-coding-region.
Aidan Kehoe <kehoea@parhasard.net>
parents: 4568
diff changeset
463 control-1-encodes-as-itself further-char-set)
e0a8715fdb1f Support new IGNORE-INVALID-SEQUENCESP argument, #'query-coding-region.
Aidan Kehoe <kehoea@parhasard.net>
parents: 4568
diff changeset
464 (concat "\x00-\x9f" (charset-skip-chars-string
e0a8715fdb1f Support new IGNORE-INVALID-SEQUENCESP argument, #'query-coding-region.
Aidan Kehoe <kehoea@parhasard.net>
parents: 4568
diff changeset
465 further-char-set)))
e0a8715fdb1f Support new IGNORE-INVALID-SEQUENCESP argument, #'query-coding-region.
Aidan Kehoe <kehoea@parhasard.net>
parents: 4568
diff changeset
466 ((and ascii-encodes-as-itself
e0a8715fdb1f Support new IGNORE-INVALID-SEQUENCESP argument, #'query-coding-region.
Aidan Kehoe <kehoea@parhasard.net>
parents: 4568
diff changeset
467 control-1-encodes-as-itself)
e0a8715fdb1f Support new IGNORE-INVALID-SEQUENCESP argument, #'query-coding-region.
Aidan Kehoe <kehoea@parhasard.net>
parents: 4568
diff changeset
468 "\x00-\x9f")
e0a8715fdb1f Support new IGNORE-INVALID-SEQUENCESP argument, #'query-coding-region.
Aidan Kehoe <kehoea@parhasard.net>
parents: 4568
diff changeset
469 ((null ascii-encodes-as-itself)
e0a8715fdb1f Support new IGNORE-INVALID-SEQUENCESP argument, #'query-coding-region.
Aidan Kehoe <kehoea@parhasard.net>
parents: 4568
diff changeset
470 (skip-chars-quote (apply #'string
e0a8715fdb1f Support new IGNORE-INVALID-SEQUENCESP argument, #'query-coding-region.
Aidan Kehoe <kehoea@parhasard.net>
parents: 4568
diff changeset
471 (append decode-table nil))))
e0a8715fdb1f Support new IGNORE-INVALID-SEQUENCESP argument, #'query-coding-region.
Aidan Kehoe <kehoea@parhasard.net>
parents: 4568
diff changeset
472 (further-char-set
e0a8715fdb1f Support new IGNORE-INVALID-SEQUENCESP argument, #'query-coding-region.
Aidan Kehoe <kehoea@parhasard.net>
parents: 4568
diff changeset
473 (concat (charset-skip-chars-string 'ascii)
e0a8715fdb1f Support new IGNORE-INVALID-SEQUENCESP argument, #'query-coding-region.
Aidan Kehoe <kehoea@parhasard.net>
parents: 4568
diff changeset
474 (charset-skip-chars-string further-char-set)))
e0a8715fdb1f Support new IGNORE-INVALID-SEQUENCESP argument, #'query-coding-region.
Aidan Kehoe <kehoea@parhasard.net>
parents: 4568
diff changeset
475 (t
e0a8715fdb1f Support new IGNORE-INVALID-SEQUENCESP argument, #'query-coding-region.
Aidan Kehoe <kehoea@parhasard.net>
parents: 4568
diff changeset
476 (charset-skip-chars-string 'ascii)))
e0a8715fdb1f Support new IGNORE-INVALID-SEQUENCESP argument, #'query-coding-region.
Aidan Kehoe <kehoea@parhasard.net>
parents: 4568
diff changeset
477 invalid-sequences-skip-chars "")
e0a8715fdb1f Support new IGNORE-INVALID-SEQUENCESP argument, #'query-coding-region.
Aidan Kehoe <kehoea@parhasard.net>
parents: 4568
diff changeset
478 with decoded-ucs = nil
e0a8715fdb1f Support new IGNORE-INVALID-SEQUENCESP argument, #'query-coding-region.
Aidan Kehoe <kehoea@parhasard.net>
parents: 4568
diff changeset
479 with decoded = nil
e0a8715fdb1f Support new IGNORE-INVALID-SEQUENCESP argument, #'query-coding-region.
Aidan Kehoe <kehoea@parhasard.net>
parents: 4568
diff changeset
480 with no-ascii-transparency-skip-chars-list =
e0a8715fdb1f Support new IGNORE-INVALID-SEQUENCESP argument, #'query-coding-region.
Aidan Kehoe <kehoea@parhasard.net>
parents: 4568
diff changeset
481 (unless ascii-encodes-as-itself (append decode-table nil))
e0a8715fdb1f Support new IGNORE-INVALID-SEQUENCESP argument, #'query-coding-region.
Aidan Kehoe <kehoea@parhasard.net>
parents: 4568
diff changeset
482 ;; Can't use #'match-string here, see:
e0a8715fdb1f Support new IGNORE-INVALID-SEQUENCESP argument, #'query-coding-region.
Aidan Kehoe <kehoea@parhasard.net>
parents: 4568
diff changeset
483 ;; http://mid.gmane.org/18829.34118.709782.704574@parhasard.net
e0a8715fdb1f Support new IGNORE-INVALID-SEQUENCESP argument, #'query-coding-region.
Aidan Kehoe <kehoea@parhasard.net>
parents: 4568
diff changeset
484 with skip-chars-test =
e0a8715fdb1f Support new IGNORE-INVALID-SEQUENCESP argument, #'query-coding-region.
Aidan Kehoe <kehoea@parhasard.net>
parents: 4568
diff changeset
485 #'(lambda (skip-chars-string testing)
e0a8715fdb1f Support new IGNORE-INVALID-SEQUENCESP argument, #'query-coding-region.
Aidan Kehoe <kehoea@parhasard.net>
parents: 4568
diff changeset
486 (with-temp-buffer
e0a8715fdb1f Support new IGNORE-INVALID-SEQUENCESP argument, #'query-coding-region.
Aidan Kehoe <kehoea@parhasard.net>
parents: 4568
diff changeset
487 (insert testing)
e0a8715fdb1f Support new IGNORE-INVALID-SEQUENCESP argument, #'query-coding-region.
Aidan Kehoe <kehoea@parhasard.net>
parents: 4568
diff changeset
488 (goto-char (point-min))
e0a8715fdb1f Support new IGNORE-INVALID-SEQUENCESP argument, #'query-coding-region.
Aidan Kehoe <kehoea@parhasard.net>
parents: 4568
diff changeset
489 (skip-chars-forward skip-chars-string)
e0a8715fdb1f Support new IGNORE-INVALID-SEQUENCESP argument, #'query-coding-region.
Aidan Kehoe <kehoea@parhasard.net>
parents: 4568
diff changeset
490 (= (point) (point-max))))
e0a8715fdb1f Support new IGNORE-INVALID-SEQUENCESP argument, #'query-coding-region.
Aidan Kehoe <kehoea@parhasard.net>
parents: 4568
diff changeset
491 do
e0a8715fdb1f Support new IGNORE-INVALID-SEQUENCESP argument, #'query-coding-region.
Aidan Kehoe <kehoea@parhasard.net>
parents: 4568
diff changeset
492 (setq decoded (aref decode-table i)
e0a8715fdb1f Support new IGNORE-INVALID-SEQUENCESP argument, #'query-coding-region.
Aidan Kehoe <kehoea@parhasard.net>
parents: 4568
diff changeset
493 decoded-ucs (char-to-unicode decoded))
e0a8715fdb1f Support new IGNORE-INVALID-SEQUENCESP argument, #'query-coding-region.
Aidan Kehoe <kehoea@parhasard.net>
parents: 4568
diff changeset
494 (cond
e0a8715fdb1f Support new IGNORE-INVALID-SEQUENCESP argument, #'query-coding-region.
Aidan Kehoe <kehoea@parhasard.net>
parents: 4568
diff changeset
495 ((<= invalid-sequence-code-point-start decoded-ucs
e0a8715fdb1f Support new IGNORE-INVALID-SEQUENCESP argument, #'query-coding-region.
Aidan Kehoe <kehoea@parhasard.net>
parents: 4568
diff changeset
496 (+ invalid-sequence-code-point-start #xFF))
e0a8715fdb1f Support new IGNORE-INVALID-SEQUENCESP argument, #'query-coding-region.
Aidan Kehoe <kehoea@parhasard.net>
parents: 4568
diff changeset
497 (setq invalid-sequences-skip-chars
e0a8715fdb1f Support new IGNORE-INVALID-SEQUENCESP argument, #'query-coding-region.
Aidan Kehoe <kehoea@parhasard.net>
parents: 4568
diff changeset
498 (concat (string decoded)
e0a8715fdb1f Support new IGNORE-INVALID-SEQUENCESP argument, #'query-coding-region.
Aidan Kehoe <kehoea@parhasard.net>
parents: 4568
diff changeset
499 invalid-sequences-skip-chars))
e0a8715fdb1f Support new IGNORE-INVALID-SEQUENCESP argument, #'query-coding-region.
Aidan Kehoe <kehoea@parhasard.net>
parents: 4568
diff changeset
500 (assert (not (funcall skip-chars-test skip-chars decoded))
e0a8715fdb1f Support new IGNORE-INVALID-SEQUENCESP argument, #'query-coding-region.
Aidan Kehoe <kehoea@parhasard.net>
parents: 4568
diff changeset
501 "This char should only be skipped with \
e0a8715fdb1f Support new IGNORE-INVALID-SEQUENCESP argument, #'query-coding-region.
Aidan Kehoe <kehoea@parhasard.net>
parents: 4568
diff changeset
502 `invalid-sequences-skip-chars', not by `skip-chars'"))
e0a8715fdb1f Support new IGNORE-INVALID-SEQUENCESP argument, #'query-coding-region.
Aidan Kehoe <kehoea@parhasard.net>
parents: 4568
diff changeset
503 ((not (funcall skip-chars-test skip-chars decoded))
e0a8715fdb1f Support new IGNORE-INVALID-SEQUENCESP argument, #'query-coding-region.
Aidan Kehoe <kehoea@parhasard.net>
parents: 4568
diff changeset
504 (if ascii-encodes-as-itself
e0a8715fdb1f Support new IGNORE-INVALID-SEQUENCESP argument, #'query-coding-region.
Aidan Kehoe <kehoea@parhasard.net>
parents: 4568
diff changeset
505 (setq skip-chars (concat skip-chars (string decoded)))
e0a8715fdb1f Support new IGNORE-INVALID-SEQUENCESP argument, #'query-coding-region.
Aidan Kehoe <kehoea@parhasard.net>
parents: 4568
diff changeset
506 (push decoded no-ascii-transparency-skip-chars-list))))
e0a8715fdb1f Support new IGNORE-INVALID-SEQUENCESP argument, #'query-coding-region.
Aidan Kehoe <kehoea@parhasard.net>
parents: 4568
diff changeset
507 finally (unless ascii-encodes-as-itself
e0a8715fdb1f Support new IGNORE-INVALID-SEQUENCESP argument, #'query-coding-region.
Aidan Kehoe <kehoea@parhasard.net>
parents: 4568
diff changeset
508 (setq skip-chars
e0a8715fdb1f Support new IGNORE-INVALID-SEQUENCESP argument, #'query-coding-region.
Aidan Kehoe <kehoea@parhasard.net>
parents: 4568
diff changeset
509 (skip-chars-quote
e0a8715fdb1f Support new IGNORE-INVALID-SEQUENCESP argument, #'query-coding-region.
Aidan Kehoe <kehoea@parhasard.net>
parents: 4568
diff changeset
510 (apply #'string
e0a8715fdb1f Support new IGNORE-INVALID-SEQUENCESP argument, #'query-coding-region.
Aidan Kehoe <kehoea@parhasard.net>
parents: 4568
diff changeset
511 no-ascii-transparency-skip-chars-list)))))
e0a8715fdb1f Support new IGNORE-INVALID-SEQUENCESP argument, #'query-coding-region.
Aidan Kehoe <kehoea@parhasard.net>
parents: 4568
diff changeset
512 (values encode-program skip-chars invalid-sequences-skip-chars)))
4072
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
513
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
514 (defun make-8-bit-create-decode-encode-tables (unicode-map)
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
515 "Return a list \(DECODE-TABLE ENCODE-TABLE) given UNICODE-MAP.
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
516 UNICODE-MAP should be an alist mapping from integer octet values to
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
517 characters with UCS code points; DECODE-TABLE will be a 256-element
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
518 vector, and ENCODE-TABLE will be a hash table mapping from 256 numbers
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
519 to 256 distinct characters. "
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
520 (check-argument-type #'listp unicode-map)
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
521 (let ((decode-table (make-vector 256 nil))
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
522 (encode-table (make-hash-table :size 256))
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
523 (private-use-start (encode-char make-8-bit-private-use-start 'ucs))
4604
e0a8715fdb1f Support new IGNORE-INVALID-SEQUENCESP argument, #'query-coding-region.
Aidan Kehoe <kehoea@parhasard.net>
parents: 4568
diff changeset
524 (invalid-sequence-code-point-start
e0a8715fdb1f Support new IGNORE-INVALID-SEQUENCESP argument, #'query-coding-region.
Aidan Kehoe <kehoea@parhasard.net>
parents: 4568
diff changeset
525 (eval-when-compile
e0a8715fdb1f Support new IGNORE-INVALID-SEQUENCESP argument, #'query-coding-region.
Aidan Kehoe <kehoea@parhasard.net>
parents: 4568
diff changeset
526 (char-to-unicode
e0a8715fdb1f Support new IGNORE-INVALID-SEQUENCESP argument, #'query-coding-region.
Aidan Kehoe <kehoea@parhasard.net>
parents: 4568
diff changeset
527 (aref (decode-coding-string "\xd8\x00\x00\x00" 'utf-16-be) 3))))
e0a8715fdb1f Support new IGNORE-INVALID-SEQUENCESP argument, #'query-coding-region.
Aidan Kehoe <kehoea@parhasard.net>
parents: 4568
diff changeset
528 desired-ucs decode-table-entry)
4072
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
529
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
530 (loop for (external internal)
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
531 in unicode-map
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
532 do
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
533 (aset decode-table external internal)
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
534 (assert (not (eq (encode-char internal 'ucs) -1))
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
535 nil
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
536 "Looks like you're calling make-8-bit-coding-system in a \
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
537 dumped file, \nand you're either not providing a literal UNICODE-MAP
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
538 or PROPS. Don't do that; make-8-bit-coding-system relies on sensible
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
539 Unicode mappings being available, which they are at compile time for
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
540 dumped files (but this requires the mentioned literals), but not, for
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
541 most of them, at run time. ")
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
542
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
543 (puthash (encode-char internal 'ucs)
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
544 ;; This is semantically an integer, but Dave Love's design
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
545 ;; for lookup-integer in CCL means we need to store it as a
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
546 ;; character.
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
547 (int-to-char external)
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
548 encode-table))
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
549
4604
e0a8715fdb1f Support new IGNORE-INVALID-SEQUENCESP argument, #'query-coding-region.
Aidan Kehoe <kehoea@parhasard.net>
parents: 4568
diff changeset
550 ;; Now, go through the decode table. For octet values above #x7f, if the
e0a8715fdb1f Support new IGNORE-INVALID-SEQUENCESP argument, #'query-coding-region.
Aidan Kehoe <kehoea@parhasard.net>
parents: 4568
diff changeset
551 ;; decode table entry is nil, this means that they have an undefined
e0a8715fdb1f Support new IGNORE-INVALID-SEQUENCESP argument, #'query-coding-region.
Aidan Kehoe <kehoea@parhasard.net>
parents: 4568
diff changeset
552 ;; mapping (= they map to XEmacs characters with keys in
e0a8715fdb1f Support new IGNORE-INVALID-SEQUENCESP argument, #'query-coding-region.
Aidan Kehoe <kehoea@parhasard.net>
parents: 4568
diff changeset
553 ;; unicode-error-default-translation-table); for octet values below or
e0a8715fdb1f Support new IGNORE-INVALID-SEQUENCESP argument, #'query-coding-region.
Aidan Kehoe <kehoea@parhasard.net>
parents: 4568
diff changeset
554 ;; equal to #x7f, it means that they map to ASCII.
e0a8715fdb1f Support new IGNORE-INVALID-SEQUENCESP argument, #'query-coding-region.
Aidan Kehoe <kehoea@parhasard.net>
parents: 4568
diff changeset
555
e0a8715fdb1f Support new IGNORE-INVALID-SEQUENCESP argument, #'query-coding-region.
Aidan Kehoe <kehoea@parhasard.net>
parents: 4568
diff changeset
556 ;; If any entry (whether below or above #x7f) in the decode-table
e0a8715fdb1f Support new IGNORE-INVALID-SEQUENCESP argument, #'query-coding-region.
Aidan Kehoe <kehoea@parhasard.net>
parents: 4568
diff changeset
557 ;; already maps to some character with a key in
e0a8715fdb1f Support new IGNORE-INVALID-SEQUENCESP argument, #'query-coding-region.
Aidan Kehoe <kehoea@parhasard.net>
parents: 4568
diff changeset
558 ;; unicode-error-default-translation-table, it is treated as an
e0a8715fdb1f Support new IGNORE-INVALID-SEQUENCESP argument, #'query-coding-region.
Aidan Kehoe <kehoea@parhasard.net>
parents: 4568
diff changeset
559 ;; undefined octet by `query-coding-region'. That is, it is not
e0a8715fdb1f Support new IGNORE-INVALID-SEQUENCESP argument, #'query-coding-region.
Aidan Kehoe <kehoea@parhasard.net>
parents: 4568
diff changeset
560 ;; necessary for an octet value to be above #x7f for this to happen.
e0a8715fdb1f Support new IGNORE-INVALID-SEQUENCESP argument, #'query-coding-region.
Aidan Kehoe <kehoea@parhasard.net>
parents: 4568
diff changeset
561
4072
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
562 (dotimes (i 256)
4604
e0a8715fdb1f Support new IGNORE-INVALID-SEQUENCESP argument, #'query-coding-region.
Aidan Kehoe <kehoea@parhasard.net>
parents: 4568
diff changeset
563 (setq decode-table-entry (aref decode-table i))
e0a8715fdb1f Support new IGNORE-INVALID-SEQUENCESP argument, #'query-coding-region.
Aidan Kehoe <kehoea@parhasard.net>
parents: 4568
diff changeset
564 (if decode-table-entry
e0a8715fdb1f Support new IGNORE-INVALID-SEQUENCESP argument, #'query-coding-region.
Aidan Kehoe <kehoea@parhasard.net>
parents: 4568
diff changeset
565 (when (get-char-table
e0a8715fdb1f Support new IGNORE-INVALID-SEQUENCESP argument, #'query-coding-region.
Aidan Kehoe <kehoea@parhasard.net>
parents: 4568
diff changeset
566 decode-table-entry
e0a8715fdb1f Support new IGNORE-INVALID-SEQUENCESP argument, #'query-coding-region.
Aidan Kehoe <kehoea@parhasard.net>
parents: 4568
diff changeset
567 unicode-error-default-translation-table)
e0a8715fdb1f Support new IGNORE-INVALID-SEQUENCESP argument, #'query-coding-region.
Aidan Kehoe <kehoea@parhasard.net>
parents: 4568
diff changeset
568 ;; The caller is explicitly specifying that this octet
e0a8715fdb1f Support new IGNORE-INVALID-SEQUENCESP argument, #'query-coding-region.
Aidan Kehoe <kehoea@parhasard.net>
parents: 4568
diff changeset
569 ;; corresponds to an invalid sequence on disk:
e0a8715fdb1f Support new IGNORE-INVALID-SEQUENCESP argument, #'query-coding-region.
Aidan Kehoe <kehoea@parhasard.net>
parents: 4568
diff changeset
570 (assert (= (get-char-table
e0a8715fdb1f Support new IGNORE-INVALID-SEQUENCESP argument, #'query-coding-region.
Aidan Kehoe <kehoea@parhasard.net>
parents: 4568
diff changeset
571 decode-table-entry
e0a8715fdb1f Support new IGNORE-INVALID-SEQUENCESP argument, #'query-coding-region.
Aidan Kehoe <kehoea@parhasard.net>
parents: 4568
diff changeset
572 unicode-error-default-translation-table) i)
e0a8715fdb1f Support new IGNORE-INVALID-SEQUENCESP argument, #'query-coding-region.
Aidan Kehoe <kehoea@parhasard.net>
parents: 4568
diff changeset
573 "Bad argument to `make-8-bit-coding-system'.
e0a8715fdb1f Support new IGNORE-INVALID-SEQUENCESP argument, #'query-coding-region.
Aidan Kehoe <kehoea@parhasard.net>
parents: 4568
diff changeset
574 If you're going to designate an octet with value below #x80 as invalid
e0a8715fdb1f Support new IGNORE-INVALID-SEQUENCESP argument, #'query-coding-region.
Aidan Kehoe <kehoea@parhasard.net>
parents: 4568
diff changeset
575 for this coding system, make sure to map it to the invalid sequence
e0a8715fdb1f Support new IGNORE-INVALID-SEQUENCESP argument, #'query-coding-region.
Aidan Kehoe <kehoea@parhasard.net>
parents: 4568
diff changeset
576 character corresponding to its octet value on disk. "))
e0a8715fdb1f Support new IGNORE-INVALID-SEQUENCESP argument, #'query-coding-region.
Aidan Kehoe <kehoea@parhasard.net>
parents: 4568
diff changeset
577
e0a8715fdb1f Support new IGNORE-INVALID-SEQUENCESP argument, #'query-coding-region.
Aidan Kehoe <kehoea@parhasard.net>
parents: 4568
diff changeset
578 ;; decode-table-entry is nil; either the octet is to be treated as
e0a8715fdb1f Support new IGNORE-INVALID-SEQUENCESP argument, #'query-coding-region.
Aidan Kehoe <kehoea@parhasard.net>
parents: 4568
diff changeset
579 ;; contributing to an error sequence (when (> #x7f i)), or it should
e0a8715fdb1f Support new IGNORE-INVALID-SEQUENCESP argument, #'query-coding-region.
Aidan Kehoe <kehoea@parhasard.net>
parents: 4568
diff changeset
580 ;; be attempted to treat it as ASCII-equivalent.
e0a8715fdb1f Support new IGNORE-INVALID-SEQUENCESP argument, #'query-coding-region.
Aidan Kehoe <kehoea@parhasard.net>
parents: 4568
diff changeset
581 (setq desired-ucs (or (and (< i #x80) i)
e0a8715fdb1f Support new IGNORE-INVALID-SEQUENCESP argument, #'query-coding-region.
Aidan Kehoe <kehoea@parhasard.net>
parents: 4568
diff changeset
582 (+ invalid-sequence-code-point-start i)))
e0a8715fdb1f Support new IGNORE-INVALID-SEQUENCESP argument, #'query-coding-region.
Aidan Kehoe <kehoea@parhasard.net>
parents: 4568
diff changeset
583 (while (gethash desired-ucs encode-table)
e0a8715fdb1f Support new IGNORE-INVALID-SEQUENCESP argument, #'query-coding-region.
Aidan Kehoe <kehoea@parhasard.net>
parents: 4568
diff changeset
584 (assert (not (< i #x80))
e0a8715fdb1f Support new IGNORE-INVALID-SEQUENCESP argument, #'query-coding-region.
Aidan Kehoe <kehoea@parhasard.net>
parents: 4568
diff changeset
585 "UCS code point should not already be in encode-table!"
e0a8715fdb1f Support new IGNORE-INVALID-SEQUENCESP argument, #'query-coding-region.
Aidan Kehoe <kehoea@parhasard.net>
parents: 4568
diff changeset
586 ;; There is one invalid sequence char per octet value;
e0a8715fdb1f Support new IGNORE-INVALID-SEQUENCESP argument, #'query-coding-region.
Aidan Kehoe <kehoea@parhasard.net>
parents: 4568
diff changeset
587 ;; with eight-bit-fixed coding systems, it makes no sense
e0a8715fdb1f Support new IGNORE-INVALID-SEQUENCESP argument, #'query-coding-region.
Aidan Kehoe <kehoea@parhasard.net>
parents: 4568
diff changeset
588 ;; for us to be multiply allocating them.
e0a8715fdb1f Support new IGNORE-INVALID-SEQUENCESP argument, #'query-coding-region.
Aidan Kehoe <kehoea@parhasard.net>
parents: 4568
diff changeset
589 (gethash desired-ucs encode-table))
e0a8715fdb1f Support new IGNORE-INVALID-SEQUENCESP argument, #'query-coding-region.
Aidan Kehoe <kehoea@parhasard.net>
parents: 4568
diff changeset
590 (setq desired-ucs (+ private-use-start desired-ucs)
e0a8715fdb1f Support new IGNORE-INVALID-SEQUENCESP argument, #'query-coding-region.
Aidan Kehoe <kehoea@parhasard.net>
parents: 4568
diff changeset
591 private-use-start (+ private-use-start 1)))
e0a8715fdb1f Support new IGNORE-INVALID-SEQUENCESP argument, #'query-coding-region.
Aidan Kehoe <kehoea@parhasard.net>
parents: 4568
diff changeset
592 (puthash desired-ucs (int-to-char i) encode-table)
4085
023ebc75c06e [xemacs-hg @ 2007-07-28 09:32:26 by aidan]
aidan
parents: 4080
diff changeset
593 (setq desired-ucs (if (> desired-ucs #xFF)
4604
e0a8715fdb1f Support new IGNORE-INVALID-SEQUENCESP argument, #'query-coding-region.
Aidan Kehoe <kehoea@parhasard.net>
parents: 4568
diff changeset
594 (unicode-to-char desired-ucs)
4085
023ebc75c06e [xemacs-hg @ 2007-07-28 09:32:26 by aidan]
aidan
parents: 4080
diff changeset
595 ;; So we get Latin-1 when run at dump time,
023ebc75c06e [xemacs-hg @ 2007-07-28 09:32:26 by aidan]
aidan
parents: 4080
diff changeset
596 ;; instead of JIT-allocated characters.
023ebc75c06e [xemacs-hg @ 2007-07-28 09:32:26 by aidan]
aidan
parents: 4080
diff changeset
597 (int-to-char desired-ucs)))
023ebc75c06e [xemacs-hg @ 2007-07-28 09:32:26 by aidan]
aidan
parents: 4080
diff changeset
598 (aset decode-table i desired-ucs)))
4072
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
599 (values decode-table encode-table)))
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
600
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
601 (defun make-8-bit-generate-decode-program (decode-table)
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
602 "Given DECODE-TABLE, generate a CCL program to decode an 8-bit charset.
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
603 DECODE-TABLE must have 256 non-cons entries, and will be regarded as
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
604 describing a map from the octet corresponding to an offset in the
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
605 table to the that entry in the table. "
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
606 (check-argument-type #'vectorp decode-table)
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
607 (check-argument-range (length decode-table) #x100 #x100)
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
608 (let ((decode-program-parts
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
609 (eval-when-compile
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
610 (let* ((compiled
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
611 (append
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
612 (ccl-compile
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
613 `(3
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
614 ((read r0)
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
615 (loop
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
616 (write-read-repeat r0 ,(make-vector
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
617 256 'sentinel)))))) nil))
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
618 (first-part compiled)
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
619 (last-part
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
620 (member-if-not #'symbolp
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
621 (member-if-not #'integerp first-part))))
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
622 ;; Chop off the sentinel sentinel sentinel [..] part.
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
623 (while compiled
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
624 (if (symbolp (cadr compiled))
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
625 (setcdr compiled nil))
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
626 (setq compiled (cdr compiled)))
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
627 (list first-part last-part)))))
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
628 (nconc
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
629 ;; copy-list needed, because the structure of the literal provided
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
630 ;; by our eval-when-compile hangs around.
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
631 (copy-list (first decode-program-parts))
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
632 (append decode-table nil)
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
633 (second decode-program-parts))))
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
634
4145
edb00a8b4eff [xemacs-hg @ 2007-08-26 20:00:29 by aidan]
aidan
parents: 4103
diff changeset
635 (defun make-8-bit-choose-category (decode-table)
edb00a8b4eff [xemacs-hg @ 2007-08-26 20:00:29 by aidan]
aidan
parents: 4103
diff changeset
636 "Given DECODE-TABLE, return an appropriate coding category.
edb00a8b4eff [xemacs-hg @ 2007-08-26 20:00:29 by aidan]
aidan
parents: 4103
diff changeset
637 DECODE-TABLE is a 256-entry vector describing the mapping from octets on
edb00a8b4eff [xemacs-hg @ 2007-08-26 20:00:29 by aidan]
aidan
parents: 4103
diff changeset
638 disk to XEmacs characters for some fixed-width 8-bit coding system. "
edb00a8b4eff [xemacs-hg @ 2007-08-26 20:00:29 by aidan]
aidan
parents: 4103
diff changeset
639 (check-argument-type #'vectorp decode-table)
edb00a8b4eff [xemacs-hg @ 2007-08-26 20:00:29 by aidan]
aidan
parents: 4103
diff changeset
640 (check-argument-range (length decode-table) #x100 #x100)
4549
68d1ca56cffa First part of interactive checks that coding systems encode regions.
Aidan Kehoe <kehoea@parhasard.net>
parents: 4299
diff changeset
641 (loop
68d1ca56cffa First part of interactive checks that coding systems encode regions.
Aidan Kehoe <kehoea@parhasard.net>
parents: 4299
diff changeset
642 named category
4559
bd1a68c34d44 Merge my change of 2008-05-14 to the query-coding-region code.
Aidan Kehoe <kehoea@parhasard.net>
parents: 4558
diff changeset
643 for i from #x80 to #x9F
4549
68d1ca56cffa First part of interactive checks that coding systems encode regions.
Aidan Kehoe <kehoea@parhasard.net>
parents: 4299
diff changeset
644 do (unless (= i (aref decode-table i))
68d1ca56cffa First part of interactive checks that coding systems encode regions.
Aidan Kehoe <kehoea@parhasard.net>
parents: 4299
diff changeset
645 (return-from category 'no-conversion))
68d1ca56cffa First part of interactive checks that coding systems encode regions.
Aidan Kehoe <kehoea@parhasard.net>
parents: 4299
diff changeset
646 finally return 'iso-8-1))
68d1ca56cffa First part of interactive checks that coding systems encode regions.
Aidan Kehoe <kehoea@parhasard.net>
parents: 4299
diff changeset
647
4604
e0a8715fdb1f Support new IGNORE-INVALID-SEQUENCESP argument, #'query-coding-region.
Aidan Kehoe <kehoea@parhasard.net>
parents: 4568
diff changeset
648 (defun 8-bit-fixed-query-coding-region (begin end coding-system &optional
e0a8715fdb1f Support new IGNORE-INVALID-SEQUENCESP argument, #'query-coding-region.
Aidan Kehoe <kehoea@parhasard.net>
parents: 4568
diff changeset
649 buffer ignore-invalid-sequencesp
e0a8715fdb1f Support new IGNORE-INVALID-SEQUENCESP argument, #'query-coding-region.
Aidan Kehoe <kehoea@parhasard.net>
parents: 4568
diff changeset
650 errorp highlightp)
4549
68d1ca56cffa First part of interactive checks that coding systems encode regions.
Aidan Kehoe <kehoea@parhasard.net>
parents: 4299
diff changeset
651 "The `query-coding-region' implementation for 8-bit-fixed coding systems.
68d1ca56cffa First part of interactive checks that coding systems encode regions.
Aidan Kehoe <kehoea@parhasard.net>
parents: 4299
diff changeset
652
68d1ca56cffa First part of interactive checks that coding systems encode regions.
Aidan Kehoe <kehoea@parhasard.net>
parents: 4299
diff changeset
653 Uses the `8-bit-fixed-query-from-unicode' and `8-bit-fixed-query-skip-chars'
68d1ca56cffa First part of interactive checks that coding systems encode regions.
Aidan Kehoe <kehoea@parhasard.net>
parents: 4299
diff changeset
654 coding system properties. The former is a hash table mapping from valid
68d1ca56cffa First part of interactive checks that coding systems encode regions.
Aidan Kehoe <kehoea@parhasard.net>
parents: 4299
diff changeset
655 Unicode code points to on-disk octets in the coding system; the latter a set
68d1ca56cffa First part of interactive checks that coding systems encode regions.
Aidan Kehoe <kehoea@parhasard.net>
parents: 4299
diff changeset
656 of characters as used by `skip-chars-forward'. Both of these properties are
68d1ca56cffa First part of interactive checks that coding systems encode regions.
Aidan Kehoe <kehoea@parhasard.net>
parents: 4299
diff changeset
657 generated automatically by `make-8-bit-coding-system'.
68d1ca56cffa First part of interactive checks that coding systems encode regions.
Aidan Kehoe <kehoea@parhasard.net>
parents: 4299
diff changeset
658
68d1ca56cffa First part of interactive checks that coding systems encode regions.
Aidan Kehoe <kehoea@parhasard.net>
parents: 4299
diff changeset
659 See that the documentation of `query-coding-region'; see also
68d1ca56cffa First part of interactive checks that coding systems encode regions.
Aidan Kehoe <kehoea@parhasard.net>
parents: 4299
diff changeset
660 `make-8-bit-coding-system'. "
68d1ca56cffa First part of interactive checks that coding systems encode regions.
Aidan Kehoe <kehoea@parhasard.net>
parents: 4299
diff changeset
661 (check-argument-type #'coding-system-p
68d1ca56cffa First part of interactive checks that coding systems encode regions.
Aidan Kehoe <kehoea@parhasard.net>
parents: 4299
diff changeset
662 (setq coding-system (find-coding-system coding-system)))
68d1ca56cffa First part of interactive checks that coding systems encode regions.
Aidan Kehoe <kehoea@parhasard.net>
parents: 4299
diff changeset
663 (check-argument-type #'integer-or-marker-p begin)
68d1ca56cffa First part of interactive checks that coding systems encode regions.
Aidan Kehoe <kehoea@parhasard.net>
parents: 4299
diff changeset
664 (check-argument-type #'integer-or-marker-p end)
68d1ca56cffa First part of interactive checks that coding systems encode regions.
Aidan Kehoe <kehoea@parhasard.net>
parents: 4299
diff changeset
665 (let ((from-unicode
4551
6812571bfcb9 Fix some bugs.
Aidan Kehoe <kehoea@parhasard.net>
parents: 4549
diff changeset
666 (or (coding-system-get coding-system '8-bit-fixed-query-from-unicode)
6812571bfcb9 Fix some bugs.
Aidan Kehoe <kehoea@parhasard.net>
parents: 4549
diff changeset
667 (coding-system-get (coding-system-base coding-system)
6812571bfcb9 Fix some bugs.
Aidan Kehoe <kehoea@parhasard.net>
parents: 4549
diff changeset
668 '8-bit-fixed-query-from-unicode)))
4549
68d1ca56cffa First part of interactive checks that coding systems encode regions.
Aidan Kehoe <kehoea@parhasard.net>
parents: 4299
diff changeset
669 (skip-chars-arg
4551
6812571bfcb9 Fix some bugs.
Aidan Kehoe <kehoea@parhasard.net>
parents: 4549
diff changeset
670 (or (coding-system-get coding-system '8-bit-fixed-query-skip-chars)
6812571bfcb9 Fix some bugs.
Aidan Kehoe <kehoea@parhasard.net>
parents: 4549
diff changeset
671 (coding-system-get (coding-system-base coding-system)
6812571bfcb9 Fix some bugs.
Aidan Kehoe <kehoea@parhasard.net>
parents: 4549
diff changeset
672 '8-bit-fixed-query-skip-chars)))
4604
e0a8715fdb1f Support new IGNORE-INVALID-SEQUENCESP argument, #'query-coding-region.
Aidan Kehoe <kehoea@parhasard.net>
parents: 4568
diff changeset
673 (invalid-sequences-skip-chars
e0a8715fdb1f Support new IGNORE-INVALID-SEQUENCESP argument, #'query-coding-region.
Aidan Kehoe <kehoea@parhasard.net>
parents: 4568
diff changeset
674 (or (coding-system-get coding-system
e0a8715fdb1f Support new IGNORE-INVALID-SEQUENCESP argument, #'query-coding-region.
Aidan Kehoe <kehoea@parhasard.net>
parents: 4568
diff changeset
675 '8-bit-fixed-invalid-sequences-skip-chars)
e0a8715fdb1f Support new IGNORE-INVALID-SEQUENCESP argument, #'query-coding-region.
Aidan Kehoe <kehoea@parhasard.net>
parents: 4568
diff changeset
676 (coding-system-get (coding-system-base coding-system)
e0a8715fdb1f Support new IGNORE-INVALID-SEQUENCESP argument, #'query-coding-region.
Aidan Kehoe <kehoea@parhasard.net>
parents: 4568
diff changeset
677 '8-bit-fixed-invalid-sequences-skip-chars)))
4549
68d1ca56cffa First part of interactive checks that coding systems encode regions.
Aidan Kehoe <kehoea@parhasard.net>
parents: 4299
diff changeset
678 (ranges (make-range-table))
4604
e0a8715fdb1f Support new IGNORE-INVALID-SEQUENCESP argument, #'query-coding-region.
Aidan Kehoe <kehoea@parhasard.net>
parents: 4568
diff changeset
679 (case-fold-search nil)
4549
68d1ca56cffa First part of interactive checks that coding systems encode regions.
Aidan Kehoe <kehoea@parhasard.net>
parents: 4299
diff changeset
680 char-after fail-range-start fail-range-end previous-fail extent
4604
e0a8715fdb1f Support new IGNORE-INVALID-SEQUENCESP argument, #'query-coding-region.
Aidan Kehoe <kehoea@parhasard.net>
parents: 4568
diff changeset
681 failed invalid-sequences-looking-at failed-reason
e0a8715fdb1f Support new IGNORE-INVALID-SEQUENCESP argument, #'query-coding-region.
Aidan Kehoe <kehoea@parhasard.net>
parents: 4568
diff changeset
682 previous-failed-reason)
4549
68d1ca56cffa First part of interactive checks that coding systems encode regions.
Aidan Kehoe <kehoea@parhasard.net>
parents: 4299
diff changeset
683 (check-type from-unicode hash-table)
68d1ca56cffa First part of interactive checks that coding systems encode regions.
Aidan Kehoe <kehoea@parhasard.net>
parents: 4299
diff changeset
684 (check-type skip-chars-arg string)
4604
e0a8715fdb1f Support new IGNORE-INVALID-SEQUENCESP argument, #'query-coding-region.
Aidan Kehoe <kehoea@parhasard.net>
parents: 4568
diff changeset
685 (check-type invalid-sequences-skip-chars string)
e0a8715fdb1f Support new IGNORE-INVALID-SEQUENCESP argument, #'query-coding-region.
Aidan Kehoe <kehoea@parhasard.net>
parents: 4568
diff changeset
686 (setq invalid-sequences-looking-at
e0a8715fdb1f Support new IGNORE-INVALID-SEQUENCESP argument, #'query-coding-region.
Aidan Kehoe <kehoea@parhasard.net>
parents: 4568
diff changeset
687 (if (equal "" invalid-sequences-skip-chars)
e0a8715fdb1f Support new IGNORE-INVALID-SEQUENCESP argument, #'query-coding-region.
Aidan Kehoe <kehoea@parhasard.net>
parents: 4568
diff changeset
688 ;; Regexp that will never match.
e0a8715fdb1f Support new IGNORE-INVALID-SEQUENCESP argument, #'query-coding-region.
Aidan Kehoe <kehoea@parhasard.net>
parents: 4568
diff changeset
689 #r".\{0,0\}"
e0a8715fdb1f Support new IGNORE-INVALID-SEQUENCESP argument, #'query-coding-region.
Aidan Kehoe <kehoea@parhasard.net>
parents: 4568
diff changeset
690 (concat "[" invalid-sequences-skip-chars "]")))
e0a8715fdb1f Support new IGNORE-INVALID-SEQUENCESP argument, #'query-coding-region.
Aidan Kehoe <kehoea@parhasard.net>
parents: 4568
diff changeset
691 (when ignore-invalid-sequencesp
e0a8715fdb1f Support new IGNORE-INVALID-SEQUENCESP argument, #'query-coding-region.
Aidan Kehoe <kehoea@parhasard.net>
parents: 4568
diff changeset
692 (setq skip-chars-arg
e0a8715fdb1f Support new IGNORE-INVALID-SEQUENCESP argument, #'query-coding-region.
Aidan Kehoe <kehoea@parhasard.net>
parents: 4568
diff changeset
693 (concat skip-chars-arg invalid-sequences-skip-chars)))
4549
68d1ca56cffa First part of interactive checks that coding systems encode regions.
Aidan Kehoe <kehoea@parhasard.net>
parents: 4299
diff changeset
694 (save-excursion
4551
6812571bfcb9 Fix some bugs.
Aidan Kehoe <kehoea@parhasard.net>
parents: 4549
diff changeset
695 (when highlightp
4604
e0a8715fdb1f Support new IGNORE-INVALID-SEQUENCESP argument, #'query-coding-region.
Aidan Kehoe <kehoea@parhasard.net>
parents: 4568
diff changeset
696 (query-coding-clear-highlights begin end buffer))
4549
68d1ca56cffa First part of interactive checks that coding systems encode regions.
Aidan Kehoe <kehoea@parhasard.net>
parents: 4299
diff changeset
697 (goto-char begin buffer)
68d1ca56cffa First part of interactive checks that coding systems encode regions.
Aidan Kehoe <kehoea@parhasard.net>
parents: 4299
diff changeset
698 (skip-chars-forward skip-chars-arg end buffer)
68d1ca56cffa First part of interactive checks that coding systems encode regions.
Aidan Kehoe <kehoea@parhasard.net>
parents: 4299
diff changeset
699 (while (< (point buffer) end)
68d1ca56cffa First part of interactive checks that coding systems encode regions.
Aidan Kehoe <kehoea@parhasard.net>
parents: 4299
diff changeset
700 (setq char-after (char-after (point buffer) buffer)
68d1ca56cffa First part of interactive checks that coding systems encode regions.
Aidan Kehoe <kehoea@parhasard.net>
parents: 4299
diff changeset
701 fail-range-start (point buffer))
68d1ca56cffa First part of interactive checks that coding systems encode regions.
Aidan Kehoe <kehoea@parhasard.net>
parents: 4299
diff changeset
702 (while (and
68d1ca56cffa First part of interactive checks that coding systems encode regions.
Aidan Kehoe <kehoea@parhasard.net>
parents: 4299
diff changeset
703 (< (point buffer) end)
4604
e0a8715fdb1f Support new IGNORE-INVALID-SEQUENCESP argument, #'query-coding-region.
Aidan Kehoe <kehoea@parhasard.net>
parents: 4568
diff changeset
704 (or (and
e0a8715fdb1f Support new IGNORE-INVALID-SEQUENCESP argument, #'query-coding-region.
Aidan Kehoe <kehoea@parhasard.net>
parents: 4568
diff changeset
705 (not (gethash (encode-char char-after 'ucs) from-unicode))
e0a8715fdb1f Support new IGNORE-INVALID-SEQUENCESP argument, #'query-coding-region.
Aidan Kehoe <kehoea@parhasard.net>
parents: 4568
diff changeset
706 (setq failed-reason 'unencodable))
e0a8715fdb1f Support new IGNORE-INVALID-SEQUENCESP argument, #'query-coding-region.
Aidan Kehoe <kehoea@parhasard.net>
parents: 4568
diff changeset
707 (and (not ignore-invalid-sequencesp)
e0a8715fdb1f Support new IGNORE-INVALID-SEQUENCESP argument, #'query-coding-region.
Aidan Kehoe <kehoea@parhasard.net>
parents: 4568
diff changeset
708 (looking-at invalid-sequences-looking-at buffer)
e0a8715fdb1f Support new IGNORE-INVALID-SEQUENCESP argument, #'query-coding-region.
Aidan Kehoe <kehoea@parhasard.net>
parents: 4568
diff changeset
709 (setq failed-reason 'invalid-sequence)))
e0a8715fdb1f Support new IGNORE-INVALID-SEQUENCESP argument, #'query-coding-region.
Aidan Kehoe <kehoea@parhasard.net>
parents: 4568
diff changeset
710 (or (null previous-failed-reason)
e0a8715fdb1f Support new IGNORE-INVALID-SEQUENCESP argument, #'query-coding-region.
Aidan Kehoe <kehoea@parhasard.net>
parents: 4568
diff changeset
711 (eq previous-failed-reason failed-reason)))
4549
68d1ca56cffa First part of interactive checks that coding systems encode regions.
Aidan Kehoe <kehoea@parhasard.net>
parents: 4299
diff changeset
712 (forward-char 1 buffer)
68d1ca56cffa First part of interactive checks that coding systems encode regions.
Aidan Kehoe <kehoea@parhasard.net>
parents: 4299
diff changeset
713 (setq char-after (char-after (point buffer) buffer)
4604
e0a8715fdb1f Support new IGNORE-INVALID-SEQUENCESP argument, #'query-coding-region.
Aidan Kehoe <kehoea@parhasard.net>
parents: 4568
diff changeset
714 failed t
e0a8715fdb1f Support new IGNORE-INVALID-SEQUENCESP argument, #'query-coding-region.
Aidan Kehoe <kehoea@parhasard.net>
parents: 4568
diff changeset
715 previous-failed-reason failed-reason))
4549
68d1ca56cffa First part of interactive checks that coding systems encode regions.
Aidan Kehoe <kehoea@parhasard.net>
parents: 4299
diff changeset
716 (if (= fail-range-start (point buffer))
68d1ca56cffa First part of interactive checks that coding systems encode regions.
Aidan Kehoe <kehoea@parhasard.net>
parents: 4299
diff changeset
717 ;; The character can actually be encoded by the coding
68d1ca56cffa First part of interactive checks that coding systems encode regions.
Aidan Kehoe <kehoea@parhasard.net>
parents: 4299
diff changeset
718 ;; system; check the characters past it.
68d1ca56cffa First part of interactive checks that coding systems encode regions.
Aidan Kehoe <kehoea@parhasard.net>
parents: 4299
diff changeset
719 (forward-char 1 buffer)
68d1ca56cffa First part of interactive checks that coding systems encode regions.
Aidan Kehoe <kehoea@parhasard.net>
parents: 4299
diff changeset
720 ;; The character actually failed.
68d1ca56cffa First part of interactive checks that coding systems encode regions.
Aidan Kehoe <kehoea@parhasard.net>
parents: 4299
diff changeset
721 (when errorp
68d1ca56cffa First part of interactive checks that coding systems encode regions.
Aidan Kehoe <kehoea@parhasard.net>
parents: 4299
diff changeset
722 (error 'text-conversion-error
68d1ca56cffa First part of interactive checks that coding systems encode regions.
Aidan Kehoe <kehoea@parhasard.net>
parents: 4299
diff changeset
723 (format "Cannot encode %s using coding system"
4551
6812571bfcb9 Fix some bugs.
Aidan Kehoe <kehoea@parhasard.net>
parents: 4549
diff changeset
724 (buffer-substring fail-range-start (point buffer)
4549
68d1ca56cffa First part of interactive checks that coding systems encode regions.
Aidan Kehoe <kehoea@parhasard.net>
parents: 4299
diff changeset
725 buffer))
68d1ca56cffa First part of interactive checks that coding systems encode regions.
Aidan Kehoe <kehoea@parhasard.net>
parents: 4299
diff changeset
726 (coding-system-name coding-system)))
4604
e0a8715fdb1f Support new IGNORE-INVALID-SEQUENCESP argument, #'query-coding-region.
Aidan Kehoe <kehoea@parhasard.net>
parents: 4568
diff changeset
727 (assert (not (null previous-failed-reason)) t
e0a8715fdb1f Support new IGNORE-INVALID-SEQUENCESP argument, #'query-coding-region.
Aidan Kehoe <kehoea@parhasard.net>
parents: 4568
diff changeset
728 "previous-failed-reason should always be non-nil here")
4549
68d1ca56cffa First part of interactive checks that coding systems encode regions.
Aidan Kehoe <kehoea@parhasard.net>
parents: 4299
diff changeset
729 (put-range-table fail-range-start
68d1ca56cffa First part of interactive checks that coding systems encode regions.
Aidan Kehoe <kehoea@parhasard.net>
parents: 4299
diff changeset
730 ;; If char-after is non-nil, we're not at
68d1ca56cffa First part of interactive checks that coding systems encode regions.
Aidan Kehoe <kehoea@parhasard.net>
parents: 4299
diff changeset
731 ;; the end of the buffer.
68d1ca56cffa First part of interactive checks that coding systems encode regions.
Aidan Kehoe <kehoea@parhasard.net>
parents: 4299
diff changeset
732 (setq fail-range-end (if char-after
68d1ca56cffa First part of interactive checks that coding systems encode regions.
Aidan Kehoe <kehoea@parhasard.net>
parents: 4299
diff changeset
733 (point buffer)
68d1ca56cffa First part of interactive checks that coding systems encode regions.
Aidan Kehoe <kehoea@parhasard.net>
parents: 4299
diff changeset
734 (point-max buffer)))
4604
e0a8715fdb1f Support new IGNORE-INVALID-SEQUENCESP argument, #'query-coding-region.
Aidan Kehoe <kehoea@parhasard.net>
parents: 4568
diff changeset
735 previous-failed-reason ranges)
e0a8715fdb1f Support new IGNORE-INVALID-SEQUENCESP argument, #'query-coding-region.
Aidan Kehoe <kehoea@parhasard.net>
parents: 4568
diff changeset
736 (setq previous-failed-reason nil)
4549
68d1ca56cffa First part of interactive checks that coding systems encode regions.
Aidan Kehoe <kehoea@parhasard.net>
parents: 4299
diff changeset
737 (when highlightp
68d1ca56cffa First part of interactive checks that coding systems encode regions.
Aidan Kehoe <kehoea@parhasard.net>
parents: 4299
diff changeset
738 (setq extent (make-extent fail-range-start fail-range-end buffer))
68d1ca56cffa First part of interactive checks that coding systems encode regions.
Aidan Kehoe <kehoea@parhasard.net>
parents: 4299
diff changeset
739 (set-extent-priority extent (+ mouse-highlight-priority 2))
68d1ca56cffa First part of interactive checks that coding systems encode regions.
Aidan Kehoe <kehoea@parhasard.net>
parents: 4299
diff changeset
740 (set-extent-face extent 'query-coding-warning-face))
4551
6812571bfcb9 Fix some bugs.
Aidan Kehoe <kehoea@parhasard.net>
parents: 4549
diff changeset
741 (skip-chars-forward skip-chars-arg end buffer)))
4549
68d1ca56cffa First part of interactive checks that coding systems encode regions.
Aidan Kehoe <kehoea@parhasard.net>
parents: 4299
diff changeset
742 (if failed
68d1ca56cffa First part of interactive checks that coding systems encode regions.
Aidan Kehoe <kehoea@parhasard.net>
parents: 4299
diff changeset
743 (values nil ranges)
68d1ca56cffa First part of interactive checks that coding systems encode regions.
Aidan Kehoe <kehoea@parhasard.net>
parents: 4299
diff changeset
744 (values t nil)))))
4145
edb00a8b4eff [xemacs-hg @ 2007-08-26 20:00:29 by aidan]
aidan
parents: 4103
diff changeset
745
4072
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
746 (defun make-8-bit-coding-system (name unicode-map &optional description props)
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
747 "Make and return a fixed-width 8-bit CCL coding system named NAME.
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
748 NAME must be a symbol, and UNICODE-MAP a list.
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
749
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
750 UNICODE-MAP is a plist describing a map from octets in the coding
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
751 system NAME (as integers) to XEmacs characters. Those XEmacs
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
752 characters will be used explicitly on decoding, but for encoding (most
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
753 relevantly, on writing to disk) XEmacs characters that map to the same
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
754 Unicode code point will be unified. This means that the ISO-8859-?
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
755 characters that map to the same Unicode code point will not be
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
756 distinct when written to disk, which is normally what is intended; it
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
757 also means that East Asian Han characters from different XEmacs
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
758 character sets will not be distinct when written to disk, which is
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
759 less often what is intended.
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
760
4604
e0a8715fdb1f Support new IGNORE-INVALID-SEQUENCESP argument, #'query-coding-region.
Aidan Kehoe <kehoea@parhasard.net>
parents: 4568
diff changeset
761 Any octets not mapped, and with values above #x7f, will be decoded into
e0a8715fdb1f Support new IGNORE-INVALID-SEQUENCESP argument, #'query-coding-region.
Aidan Kehoe <kehoea@parhasard.net>
parents: 4568
diff changeset
762 XEmacs characters that reflect that their values are undefined. These
e0a8715fdb1f Support new IGNORE-INVALID-SEQUENCESP argument, #'query-coding-region.
Aidan Kehoe <kehoea@parhasard.net>
parents: 4568
diff changeset
763 characters will be displayed in a language-environment-specific way. See
e0a8715fdb1f Support new IGNORE-INVALID-SEQUENCESP argument, #'query-coding-region.
Aidan Kehoe <kehoea@parhasard.net>
parents: 4568
diff changeset
764 `unicode-error-default-translation-table' and the
e0a8715fdb1f Support new IGNORE-INVALID-SEQUENCESP argument, #'query-coding-region.
Aidan Kehoe <kehoea@parhasard.net>
parents: 4568
diff changeset
765 `invalid-sequence-coding-system' argument to `set-language-info'.
e0a8715fdb1f Support new IGNORE-INVALID-SEQUENCESP argument, #'query-coding-region.
Aidan Kehoe <kehoea@parhasard.net>
parents: 4568
diff changeset
766
e0a8715fdb1f Support new IGNORE-INVALID-SEQUENCESP argument, #'query-coding-region.
Aidan Kehoe <kehoea@parhasard.net>
parents: 4568
diff changeset
767 These characters will normally be treated as invalid when checking whether
e0a8715fdb1f Support new IGNORE-INVALID-SEQUENCESP argument, #'query-coding-region.
Aidan Kehoe <kehoea@parhasard.net>
parents: 4568
diff changeset
768 text can be encoded with `query-coding-region'--see the
e0a8715fdb1f Support new IGNORE-INVALID-SEQUENCESP argument, #'query-coding-region.
Aidan Kehoe <kehoea@parhasard.net>
parents: 4568
diff changeset
769 IGNORE-INVALID-SEQUENCESP argument to that function to avoid this. It is
e0a8715fdb1f Support new IGNORE-INVALID-SEQUENCESP argument, #'query-coding-region.
Aidan Kehoe <kehoea@parhasard.net>
parents: 4568
diff changeset
770 possible to specify that octets with values less than #x80 (or indeed
e0a8715fdb1f Support new IGNORE-INVALID-SEQUENCESP argument, #'query-coding-region.
Aidan Kehoe <kehoea@parhasard.net>
parents: 4568
diff changeset
771 greater than it) be treated in this way, by specifying explicitly that they
e0a8715fdb1f Support new IGNORE-INVALID-SEQUENCESP argument, #'query-coding-region.
Aidan Kehoe <kehoea@parhasard.net>
parents: 4568
diff changeset
772 correspond to the character mapping to that octet in
e0a8715fdb1f Support new IGNORE-INVALID-SEQUENCESP argument, #'query-coding-region.
Aidan Kehoe <kehoea@parhasard.net>
parents: 4568
diff changeset
773 `unicode-error-default-translation-table'. Far fewer coding systems
e0a8715fdb1f Support new IGNORE-INVALID-SEQUENCESP argument, #'query-coding-region.
Aidan Kehoe <kehoea@parhasard.net>
parents: 4568
diff changeset
774 override the ASCII mapping, though, so this is not the default.
4072
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
775
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
776 DESCRIPTION and PROPS are as in `make-coding-system', which see. This
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
777 function also accepts two additional (optional) properties in PROPS;
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
778 `aliases', giving a list of aliases to be initialized for this
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
779 coding-system, and `encode-failure-octet', an integer between 0 and 256 to
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
780 write in place of XEmacs characters that cannot be encoded, defaulting to
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
781 the code for tilde `~'. "
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
782 (check-argument-type #'symbolp name)
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
783 (check-argument-type #'listp unicode-map)
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
784 (check-argument-type #'stringp
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
785 (or description
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
786 (setq description
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
787 (format "Coding system used for %s." name))))
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
788 (check-valid-plist props)
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
789 (let ((encode-failure-octet (or (plist-get props 'encode-failure-octet)
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
790 (char-to-int ?~)))
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
791 (aliases (plist-get props 'aliases))
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
792 (hash-table-sym (gentemp (format "%s-encode-table" name)))
4604
e0a8715fdb1f Support new IGNORE-INVALID-SEQUENCESP argument, #'query-coding-region.
Aidan Kehoe <kehoea@parhasard.net>
parents: 4568
diff changeset
793 encode-program decode-program result decode-table encode-table
e0a8715fdb1f Support new IGNORE-INVALID-SEQUENCESP argument, #'query-coding-region.
Aidan Kehoe <kehoea@parhasard.net>
parents: 4568
diff changeset
794 skip-chars invalid-sequences-skip-chars)
4072
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
795
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
796 ;; Some more sanity checking.
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
797 (check-argument-range encode-failure-octet 0 #xFF)
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
798 (check-argument-type #'listp aliases)
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
799
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
800 ;; Don't pass on our extra data to make-coding-system.
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
801 (setq props (plist-remprop props 'encode-failure-octet)
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
802 props (plist-remprop props 'aliases))
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
803
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
804 (multiple-value-setq
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
805 (decode-table encode-table)
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
806 (make-8-bit-create-decode-encode-tables unicode-map))
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
807
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
808 ;; Register the decode-table.
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
809 (define-translation-hash-table hash-table-sym encode-table)
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
810
4604
e0a8715fdb1f Support new IGNORE-INVALID-SEQUENCESP argument, #'query-coding-region.
Aidan Kehoe <kehoea@parhasard.net>
parents: 4568
diff changeset
811 ;; Generate the programs and skip-chars strings.
e0a8715fdb1f Support new IGNORE-INVALID-SEQUENCESP argument, #'query-coding-region.
Aidan Kehoe <kehoea@parhasard.net>
parents: 4568
diff changeset
812 (setq decode-program (make-8-bit-generate-decode-program decode-table))
e0a8715fdb1f Support new IGNORE-INVALID-SEQUENCESP argument, #'query-coding-region.
Aidan Kehoe <kehoea@parhasard.net>
parents: 4568
diff changeset
813 (multiple-value-setq
e0a8715fdb1f Support new IGNORE-INVALID-SEQUENCESP argument, #'query-coding-region.
Aidan Kehoe <kehoea@parhasard.net>
parents: 4568
diff changeset
814 (encode-program skip-chars invalid-sequences-skip-chars)
e0a8715fdb1f Support new IGNORE-INVALID-SEQUENCESP argument, #'query-coding-region.
Aidan Kehoe <kehoea@parhasard.net>
parents: 4568
diff changeset
815 (make-8-bit-generate-encode-program-and-skip-chars-strings
e0a8715fdb1f Support new IGNORE-INVALID-SEQUENCESP argument, #'query-coding-region.
Aidan Kehoe <kehoea@parhasard.net>
parents: 4568
diff changeset
816 decode-table encode-table encode-failure-octet))
e0a8715fdb1f Support new IGNORE-INVALID-SEQUENCESP argument, #'query-coding-region.
Aidan Kehoe <kehoea@parhasard.net>
parents: 4568
diff changeset
817
4072
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
818 (unless (vectorp encode-program)
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
819 (setq encode-program
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
820 (apply #'vector
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
821 (nsublis (list (cons 'encode-table-sym hash-table-sym))
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
822 (copy-tree encode-program)))))
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
823 (unless (vectorp decode-program)
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
824 (setq decode-program
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
825 (apply #'vector decode-program)))
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
826
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
827 ;; And now generate the actual coding system.
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
828 (setq result
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
829 (make-coding-system
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
830 name 'ccl
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
831 description
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
832 (plist-put (plist-put props 'decode decode-program)
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
833 'encode encode-program)))
4295
eded49463f9a [xemacs-hg @ 2007-11-29 13:37:51 by aidan]
aidan
parents: 4145
diff changeset
834 (coding-system-put name '8-bit-fixed t)
4145
edb00a8b4eff [xemacs-hg @ 2007-08-26 20:00:29 by aidan]
aidan
parents: 4103
diff changeset
835 (coding-system-put name 'category
edb00a8b4eff [xemacs-hg @ 2007-08-26 20:00:29 by aidan]
aidan
parents: 4103
diff changeset
836 (make-8-bit-choose-category decode-table))
4549
68d1ca56cffa First part of interactive checks that coding systems encode regions.
Aidan Kehoe <kehoea@parhasard.net>
parents: 4299
diff changeset
837 (coding-system-put name '8-bit-fixed-query-skip-chars
4604
e0a8715fdb1f Support new IGNORE-INVALID-SEQUENCESP argument, #'query-coding-region.
Aidan Kehoe <kehoea@parhasard.net>
parents: 4568
diff changeset
838 skip-chars)
e0a8715fdb1f Support new IGNORE-INVALID-SEQUENCESP argument, #'query-coding-region.
Aidan Kehoe <kehoea@parhasard.net>
parents: 4568
diff changeset
839 (coding-system-put name '8-bit-fixed-invalid-sequences-skip-chars
e0a8715fdb1f Support new IGNORE-INVALID-SEQUENCESP argument, #'query-coding-region.
Aidan Kehoe <kehoea@parhasard.net>
parents: 4568
diff changeset
840 invalid-sequences-skip-chars)
4549
68d1ca56cffa First part of interactive checks that coding systems encode regions.
Aidan Kehoe <kehoea@parhasard.net>
parents: 4299
diff changeset
841 (coding-system-put name '8-bit-fixed-query-from-unicode encode-table)
68d1ca56cffa First part of interactive checks that coding systems encode regions.
Aidan Kehoe <kehoea@parhasard.net>
parents: 4299
diff changeset
842 (coding-system-put name 'query-coding-function
68d1ca56cffa First part of interactive checks that coding systems encode regions.
Aidan Kehoe <kehoea@parhasard.net>
parents: 4299
diff changeset
843 #'8-bit-fixed-query-coding-region)
68d1ca56cffa First part of interactive checks that coding systems encode regions.
Aidan Kehoe <kehoea@parhasard.net>
parents: 4299
diff changeset
844 (coding-system-put (intern (format "%s-unix" name))
68d1ca56cffa First part of interactive checks that coding systems encode regions.
Aidan Kehoe <kehoea@parhasard.net>
parents: 4299
diff changeset
845 'query-coding-function
68d1ca56cffa First part of interactive checks that coding systems encode regions.
Aidan Kehoe <kehoea@parhasard.net>
parents: 4299
diff changeset
846 #'8-bit-fixed-query-coding-region)
68d1ca56cffa First part of interactive checks that coding systems encode regions.
Aidan Kehoe <kehoea@parhasard.net>
parents: 4299
diff changeset
847 (coding-system-put (intern (format "%s-dos" name))
68d1ca56cffa First part of interactive checks that coding systems encode regions.
Aidan Kehoe <kehoea@parhasard.net>
parents: 4299
diff changeset
848 'query-coding-function
68d1ca56cffa First part of interactive checks that coding systems encode regions.
Aidan Kehoe <kehoea@parhasard.net>
parents: 4299
diff changeset
849 #'8-bit-fixed-query-coding-region)
68d1ca56cffa First part of interactive checks that coding systems encode regions.
Aidan Kehoe <kehoea@parhasard.net>
parents: 4299
diff changeset
850 (coding-system-put (intern (format "%s-mac" name))
68d1ca56cffa First part of interactive checks that coding systems encode regions.
Aidan Kehoe <kehoea@parhasard.net>
parents: 4299
diff changeset
851 'query-coding-function
68d1ca56cffa First part of interactive checks that coding systems encode regions.
Aidan Kehoe <kehoea@parhasard.net>
parents: 4299
diff changeset
852 #'8-bit-fixed-query-coding-region)
4072
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
853 (loop for alias in aliases
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
854 do (define-coding-system-alias alias name))
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
855 result))
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
856
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
857 (define-compiler-macro make-8-bit-coding-system (&whole form name unicode-map
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
858 &optional description props)
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
859 ;; We provide the compiler macro (= macro that is expanded only on
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
860 ;; compilation, and that can punt to a runtime version of the
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
861 ;; associate function if necessary) not for reasons of speed, though
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
862 ;; it does speed up things at runtime a little, but because the
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
863 ;; Unicode mappings are available at compile time in the dumped
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
864 ;; files, but they are not available at run time for the vast
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
865 ;; majority of them.
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
866
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
867 (if (not (and (and (consp name) (eq (car name) 'quote))
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
868 (and (consp unicode-map) (eq (car unicode-map) 'quote))
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
869 (and (or (and (consp props) (eq (car props) 'quote))
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
870 (null props)))))
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
871 ;; The call does not use literals; do it at runtime.
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
872 form
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
873 (setq name (cadr name)
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
874 unicode-map (cadr unicode-map)
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
875 props (if props (cadr props)))
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
876 (let ((encode-failure-octet
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
877 (or (plist-get props 'encode-failure-octet) (char-to-int ?~)))
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
878 (aliases (plist-get props 'aliases))
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
879 encode-program decode-program
4604
e0a8715fdb1f Support new IGNORE-INVALID-SEQUENCESP argument, #'query-coding-region.
Aidan Kehoe <kehoea@parhasard.net>
parents: 4568
diff changeset
880 decode-table encode-table
e0a8715fdb1f Support new IGNORE-INVALID-SEQUENCESP argument, #'query-coding-region.
Aidan Kehoe <kehoea@parhasard.net>
parents: 4568
diff changeset
881 skip-chars invalid-sequences-skip-chars)
4072
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
882
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
883 ;; Some sanity checking.
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
884 (check-argument-range encode-failure-octet 0 #xFF)
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
885 (check-argument-type #'listp aliases)
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
886
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
887 ;; Don't pass on our extra data to make-coding-system.
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
888 (setq props (plist-remprop props 'encode-failure-octet)
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
889 props (plist-remprop props 'aliases))
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
890
4604
e0a8715fdb1f Support new IGNORE-INVALID-SEQUENCESP argument, #'query-coding-region.
Aidan Kehoe <kehoea@parhasard.net>
parents: 4568
diff changeset
891 ;; Work out encode-table and decode-table
4072
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
892 (multiple-value-setq
4604
e0a8715fdb1f Support new IGNORE-INVALID-SEQUENCESP argument, #'query-coding-region.
Aidan Kehoe <kehoea@parhasard.net>
parents: 4568
diff changeset
893 (decode-table encode-table)
e0a8715fdb1f Support new IGNORE-INVALID-SEQUENCESP argument, #'query-coding-region.
Aidan Kehoe <kehoea@parhasard.net>
parents: 4568
diff changeset
894 (make-8-bit-create-decode-encode-tables unicode-map))
4072
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
895
4604
e0a8715fdb1f Support new IGNORE-INVALID-SEQUENCESP argument, #'query-coding-region.
Aidan Kehoe <kehoea@parhasard.net>
parents: 4568
diff changeset
896 ;; Generate the decode and encode programs, and the skip-chars
e0a8715fdb1f Support new IGNORE-INVALID-SEQUENCESP argument, #'query-coding-region.
Aidan Kehoe <kehoea@parhasard.net>
parents: 4568
diff changeset
897 ;; arguments.
e0a8715fdb1f Support new IGNORE-INVALID-SEQUENCESP argument, #'query-coding-region.
Aidan Kehoe <kehoea@parhasard.net>
parents: 4568
diff changeset
898 (setq decode-program (make-8-bit-generate-decode-program decode-table))
e0a8715fdb1f Support new IGNORE-INVALID-SEQUENCESP argument, #'query-coding-region.
Aidan Kehoe <kehoea@parhasard.net>
parents: 4568
diff changeset
899 (multiple-value-setq
e0a8715fdb1f Support new IGNORE-INVALID-SEQUENCESP argument, #'query-coding-region.
Aidan Kehoe <kehoea@parhasard.net>
parents: 4568
diff changeset
900 (encode-program skip-chars invalid-sequences-skip-chars)
e0a8715fdb1f Support new IGNORE-INVALID-SEQUENCESP argument, #'query-coding-region.
Aidan Kehoe <kehoea@parhasard.net>
parents: 4568
diff changeset
901 (make-8-bit-generate-encode-program-and-skip-chars-strings
e0a8715fdb1f Support new IGNORE-INVALID-SEQUENCESP argument, #'query-coding-region.
Aidan Kehoe <kehoea@parhasard.net>
parents: 4568
diff changeset
902 decode-table encode-table encode-failure-octet))
4072
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
903
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
904 ;; And return the generated code.
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
905 `(let ((encode-table-sym (gentemp (format "%s-encode-table" ',name)))
4549
68d1ca56cffa First part of interactive checks that coding systems encode regions.
Aidan Kehoe <kehoea@parhasard.net>
parents: 4299
diff changeset
906 (encode-table ,encode-table))
68d1ca56cffa First part of interactive checks that coding systems encode regions.
Aidan Kehoe <kehoea@parhasard.net>
parents: 4299
diff changeset
907 (define-translation-hash-table encode-table-sym encode-table)
4103
b4f4e0cc90f1 [xemacs-hg @ 2007-08-07 23:08:47 by aidan]
aidan
parents: 4090
diff changeset
908 (make-coding-system
b4f4e0cc90f1 [xemacs-hg @ 2007-08-07 23:08:47 by aidan]
aidan
parents: 4090
diff changeset
909 ',name 'ccl ,description
b4f4e0cc90f1 [xemacs-hg @ 2007-08-07 23:08:47 by aidan]
aidan
parents: 4090
diff changeset
910 (plist-put (plist-put ',props 'decode
b4f4e0cc90f1 [xemacs-hg @ 2007-08-07 23:08:47 by aidan]
aidan
parents: 4090
diff changeset
911 ,(apply #'vector decode-program))
b4f4e0cc90f1 [xemacs-hg @ 2007-08-07 23:08:47 by aidan]
aidan
parents: 4090
diff changeset
912 'encode
b4f4e0cc90f1 [xemacs-hg @ 2007-08-07 23:08:47 by aidan]
aidan
parents: 4090
diff changeset
913 (apply #'vector
b4f4e0cc90f1 [xemacs-hg @ 2007-08-07 23:08:47 by aidan]
aidan
parents: 4090
diff changeset
914 (nsublis
b4f4e0cc90f1 [xemacs-hg @ 2007-08-07 23:08:47 by aidan]
aidan
parents: 4090
diff changeset
915 (list (cons
b4f4e0cc90f1 [xemacs-hg @ 2007-08-07 23:08:47 by aidan]
aidan
parents: 4090
diff changeset
916 'encode-table-sym
b4f4e0cc90f1 [xemacs-hg @ 2007-08-07 23:08:47 by aidan]
aidan
parents: 4090
diff changeset
917 (symbol-value 'encode-table-sym)))
b4f4e0cc90f1 [xemacs-hg @ 2007-08-07 23:08:47 by aidan]
aidan
parents: 4090
diff changeset
918 ',encode-program))))
4295
eded49463f9a [xemacs-hg @ 2007-11-29 13:37:51 by aidan]
aidan
parents: 4145
diff changeset
919 (coding-system-put ',name '8-bit-fixed t)
4549
68d1ca56cffa First part of interactive checks that coding systems encode regions.
Aidan Kehoe <kehoea@parhasard.net>
parents: 4299
diff changeset
920 (coding-system-put ',name 'category
68d1ca56cffa First part of interactive checks that coding systems encode regions.
Aidan Kehoe <kehoea@parhasard.net>
parents: 4299
diff changeset
921 ',(make-8-bit-choose-category decode-table))
68d1ca56cffa First part of interactive checks that coding systems encode regions.
Aidan Kehoe <kehoea@parhasard.net>
parents: 4299
diff changeset
922 (coding-system-put ',name '8-bit-fixed-query-skip-chars
4604
e0a8715fdb1f Support new IGNORE-INVALID-SEQUENCESP argument, #'query-coding-region.
Aidan Kehoe <kehoea@parhasard.net>
parents: 4568
diff changeset
923 ,skip-chars)
e0a8715fdb1f Support new IGNORE-INVALID-SEQUENCESP argument, #'query-coding-region.
Aidan Kehoe <kehoea@parhasard.net>
parents: 4568
diff changeset
924 (coding-system-put ',name '8-bit-fixed-invalid-sequences-skip-chars
e0a8715fdb1f Support new IGNORE-INVALID-SEQUENCESP argument, #'query-coding-region.
Aidan Kehoe <kehoea@parhasard.net>
parents: 4568
diff changeset
925 ,invalid-sequences-skip-chars)
4549
68d1ca56cffa First part of interactive checks that coding systems encode regions.
Aidan Kehoe <kehoea@parhasard.net>
parents: 4299
diff changeset
926 (coding-system-put ',name '8-bit-fixed-query-from-unicode encode-table)
68d1ca56cffa First part of interactive checks that coding systems encode regions.
Aidan Kehoe <kehoea@parhasard.net>
parents: 4299
diff changeset
927 (coding-system-put ',name 'query-coding-function
68d1ca56cffa First part of interactive checks that coding systems encode regions.
Aidan Kehoe <kehoea@parhasard.net>
parents: 4299
diff changeset
928 #'8-bit-fixed-query-coding-region)
68d1ca56cffa First part of interactive checks that coding systems encode regions.
Aidan Kehoe <kehoea@parhasard.net>
parents: 4299
diff changeset
929 (coding-system-put ',(intern (format "%s-unix" name))
68d1ca56cffa First part of interactive checks that coding systems encode regions.
Aidan Kehoe <kehoea@parhasard.net>
parents: 4299
diff changeset
930 'query-coding-function
68d1ca56cffa First part of interactive checks that coding systems encode regions.
Aidan Kehoe <kehoea@parhasard.net>
parents: 4299
diff changeset
931 #'8-bit-fixed-query-coding-region)
68d1ca56cffa First part of interactive checks that coding systems encode regions.
Aidan Kehoe <kehoea@parhasard.net>
parents: 4299
diff changeset
932 (coding-system-put ',(intern (format "%s-dos" name))
68d1ca56cffa First part of interactive checks that coding systems encode regions.
Aidan Kehoe <kehoea@parhasard.net>
parents: 4299
diff changeset
933 'query-coding-function
68d1ca56cffa First part of interactive checks that coding systems encode regions.
Aidan Kehoe <kehoea@parhasard.net>
parents: 4299
diff changeset
934 #'8-bit-fixed-query-coding-region)
68d1ca56cffa First part of interactive checks that coding systems encode regions.
Aidan Kehoe <kehoea@parhasard.net>
parents: 4299
diff changeset
935 (coding-system-put ',(intern (format "%s-mac" name))
68d1ca56cffa First part of interactive checks that coding systems encode regions.
Aidan Kehoe <kehoea@parhasard.net>
parents: 4299
diff changeset
936 'query-coding-function
68d1ca56cffa First part of interactive checks that coding systems encode regions.
Aidan Kehoe <kehoea@parhasard.net>
parents: 4299
diff changeset
937 #'8-bit-fixed-query-coding-region)
4072
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
938 ,(macroexpand `(loop for alias in ',aliases
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
939 do (define-coding-system-alias alias
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
940 ',name)))
4103
b4f4e0cc90f1 [xemacs-hg @ 2007-08-07 23:08:47 by aidan]
aidan
parents: 4090
diff changeset
941 (find-coding-system ',name)))))
4299
f4c3ffe60a4f [xemacs-hg @ 2007-12-01 14:24:46 by aidan]
aidan
parents: 4295
diff changeset
942
f4c3ffe60a4f [xemacs-hg @ 2007-12-01 14:24:46 by aidan]
aidan
parents: 4295
diff changeset
943 ;; Ideally this would be in latin.el, but code-init.el uses it.
f4c3ffe60a4f [xemacs-hg @ 2007-12-01 14:24:46 by aidan]
aidan
parents: 4295
diff changeset
944 (make-8-bit-coding-system
f4c3ffe60a4f [xemacs-hg @ 2007-12-01 14:24:46 by aidan]
aidan
parents: 4295
diff changeset
945 'iso-8859-1
4604
e0a8715fdb1f Support new IGNORE-INVALID-SEQUENCESP argument, #'query-coding-region.
Aidan Kehoe <kehoea@parhasard.net>
parents: 4568
diff changeset
946 (loop
e0a8715fdb1f Support new IGNORE-INVALID-SEQUENCESP argument, #'query-coding-region.
Aidan Kehoe <kehoea@parhasard.net>
parents: 4568
diff changeset
947 for i from #x80 to #xff
e0a8715fdb1f Support new IGNORE-INVALID-SEQUENCESP argument, #'query-coding-region.
Aidan Kehoe <kehoea@parhasard.net>
parents: 4568
diff changeset
948 collect (list i (int-char i))) ;; Identical to Latin-1.
4299
f4c3ffe60a4f [xemacs-hg @ 2007-12-01 14:24:46 by aidan]
aidan
parents: 4295
diff changeset
949 "ISO-8859-1 (Latin-1)"
f4c3ffe60a4f [xemacs-hg @ 2007-12-01 14:24:46 by aidan]
aidan
parents: 4295
diff changeset
950 '(mnemonic "Latin 1"
f4c3ffe60a4f [xemacs-hg @ 2007-12-01 14:24:46 by aidan]
aidan
parents: 4295
diff changeset
951 documentation "The most used encoding of Western Europe and the Americas."
f4c3ffe60a4f [xemacs-hg @ 2007-12-01 14:24:46 by aidan]
aidan
parents: 4295
diff changeset
952 aliases (iso-latin-1 latin-1)))