annotate lisp/mule/mule-coding.el @ 4295:eded49463f9a

[xemacs-hg @ 2007-11-29 13:37:51 by aidan] Add round-trip tests for my fixed-width-8-bit CCL coding systems, fix a bug with them, take out some inadvertant debugging code of mine from mule-ccl.c.
author aidan
date Thu, 29 Nov 2007 13:38:21 +0000
parents edb00a8b4eff
children f4c3ffe60a4f
Ignore whitespace changes - Everywhere: Within whitespace: At end of lines:
rev   line source
502
7039e6323819 [xemacs-hg @ 2001-05-04 22:41:46 by ben]
ben
parents: 448
diff changeset
1 ;;; mule-coding.el --- Coding-system functions for Mule. -*- coding: iso-2022-7bit; -*-
333
4f79e16b1112 Import from CVS: tag r21-0-64
cvs
parents:
diff changeset
2
4f79e16b1112 Import from CVS: tag r21-0-64
cvs
parents:
diff changeset
3 ;; Copyright (C) 1995 Electrotechnical Laboratory, JAPAN.
4f79e16b1112 Import from CVS: tag r21-0-64
cvs
parents:
diff changeset
4 ;; Licensed to the Free Software Foundation.
4f79e16b1112 Import from CVS: tag r21-0-64
cvs
parents:
diff changeset
5 ;; Copyright (C) 1995 Amdahl Corporation.
4f79e16b1112 Import from CVS: tag r21-0-64
cvs
parents:
diff changeset
6 ;; Copyright (C) 1995 Sun Microsystems.
4f79e16b1112 Import from CVS: tag r21-0-64
cvs
parents:
diff changeset
7 ;; Copyright (C) 1997 MORIOKA Tomohiko
771
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents: 528
diff changeset
8 ;; Copyright (C) 2001 Ben Wing.
333
4f79e16b1112 Import from CVS: tag r21-0-64
cvs
parents:
diff changeset
9
4f79e16b1112 Import from CVS: tag r21-0-64
cvs
parents:
diff changeset
10 ;; This file is part of XEmacs.
4f79e16b1112 Import from CVS: tag r21-0-64
cvs
parents:
diff changeset
11
4f79e16b1112 Import from CVS: tag r21-0-64
cvs
parents:
diff changeset
12 ;; XEmacs is free software; you can redistribute it and/or modify it
4f79e16b1112 Import from CVS: tag r21-0-64
cvs
parents:
diff changeset
13 ;; under the terms of the GNU General Public License as published by
4f79e16b1112 Import from CVS: tag r21-0-64
cvs
parents:
diff changeset
14 ;; the Free Software Foundation; either version 2, or (at your option)
4f79e16b1112 Import from CVS: tag r21-0-64
cvs
parents:
diff changeset
15 ;; any later version.
4f79e16b1112 Import from CVS: tag r21-0-64
cvs
parents:
diff changeset
16
4f79e16b1112 Import from CVS: tag r21-0-64
cvs
parents:
diff changeset
17 ;; XEmacs is distributed in the hope that it will be useful, but
4f79e16b1112 Import from CVS: tag r21-0-64
cvs
parents:
diff changeset
18 ;; WITHOUT ANY WARRANTY; without even the implied warranty of
4f79e16b1112 Import from CVS: tag r21-0-64
cvs
parents:
diff changeset
19 ;; MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
4f79e16b1112 Import from CVS: tag r21-0-64
cvs
parents:
diff changeset
20 ;; General Public License for more details.
4f79e16b1112 Import from CVS: tag r21-0-64
cvs
parents:
diff changeset
21
4f79e16b1112 Import from CVS: tag r21-0-64
cvs
parents:
diff changeset
22 ;; You should have received a copy of the GNU General Public License
444
576fb035e263 Import from CVS: tag r21-2-37
cvs
parents: 442
diff changeset
23 ;; along with XEmacs; see the file COPYING. If not, write to the
333
4f79e16b1112 Import from CVS: tag r21-0-64
cvs
parents:
diff changeset
24 ;; Free Software Foundation, Inc., 59 Temple Place - Suite 330,
4f79e16b1112 Import from CVS: tag r21-0-64
cvs
parents:
diff changeset
25 ;; Boston, MA 02111-1307, USA.
4f79e16b1112 Import from CVS: tag r21-0-64
cvs
parents:
diff changeset
26
4f79e16b1112 Import from CVS: tag r21-0-64
cvs
parents:
diff changeset
27 ;;; Commentary:
4f79e16b1112 Import from CVS: tag r21-0-64
cvs
parents:
diff changeset
28
4f79e16b1112 Import from CVS: tag r21-0-64
cvs
parents:
diff changeset
29 ;;; split off of mule.el and mostly moved to coding.el
4f79e16b1112 Import from CVS: tag r21-0-64
cvs
parents:
diff changeset
30
4072
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
31 ;; Needed for make-8-bit-coding-system.
4080
476d0799d704 [xemacs-hg @ 2007-07-27 18:56:45 by aidan]
aidan
parents: 4072
diff changeset
32 (eval-when-compile (require 'ccl))
4072
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
33
333
4f79e16b1112 Import from CVS: tag r21-0-64
cvs
parents:
diff changeset
34 ;;; Code:
4f79e16b1112 Import from CVS: tag r21-0-64
cvs
parents:
diff changeset
35
4f79e16b1112 Import from CVS: tag r21-0-64
cvs
parents:
diff changeset
36 (defun coding-system-force-on-output (coding-system register)
4f79e16b1112 Import from CVS: tag r21-0-64
cvs
parents:
diff changeset
37 "Return the 'force-on-output property of CODING-SYSTEM for the specified REGISTER."
444
576fb035e263 Import from CVS: tag r21-2-37
cvs
parents: 442
diff changeset
38 (check-type register integer)
333
4f79e16b1112 Import from CVS: tag r21-0-64
cvs
parents:
diff changeset
39 (coding-system-property
4f79e16b1112 Import from CVS: tag r21-0-64
cvs
parents:
diff changeset
40 coding-system
4f79e16b1112 Import from CVS: tag r21-0-64
cvs
parents:
diff changeset
41 (case register
4f79e16b1112 Import from CVS: tag r21-0-64
cvs
parents:
diff changeset
42 (0 'force-g0-on-output)
4f79e16b1112 Import from CVS: tag r21-0-64
cvs
parents:
diff changeset
43 (1 'force-g1-on-output)
4f79e16b1112 Import from CVS: tag r21-0-64
cvs
parents:
diff changeset
44 (2 'force-g2-on-output)
4f79e16b1112 Import from CVS: tag r21-0-64
cvs
parents:
diff changeset
45 (3 'force-g3-on-output)
4f79e16b1112 Import from CVS: tag r21-0-64
cvs
parents:
diff changeset
46 (t (signal 'args-out-of-range (list register 0 3))))))
4f79e16b1112 Import from CVS: tag r21-0-64
cvs
parents:
diff changeset
47
4f79e16b1112 Import from CVS: tag r21-0-64
cvs
parents:
diff changeset
48 (defun coding-system-short (coding-system)
4f79e16b1112 Import from CVS: tag r21-0-64
cvs
parents:
diff changeset
49 "Return the 'short property of CODING-SYSTEM."
4f79e16b1112 Import from CVS: tag r21-0-64
cvs
parents:
diff changeset
50 (coding-system-property coding-system 'short))
4f79e16b1112 Import from CVS: tag r21-0-64
cvs
parents:
diff changeset
51
4f79e16b1112 Import from CVS: tag r21-0-64
cvs
parents:
diff changeset
52 (defun coding-system-no-ascii-eol (coding-system)
4f79e16b1112 Import from CVS: tag r21-0-64
cvs
parents:
diff changeset
53 "Return the 'no-ascii-eol property of CODING-SYSTEM."
4f79e16b1112 Import from CVS: tag r21-0-64
cvs
parents:
diff changeset
54 (coding-system-property coding-system 'no-ascii-eol))
4f79e16b1112 Import from CVS: tag r21-0-64
cvs
parents:
diff changeset
55
4f79e16b1112 Import from CVS: tag r21-0-64
cvs
parents:
diff changeset
56 (defun coding-system-no-ascii-cntl (coding-system)
4f79e16b1112 Import from CVS: tag r21-0-64
cvs
parents:
diff changeset
57 "Return the 'no-ascii-cntl property of CODING-SYSTEM."
4f79e16b1112 Import from CVS: tag r21-0-64
cvs
parents:
diff changeset
58 (coding-system-property coding-system 'no-ascii-cntl))
4f79e16b1112 Import from CVS: tag r21-0-64
cvs
parents:
diff changeset
59
4f79e16b1112 Import from CVS: tag r21-0-64
cvs
parents:
diff changeset
60 (defun coding-system-seven (coding-system)
4f79e16b1112 Import from CVS: tag r21-0-64
cvs
parents:
diff changeset
61 "Return the 'seven property of CODING-SYSTEM."
4f79e16b1112 Import from CVS: tag r21-0-64
cvs
parents:
diff changeset
62 (coding-system-property coding-system 'seven))
4f79e16b1112 Import from CVS: tag r21-0-64
cvs
parents:
diff changeset
63
4f79e16b1112 Import from CVS: tag r21-0-64
cvs
parents:
diff changeset
64 (defun coding-system-lock-shift (coding-system)
4f79e16b1112 Import from CVS: tag r21-0-64
cvs
parents:
diff changeset
65 "Return the 'lock-shift property of CODING-SYSTEM."
4f79e16b1112 Import from CVS: tag r21-0-64
cvs
parents:
diff changeset
66 (coding-system-property coding-system 'lock-shift))
4f79e16b1112 Import from CVS: tag r21-0-64
cvs
parents:
diff changeset
67
4f79e16b1112 Import from CVS: tag r21-0-64
cvs
parents:
diff changeset
68 ;;(defun coding-system-use-japanese-jisx0201-roman (coding-system)
4f79e16b1112 Import from CVS: tag r21-0-64
cvs
parents:
diff changeset
69 ;; "Return the 'use-japanese-jisx0201-roman property of CODING-SYSTEM."
4f79e16b1112 Import from CVS: tag r21-0-64
cvs
parents:
diff changeset
70 ;; (coding-system-property coding-system 'use-japanese-jisx0201-roman))
4f79e16b1112 Import from CVS: tag r21-0-64
cvs
parents:
diff changeset
71
4f79e16b1112 Import from CVS: tag r21-0-64
cvs
parents:
diff changeset
72 ;;(defun coding-system-use-japanese-jisx0208-1978 (coding-system)
4f79e16b1112 Import from CVS: tag r21-0-64
cvs
parents:
diff changeset
73 ;; "Return the 'use-japanese-jisx0208-1978 property of CODING-SYSTEM."
4f79e16b1112 Import from CVS: tag r21-0-64
cvs
parents:
diff changeset
74 ;; (coding-system-property coding-system 'use-japanese-jisx0208-2978))
4f79e16b1112 Import from CVS: tag r21-0-64
cvs
parents:
diff changeset
75
4f79e16b1112 Import from CVS: tag r21-0-64
cvs
parents:
diff changeset
76 (defun coding-system-no-iso6429 (coding-system)
4f79e16b1112 Import from CVS: tag r21-0-64
cvs
parents:
diff changeset
77 "Return the 'no-iso6429 property of CODING-SYSTEM."
4f79e16b1112 Import from CVS: tag r21-0-64
cvs
parents:
diff changeset
78 (coding-system-property coding-system 'no-iso6429))
4f79e16b1112 Import from CVS: tag r21-0-64
cvs
parents:
diff changeset
79
4f79e16b1112 Import from CVS: tag r21-0-64
cvs
parents:
diff changeset
80 (defun coding-system-ccl-encode (coding-system)
4f79e16b1112 Import from CVS: tag r21-0-64
cvs
parents:
diff changeset
81 "Return the CCL 'encode property of CODING-SYSTEM."
4f79e16b1112 Import from CVS: tag r21-0-64
cvs
parents:
diff changeset
82 (coding-system-property coding-system 'encode))
4f79e16b1112 Import from CVS: tag r21-0-64
cvs
parents:
diff changeset
83
4f79e16b1112 Import from CVS: tag r21-0-64
cvs
parents:
diff changeset
84 (defun coding-system-ccl-decode (coding-system)
4f79e16b1112 Import from CVS: tag r21-0-64
cvs
parents:
diff changeset
85 "Return the CCL 'decode property of CODING-SYSTEM."
4f79e16b1112 Import from CVS: tag r21-0-64
cvs
parents:
diff changeset
86 (coding-system-property coding-system 'decode))
4f79e16b1112 Import from CVS: tag r21-0-64
cvs
parents:
diff changeset
87
771
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents: 528
diff changeset
88 (defun coding-system-iso2022-charset (coding-system register)
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents: 528
diff changeset
89 "Return the charset initially designated to REGISTER in CODING-SYSTEM.
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents: 528
diff changeset
90 The allowable range of REGISTER is 0 through 3."
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents: 528
diff changeset
91 (if (or (< register 0) (> register 3))
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents: 528
diff changeset
92 (error 'args-out-of-range "coding-system-charset REGISTER" register 0 3))
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents: 528
diff changeset
93 (coding-system-property coding-system (nth register '(charset-g0
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents: 528
diff changeset
94 charset-g1
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents: 528
diff changeset
95 charset-g2
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents: 528
diff changeset
96 charset-g3))))
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents: 528
diff changeset
97
333
4f79e16b1112 Import from CVS: tag r21-0-64
cvs
parents:
diff changeset
98
4f79e16b1112 Import from CVS: tag r21-0-64
cvs
parents:
diff changeset
99 ;;;; Definitions of predefined coding systems
4f79e16b1112 Import from CVS: tag r21-0-64
cvs
parents:
diff changeset
100
4f79e16b1112 Import from CVS: tag r21-0-64
cvs
parents:
diff changeset
101 (make-coding-system
4f79e16b1112 Import from CVS: tag r21-0-64
cvs
parents:
diff changeset
102 'ctext 'iso2022
771
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents: 528
diff changeset
103 "Compound Text"
333
4f79e16b1112 Import from CVS: tag r21-0-64
cvs
parents:
diff changeset
104 '(charset-g0 ascii
4f79e16b1112 Import from CVS: tag r21-0-64
cvs
parents:
diff changeset
105 charset-g1 latin-iso8859-1
4f79e16b1112 Import from CVS: tag r21-0-64
cvs
parents:
diff changeset
106 eol-type nil
4f79e16b1112 Import from CVS: tag r21-0-64
cvs
parents:
diff changeset
107 mnemonic "CText"))
4f79e16b1112 Import from CVS: tag r21-0-64
cvs
parents:
diff changeset
108
4f79e16b1112 Import from CVS: tag r21-0-64
cvs
parents:
diff changeset
109 (make-coding-system
4f79e16b1112 Import from CVS: tag r21-0-64
cvs
parents:
diff changeset
110 'iso-8859-1 'no-conversion
771
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents: 528
diff changeset
111 "ISO-8859-1 (Latin-1)"
333
4f79e16b1112 Import from CVS: tag r21-0-64
cvs
parents:
diff changeset
112 '(eol-type nil mnemonic "Noconv"))
4f79e16b1112 Import from CVS: tag r21-0-64
cvs
parents:
diff changeset
113
4f79e16b1112 Import from CVS: tag r21-0-64
cvs
parents:
diff changeset
114 (make-coding-system
4f79e16b1112 Import from CVS: tag r21-0-64
cvs
parents:
diff changeset
115 'iso-2022-8bit-ss2 'iso2022
771
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents: 528
diff changeset
116 "ISO-2022 8-bit w/SS2"
333
4f79e16b1112 Import from CVS: tag r21-0-64
cvs
parents:
diff changeset
117 '(charset-g0 ascii
4f79e16b1112 Import from CVS: tag r21-0-64
cvs
parents:
diff changeset
118 charset-g1 latin-iso8859-1
4f79e16b1112 Import from CVS: tag r21-0-64
cvs
parents:
diff changeset
119 charset-g2 t ;; unspecified but can be used later.
4f79e16b1112 Import from CVS: tag r21-0-64
cvs
parents:
diff changeset
120 short t
4f79e16b1112 Import from CVS: tag r21-0-64
cvs
parents:
diff changeset
121 mnemonic "ISO8/SS"
771
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents: 528
diff changeset
122 documentation "ISO 2022 based 8-bit encoding using SS2 for 96-charset"
333
4f79e16b1112 Import from CVS: tag r21-0-64
cvs
parents:
diff changeset
123 ))
4f79e16b1112 Import from CVS: tag r21-0-64
cvs
parents:
diff changeset
124
4f79e16b1112 Import from CVS: tag r21-0-64
cvs
parents:
diff changeset
125 (make-coding-system
4f79e16b1112 Import from CVS: tag r21-0-64
cvs
parents:
diff changeset
126 'iso-2022-7bit-ss2 'iso2022
771
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents: 528
diff changeset
127 "ISO-2022 7-bit w/SS2"
333
4f79e16b1112 Import from CVS: tag r21-0-64
cvs
parents:
diff changeset
128 '(charset-g0 ascii
4f79e16b1112 Import from CVS: tag r21-0-64
cvs
parents:
diff changeset
129 charset-g2 t ;; unspecified but can be used later.
4f79e16b1112 Import from CVS: tag r21-0-64
cvs
parents:
diff changeset
130 seven t
4f79e16b1112 Import from CVS: tag r21-0-64
cvs
parents:
diff changeset
131 short t
4f79e16b1112 Import from CVS: tag r21-0-64
cvs
parents:
diff changeset
132 mnemonic "ISO7/SS"
771
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents: 528
diff changeset
133 documentation "ISO 2022 based 7-bit encoding using SS2 for 96-charset"
333
4f79e16b1112 Import from CVS: tag r21-0-64
cvs
parents:
diff changeset
134 eol-type nil))
4f79e16b1112 Import from CVS: tag r21-0-64
cvs
parents:
diff changeset
135
4f79e16b1112 Import from CVS: tag r21-0-64
cvs
parents:
diff changeset
136 ;; (copy-coding-system 'iso-2022-7bit-ss2 'iso-2022-jp-2)
4f79e16b1112 Import from CVS: tag r21-0-64
cvs
parents:
diff changeset
137 (make-coding-system
4f79e16b1112 Import from CVS: tag r21-0-64
cvs
parents:
diff changeset
138 'iso-2022-jp-2 'iso2022
771
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents: 528
diff changeset
139 "ISO-2022-JP-2"
333
4f79e16b1112 Import from CVS: tag r21-0-64
cvs
parents:
diff changeset
140 '(charset-g0 ascii
4f79e16b1112 Import from CVS: tag r21-0-64
cvs
parents:
diff changeset
141 charset-g2 t ;; unspecified but can be used later.
4f79e16b1112 Import from CVS: tag r21-0-64
cvs
parents:
diff changeset
142 seven t
4f79e16b1112 Import from CVS: tag r21-0-64
cvs
parents:
diff changeset
143 short t
4f79e16b1112 Import from CVS: tag r21-0-64
cvs
parents:
diff changeset
144 mnemonic "ISO7/SS"
4f79e16b1112 Import from CVS: tag r21-0-64
cvs
parents:
diff changeset
145 eol-type nil))
4f79e16b1112 Import from CVS: tag r21-0-64
cvs
parents:
diff changeset
146
4f79e16b1112 Import from CVS: tag r21-0-64
cvs
parents:
diff changeset
147 (make-coding-system
4f79e16b1112 Import from CVS: tag r21-0-64
cvs
parents:
diff changeset
148 'iso-2022-7bit 'iso2022
771
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents: 528
diff changeset
149 "ISO 2022 7-bit"
333
4f79e16b1112 Import from CVS: tag r21-0-64
cvs
parents:
diff changeset
150 '(charset-g0 ascii
4f79e16b1112 Import from CVS: tag r21-0-64
cvs
parents:
diff changeset
151 seven t
4f79e16b1112 Import from CVS: tag r21-0-64
cvs
parents:
diff changeset
152 short t
771
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents: 528
diff changeset
153 mnemonic "ISO7"
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents: 528
diff changeset
154 documentation "ISO-2022-based 7-bit encoding using only G0"
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents: 528
diff changeset
155 ))
333
4f79e16b1112 Import from CVS: tag r21-0-64
cvs
parents:
diff changeset
156
4f79e16b1112 Import from CVS: tag r21-0-64
cvs
parents:
diff changeset
157 ;; compatibility for old XEmacsen
771
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents: 528
diff changeset
158 (define-coding-system-alias 'iso-2022-7 'iso-2022-7bit)
333
4f79e16b1112 Import from CVS: tag r21-0-64
cvs
parents:
diff changeset
159
4f79e16b1112 Import from CVS: tag r21-0-64
cvs
parents:
diff changeset
160 (make-coding-system
4f79e16b1112 Import from CVS: tag r21-0-64
cvs
parents:
diff changeset
161 'iso-2022-8 'iso2022
771
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents: 528
diff changeset
162 "ISO-2022 8-bit"
333
4f79e16b1112 Import from CVS: tag r21-0-64
cvs
parents:
diff changeset
163 '(charset-g0 ascii
4f79e16b1112 Import from CVS: tag r21-0-64
cvs
parents:
diff changeset
164 charset-g1 latin-iso8859-1
4f79e16b1112 Import from CVS: tag r21-0-64
cvs
parents:
diff changeset
165 short t
4f79e16b1112 Import from CVS: tag r21-0-64
cvs
parents:
diff changeset
166 mnemonic "ISO8"
771
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents: 528
diff changeset
167 documentation "ISO-2022 eight-bit coding system. No single-shift or locking-shift."
333
4f79e16b1112 Import from CVS: tag r21-0-64
cvs
parents:
diff changeset
168 ))
4f79e16b1112 Import from CVS: tag r21-0-64
cvs
parents:
diff changeset
169
4f79e16b1112 Import from CVS: tag r21-0-64
cvs
parents:
diff changeset
170 (make-coding-system
4f79e16b1112 Import from CVS: tag r21-0-64
cvs
parents:
diff changeset
171 'escape-quoted 'iso2022
771
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents: 528
diff changeset
172 "Escape-Quoted (for .ELC files)"
333
4f79e16b1112 Import from CVS: tag r21-0-64
cvs
parents:
diff changeset
173 '(charset-g0 ascii
4f79e16b1112 Import from CVS: tag r21-0-64
cvs
parents:
diff changeset
174 charset-g1 latin-iso8859-1
4f79e16b1112 Import from CVS: tag r21-0-64
cvs
parents:
diff changeset
175 eol-type lf
4f79e16b1112 Import from CVS: tag r21-0-64
cvs
parents:
diff changeset
176 escape-quoted t
4f79e16b1112 Import from CVS: tag r21-0-64
cvs
parents:
diff changeset
177 mnemonic "ESC/Quot"
771
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents: 528
diff changeset
178 documentation "ISO-2022 eight-bit coding system with escape quoting; used for .ELC files."
333
4f79e16b1112 Import from CVS: tag r21-0-64
cvs
parents:
diff changeset
179 ))
4f79e16b1112 Import from CVS: tag r21-0-64
cvs
parents:
diff changeset
180
4f79e16b1112 Import from CVS: tag r21-0-64
cvs
parents:
diff changeset
181 (make-coding-system
4f79e16b1112 Import from CVS: tag r21-0-64
cvs
parents:
diff changeset
182 'iso-2022-lock 'iso2022
771
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents: 528
diff changeset
183 "ISO-2022 w/locking-shift"
333
4f79e16b1112 Import from CVS: tag r21-0-64
cvs
parents:
diff changeset
184 '(charset-g0 ascii
4f79e16b1112 Import from CVS: tag r21-0-64
cvs
parents:
diff changeset
185 charset-g1 t ;; unspecified but can be used later.
4f79e16b1112 Import from CVS: tag r21-0-64
cvs
parents:
diff changeset
186 seven t
4f79e16b1112 Import from CVS: tag r21-0-64
cvs
parents:
diff changeset
187 lock-shift t
4f79e16b1112 Import from CVS: tag r21-0-64
cvs
parents:
diff changeset
188 mnemonic "ISO7/Lock"
771
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents: 528
diff changeset
189 documentation "ISO-2022 coding system using Locking-Shift for 96-charset."
333
4f79e16b1112 Import from CVS: tag r21-0-64
cvs
parents:
diff changeset
190 ))
4072
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
191
333
4f79e16b1112 Import from CVS: tag r21-0-64
cvs
parents:
diff changeset
192
4072
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
193 ;; This is used by people writing CCL programs, but is called at runtime.
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
194 (defun define-translation-hash-table (symbol table)
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
195 "Define SYMBOL as the name of the hash translation TABLE for use in CCL.
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
196
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
197 Analogous to `define-translation-table', but updates
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
198 `translation-hash-table-vector' and the table is for use in the CCL
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
199 `lookup-integer' and `lookup-character' functions."
4145
edb00a8b4eff [xemacs-hg @ 2007-08-26 20:00:29 by aidan]
aidan
parents: 4103
diff changeset
200 (check-argument-type #'symbolp symbol)
edb00a8b4eff [xemacs-hg @ 2007-08-26 20:00:29 by aidan]
aidan
parents: 4103
diff changeset
201 (check-argument-type #'hash-table-p table)
4072
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
202 (let ((len (length translation-hash-table-vector))
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
203 (id 0)
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
204 done)
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
205 (put symbol 'translation-hash-table table)
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
206 (while (not done)
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
207 (if (>= id len)
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
208 (setq translation-hash-table-vector
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
209 (vconcat translation-hash-table-vector [nil])))
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
210 (let ((slot (aref translation-hash-table-vector id)))
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
211 (if (or (not slot)
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
212 (eq (car slot) symbol))
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
213 (progn
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
214 (aset translation-hash-table-vector id (cons symbol table))
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
215 (setq done t))
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
216 (setq id (1+ id)))))
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
217 (put symbol 'translation-hash-table-id id)
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
218 id))
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
219
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
220 (defvar make-8-bit-private-use-start (decode-char 'ucs #xE000)
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
221 "Start of a 256 code private use area for make-8-bit-coding-system.
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
222
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
223 This is used to ensure that distinct octets on disk for a given coding
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
224 system map to distinct XEmacs characters, preventing a spurious changes when
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
225 a file is read, not changed, and then written. ")
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
226
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
227 (defun make-8-bit-generate-helper (decode-table encode-table
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
228 encode-failure-octet)
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
229 "Helper function for `make-8-bit-generate-encode-program', which see.
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
230
4145
edb00a8b4eff [xemacs-hg @ 2007-08-26 20:00:29 by aidan]
aidan
parents: 4103
diff changeset
231 Deals with the case where ASCII and another character set can both be
edb00a8b4eff [xemacs-hg @ 2007-08-26 20:00:29 by aidan]
aidan
parents: 4103
diff changeset
232 encoded unambiguously and completely into the coding-system; if this is so,
edb00a8b4eff [xemacs-hg @ 2007-08-26 20:00:29 by aidan]
aidan
parents: 4103
diff changeset
233 returns a list corresponding to such a ccl-program. If not, it returns nil. "
4072
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
234 (let ((tentative-encode-program-parts
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
235 (eval-when-compile
4295
eded49463f9a [xemacs-hg @ 2007-11-29 13:37:51 by aidan]
aidan
parents: 4145
diff changeset
236 (let* ((vec-len 128)
eded49463f9a [xemacs-hg @ 2007-11-29 13:37:51 by aidan]
aidan
parents: 4145
diff changeset
237 (compiled
4072
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
238 (append
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
239 (ccl-compile
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
240 `(1
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
241 (loop
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
242 (read-multibyte-character r0 r1)
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
243 (if (r0 == ,(charset-id 'ascii))
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
244 (write r1)
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
245 ((if (r0 == #xABAB)
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
246 ;; #xBFFE is a sentinel in the compiled
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
247 ;; program.
4295
eded49463f9a [xemacs-hg @ 2007-11-29 13:37:51 by aidan]
aidan
parents: 4145
diff changeset
248 ;; #xBFFE is a sentinel in the compiled
eded49463f9a [xemacs-hg @ 2007-11-29 13:37:51 by aidan]
aidan
parents: 4145
diff changeset
249 ;; program.
eded49463f9a [xemacs-hg @ 2007-11-29 13:37:51 by aidan]
aidan
parents: 4145
diff changeset
250 ((r0 = r1 & #x7F)
eded49463f9a [xemacs-hg @ 2007-11-29 13:37:51 by aidan]
aidan
parents: 4145
diff changeset
251 (write r0 ,(make-vector vec-len #xBFFE)))
4072
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
252 ((mule-to-unicode r0 r1)
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
253 (if (r0 == #xFFFD)
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
254 (write #xBEEF)
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
255 ((lookup-integer encode-table-sym r0 r3)
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
256 (if r7
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
257 (write-multibyte-character r0 r3)
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
258 (write #xBEEF))))))))
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
259 (repeat)))) nil))
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
260 (first-part compiled)
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
261 (last-part
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
262 (member-if-not (lambda (entr) (eq #xBFFE entr))
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
263 (member-if
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
264 (lambda (entr) (eq #xBFFE entr))
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
265 first-part))))
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
266 (while compiled
4295
eded49463f9a [xemacs-hg @ 2007-11-29 13:37:51 by aidan]
aidan
parents: 4145
diff changeset
267 (when (eq #xBFFE (cadr compiled))
eded49463f9a [xemacs-hg @ 2007-11-29 13:37:51 by aidan]
aidan
parents: 4145
diff changeset
268 (assert (= vec-len (search '(#xBFFE) (cdr compiled)
eded49463f9a [xemacs-hg @ 2007-11-29 13:37:51 by aidan]
aidan
parents: 4145
diff changeset
269 :test #'/=)) nil
eded49463f9a [xemacs-hg @ 2007-11-29 13:37:51 by aidan]
aidan
parents: 4145
diff changeset
270 "Strange ccl vector length")
eded49463f9a [xemacs-hg @ 2007-11-29 13:37:51 by aidan]
aidan
parents: 4145
diff changeset
271 (setcdr compiled nil))
4072
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
272 (setq compiled (cdr compiled)))
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
273 ;; Is the generated code as we expect it to be?
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
274 (assert (and (memq #xABAB first-part)
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
275 (memq #xBEEF14 last-part))
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
276 nil
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
277 "This code assumes that the constant #xBEEF is #xBEEF14 in \
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
278 compiled CCL code,\nand that the constant #xABAB is #xABAB. If that is
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
279 not the case, and it appears not to be--that's why you're getting this
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
280 message--it will not work. ")
4295
eded49463f9a [xemacs-hg @ 2007-11-29 13:37:51 by aidan]
aidan
parents: 4145
diff changeset
281 (list first-part last-part vec-len))))
4072
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
282 (charset-lower -1)
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
283 (charset-upper -1)
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
284 worth-trying known-charsets encode-program
4295
eded49463f9a [xemacs-hg @ 2007-11-29 13:37:51 by aidan]
aidan
parents: 4145
diff changeset
285 other-charset-vector ucs args-out-of-range)
4072
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
286
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
287 (loop for char across decode-table
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
288 do (pushnew (char-charset char) known-charsets))
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
289 (setq known-charsets (delq 'ascii known-charsets))
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
290
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
291 (loop for known-charset in known-charsets
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
292 do
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
293 ;; This is not possible for two dimensional charsets.
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
294 (when (eq 1 (charset-dimension known-charset))
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
295 (setq args-out-of-range t)
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
296 (if (eq 'control-1 known-charset)
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
297 (setq charset-lower 0
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
298 charset-upper 31)
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
299 ;; There should be a nicer way to get the limits here.
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
300 (condition-case args-out-of-range
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
301 (make-char known-charset #x100)
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
302 (args-out-of-range
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
303 (setq charset-lower (third args-out-of-range)
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
304 charset-upper (fourth args-out-of-range)))))
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
305 (loop
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
306 for i from charset-lower to charset-upper
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
307 always (and (setq ucs
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
308 (encode-char (make-char known-charset i) 'ucs))
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
309 (gethash ucs encode-table))
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
310 finally (setq worth-trying known-charset))
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
311
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
312 ;; Only trying this for one charset at a time, the first find.
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
313 (when worth-trying (return))
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
314
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
315 ;; Okay, this charset is not worth trying, Try the next.
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
316 (setq charset-lower -1
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
317 charset-upper -1
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
318 worth-trying nil)))
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
319
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
320 (when worth-trying
4295
eded49463f9a [xemacs-hg @ 2007-11-29 13:37:51 by aidan]
aidan
parents: 4145
diff changeset
321 (setq other-charset-vector
eded49463f9a [xemacs-hg @ 2007-11-29 13:37:51 by aidan]
aidan
parents: 4145
diff changeset
322 (make-vector (third tentative-encode-program-parts)
eded49463f9a [xemacs-hg @ 2007-11-29 13:37:51 by aidan]
aidan
parents: 4145
diff changeset
323 encode-failure-octet))
4072
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
324 (loop for i from charset-lower to charset-upper
4090
751ae075e76e [xemacs-hg @ 2007-08-01 13:53:32 by aidan]
aidan
parents: 4085
diff changeset
325 do (aset other-charset-vector i
4072
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
326 (gethash (encode-char (make-char worth-trying i)
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
327 'ucs) encode-table)))
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
328 (setq encode-program
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
329 (nsublis
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
330 (list (cons #xABAB (charset-id worth-trying)))
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
331 (nconc
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
332 (copy-list (first
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
333 tentative-encode-program-parts))
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
334 (append other-charset-vector nil)
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
335 (copy-tree (second
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
336 tentative-encode-program-parts))))))
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
337 encode-program))
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
338
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
339 (defun make-8-bit-generate-encode-program (decode-table encode-table
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
340 encode-failure-octet)
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
341 "Generate a CCL program to decode a 8-bit fixed-width charset.
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
342
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
343 DECODE-TABLE must have 256 non-cons entries, and will be regarded as
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
344 describing a map from the octet corresponding to an offset in the
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
345 table to the that entry in the table. ENCODE-TABLE is a hash table
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
346 map from unicode values to characters in the range [0,255].
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
347 ENCODE-FAILURE-OCTET describes an integer between 0 and 255
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
348 \(inclusive) to write in the event that a character cannot be encoded. "
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
349 (check-argument-type #'vectorp decode-table)
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
350 (check-argument-range (length decode-table) #x100 #x100)
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
351 (check-argument-type #'hash-table-p encode-table)
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
352 (check-argument-type #'integerp encode-failure-octet)
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
353 (check-argument-range encode-failure-octet #x00 #xFF)
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
354 (let ((encode-program nil)
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
355 (general-encode-program
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
356 (eval-when-compile
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
357 (let ((prog (append
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
358 (ccl-compile
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
359 `(1
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
360 (loop
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
361 (read-multibyte-character r0 r1)
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
362 (mule-to-unicode r0 r1)
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
363 (if (r0 == #xFFFD)
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
364 (write #xBEEF)
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
365 ((lookup-integer encode-table-sym r0 r3)
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
366 (if r7
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
367 (write-multibyte-character r0 r3)
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
368 (write #xBEEF))))
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
369 (repeat)))) nil)))
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
370 (assert (memq #xBEEF14 prog)
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
371 nil
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
372 "This code assumes that the constant #xBEEF is #xBEEF14 \
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
373 in compiled CCL code.\nIf that is not the case, and it appears not to
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
374 be--that's why you're getting this message--it will not work. ")
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
375 prog)))
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
376 (encode-program-with-ascii-optimisation
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
377 (eval-when-compile
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
378 (let ((prog (append
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
379 (ccl-compile
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
380 `(1
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
381 (loop
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
382 (read-multibyte-character r0 r1)
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
383 (if (r0 == ,(charset-id 'ascii))
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
384 (write r1)
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
385 ((mule-to-unicode r0 r1)
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
386 (if (r0 == #xFFFD)
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
387 (write #xBEEF)
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
388 ((lookup-integer encode-table-sym r0 r3)
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
389 (if r7
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
390 (write-multibyte-character r0 r3)
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
391 (write #xBEEF))))))
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
392 (repeat)))) nil)))
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
393 (assert (memq #xBEEF14 prog)
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
394 nil
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
395 "This code assumes that the constant #xBEEF is #xBEEF14 \
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
396 in compiled CCL code.\nIf that is not the case, and it appears not to
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
397 be--that's why you're getting this message--it will not work. ")
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
398 prog)))
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
399 (ascii-encodes-as-itself nil))
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
400
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
401 ;; Is this coding system ASCII-compatible? If so, we can avoid the hash
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
402 ;; table lookup for those characters.
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
403 (loop
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
404 for i from #x00 to #x7f
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
405 always (eq (int-to-char i) (gethash i encode-table))
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
406 finally (setq ascii-encodes-as-itself t))
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
407
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
408 ;; Note that this logic handles EBCDIC badly. For example, CP037,
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
409 ;; MIME name ebcdic-na, has the entire repertoire of ASCII and
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
410 ;; Latin 1, and thus a more optimal ccl encode program would check
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
411 ;; for those character sets and use tables. But for now, we do a
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
412 ;; hash table lookup for every character.
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
413 (if (null ascii-encodes-as-itself)
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
414 ;; General encode program. Pros; general and correct. Cons;
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
415 ;; slow, a hash table lookup + mule-unicode conversion is done
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
416 ;; for every character encoding.
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
417 (setq encode-program general-encode-program)
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
418 (setq encode-program
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
419 ;; Encode program with ascii-ascii mapping (based on a
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
420 ;; character's mule character set), and one other mule
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
421 ;; character set using table-based encoding, other
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
422 ;; character sets using hash table lookups.
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
423 ;; make-8-bit-non-ascii-completely-coveredp only returns
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
424 ;; such a mapping if some non-ASCII charset with
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
425 ;; characters in decode-table is entirely covered by
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
426 ;; encode-table.
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
427 (make-8-bit-generate-helper decode-table encode-table
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
428 encode-failure-octet))
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
429 (unless encode-program
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
430 ;; If make-8-bit-non-ascii-completely-coveredp returned nil,
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
431 ;; but ASCII still encodes as itself, do one-to-one mapping
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
432 ;; for ASCII, and a hash table lookup for everything else.
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
433 (setq encode-program encode-program-with-ascii-optimisation)))
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
434
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
435 (setq encode-program
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
436 (nsublis
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
437 (list (cons #xBEEF14
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
438 (logior (lsh encode-failure-octet 8)
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
439 #x14)))
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
440 (copy-tree encode-program)))
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
441 encode-program))
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
442
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
443 (defun make-8-bit-create-decode-encode-tables (unicode-map)
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
444 "Return a list \(DECODE-TABLE ENCODE-TABLE) given UNICODE-MAP.
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
445 UNICODE-MAP should be an alist mapping from integer octet values to
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
446 characters with UCS code points; DECODE-TABLE will be a 256-element
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
447 vector, and ENCODE-TABLE will be a hash table mapping from 256 numbers
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
448 to 256 distinct characters. "
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
449 (check-argument-type #'listp unicode-map)
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
450 (let ((decode-table (make-vector 256 nil))
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
451 (encode-table (make-hash-table :size 256))
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
452 (private-use-start (encode-char make-8-bit-private-use-start 'ucs))
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
453 desired-ucs)
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
454
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
455 (loop for (external internal)
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
456 in unicode-map
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
457 do
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
458 (aset decode-table external internal)
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
459 (assert (not (eq (encode-char internal 'ucs) -1))
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
460 nil
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
461 "Looks like you're calling make-8-bit-coding-system in a \
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
462 dumped file, \nand you're either not providing a literal UNICODE-MAP
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
463 or PROPS. Don't do that; make-8-bit-coding-system relies on sensible
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
464 Unicode mappings being available, which they are at compile time for
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
465 dumped files (but this requires the mentioned literals), but not, for
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
466 most of them, at run time. ")
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
467
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
468 (puthash (encode-char internal 'ucs)
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
469 ;; This is semantically an integer, but Dave Love's design
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
470 ;; for lookup-integer in CCL means we need to store it as a
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
471 ;; character.
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
472 (int-to-char external)
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
473 encode-table))
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
474
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
475 ;; Now, go through the decode table looking at the characters that
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
476 ;; remain nil. If the XEmacs character with that integer is already in
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
477 ;; the encode table, map the on-disk octet to a Unicode private use
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
478 ;; character. Otherwise map the on-disk octet to the XEmacs character
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
479 ;; with that numeric value, to make it clearer what it is.
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
480 (dotimes (i 256)
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
481 (when (null (aref decode-table i))
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
482 ;; Find a free code point.
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
483 (setq desired-ucs i)
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
484 (while (gethash desired-ucs encode-table)
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
485 ;; In the normal case, the code point chosen will be U+E0XY, where
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
486 ;; XY is the hexadecimal octet on disk. In pathological cases
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
487 ;; it'll be something else.
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
488 (setq desired-ucs (+ private-use-start desired-ucs)
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
489 private-use-start (+ private-use-start 1)))
4085
023ebc75c06e [xemacs-hg @ 2007-07-28 09:32:26 by aidan]
aidan
parents: 4080
diff changeset
490 (puthash desired-ucs (int-to-char i) encode-table)
023ebc75c06e [xemacs-hg @ 2007-07-28 09:32:26 by aidan]
aidan
parents: 4080
diff changeset
491 (setq desired-ucs (if (> desired-ucs #xFF)
023ebc75c06e [xemacs-hg @ 2007-07-28 09:32:26 by aidan]
aidan
parents: 4080
diff changeset
492 (decode-char 'ucs desired-ucs)
023ebc75c06e [xemacs-hg @ 2007-07-28 09:32:26 by aidan]
aidan
parents: 4080
diff changeset
493 ;; So we get Latin-1 when run at dump time,
023ebc75c06e [xemacs-hg @ 2007-07-28 09:32:26 by aidan]
aidan
parents: 4080
diff changeset
494 ;; instead of JIT-allocated characters.
023ebc75c06e [xemacs-hg @ 2007-07-28 09:32:26 by aidan]
aidan
parents: 4080
diff changeset
495 (int-to-char desired-ucs)))
023ebc75c06e [xemacs-hg @ 2007-07-28 09:32:26 by aidan]
aidan
parents: 4080
diff changeset
496 (aset decode-table i desired-ucs)))
4072
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
497 (values decode-table encode-table)))
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
498
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
499 (defun make-8-bit-generate-decode-program (decode-table)
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
500 "Given DECODE-TABLE, generate a CCL program to decode an 8-bit charset.
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
501 DECODE-TABLE must have 256 non-cons entries, and will be regarded as
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
502 describing a map from the octet corresponding to an offset in the
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
503 table to the that entry in the table. "
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
504 (check-argument-type #'vectorp decode-table)
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
505 (check-argument-range (length decode-table) #x100 #x100)
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
506 (let ((decode-program-parts
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
507 (eval-when-compile
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
508 (let* ((compiled
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
509 (append
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
510 (ccl-compile
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
511 `(3
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
512 ((read r0)
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
513 (loop
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
514 (write-read-repeat r0 ,(make-vector
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
515 256 'sentinel)))))) nil))
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
516 (first-part compiled)
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
517 (last-part
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
518 (member-if-not #'symbolp
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
519 (member-if-not #'integerp first-part))))
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
520 ;; Chop off the sentinel sentinel sentinel [..] part.
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
521 (while compiled
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
522 (if (symbolp (cadr compiled))
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
523 (setcdr compiled nil))
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
524 (setq compiled (cdr compiled)))
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
525 (list first-part last-part)))))
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
526 (nconc
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
527 ;; copy-list needed, because the structure of the literal provided
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
528 ;; by our eval-when-compile hangs around.
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
529 (copy-list (first decode-program-parts))
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
530 (append decode-table nil)
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
531 (second decode-program-parts))))
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
532
4145
edb00a8b4eff [xemacs-hg @ 2007-08-26 20:00:29 by aidan]
aidan
parents: 4103
diff changeset
533 (defun make-8-bit-choose-category (decode-table)
edb00a8b4eff [xemacs-hg @ 2007-08-26 20:00:29 by aidan]
aidan
parents: 4103
diff changeset
534 "Given DECODE-TABLE, return an appropriate coding category.
edb00a8b4eff [xemacs-hg @ 2007-08-26 20:00:29 by aidan]
aidan
parents: 4103
diff changeset
535 DECODE-TABLE is a 256-entry vector describing the mapping from octets on
edb00a8b4eff [xemacs-hg @ 2007-08-26 20:00:29 by aidan]
aidan
parents: 4103
diff changeset
536 disk to XEmacs characters for some fixed-width 8-bit coding system. "
edb00a8b4eff [xemacs-hg @ 2007-08-26 20:00:29 by aidan]
aidan
parents: 4103
diff changeset
537 (check-argument-type #'vectorp decode-table)
edb00a8b4eff [xemacs-hg @ 2007-08-26 20:00:29 by aidan]
aidan
parents: 4103
diff changeset
538 (check-argument-range (length decode-table) #x100 #x100)
edb00a8b4eff [xemacs-hg @ 2007-08-26 20:00:29 by aidan]
aidan
parents: 4103
diff changeset
539 (block category
edb00a8b4eff [xemacs-hg @ 2007-08-26 20:00:29 by aidan]
aidan
parents: 4103
diff changeset
540 (loop
edb00a8b4eff [xemacs-hg @ 2007-08-26 20:00:29 by aidan]
aidan
parents: 4103
diff changeset
541 for i from #x80 to #xBF
edb00a8b4eff [xemacs-hg @ 2007-08-26 20:00:29 by aidan]
aidan
parents: 4103
diff changeset
542 do (unless (= i (aref decode-table i))
edb00a8b4eff [xemacs-hg @ 2007-08-26 20:00:29 by aidan]
aidan
parents: 4103
diff changeset
543 (return-from category 'no-conversion)))
edb00a8b4eff [xemacs-hg @ 2007-08-26 20:00:29 by aidan]
aidan
parents: 4103
diff changeset
544 'iso-8-1))
edb00a8b4eff [xemacs-hg @ 2007-08-26 20:00:29 by aidan]
aidan
parents: 4103
diff changeset
545
4072
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
546 ;;;###autoload
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
547 (defun make-8-bit-coding-system (name unicode-map &optional description props)
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
548 "Make and return a fixed-width 8-bit CCL coding system named NAME.
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
549 NAME must be a symbol, and UNICODE-MAP a list.
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
550
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
551 UNICODE-MAP is a plist describing a map from octets in the coding
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
552 system NAME (as integers) to XEmacs characters. Those XEmacs
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
553 characters will be used explicitly on decoding, but for encoding (most
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
554 relevantly, on writing to disk) XEmacs characters that map to the same
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
555 Unicode code point will be unified. This means that the ISO-8859-?
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
556 characters that map to the same Unicode code point will not be
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
557 distinct when written to disk, which is normally what is intended; it
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
558 also means that East Asian Han characters from different XEmacs
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
559 character sets will not be distinct when written to disk, which is
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
560 less often what is intended.
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
561
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
562 Any octets not mapped will be decoded into the ISO 8859-1 characters with
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
563 the corresponding numeric value; unless another octet maps to that
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
564 character, in which case the Unicode private use area will be used. This
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
565 avoids spurious changes to files on disk when they contain octets that would
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
566 be otherwise remapped to the canonical values for the corresponding
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
567 characters in the coding system.
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
568
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
569 DESCRIPTION and PROPS are as in `make-coding-system', which see. This
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
570 function also accepts two additional (optional) properties in PROPS;
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
571 `aliases', giving a list of aliases to be initialized for this
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
572 coding-system, and `encode-failure-octet', an integer between 0 and 256 to
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
573 write in place of XEmacs characters that cannot be encoded, defaulting to
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
574 the code for tilde `~'. "
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
575 (check-argument-type #'symbolp name)
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
576 (check-argument-type #'listp unicode-map)
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
577 (check-argument-type #'stringp
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
578 (or description
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
579 (setq description
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
580 (format "Coding system used for %s." name))))
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
581 (check-valid-plist props)
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
582 (let ((encode-failure-octet (or (plist-get props 'encode-failure-octet)
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
583 (char-to-int ?~)))
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
584 (aliases (plist-get props 'aliases))
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
585 (hash-table-sym (gentemp (format "%s-encode-table" name)))
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
586 encode-program decode-program result decode-table encode-table)
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
587
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
588 ;; Some more sanity checking.
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
589 (check-argument-range encode-failure-octet 0 #xFF)
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
590 (check-argument-type #'listp aliases)
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
591
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
592 ;; Don't pass on our extra data to make-coding-system.
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
593 (setq props (plist-remprop props 'encode-failure-octet)
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
594 props (plist-remprop props 'aliases))
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
595
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
596 (multiple-value-setq
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
597 (decode-table encode-table)
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
598 (make-8-bit-create-decode-encode-tables unicode-map))
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
599
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
600 ;; Register the decode-table.
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
601 (define-translation-hash-table hash-table-sym encode-table)
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
602
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
603 ;; Generate the programs.
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
604 (setq decode-program (make-8-bit-generate-decode-program decode-table)
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
605 encode-program (make-8-bit-generate-encode-program
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
606 decode-table encode-table encode-failure-octet))
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
607 (unless (vectorp encode-program)
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
608 (setq encode-program
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
609 (apply #'vector
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
610 (nsublis (list (cons 'encode-table-sym hash-table-sym))
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
611 (copy-tree encode-program)))))
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
612 (unless (vectorp decode-program)
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
613 (setq decode-program
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
614 (apply #'vector decode-program)))
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
615
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
616 ;; And now generate the actual coding system.
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
617 (setq result
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
618 (make-coding-system
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
619 name 'ccl
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
620 description
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
621 (plist-put (plist-put props 'decode decode-program)
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
622 'encode encode-program)))
4295
eded49463f9a [xemacs-hg @ 2007-11-29 13:37:51 by aidan]
aidan
parents: 4145
diff changeset
623 (coding-system-put name '8-bit-fixed t)
4145
edb00a8b4eff [xemacs-hg @ 2007-08-26 20:00:29 by aidan]
aidan
parents: 4103
diff changeset
624 (coding-system-put name 'category
edb00a8b4eff [xemacs-hg @ 2007-08-26 20:00:29 by aidan]
aidan
parents: 4103
diff changeset
625 (make-8-bit-choose-category decode-table))
4072
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
626 (loop for alias in aliases
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
627 do (define-coding-system-alias alias name))
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
628 result))
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
629
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
630 (define-compiler-macro make-8-bit-coding-system (&whole form name unicode-map
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
631 &optional description props)
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
632
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
633 ;; We provide the compiler macro (= macro that is expanded only on
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
634 ;; compilation, and that can punt to a runtime version of the
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
635 ;; associate function if necessary) not for reasons of speed, though
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
636 ;; it does speed up things at runtime a little, but because the
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
637 ;; Unicode mappings are available at compile time in the dumped
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
638 ;; files, but they are not available at run time for the vast
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
639 ;; majority of them.
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
640
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
641 (if (not (and (and (consp name) (eq (car name) 'quote))
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
642 (and (consp unicode-map) (eq (car unicode-map) 'quote))
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
643 (and (or (and (consp props) (eq (car props) 'quote))
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
644 (null props)))))
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
645 ;; The call does not use literals; do it at runtime.
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
646 form
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
647 (setq name (cadr name)
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
648 unicode-map (cadr unicode-map)
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
649 props (if props (cadr props)))
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
650 (let ((encode-failure-octet
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
651 (or (plist-get props 'encode-failure-octet) (char-to-int ?~)))
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
652 (aliases (plist-get props 'aliases))
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
653 encode-program decode-program
4103
b4f4e0cc90f1 [xemacs-hg @ 2007-08-07 23:08:47 by aidan]
aidan
parents: 4090
diff changeset
654 decode-table encode-table)
4072
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
655
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
656 ;; Some sanity checking.
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
657 (check-argument-range encode-failure-octet 0 #xFF)
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
658 (check-argument-type #'listp aliases)
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
659
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
660 ;; Don't pass on our extra data to make-coding-system.
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
661 (setq props (plist-remprop props 'encode-failure-octet)
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
662 props (plist-remprop props 'aliases))
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
663
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
664 ;; Work out encode-table and decode-table.
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
665 (multiple-value-setq
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
666 (decode-table encode-table)
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
667 (make-8-bit-create-decode-encode-tables unicode-map))
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
668
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
669 ;; Generate the decode and encode programs.
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
670 (setq decode-program (make-8-bit-generate-decode-program decode-table)
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
671 encode-program (make-8-bit-generate-encode-program
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
672 decode-table encode-table encode-failure-octet))
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
673
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
674 ;; And return the generated code.
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
675 `(let ((encode-table-sym (gentemp (format "%s-encode-table" ',name)))
4103
b4f4e0cc90f1 [xemacs-hg @ 2007-08-07 23:08:47 by aidan]
aidan
parents: 4090
diff changeset
676 ;; The case-fold-search bind shouldn't be necessary. If I take
b4f4e0cc90f1 [xemacs-hg @ 2007-08-07 23:08:47 by aidan]
aidan
parents: 4090
diff changeset
677 ;; it, out, though, I get:
b4f4e0cc90f1 [xemacs-hg @ 2007-08-07 23:08:47 by aidan]
aidan
parents: 4090
diff changeset
678 ;;
b4f4e0cc90f1 [xemacs-hg @ 2007-08-07 23:08:47 by aidan]
aidan
parents: 4090
diff changeset
679 ;; (invalid-read-syntax "Multiply defined symbol label" 1)
b4f4e0cc90f1 [xemacs-hg @ 2007-08-07 23:08:47 by aidan]
aidan
parents: 4090
diff changeset
680 ;;
b4f4e0cc90f1 [xemacs-hg @ 2007-08-07 23:08:47 by aidan]
aidan
parents: 4090
diff changeset
681 ;; when the file is byte compiled.
b4f4e0cc90f1 [xemacs-hg @ 2007-08-07 23:08:47 by aidan]
aidan
parents: 4090
diff changeset
682 (case-fold-search t))
4072
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
683 (define-translation-hash-table encode-table-sym ,encode-table)
4103
b4f4e0cc90f1 [xemacs-hg @ 2007-08-07 23:08:47 by aidan]
aidan
parents: 4090
diff changeset
684 (make-coding-system
b4f4e0cc90f1 [xemacs-hg @ 2007-08-07 23:08:47 by aidan]
aidan
parents: 4090
diff changeset
685 ',name 'ccl ,description
b4f4e0cc90f1 [xemacs-hg @ 2007-08-07 23:08:47 by aidan]
aidan
parents: 4090
diff changeset
686 (plist-put (plist-put ',props 'decode
b4f4e0cc90f1 [xemacs-hg @ 2007-08-07 23:08:47 by aidan]
aidan
parents: 4090
diff changeset
687 ,(apply #'vector decode-program))
b4f4e0cc90f1 [xemacs-hg @ 2007-08-07 23:08:47 by aidan]
aidan
parents: 4090
diff changeset
688 'encode
b4f4e0cc90f1 [xemacs-hg @ 2007-08-07 23:08:47 by aidan]
aidan
parents: 4090
diff changeset
689 (apply #'vector
b4f4e0cc90f1 [xemacs-hg @ 2007-08-07 23:08:47 by aidan]
aidan
parents: 4090
diff changeset
690 (nsublis
b4f4e0cc90f1 [xemacs-hg @ 2007-08-07 23:08:47 by aidan]
aidan
parents: 4090
diff changeset
691 (list (cons
b4f4e0cc90f1 [xemacs-hg @ 2007-08-07 23:08:47 by aidan]
aidan
parents: 4090
diff changeset
692 'encode-table-sym
b4f4e0cc90f1 [xemacs-hg @ 2007-08-07 23:08:47 by aidan]
aidan
parents: 4090
diff changeset
693 (symbol-value 'encode-table-sym)))
b4f4e0cc90f1 [xemacs-hg @ 2007-08-07 23:08:47 by aidan]
aidan
parents: 4090
diff changeset
694 ',encode-program))))
4295
eded49463f9a [xemacs-hg @ 2007-11-29 13:37:51 by aidan]
aidan
parents: 4145
diff changeset
695 (coding-system-put ',name '8-bit-fixed t)
4145
edb00a8b4eff [xemacs-hg @ 2007-08-26 20:00:29 by aidan]
aidan
parents: 4103
diff changeset
696 (coding-system-put ',name 'category ',
edb00a8b4eff [xemacs-hg @ 2007-08-26 20:00:29 by aidan]
aidan
parents: 4103
diff changeset
697 (make-8-bit-choose-category decode-table))
4072
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
698 ,(macroexpand `(loop for alias in ',aliases
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
699 do (define-coding-system-alias alias
aa28d959af41 [xemacs-hg @ 2007-07-22 22:03:29 by aidan]
aidan
parents: 771
diff changeset
700 ',name)))
4103
b4f4e0cc90f1 [xemacs-hg @ 2007-08-07 23:08:47 by aidan]
aidan
parents: 4090
diff changeset
701 (find-coding-system ',name)))))