Mercurial > hg > xemacs-beta
annotate lisp/code-init.el @ 4568:1d74a1d115ee
Add #'query-coding-region tests; do the work necessary to get them running.
lisp/ChangeLog addition:
2008-12-28 Aidan Kehoe <kehoea@parhasard.net>
* coding.el (default-query-coding-region):
Declare using defun*, so we can #'return-from to it on
encountering a safe-charsets value of t. Comment out a few
debug messages.
(query-coding-region):
Correct the docstring, it deals with a region, not a string.
(unencodable-char-position):
Correct the implementation for non-nil COUNT, special-case a zero
value for count, treat it as one. Don't rely on dynamic scope when
calling the main lambda.
* unicode.el (unicode-query-coding-region):
Comment out some debug messages here.
* mule/mule-coding.el (8-bit-fixed-query-coding-region):
Comment out some debug messages here.
* code-init.el (raw-text):
Add a safe-charsets property to this coding system.
* mule/korean.el (iso-2022-int-1):
* mule/korean.el (euc-kr):
* mule/korean.el (iso-2022-kr):
Add safe-charsets properties for these coding systems.
* mule/japanese.el (iso-2022-jp):
* mule/japanese.el (jis7):
* mule/japanese.el (jis8):
* mule/japanese.el (shift-jis):
* mule/japanese.el (iso-2022-jp-1978-irv):
* mule/japanese.el (euc-jp):
Add safe-charsets properties for all these coding systems.
* mule/iso-with-esc.el:
Add safe-charsets properties to all the coding systems in
here. Comment on the downside of a safe-charsets value of t for
iso-latin-1-with-esc.
* mule/hebrew.el (ctext-hebrew):
Add a safe-charsets property for this coding system.
* mule/devanagari.el (in-is13194-devanagari):
Add a safe-charsets property for this coding system.
* mule/chinese.el (cn-gb-2312):
* mule/chinese.el (hz-gb-2312):
* mule/chinese.el (big5):
Add safe-charsets properties for these coding systems.
* mule/latin.el (iso-8859-14):
Add an implementation for this, using #'make-8-bit-coding-system.
* mule/mule-coding.el (ctext):
* mule/mule-coding.el (iso-2022-8bit-ss2):
* mule/mule-coding.el (iso-2022-7bit-ss2):
* mule/mule-coding.el (iso-2022-jp-2):
* mule/mule-coding.el (iso-2022-7bit):
* mule/mule-coding.el (iso-2022-8):
* mule/mule-coding.el (escape-quoted):
* mule/mule-coding.el (iso-2022-lock):
Add safe-charsets properties for all these coding systems.
src/ChangeLog addition:
2008-12-28 Aidan Kehoe <kehoea@parhasard.net>
* file-coding.c (Fmake_coding_system):
Document our use of the safe-chars and safe-charsets properties,
and the differences compared to GNU.
(make_coding_system_1): Don't drop the safe-chars and
safe-charsets properties.
(Fcoding_system_property): Return the safe-chars and safe-charsets
properties when asked for them.
* file-coding.h (CODING_SYSTEM_SAFE_CHARSETS):
* coding-system-slots.h:
Make the safe-chars and safe-charsets slots available in these
headers.
tests/ChangeLog addition:
2008-12-28 Aidan Kehoe <kehoea@parhasard.net>
* automated/query-coding-tests.el:
New file, testing the functionality of #'query-coding-region and
#'query-coding-string.
author | Aidan Kehoe <kehoea@parhasard.net> |
---|---|
date | Sun, 28 Dec 2008 14:46:24 +0000 |
parents | 14f65fa1e69e |
children | d2ec55325515 |
rev | line source |
---|---|
771 | 1 ;;; code-init.el --- Handle coding system default values |
2 | |
1318 | 3 ;; Copyright (C) 2001, 2002, 2003 Ben Wing. |
771 | 4 |
5 ;; This file is part of XEmacs. | |
6 | |
7 ;; XEmacs is free software; you can redistribute it and/or modify it | |
8 ;; under the terms of the GNU General Public License as published by | |
9 ;; the Free Software Foundation; either version 2, or (at your option) | |
10 ;; any later version. | |
11 | |
12 ;; XEmacs is distributed in the hope that it will be useful, but | |
13 ;; WITHOUT ANY WARRANTY; without even the implied warranty of | |
14 ;; MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU | |
15 ;; General Public License for more details. | |
16 | |
17 ;; You should have received a copy of the GNU General Public License | |
18 ;; along with XEmacs; see the file COPYING. If not, write to the | |
19 ;; Free Software Foundation, Inc., 59 Temple Place - Suite 330, | |
20 ;; Boston, MA 02111-1307, USA. | |
21 | |
22 ;;; Commentary: | |
23 | |
24 ;; Placed in a separate file so it can be loaded after the various | |
25 ;; coding systems have been created, because we'll be using them at | |
26 ;; load time. | |
27 | |
2297 | 28 ;; #### Issues (this discussion probably belongs elsewhere) |
29 ;; 1. "Big" characters are unrepresentable. Should give error, warning, | |
30 ;; not just substitute "~". | |
31 ;; 2. 21.4 compatibility? | |
32 ;; 3. make-char: non-mule barfs on non-iso8859-1. | |
33 | |
771 | 34 ;;; Code: |
35 | |
36 (defcustom eol-detection-enabled-p (or (featurep 'mule) | |
37 (memq system-type '(windows-nt | |
38 cygwin32)) | |
39 (featurep 'unix-default-eol-detection)) | |
40 "True if XEmacs automatically detects the EOL type when reading files. | |
41 Normally, this is always the case on Windows or when international (Mule) | |
42 support is compiled into this XEmacs. Otherwise, it is currently off by | |
43 default, but this may change. Don't set this; nothing will happen. Instead, | |
44 use the Options menu or `set-eol-detection'." | |
45 :group 'encoding | |
46 :type 'boolean | |
47 ;; upon initialization, we don't want the whole business of | |
48 ;; set-eol-detection to be called. We will init everything appropriately | |
49 ;; later in the same file, when reset-language-environment is called. | |
50 :initialize #'(lambda (var val) | |
1698 | 51 (setq eol-detection-enabled-p (eval val))) |
771 | 52 :set #'(lambda (var val) |
53 (set-eol-detection val) | |
54 (setq eol-detection-enabled-p val))) | |
55 | |
56 (defun set-eol-detection (flag) | |
57 "Enable (if FLAG is non-nil) or disable automatic EOL detection of files. | |
58 EOL detection is enabled by default on Windows or when international (Mule) | |
59 support is compiled into this XEmacs. Otherwise, it is currently off by | |
60 default, but this may change. NOTE: You *REALLY* should not turn off EOL | |
61 detection on Windows! Your files will have lots of annoying ^M's in them | |
62 if you do this." | |
63 (dolist (x '(buffer-file-coding-system-for-read | |
64 keyboard | |
2508 | 65 default-process-coding-system-read |
66 no-conversion-coding-system-mapping)) | |
771 | 67 (set-coding-system-variable |
68 x (coding-system-change-eol-conversion (get-coding-system-variable x) | |
2508 | 69 (if flag nil 'lf))))) |
771 | 70 |
71 (defun coding-system-current-system-configuration () | |
72 (cond ((memq system-type '(windows-nt cygwin32)) | |
73 (if (featurep 'mule) 'windows-mule 'windows-no-mule)) | |
74 ((featurep 'mule) 'unix-mule) | |
75 (eol-detection-enabled-p 'unix-no-mule-eol-detection) | |
76 (t 'unix-no-mule-no-eol-detection))) | |
77 | |
1318 | 78 ;; NOTE NOTE NOTE: These values may get overridden when the language |
79 ;; environment is initialized (set-language-environment-coding-systems). | |
771 | 80 (defvar coding-system-variable-default-value-table |
1318 | 81 '((buffer-file-coding-system-for-read |
82 binary raw-text undecided raw-text undecided) | |
83 (default-buffer-file-coding-system | |
2297 | 84 ;; #### iso-2022-8 with no eol specified? can that be OK? |
1318 | 85 binary binary iso-2022-8 raw-text-dos mswindows-multibyte-dos) |
86 (native | |
87 binary binary binary raw-text-dos mswindows-multibyte-system-default-dos) | |
88 (keyboard | |
1471 | 89 binary raw-text undecided-unix raw-text undecided-unix) |
771 | 90 ;; the `terminal' coding system is used for output to stderr. such |
91 ;; streams do automatic lf->crlf encoding in the C library, so we need | |
92 ;; to not do the same translations ourselves. | |
1318 | 93 (terminal |
94 binary binary binary binary mswindows-multibyte-unix) | |
95 (default-process-coding-system-read | |
96 binary raw-text undecided raw-text undecided) | |
97 (default-process-coding-system-write | |
98 binary binary binary raw-text mswindows-multibyte-system-default) | |
99 (no-conversion-coding-system-mapping | |
100 binary raw-text raw-text raw-text mswindows-multibyte) | |
771 | 101 )) |
102 | |
103 (defvar coding-system-default-configuration-list | |
104 '(unix-no-mule-no-eol-detection | |
105 unix-no-mule-eol-detection | |
106 unix-mule | |
107 windows-no-mule | |
108 windows-mule)) | |
109 | |
110 (defvar coding-system-default-variable-list | |
111 '(buffer-file-coding-system-for-read | |
112 default-buffer-file-coding-system | |
113 native | |
114 keyboard | |
115 terminal | |
116 default-process-coding-system-read | |
2508 | 117 default-process-coding-system-write |
118 no-conversion-coding-system-mapping)) | |
771 | 119 |
120 (defun get-coding-system-variable (var) | |
121 "Return the value of a basic coding system variable. | |
122 This is intended as a uniform interface onto the coding system settings that | |
123 control how encoding detection and conversion works. See | |
124 `coding-system-variable-default-value' for a list of the possible values of | |
125 VAR." | |
126 (case var | |
127 (buffer-file-coding-system-for-read buffer-file-coding-system-for-read) | |
128 (default-buffer-file-coding-system | |
129 (default-value 'buffer-file-coding-system)) | |
130 (native (coding-system-aliasee 'native)) | |
131 (keyboard (coding-system-aliasee 'keyboard)) | |
132 (terminal (coding-system-aliasee 'terminal)) | |
133 (default-process-coding-system-read (car default-process-coding-system)) | |
134 (default-process-coding-system-write (cdr default-process-coding-system)) | |
2508 | 135 (no-conversion-coding-system-mapping |
136 (coding-category-system 'no-conversion)) | |
771 | 137 (t (error 'invalid-constant "Invalid coding system variable" var)))) |
138 | |
139 (defun set-coding-system-variable (var value) | |
140 "Set a basic coding system variable to VALUE. | |
141 This is intended as a uniform interface onto the coding system settings that | |
142 control how encoding detection and conversion works. See | |
143 `coding-system-variable-default-value' for a list of the possible values of | |
144 VAR." | |
145 (case var | |
146 (buffer-file-coding-system-for-read | |
147 (set-buffer-file-coding-system-for-read value)) | |
148 (default-buffer-file-coding-system | |
149 (set-default-buffer-file-coding-system value)) | |
150 (native (define-coding-system-alias 'native value)) | |
151 (keyboard (set-keyboard-coding-system value)) | |
152 (terminal (set-terminal-coding-system value)) | |
153 (default-process-coding-system-read | |
154 (setq default-process-coding-system | |
155 (cons value (cdr default-process-coding-system)))) | |
156 (default-process-coding-system-write | |
157 (setq default-process-coding-system | |
158 (cons (car default-process-coding-system) value))) | |
2508 | 159 (no-conversion-coding-system-mapping |
160 (set-coding-category-system 'no-conversion value)) | |
771 | 161 (t (error 'invalid-constant "Invalid coding system variable" var)))) |
162 | |
163 (defun coding-system-variable-default-value (var &optional config) | |
164 "Return the appropriate default value for a coding system variable. | |
165 | |
166 VAR specifies the variable, and CONFIG the configuration, defaulting | |
167 to the current system configuration (as returned by | |
168 `coding-system-current-system-configuration'). | |
169 | |
170 The table of default values looks like this: (see below for abbreviations) | |
171 | |
172 | |
1471 | 173 Unix Unix+EOL Unix+Mule MSW MSW+Mule |
174 ----------------------------------------------------------------------------- | |
175 bfcs-for-read binary raw-text undecided raw-text undecided | |
176 default bfcs binary binary iso-2022-8 raw-text-dos MSW-MB-dos | |
177 native binary binary binary raw-text-dos MSW-MB-SD-dos | |
178 keyboard binary raw-text undecided-unix raw-text undecided-unix | |
179 terminal binary binary binary binary MSW-MB-unix | |
180 process-read binary raw-text undecided raw-text undecided | |
181 process-write binary binary binary raw-text MSW-MB-SD | |
182 no-conv-cs binary raw-text raw-text raw-text MSW-MB | |
771 | 183 |
184 | |
185 VAR can be one of: (abbreviations in parens) | |
186 | |
187 `buffer-file-coding-system-for-read' (bfcs-for-read) | |
188 | |
189 Lisp variable of the same name; the default coding system used when | |
190 reading in a file, in the absence of more specific settings. (See | |
191 `insert-file-contents' for a description of exactly how a file's | |
192 coding system is determined when it's read in.) | |
193 | |
194 `default-buffer-file-coding-system' (default bfcs) | |
195 | |
196 Default value of `buffer-file-coding-system', the buffer-local | |
197 variable specifying a file's coding system to be used when it is | |
198 written out. Set using `set-default-buffer-file-coding-system' (or | |
199 the primitive `setq-default'). When a file is read in, | |
200 `buffer-file-coding-system' for that file is set from the coding | |
201 system used to read the file in; the default value applies to newly | |
202 created files. | |
203 | |
204 `native' (native) | |
205 | |
206 The coding system named `native'. Changed using | |
207 `define-coding-system-alias'. Used internally when passing | |
1318 | 208 text to or from system API's, unless the particular |
771 | 209 API specifies another coding system. |
210 | |
211 `keyboard' (keyboard) | |
212 | |
213 #### fill in | |
214 | |
215 `terminal' (terminal) | |
216 | |
217 #### fill in | |
218 | |
219 `default-process-coding-system-read' (process-read) | |
220 | |
221 #### fill in | |
222 | |
223 `default-process-coding-system-write' (process-write) | |
224 | |
225 #### fill in | |
226 | |
227 `no-conversion-coding-system-mapping' (no-conv-cs) | |
228 | |
229 Coding system used when category `no-conversion' is detected. | |
230 | |
231 | |
232 CONFIG is one of: (abbreviations in parens) | |
233 | |
234 `unix-no-mule-no-eol-detection' (Unix) | |
235 | |
236 Unix, no Mule support, no automatic EOL detection. (Controlled by | |
237 `eol-detection-enabled-p', which is set by the command-line flag | |
238 -enable-eol-detection or the configure flag --with-default-eol-detection.) | |
239 | |
240 `unix-no-mule-eol-detection' (Unix+EOL) | |
241 | |
242 Unix, no Mule support, automatic EOL detection. | |
243 | |
244 `unix-mule' (Unix+Mule) | |
245 | |
246 Unix, Mule support. | |
247 | |
248 `windows-no-mule' (MSW) | |
249 | |
250 MS Windows or Cygwin, no Mule support. | |
251 | |
252 `windows-mule'. (MSW+Mule) | |
253 | |
254 MS Windows or Cygwin, Mule support. | |
255 | |
256 | |
257 The following coding system abbreviations are also used in the table: | |
258 | |
259 MSW-MB = mswindows-multibyte | |
260 MSW-MB = mswindows-multibyte-system-default | |
261 " | |
262 (setq config (or config (coding-system-current-system-configuration))) | |
263 (let ((defs (cdr (assq var coding-system-variable-default-value-table)))) | |
264 (or defs (error 'invalid-constant "Invalid coding system variable" var)) | |
265 (let ((pos (position config coding-system-default-configuration-list))) | |
266 (or pos (error 'invalid-constant "Invalid coding system configuration" | |
267 config)) | |
268 (nth pos defs)))) | |
269 | |
270 (defun reset-coding-system-defaults (&optional config) | |
271 "Reset all basic coding system variables are set to their default values. | |
272 See `coding-system-variable-default-value'." | |
273 (setq config (or config (coding-system-current-system-configuration))) | |
274 (mapcar #'(lambda (var) | |
275 (set-coding-system-variable | |
276 var (coding-system-variable-default-value var config))) | |
277 coding-system-default-variable-list)) | |
278 | |
279 (defun reset-coding-categories-to-default () | |
280 "Reset all coding categories (used for automatic detection) to their defaults. | |
281 | |
282 The order of priorities of coding categories and the coding system | |
283 bound to each category are as follows: | |
284 | |
285 coding category coding system | |
286 -------------------------------------------------- | |
287 utf-16-little-endian-bom utf-16-little-endian | |
288 utf-16-bom utf-16-bom | |
985 | 289 utf-8-bom utf-8-bom |
771 | 290 iso-7 iso-2022-7bit |
291 no-conversion raw-text | |
292 utf-8 utf-8 | |
293 iso-8-1 iso-8859-1 | |
294 iso-8-2 ctext (iso-8859-1 alias) | |
295 iso-8-designate ctext (iso-8859-1 alias) | |
296 iso-lock-shift iso-2022-lock | |
297 shift-jis shift-jis | |
298 big5 big5 | |
299 utf-16-little-endian utf-16-little-endian | |
300 utf-16 utf-16 | |
301 ucs-4 ucs-4 | |
302 " | |
303 ;; #### What a mess! This needs to be overhauled. | |
304 | |
305 ;; The old table (from FSF synch?) was not what we use (cf mule-coding.el), | |
306 ;; and as documented iso-8-designate is inconsistent with iso-2022-8bit-ss2. | |
307 ;; The order of priorities of coding categories and the coding system | |
308 ;; bound to each category are as follows: | |
309 ;; | |
310 ;; coding category coding system | |
311 ;; -------------------------------------------------- | |
312 ;; iso-8-2 iso-8859-1 | |
313 ;; iso-8-1 iso-8859-1 | |
314 ;; iso-7 iso-2022-7bit | |
315 ;; iso-lock-shift iso-2022-lock | |
316 ;; iso-8-designate iso-2022-8bit-ss2 | |
317 ;; no-conversion raw-text | |
318 ;; shift-jis shift_jis | |
319 ;; big5 big5 | |
320 ;; ucs-4 ---- | |
321 ;; utf-8 ---- | |
322 (when (featurep 'mule) | |
323 (set-coding-category-system 'iso-7 'iso-2022-7) | |
324 (set-coding-category-system 'iso-8-1 'iso-8859-1) | |
325 (set-coding-category-system 'iso-8-2 'ctext) | |
326 (set-coding-category-system 'iso-lock-shift 'iso-2022-lock) | |
327 (set-coding-category-system 'iso-8-designate 'ctext) | |
328 (if (find-coding-system 'shift-jis) | |
329 (set-coding-category-system 'shift-jis 'shift-jis)) | |
330 (if (find-coding-system 'big5) | |
331 (set-coding-category-system 'big5 'big5)) | |
332 ) | |
333 (set-coding-category-system | |
334 'no-conversion | |
335 (coding-system-variable-default-value 'no-conversion-coding-system-mapping)) | |
336 (set-coding-category-system 'ucs-4 'ucs-4) | |
337 (set-coding-category-system 'utf-8 'utf-8) | |
985 | 338 (set-coding-category-system 'utf-8-bom 'utf-8-bom) |
771 | 339 (set-coding-category-system 'utf-16-little-endian 'utf-16-little-endian) |
340 (set-coding-category-system 'utf-16 'utf-16) | |
341 (set-coding-category-system 'utf-16-little-endian-bom | |
342 'utf-16-little-endian-bom) | |
343 (set-coding-category-system 'utf-16-bom 'utf-16-bom) | |
344 (set-coding-priority-list | |
345 (if (featurep 'mule) | |
346 '(utf-16-little-endian-bom | |
347 utf-16-bom | |
985 | 348 utf-8-bom |
771 | 349 iso-7 |
350 no-conversion | |
351 utf-8 | |
352 iso-8-1 | |
353 iso-8-2 | |
354 iso-8-designate | |
355 iso-lock-shift | |
356 shift-jis | |
357 big5 | |
358 utf-16-little-endian | |
359 utf-16 | |
360 ucs-4) | |
361 '(utf-16-little-endian-bom | |
362 utf-16-bom | |
985 | 363 utf-8-bom |
771 | 364 no-conversion |
365 utf-8 | |
366 utf-16-little-endian | |
367 utf-16 | |
368 ucs-4)))) | |
369 | |
370 (defun reset-language-environment () | |
371 "Reset coding system environment of XEmacs to the default status. | |
372 All basic coding system variables are set to their default values, as | |
373 are the coding categories used for automatic detection and their | |
374 priority. | |
375 | |
376 BE VERY CERTAIN YOU WANT TO DO THIS BEFORE DOING IT! | |
377 | |
378 For more information, see `reset-coding-system-defaults' and | |
379 `reset-coding-categories-to-default'." | |
380 (reset-coding-system-defaults) | |
381 (reset-coding-categories-to-default)) | |
382 | |
383 ;; Initialize everything so that the remaining Lisp files can contain | |
384 ;; extended characters. (They will be in ISO-7 format) | |
385 | |
386 ;; !!####!! The Lisp files should all be in UTF-8!!! That way, all | |
387 ;; special characters appear as high bits and there's no problem with | |
388 ;; the Lisp parser trying to read a Mule file and getting all screwed | |
389 ;; up. The only other thing then would be characters; we just need to | |
390 ;; modify the Lisp parser to read the stuff directly after a ? as | |
391 ;; UTF-8 and return a 30-bit value directly, and modify the character | |
392 ;; routines a bit to allow such a beast to exist. MAKE IT A POINT TO | |
393 ;; IMPLEMENT THIS AS ONE OF MY FUTURE PROJECTS. --ben | |
394 | |
395 (reset-language-environment) | |
396 | |
4568
1d74a1d115ee
Add #'query-coding-region tests; do the work necessary to get them running.
Aidan Kehoe <kehoea@parhasard.net>
parents:
2508
diff
changeset
|
397 (coding-system-put 'raw-text 'safe-charsets '(ascii control-1 latin-iso8859-1)) |
1d74a1d115ee
Add #'query-coding-region tests; do the work necessary to get them running.
Aidan Kehoe <kehoea@parhasard.net>
parents:
2508
diff
changeset
|
398 |
771 | 399 ;;; code-init.el ends here |