Mercurial > hg > xemacs-beta
comparison lisp/mule/mule-coding.el @ 4568:1d74a1d115ee
Add #'query-coding-region tests; do the work necessary to get them running.
lisp/ChangeLog addition:
2008-12-28 Aidan Kehoe <kehoea@parhasard.net>
* coding.el (default-query-coding-region):
Declare using defun*, so we can #'return-from to it on
encountering a safe-charsets value of t. Comment out a few
debug messages.
(query-coding-region):
Correct the docstring, it deals with a region, not a string.
(unencodable-char-position):
Correct the implementation for non-nil COUNT, special-case a zero
value for count, treat it as one. Don't rely on dynamic scope when
calling the main lambda.
* unicode.el (unicode-query-coding-region):
Comment out some debug messages here.
* mule/mule-coding.el (8-bit-fixed-query-coding-region):
Comment out some debug messages here.
* code-init.el (raw-text):
Add a safe-charsets property to this coding system.
* mule/korean.el (iso-2022-int-1):
* mule/korean.el (euc-kr):
* mule/korean.el (iso-2022-kr):
Add safe-charsets properties for these coding systems.
* mule/japanese.el (iso-2022-jp):
* mule/japanese.el (jis7):
* mule/japanese.el (jis8):
* mule/japanese.el (shift-jis):
* mule/japanese.el (iso-2022-jp-1978-irv):
* mule/japanese.el (euc-jp):
Add safe-charsets properties for all these coding systems.
* mule/iso-with-esc.el:
Add safe-charsets properties to all the coding systems in
here. Comment on the downside of a safe-charsets value of t for
iso-latin-1-with-esc.
* mule/hebrew.el (ctext-hebrew):
Add a safe-charsets property for this coding system.
* mule/devanagari.el (in-is13194-devanagari):
Add a safe-charsets property for this coding system.
* mule/chinese.el (cn-gb-2312):
* mule/chinese.el (hz-gb-2312):
* mule/chinese.el (big5):
Add safe-charsets properties for these coding systems.
* mule/latin.el (iso-8859-14):
Add an implementation for this, using #'make-8-bit-coding-system.
* mule/mule-coding.el (ctext):
* mule/mule-coding.el (iso-2022-8bit-ss2):
* mule/mule-coding.el (iso-2022-7bit-ss2):
* mule/mule-coding.el (iso-2022-jp-2):
* mule/mule-coding.el (iso-2022-7bit):
* mule/mule-coding.el (iso-2022-8):
* mule/mule-coding.el (escape-quoted):
* mule/mule-coding.el (iso-2022-lock):
Add safe-charsets properties for all these coding systems.
src/ChangeLog addition:
2008-12-28 Aidan Kehoe <kehoea@parhasard.net>
* file-coding.c (Fmake_coding_system):
Document our use of the safe-chars and safe-charsets properties,
and the differences compared to GNU.
(make_coding_system_1): Don't drop the safe-chars and
safe-charsets properties.
(Fcoding_system_property): Return the safe-chars and safe-charsets
properties when asked for them.
* file-coding.h (CODING_SYSTEM_SAFE_CHARSETS):
* coding-system-slots.h:
Make the safe-chars and safe-charsets slots available in these
headers.
tests/ChangeLog addition:
2008-12-28 Aidan Kehoe <kehoea@parhasard.net>
* automated/query-coding-tests.el:
New file, testing the functionality of #'query-coding-region and
#'query-coding-string.
author | Aidan Kehoe <kehoea@parhasard.net> |
---|---|
date | Sun, 28 Dec 2008 14:46:24 +0000 |
parents | 84d618b355f5 |
children | e0a8715fdb1f |
comparison
equal
deleted
inserted
replaced
4567:84d618b355f5 | 4568:1d74a1d115ee |
---|---|
102 'ctext 'iso2022 | 102 'ctext 'iso2022 |
103 "Compound Text" | 103 "Compound Text" |
104 '(charset-g0 ascii | 104 '(charset-g0 ascii |
105 charset-g1 latin-iso8859-1 | 105 charset-g1 latin-iso8859-1 |
106 eol-type nil | 106 eol-type nil |
107 safe-charsets t ;; Reasonable | |
107 mnemonic "CText")) | 108 mnemonic "CText")) |
108 | 109 |
109 (make-coding-system | 110 (make-coding-system |
110 'iso-2022-8bit-ss2 'iso2022 | 111 'iso-2022-8bit-ss2 'iso2022 |
111 "ISO-2022 8-bit w/SS2" | 112 "ISO-2022 8-bit w/SS2" |
112 '(charset-g0 ascii | 113 '(charset-g0 ascii |
113 charset-g1 latin-iso8859-1 | 114 charset-g1 latin-iso8859-1 |
114 charset-g2 t ;; unspecified but can be used later. | 115 charset-g2 t ;; unspecified but can be used later. |
115 short t | 116 short t |
117 safe-charsets (ascii katakana-jisx0201 japanese-jisx0208-1978 | |
118 japanese-jisx0208 japanese-jisx0212 japanese-jisx0213-1 | |
119 japanese-jisx0213-2) | |
116 mnemonic "ISO8/SS" | 120 mnemonic "ISO8/SS" |
117 documentation "ISO 2022 based 8-bit encoding using SS2 for 96-charset" | 121 documentation "ISO 2022 based 8-bit encoding using SS2 for 96-charset" |
118 )) | 122 )) |
119 | 123 |
120 (make-coding-system | 124 (make-coding-system |
122 "ISO-2022 7-bit w/SS2" | 126 "ISO-2022 7-bit w/SS2" |
123 '(charset-g0 ascii | 127 '(charset-g0 ascii |
124 charset-g2 t ;; unspecified but can be used later. | 128 charset-g2 t ;; unspecified but can be used later. |
125 seven t | 129 seven t |
126 short t | 130 short t |
131 safe-charsets t | |
127 mnemonic "ISO7/SS" | 132 mnemonic "ISO7/SS" |
128 documentation "ISO 2022 based 7-bit encoding using SS2 for 96-charset" | 133 documentation "ISO 2022 based 7-bit encoding using SS2 for 96-charset" |
129 eol-type nil)) | 134 eol-type nil)) |
130 | 135 |
131 ;; (copy-coding-system 'iso-2022-7bit-ss2 'iso-2022-jp-2) | 136 ;; (copy-coding-system 'iso-2022-7bit-ss2 'iso-2022-jp-2) |
134 "ISO-2022-JP-2" | 139 "ISO-2022-JP-2" |
135 '(charset-g0 ascii | 140 '(charset-g0 ascii |
136 charset-g2 t ;; unspecified but can be used later. | 141 charset-g2 t ;; unspecified but can be used later. |
137 seven t | 142 seven t |
138 short t | 143 short t |
144 safe-charsets t | |
139 mnemonic "ISO7/SS" | 145 mnemonic "ISO7/SS" |
140 eol-type nil)) | 146 eol-type nil)) |
141 | 147 |
142 (make-coding-system | 148 (make-coding-system |
143 'iso-2022-7bit 'iso2022 | 149 'iso-2022-7bit 'iso2022 |
144 "ISO 2022 7-bit" | 150 "ISO 2022 7-bit" |
145 '(charset-g0 ascii | 151 '(charset-g0 ascii |
146 seven t | 152 seven t |
147 short t | 153 short t |
154 safe-charsets t | |
148 mnemonic "ISO7" | 155 mnemonic "ISO7" |
149 documentation "ISO-2022-based 7-bit encoding using only G0" | 156 documentation "ISO-2022-based 7-bit encoding using only G0" |
150 )) | 157 )) |
151 | 158 |
152 ;; compatibility for old XEmacsen | 159 ;; compatibility for old XEmacsen |
156 'iso-2022-8 'iso2022 | 163 'iso-2022-8 'iso2022 |
157 "ISO-2022 8-bit" | 164 "ISO-2022 8-bit" |
158 '(charset-g0 ascii | 165 '(charset-g0 ascii |
159 charset-g1 latin-iso8859-1 | 166 charset-g1 latin-iso8859-1 |
160 short t | 167 short t |
168 safe-charsets t | |
161 mnemonic "ISO8" | 169 mnemonic "ISO8" |
162 documentation "ISO-2022 eight-bit coding system. No single-shift or locking-shift." | 170 documentation "ISO-2022 eight-bit coding system. No single-shift or locking-shift." |
163 )) | 171 )) |
164 | 172 |
165 (make-coding-system | 173 (make-coding-system |
167 "Escape-Quoted (for .ELC files)" | 175 "Escape-Quoted (for .ELC files)" |
168 '(charset-g0 ascii | 176 '(charset-g0 ascii |
169 charset-g1 latin-iso8859-1 | 177 charset-g1 latin-iso8859-1 |
170 eol-type lf | 178 eol-type lf |
171 escape-quoted t | 179 escape-quoted t |
180 safe-charsets t | |
172 mnemonic "ESC/Quot" | 181 mnemonic "ESC/Quot" |
173 documentation "ISO-2022 eight-bit coding system with escape quoting; used for .ELC files." | 182 documentation "ISO-2022 eight-bit coding system with escape quoting; used for .ELC files." |
174 )) | 183 )) |
175 | 184 |
176 (make-coding-system | 185 (make-coding-system |
178 "ISO-2022 w/locking-shift" | 187 "ISO-2022 w/locking-shift" |
179 '(charset-g0 ascii | 188 '(charset-g0 ascii |
180 charset-g1 t ;; unspecified but can be used later. | 189 charset-g1 t ;; unspecified but can be used later. |
181 seven t | 190 seven t |
182 lock-shift t | 191 lock-shift t |
192 safe-charsets t | |
183 mnemonic "ISO7/Lock" | 193 mnemonic "ISO7/Lock" |
184 documentation "ISO-2022 coding system using Locking-Shift for 96-charset." | 194 documentation "ISO-2022 coding system using Locking-Shift for 96-charset." |
185 )) | 195 )) |
186 | 196 |
187 | 197 |
572 (extent-face extent)) | 582 (extent-face extent)) |
573 (delete-extent extent))) buffer begin end)) | 583 (delete-extent extent))) buffer begin end)) |
574 (goto-char begin buffer) | 584 (goto-char begin buffer) |
575 (skip-chars-forward skip-chars-arg end buffer) | 585 (skip-chars-forward skip-chars-arg end buffer) |
576 (while (< (point buffer) end) | 586 (while (< (point buffer) end) |
577 (message | 587 ; (message |
578 "fail-range-start is %S, previous-fail %S, point is %S, end is %S" | 588 ; "fail-range-start is %S, previous-fail %S, point is %S, end is %S" |
579 fail-range-start previous-fail (point buffer) end) | 589 ; fail-range-start previous-fail (point buffer) end) |
580 (setq char-after (char-after (point buffer) buffer) | 590 (setq char-after (char-after (point buffer) buffer) |
581 fail-range-start (point buffer)) | 591 fail-range-start (point buffer)) |
582 (message "arguments are %S %S" | 592 ; (message "arguments are %S %S" |
583 (< (point buffer) end) | 593 ; (< (point buffer) end) |
584 (not (gethash (encode-char char-after 'ucs) from-unicode))) | 594 ; (not (gethash (encode-char char-after 'ucs) from-unicode))) |
585 (while (and | 595 (while (and |
586 (< (point buffer) end) | 596 (< (point buffer) end) |
587 (not (gethash (encode-char char-after 'ucs) from-unicode))) | 597 (not (gethash (encode-char char-after 'ucs) from-unicode))) |
588 (forward-char 1 buffer) | 598 (forward-char 1 buffer) |
589 (setq char-after (char-after (point buffer) buffer) | 599 (setq char-after (char-after (point buffer) buffer) |
591 (if (= fail-range-start (point buffer)) | 601 (if (= fail-range-start (point buffer)) |
592 ;; The character can actually be encoded by the coding | 602 ;; The character can actually be encoded by the coding |
593 ;; system; check the characters past it. | 603 ;; system; check the characters past it. |
594 (forward-char 1 buffer) | 604 (forward-char 1 buffer) |
595 ;; The character actually failed. | 605 ;; The character actually failed. |
596 (message "past the move through, point now %S" (point buffer)) | 606 ; (message "past the move through, point now %S" (point buffer)) |
597 (when errorp | 607 (when errorp |
598 (error 'text-conversion-error | 608 (error 'text-conversion-error |
599 (format "Cannot encode %s using coding system" | 609 (format "Cannot encode %s using coding system" |
600 (buffer-substring fail-range-start (point buffer) | 610 (buffer-substring fail-range-start (point buffer) |
601 buffer)) | 611 buffer)) |
606 (setq fail-range-end (if char-after | 616 (setq fail-range-end (if char-after |
607 (point buffer) | 617 (point buffer) |
608 (point-max buffer))) | 618 (point-max buffer))) |
609 t ranges) | 619 t ranges) |
610 (when highlightp | 620 (when highlightp |
611 (message "highlighting") | 621 ; (message "highlighting") |
612 (setq extent (make-extent fail-range-start fail-range-end buffer)) | 622 (setq extent (make-extent fail-range-start fail-range-end buffer)) |
613 (set-extent-priority extent (+ mouse-highlight-priority 2)) | 623 (set-extent-priority extent (+ mouse-highlight-priority 2)) |
614 (set-extent-face extent 'query-coding-warning-face)) | 624 (set-extent-face extent 'query-coding-warning-face)) |
615 (skip-chars-forward skip-chars-arg end buffer))) | 625 (skip-chars-forward skip-chars-arg end buffer))) |
616 (message "about to give the result, ranges %S" ranges) | 626 ; (message "about to give the result, ranges %S" ranges) |
617 (if failed | 627 (if failed |
618 (values nil ranges) | 628 (values nil ranges) |
619 (values t nil))))) | 629 (values t nil))))) |
620 | 630 |
621 ;;;###autoload | 631 ;;;###autoload |