comparison lisp/mule/mule-coding.el @ 4568:1d74a1d115ee

Add #'query-coding-region tests; do the work necessary to get them running. lisp/ChangeLog addition: 2008-12-28 Aidan Kehoe <kehoea@parhasard.net> * coding.el (default-query-coding-region): Declare using defun*, so we can #'return-from to it on encountering a safe-charsets value of t. Comment out a few debug messages. (query-coding-region): Correct the docstring, it deals with a region, not a string. (unencodable-char-position): Correct the implementation for non-nil COUNT, special-case a zero value for count, treat it as one. Don't rely on dynamic scope when calling the main lambda. * unicode.el (unicode-query-coding-region): Comment out some debug messages here. * mule/mule-coding.el (8-bit-fixed-query-coding-region): Comment out some debug messages here. * code-init.el (raw-text): Add a safe-charsets property to this coding system. * mule/korean.el (iso-2022-int-1): * mule/korean.el (euc-kr): * mule/korean.el (iso-2022-kr): Add safe-charsets properties for these coding systems. * mule/japanese.el (iso-2022-jp): * mule/japanese.el (jis7): * mule/japanese.el (jis8): * mule/japanese.el (shift-jis): * mule/japanese.el (iso-2022-jp-1978-irv): * mule/japanese.el (euc-jp): Add safe-charsets properties for all these coding systems. * mule/iso-with-esc.el: Add safe-charsets properties to all the coding systems in here. Comment on the downside of a safe-charsets value of t for iso-latin-1-with-esc. * mule/hebrew.el (ctext-hebrew): Add a safe-charsets property for this coding system. * mule/devanagari.el (in-is13194-devanagari): Add a safe-charsets property for this coding system. * mule/chinese.el (cn-gb-2312): * mule/chinese.el (hz-gb-2312): * mule/chinese.el (big5): Add safe-charsets properties for these coding systems. * mule/latin.el (iso-8859-14): Add an implementation for this, using #'make-8-bit-coding-system. * mule/mule-coding.el (ctext): * mule/mule-coding.el (iso-2022-8bit-ss2): * mule/mule-coding.el (iso-2022-7bit-ss2): * mule/mule-coding.el (iso-2022-jp-2): * mule/mule-coding.el (iso-2022-7bit): * mule/mule-coding.el (iso-2022-8): * mule/mule-coding.el (escape-quoted): * mule/mule-coding.el (iso-2022-lock): Add safe-charsets properties for all these coding systems. src/ChangeLog addition: 2008-12-28 Aidan Kehoe <kehoea@parhasard.net> * file-coding.c (Fmake_coding_system): Document our use of the safe-chars and safe-charsets properties, and the differences compared to GNU. (make_coding_system_1): Don't drop the safe-chars and safe-charsets properties. (Fcoding_system_property): Return the safe-chars and safe-charsets properties when asked for them. * file-coding.h (CODING_SYSTEM_SAFE_CHARSETS): * coding-system-slots.h: Make the safe-chars and safe-charsets slots available in these headers. tests/ChangeLog addition: 2008-12-28 Aidan Kehoe <kehoea@parhasard.net> * automated/query-coding-tests.el: New file, testing the functionality of #'query-coding-region and #'query-coding-string.
author Aidan Kehoe <kehoea@parhasard.net>
date Sun, 28 Dec 2008 14:46:24 +0000
parents 84d618b355f5
children e0a8715fdb1f
comparison
equal deleted inserted replaced
4567:84d618b355f5 4568:1d74a1d115ee
102 'ctext 'iso2022 102 'ctext 'iso2022
103 "Compound Text" 103 "Compound Text"
104 '(charset-g0 ascii 104 '(charset-g0 ascii
105 charset-g1 latin-iso8859-1 105 charset-g1 latin-iso8859-1
106 eol-type nil 106 eol-type nil
107 safe-charsets t ;; Reasonable
107 mnemonic "CText")) 108 mnemonic "CText"))
108 109
109 (make-coding-system 110 (make-coding-system
110 'iso-2022-8bit-ss2 'iso2022 111 'iso-2022-8bit-ss2 'iso2022
111 "ISO-2022 8-bit w/SS2" 112 "ISO-2022 8-bit w/SS2"
112 '(charset-g0 ascii 113 '(charset-g0 ascii
113 charset-g1 latin-iso8859-1 114 charset-g1 latin-iso8859-1
114 charset-g2 t ;; unspecified but can be used later. 115 charset-g2 t ;; unspecified but can be used later.
115 short t 116 short t
117 safe-charsets (ascii katakana-jisx0201 japanese-jisx0208-1978
118 japanese-jisx0208 japanese-jisx0212 japanese-jisx0213-1
119 japanese-jisx0213-2)
116 mnemonic "ISO8/SS" 120 mnemonic "ISO8/SS"
117 documentation "ISO 2022 based 8-bit encoding using SS2 for 96-charset" 121 documentation "ISO 2022 based 8-bit encoding using SS2 for 96-charset"
118 )) 122 ))
119 123
120 (make-coding-system 124 (make-coding-system
122 "ISO-2022 7-bit w/SS2" 126 "ISO-2022 7-bit w/SS2"
123 '(charset-g0 ascii 127 '(charset-g0 ascii
124 charset-g2 t ;; unspecified but can be used later. 128 charset-g2 t ;; unspecified but can be used later.
125 seven t 129 seven t
126 short t 130 short t
131 safe-charsets t
127 mnemonic "ISO7/SS" 132 mnemonic "ISO7/SS"
128 documentation "ISO 2022 based 7-bit encoding using SS2 for 96-charset" 133 documentation "ISO 2022 based 7-bit encoding using SS2 for 96-charset"
129 eol-type nil)) 134 eol-type nil))
130 135
131 ;; (copy-coding-system 'iso-2022-7bit-ss2 'iso-2022-jp-2) 136 ;; (copy-coding-system 'iso-2022-7bit-ss2 'iso-2022-jp-2)
134 "ISO-2022-JP-2" 139 "ISO-2022-JP-2"
135 '(charset-g0 ascii 140 '(charset-g0 ascii
136 charset-g2 t ;; unspecified but can be used later. 141 charset-g2 t ;; unspecified but can be used later.
137 seven t 142 seven t
138 short t 143 short t
144 safe-charsets t
139 mnemonic "ISO7/SS" 145 mnemonic "ISO7/SS"
140 eol-type nil)) 146 eol-type nil))
141 147
142 (make-coding-system 148 (make-coding-system
143 'iso-2022-7bit 'iso2022 149 'iso-2022-7bit 'iso2022
144 "ISO 2022 7-bit" 150 "ISO 2022 7-bit"
145 '(charset-g0 ascii 151 '(charset-g0 ascii
146 seven t 152 seven t
147 short t 153 short t
154 safe-charsets t
148 mnemonic "ISO7" 155 mnemonic "ISO7"
149 documentation "ISO-2022-based 7-bit encoding using only G0" 156 documentation "ISO-2022-based 7-bit encoding using only G0"
150 )) 157 ))
151 158
152 ;; compatibility for old XEmacsen 159 ;; compatibility for old XEmacsen
156 'iso-2022-8 'iso2022 163 'iso-2022-8 'iso2022
157 "ISO-2022 8-bit" 164 "ISO-2022 8-bit"
158 '(charset-g0 ascii 165 '(charset-g0 ascii
159 charset-g1 latin-iso8859-1 166 charset-g1 latin-iso8859-1
160 short t 167 short t
168 safe-charsets t
161 mnemonic "ISO8" 169 mnemonic "ISO8"
162 documentation "ISO-2022 eight-bit coding system. No single-shift or locking-shift." 170 documentation "ISO-2022 eight-bit coding system. No single-shift or locking-shift."
163 )) 171 ))
164 172
165 (make-coding-system 173 (make-coding-system
167 "Escape-Quoted (for .ELC files)" 175 "Escape-Quoted (for .ELC files)"
168 '(charset-g0 ascii 176 '(charset-g0 ascii
169 charset-g1 latin-iso8859-1 177 charset-g1 latin-iso8859-1
170 eol-type lf 178 eol-type lf
171 escape-quoted t 179 escape-quoted t
180 safe-charsets t
172 mnemonic "ESC/Quot" 181 mnemonic "ESC/Quot"
173 documentation "ISO-2022 eight-bit coding system with escape quoting; used for .ELC files." 182 documentation "ISO-2022 eight-bit coding system with escape quoting; used for .ELC files."
174 )) 183 ))
175 184
176 (make-coding-system 185 (make-coding-system
178 "ISO-2022 w/locking-shift" 187 "ISO-2022 w/locking-shift"
179 '(charset-g0 ascii 188 '(charset-g0 ascii
180 charset-g1 t ;; unspecified but can be used later. 189 charset-g1 t ;; unspecified but can be used later.
181 seven t 190 seven t
182 lock-shift t 191 lock-shift t
192 safe-charsets t
183 mnemonic "ISO7/Lock" 193 mnemonic "ISO7/Lock"
184 documentation "ISO-2022 coding system using Locking-Shift for 96-charset." 194 documentation "ISO-2022 coding system using Locking-Shift for 96-charset."
185 )) 195 ))
186 196
187 197
572 (extent-face extent)) 582 (extent-face extent))
573 (delete-extent extent))) buffer begin end)) 583 (delete-extent extent))) buffer begin end))
574 (goto-char begin buffer) 584 (goto-char begin buffer)
575 (skip-chars-forward skip-chars-arg end buffer) 585 (skip-chars-forward skip-chars-arg end buffer)
576 (while (< (point buffer) end) 586 (while (< (point buffer) end)
577 (message 587 ; (message
578 "fail-range-start is %S, previous-fail %S, point is %S, end is %S" 588 ; "fail-range-start is %S, previous-fail %S, point is %S, end is %S"
579 fail-range-start previous-fail (point buffer) end) 589 ; fail-range-start previous-fail (point buffer) end)
580 (setq char-after (char-after (point buffer) buffer) 590 (setq char-after (char-after (point buffer) buffer)
581 fail-range-start (point buffer)) 591 fail-range-start (point buffer))
582 (message "arguments are %S %S" 592 ; (message "arguments are %S %S"
583 (< (point buffer) end) 593 ; (< (point buffer) end)
584 (not (gethash (encode-char char-after 'ucs) from-unicode))) 594 ; (not (gethash (encode-char char-after 'ucs) from-unicode)))
585 (while (and 595 (while (and
586 (< (point buffer) end) 596 (< (point buffer) end)
587 (not (gethash (encode-char char-after 'ucs) from-unicode))) 597 (not (gethash (encode-char char-after 'ucs) from-unicode)))
588 (forward-char 1 buffer) 598 (forward-char 1 buffer)
589 (setq char-after (char-after (point buffer) buffer) 599 (setq char-after (char-after (point buffer) buffer)
591 (if (= fail-range-start (point buffer)) 601 (if (= fail-range-start (point buffer))
592 ;; The character can actually be encoded by the coding 602 ;; The character can actually be encoded by the coding
593 ;; system; check the characters past it. 603 ;; system; check the characters past it.
594 (forward-char 1 buffer) 604 (forward-char 1 buffer)
595 ;; The character actually failed. 605 ;; The character actually failed.
596 (message "past the move through, point now %S" (point buffer)) 606 ; (message "past the move through, point now %S" (point buffer))
597 (when errorp 607 (when errorp
598 (error 'text-conversion-error 608 (error 'text-conversion-error
599 (format "Cannot encode %s using coding system" 609 (format "Cannot encode %s using coding system"
600 (buffer-substring fail-range-start (point buffer) 610 (buffer-substring fail-range-start (point buffer)
601 buffer)) 611 buffer))
606 (setq fail-range-end (if char-after 616 (setq fail-range-end (if char-after
607 (point buffer) 617 (point buffer)
608 (point-max buffer))) 618 (point-max buffer)))
609 t ranges) 619 t ranges)
610 (when highlightp 620 (when highlightp
611 (message "highlighting") 621 ; (message "highlighting")
612 (setq extent (make-extent fail-range-start fail-range-end buffer)) 622 (setq extent (make-extent fail-range-start fail-range-end buffer))
613 (set-extent-priority extent (+ mouse-highlight-priority 2)) 623 (set-extent-priority extent (+ mouse-highlight-priority 2))
614 (set-extent-face extent 'query-coding-warning-face)) 624 (set-extent-face extent 'query-coding-warning-face))
615 (skip-chars-forward skip-chars-arg end buffer))) 625 (skip-chars-forward skip-chars-arg end buffer)))
616 (message "about to give the result, ranges %S" ranges) 626 ; (message "about to give the result, ranges %S" ranges)
617 (if failed 627 (if failed
618 (values nil ranges) 628 (values nil ranges)
619 (values t nil))))) 629 (values t nil)))))
620 630
621 ;;;###autoload 631 ;;;###autoload