comparison tests/automated/query-coding-tests.el @ 4604:e0a8715fdb1f

Support new IGNORE-INVALID-SEQUENCESP argument, #'query-coding-region. lisp/ChangeLog addition: 2009-02-07 Aidan Kehoe <kehoea@parhasard.net> * coding.el (query-coding-clear-highlights): Rename the BUFFER argument to BUFFER-OR-STRING, describe it as possibly being a string in its documentation. (default-query-coding-region): Add a new IGNORE-INVALID-SEQUENCESP argument, document that this function does not support it. Bind case-fold-search to nil, we don't want this to influence what the function thinks is encodable or not. (query-coding-region): Add a new IGNORE-INVALID-SEQUENCESP argument, document what it does; reflect this new argument in the associated compiler macro. (query-coding-string): Add a new IGNORE-INVALID-SEQUENCESP argument, document what it does. Support the HIGHLIGHT argument correctly. * unicode.el (unicode-query-coding-region): Add a new IGNORE-INVALID-SEQUENCESP argument, document what it does, implement this. Document a potential problem. Use #'query-coding-clear-highlights instead of reimplementing it ourselves. Remove some debugging messages. * mule/arabic.el (iso-8859-6): * mule/cyrillic.el (iso-8859-5): * mule/greek.el (iso-8859-7): * mule/hebrew.el (iso-8859-8): * mule/latin.el (iso-8859-2): * mule/latin.el (iso-8859-3): * mule/latin.el (iso-8859-4): * mule/latin.el (iso-8859-14): * mule/latin.el (iso-8859-15): * mule/latin.el (iso-8859-16): * mule/latin.el (iso-8859-9): * mule/latin.el (windows-1252): * mule/mule-coding.el (iso-8859-1): Avoid the assumption that characters not given an explicit mapping in these coding systems map to the ISO 8859-1 characters corresponding to the octets on disk; this makes it much more reasonable to implement the IGNORE-INVALID-SEQUENCESP argument to query-coding-region. * mule/mule-cmds.el (set-language-info): Correct the docstring. * mule/mule-cmds.el (finish-set-language-environment): Treat invalid Unicode sequences produced from invalid-sequence-coding-system and corresponding to control characters the same as control characters in redisplay. * mule/mule-cmds.el: Document that encode-coding-char is available in coding.el * mule/mule-coding.el (make-8-bit-generate-helper): Change to return the both the encode-program generated and the relevant non-ASCII charset; update the docstring to reflect this. * mule/mule-coding.el (make-8-bit-generate-encode-program-and-skip-chars-strings): Rename this function; have it return skip-chars-strings as well as the encode program. Have these skip-chars-strings use ranges for charsets, where possible. * mule/mule-coding.el (make-8-bit-create-decode-encode-tables): Revise this to allow people to specify explicitly characters that should be undefined (= corresponding to keys in unicode-error-default-translation-table), and treating unspecified octets above #x7f as undefined by default. * mule/mule-coding.el (8-bit-fixed-query-coding-region): Add a new IGNORE-INVALID-SEQUENCESP argument, implement support for it using the 8-bit-fixed-invalid-sequences-skip-chars coding system property; remove some debugging messages. * mule/mule-coding.el (make-8-bit-coding-system): This function is dumped, autoloading it makes no sense. Document what happens when characters above #x7f are not specified, implement this. * mule/vietnamese.el: Correct spelling. tests/ChangeLog addition: 2009-02-07 Aidan Kehoe <kehoea@parhasard.net> * automated/query-coding-tests.el: Add FAILING-CASE arguments to the Assert calls, making #'q-c-debug mostly unnecessary. Remove #'q-c-debug. Add new tests that use the IGNORE-INVALID-SEQUENCESP argument to #'query-coding-region; rework the existing ones to respect it.
author Aidan Kehoe <kehoea@parhasard.net>
date Sat, 07 Feb 2009 17:13:37 +0000
parents e6a7054a9c30
children 8cbca852bcd4
comparison
equal deleted inserted replaced
4603:202cb69c4d87 4604:e0a8715fdb1f
28 28
29 ;; Test the query-coding-region and query-coding-string implementations for 29 ;; Test the query-coding-region and query-coding-string implementations for
30 ;; some well-known coding systems. 30 ;; some well-known coding systems.
31 31
32 (require 'bytecomp) 32 (require 'bytecomp)
33
34 (defun q-c-debug (&rest aerger)
35 (let ((standard-output (get-buffer-create "query-coding-debug"))
36 (fmt (condition-case nil
37 (and (stringp (first aerger))
38 (apply #'format aerger))
39 (error nil))))
40 (if fmt
41 (progn
42 (princ (apply #'format aerger))
43 (terpri))
44 (princ "--> ")
45 (let ((i 1))
46 (dolist (sgra aerger)
47 (if (> i 1) (princ " "))
48 (princ (format "%d. " i))
49 (prin1 sgra)
50 (incf i))
51 (terpri)))))
52
53 ;; Comment this out if debugging:
54 (defalias 'q-c-debug #'ignore)
55 33
56 (when (featurep 'mule) 34 (when (featurep 'mule)
57 (let ((ascii-chars-string (apply #'string 35 (let ((ascii-chars-string (apply #'string
58 (loop for i from #x0 to #x7f 36 (loop for i from #x0 to #x7f
59 collect (int-to-char i)))) 37 collect (int-to-char i))))
62 collect (int-to-char i)))) 40 collect (int-to-char i))))
63 unix-coding-system text-conversion-error-signalled) 41 unix-coding-system text-conversion-error-signalled)
64 (with-temp-buffer 42 (with-temp-buffer
65 (insert ascii-chars-string) 43 (insert ascii-chars-string)
66 ;; First, check all the coding systems that are ASCII-transparent for 44 ;; First, check all the coding systems that are ASCII-transparent for
67 ;; ASCII-transparency in the check. 45 ;; ASCII-transparency in query-coding-region.
68 (dolist (coding-system 46 (dolist (coding-system
69 (delete-duplicates 47 (delete-duplicates
70 (mapcar #'(lambda (coding-system) 48 (mapcar #'(lambda (coding-system)
71 (unless (coding-system-alias-p coding-system) 49 (unless (coding-system-alias-p coding-system)
72 ;; We're only interested in the version with 50 ;; We're only interested in the version with
85 unix-coding-system) 63 unix-coding-system)
86 '(undecided chain)))) 64 '(undecided chain))))
87 unix-coding-system))) 65 unix-coding-system)))
88 (coding-system-list nil)) 66 (coding-system-list nil))
89 :test #'eq)) 67 :test #'eq))
90 (q-c-debug "looking at coding system %S" (coding-system-name
91 coding-system))
92 (multiple-value-bind (query-coding-succeeded query-coding-table) 68 (multiple-value-bind (query-coding-succeeded query-coding-table)
93 (query-coding-region (point-min) (point-max) coding-system) 69 (query-coding-region (point-min) (point-max) coding-system)
94 (Assert (eq t query-coding-succeeded)) 70 (Assert (eq t query-coding-succeeded)
95 (Assert (null query-coding-table))) 71 (format "checking query-coding-region ASCII-transparency, %s"
72 coding-system))
73 (Assert (null query-coding-table)
74 (format "checking query-coding-region ASCII-transparency, %s"
75 coding-system)))
96 (multiple-value-bind (query-coding-succeeded query-coding-table) 76 (multiple-value-bind (query-coding-succeeded query-coding-table)
97 (query-coding-string ascii-chars-string coding-system) 77 (query-coding-string ascii-chars-string coding-system)
98 (Assert (eq t query-coding-succeeded)) 78 (Assert (eq t query-coding-succeeded)
99 (Assert (null query-coding-table)))) 79 (format "checking query-coding-string ASCII-transparency, %s"
80 coding-system))
81 (Assert (null query-coding-table)
82 (format "checking query-coding-string ASCII-transparency, %s"
83 coding-system))))
100 (delete-region (point-min) (point-max)) 84 (delete-region (point-min) (point-max))
101 ;; Check for success from the two Latin-1 coding systems 85 ;; Check for success from the two Latin-1 coding systems
102 (insert latin-1-chars-string) 86 (insert latin-1-chars-string)
103 (multiple-value-bind (query-coding-succeeded query-coding-table) 87 (multiple-value-bind (query-coding-succeeded query-coding-table)
104 (query-coding-region (point-min) (point-max) 'iso-8859-1-unix) 88 (query-coding-region (point-min) (point-max) 'iso-8859-1-unix)
105 (Assert (eq t query-coding-succeeded)) 89 (Assert (eq t query-coding-succeeded)
106 (Assert (null query-coding-table))) 90 "checking query-coding-region iso-8859-1-transparency")
91 (Assert (null query-coding-table)
92 "checking query-coding-region iso-8859-1-transparency"))
107 (multiple-value-bind (query-coding-succeeded query-coding-table) 93 (multiple-value-bind (query-coding-succeeded query-coding-table)
108 (query-coding-string (buffer-string) 'iso-8859-1-unix) 94 (query-coding-string (buffer-string) 'iso-8859-1-unix)
109 (Assert (eq t query-coding-succeeded)) 95 (Assert (eq t query-coding-succeeded)
110 (Assert (null query-coding-table))) 96 "checking query-coding-string iso-8859-1-transparency")
97 (Assert (null query-coding-table)
98 "checking query-coding-string iso-8859-1-transparency"))
111 (multiple-value-bind (query-coding-succeeded query-coding-table) 99 (multiple-value-bind (query-coding-succeeded query-coding-table)
112 (query-coding-string (buffer-string) 'iso-latin-1-with-esc-unix) 100 (query-coding-string (buffer-string) 'iso-latin-1-with-esc-unix)
113 (Assert (eq t query-coding-succeeded)) 101 (Assert
114 (Assert (null query-coding-table))) 102 (eq t query-coding-succeeded)
103 "checking query-coding-region iso-latin-1-with-esc-transparency")
104 (Assert
105 (null query-coding-table)
106 "checking query-coding-region iso-latin-1-with-esc-transparency"))
115 ;; Make it fail, check that it fails correctly 107 ;; Make it fail, check that it fails correctly
116 (insert (decode-char 'ucs #x20AC)) ;; EURO SIGN 108 (insert (decode-char 'ucs #x20AC)) ;; EURO SIGN
117 (multiple-value-bind (query-coding-succeeded query-coding-table) 109 (multiple-value-bind (query-coding-succeeded query-coding-table)
118 (query-coding-region (point-min) (point-max) 'iso-8859-1-unix) 110 (query-coding-region (point-min) (point-max) 'iso-8859-1-unix)
119 (Assert (null query-coding-succeeded)) 111 (Assert
120 (Assert (equal query-coding-table 112 (null query-coding-succeeded)
121 #s(range-table type start-closed-end-open data 113 "checking that query-coding-region fails, U+20AC, iso-8859-1")
122 ((257 258) t))))) 114 (Assert
115 (equal query-coding-table
116 #s(range-table type start-closed-end-open data
117 ((257 258) unencodable)))
118 "checking query-coding-region fails correctly, U+20AC, iso-8859-1"))
123 (multiple-value-bind (query-coding-succeeded query-coding-table) 119 (multiple-value-bind (query-coding-succeeded query-coding-table)
124 (query-coding-region (point-min) (point-max) 120 (query-coding-region (point-min) (point-max)
125 'iso-latin-1-with-esc-unix) 121 'iso-latin-1-with-esc-unix)
126 ;; Stupidly, this succeeds. The behaviour is compatible with 122 ;; Stupidly, this succeeds. The behaviour is compatible with
127 ;; GNU, though, and we encourage people not to use 123 ;; GNU, though, and we encourage people not to use
128 ;; iso-latin-1-with-esc-unix anyway: 124 ;; iso-latin-1-with-esc-unix anyway:
129 (Assert query-coding-succeeded) 125 (Assert
130 (Assert (null query-coding-table))) 126 query-coding-succeeded
127 "checking that query-coding-region succeeds, U+20AC, \
128 iso-latin-with-esc-unix-1")
129 (Assert
130 (null query-coding-table)
131 "checking that query-coding-region succeeds, U+20AC, \
132 iso-latin-with-esc-unix-1"))
131 ;; Check that it errors correctly. 133 ;; Check that it errors correctly.
132 (setq text-conversion-error-signalled nil) 134 (setq text-conversion-error-signalled nil)
133 (condition-case nil 135 (condition-case nil
134 (query-coding-region (point-min) (point-max) 'iso-8859-1-unix nil t) 136 (query-coding-region (point-min) (point-max) 'iso-8859-1-unix
137 (current-buffer) nil t)
135 (text-conversion-error 138 (text-conversion-error
136 (setq text-conversion-error-signalled t))) 139 (setq text-conversion-error-signalled t)))
137 (Assert text-conversion-error-signalled) 140 (Assert
141 text-conversion-error-signalled
142 "checking query-coding-region signals text-conversion-error correctly")
138 (setq text-conversion-error-signalled nil) 143 (setq text-conversion-error-signalled nil)
139 (condition-case nil 144 (condition-case nil
140 (query-coding-region (point-min) (point-max) 145 (query-coding-region (point-min) (point-max)
141 'iso-latin-1-with-esc-unix nil t) 146 'iso-latin-1-with-esc-unix nil nil t)
142 (text-conversion-error 147 (text-conversion-error
143 (setq text-conversion-error-signalled t))) 148 (setq text-conversion-error-signalled t)))
144 (Assert (null text-conversion-error-signalled)) 149 (Assert
150 (null text-conversion-error-signalled)
151 "checking query-coding-region doesn't signal text-conversion-error")
145 (delete-region (point-min) (point-max)) 152 (delete-region (point-min) (point-max))
146 (insert latin-1-chars-string) 153 (insert latin-1-chars-string)
147 (decode-coding-region (point-min) (point-max) 'windows-1252-unix) 154 (decode-coding-region (point-min) (point-max) 'windows-1252-unix)
148 (goto-char (point-max)) ;; #'decode-coding-region just messed up point. 155 (goto-char (point-max)) ;; #'decode-coding-region just messed up point.
149 (multiple-value-bind (query-coding-succeeded query-coding-table) 156 (multiple-value-bind (query-coding-succeeded query-coding-table)
150 (query-coding-region (point-min) (point-max) 'windows-1252-unix) 157 (query-coding-region (point-min) (point-max) 'windows-1252-unix)
151 (Assert (eq t query-coding-succeeded)) 158 (Assert
152 (Assert (null query-coding-table))) 159 (null query-coding-succeeded)
160 "check query-coding-region fails, windows-1252, invalid-sequences")
161 (Assert
162 (equal query-coding-table
163 #s(range-table type start-closed-end-open
164 data ((130 131) invalid-sequence
165 (142 143) invalid-sequence
166 (144 146) invalid-sequence
167 (158 159) invalid-sequence)))
168 "check query-coding-region fails, windows-1252, invalid-sequences"))
169 (multiple-value-bind (query-coding-succeeded query-coding-table)
170 (query-coding-region (point-min) (point-max) 'windows-1252-unix
171 (current-buffer) t)
172 (Assert
173 (eq t query-coding-succeeded)
174 "checking that query-coding-region succeeds, U+20AC, windows-1252")
175 (Assert
176 (null query-coding-table)
177 "checking that query-coding-region succeeds, U+20AC, windows-1252"))
153 (insert ?\x80) 178 (insert ?\x80)
154 (multiple-value-bind (query-coding-succeeded query-coding-table) 179 (multiple-value-bind (query-coding-succeeded query-coding-table)
180 (query-coding-region (point-min) (point-max) 'windows-1252-unix
181 (current-buffer) t)
182 (Assert
183 (null query-coding-succeeded)
184 "checking that query-coding-region fails, U+0080, windows-1252")
185 (Assert
186 (equal query-coding-table
187 #s(range-table type start-closed-end-open data
188 ((257 258) unencodable)))
189 "checking that query-coding-region fails, U+0080, windows-1252"))
190 (multiple-value-bind (query-coding-succeeded query-coding-table)
155 (query-coding-region (point-min) (point-max) 'windows-1252-unix) 191 (query-coding-region (point-min) (point-max) 'windows-1252-unix)
156 (Assert (null query-coding-succeeded)) 192 (Assert
157 (Assert (equal query-coding-table 193 (null query-coding-succeeded)
158 #s(range-table type start-closed-end-open data 194 "check query-coding-region fails, U+0080, invalid-sequence, cp1252")
159 ((257 258) t))))) 195 (Assert
196 (equal query-coding-table
197 #s(range-table type start-closed-end-open
198 data ((130 131) invalid-sequence
199 (142 143) invalid-sequence
200 (144 146) invalid-sequence
201 (158 159) invalid-sequence
202 (257 258) unencodable)))
203 "check query-coding-region fails, U+0080, invalid-sequence, cp1252"))
160 ;; Try a similar approach with koi8-o, the koi8 variant with 204 ;; Try a similar approach with koi8-o, the koi8 variant with
161 ;; support for Old Church Slavonic. 205 ;; support for Old Church Slavonic.
162 (delete-region (point-min) (point-max)) 206 (delete-region (point-min) (point-max))
163 (insert latin-1-chars-string) 207 (insert latin-1-chars-string)
164 (decode-coding-region (point-min) (point-max) 'koi8-o-unix) 208 (decode-coding-region (point-min) (point-max) 'koi8-o-unix)
165 (multiple-value-bind (query-coding-succeeded query-coding-table) 209 (multiple-value-bind (query-coding-succeeded query-coding-table)
166 (query-coding-region (point-min) (point-max) 'koi8-o-unix) 210 (query-coding-region (point-min) (point-max) 'koi8-o-unix)
167 (Assert (eq t query-coding-succeeded)) 211 (Assert
168 (Assert (null query-coding-table))) 212 (eq t query-coding-succeeded)
213 "checking that query-coding-region succeeds, koi8-o-unix")
214 (Assert
215 (null query-coding-table)
216 "checking that query-coding-region succeeds, koi8-o-unix"))
169 (multiple-value-bind (query-coding-succeeded query-coding-table) 217 (multiple-value-bind (query-coding-succeeded query-coding-table)
170 (query-coding-region (point-min) (point-max) 'escape-quoted) 218 (query-coding-region (point-min) (point-max) 'escape-quoted)
171 (Assert (eq t query-coding-succeeded)) 219 (Assert (eq t query-coding-succeeded)
172 (Assert (null query-coding-table))) 220 "checking that query-coding-region succeeds, escape-quoted")
221 (Assert (null query-coding-table)
222 "checking that query-coding-region succeeds, escape-quoted"))
173 (multiple-value-bind (query-coding-succeeded query-coding-table) 223 (multiple-value-bind (query-coding-succeeded query-coding-table)
174 (query-coding-region (point-min) (point-max) 'windows-1252-unix) 224 (query-coding-region (point-min) (point-max) 'windows-1252-unix)
175 (Assert (null query-coding-succeeded)) 225 (Assert
176 (Assert (equal query-coding-table 226 (null query-coding-succeeded)
177 #s(range-table type start-closed-end-open 227 "checking that query-coding-region fails, windows-1252 and Cyrillic")
178 data ((129 131) t (132 133) t (139 140) t 228 (Assert
179 (141 146) t (155 156) t (157 161) t 229 (equal query-coding-table
180 (162 170) t (173 176) t (178 187) t 230 #s(range-table type start-closed-end-open
181 (189 192) t (193 257) t))))) 231 data ((129 131) unencodable
232 (132 133) unencodable
233 (139 140) unencodable
234 (141 146) unencodable
235 (155 156) unencodable
236 (157 161) unencodable
237 (162 170) unencodable
238 (173 176) unencodable
239 (178 187) unencodable
240 (189 192) unencodable
241 (193 257) unencodable)))
242 "checking that query-coding-region fails, windows-1252 and Cyrillic"))
182 (multiple-value-bind (query-coding-succeeded query-coding-table) 243 (multiple-value-bind (query-coding-succeeded query-coding-table)
183 (query-coding-region (point-min) (point-max) 'koi8-r-unix) 244 (query-coding-region (point-min) (point-max) 'koi8-r-unix)
184 (Assert (null query-coding-succeeded)) 245 (Assert
185 (Assert (equal query-coding-table 246 (null query-coding-succeeded)
186 #s(range-table type start-closed-end-open 247 "checking that query-coding-region fails, koi8-r and OCS characters")
187 data ((129 154) t (155 161) t (162 164) t 248 (Assert
188 (165 177) t (178 180) t 249 (equal query-coding-table
189 (181 192) t))))) 250 #s(range-table type start-closed-end-open
251 data ((129 154) unencodable
252 (155 161) unencodable
253 (162 164) unencodable
254 (165 177) unencodable
255 (178 180) unencodable
256 (181 192) unencodable)))
257 "checking that query-coding-region fails, koi8-r and OCS characters"))
190 ;; Check that the Unicode coding systems handle characters 258 ;; Check that the Unicode coding systems handle characters
191 ;; without Unicode mappings. 259 ;; without Unicode mappings.
192 (delete-region (point-min) (point-max)) 260 (delete-region (point-min) (point-max))
193 (insert latin-1-chars-string) 261 (insert latin-1-chars-string)
194 (decode-coding-region (point-min) (point-max) 'greek-iso-8bit-with-esc) 262 (decode-coding-region (point-min) (point-max) 'greek-iso-8bit-with-esc)
208 utf-32-mac utf-8-mac utf-16-little-endian-unix 276 utf-32-mac utf-8-mac utf-16-little-endian-unix
209 ucs-4-little-endian ucs-4-little-endian-unix utf-8 277 ucs-4-little-endian ucs-4-little-endian-unix utf-8
210 utf-16-little-endian-bom)) 278 utf-16-little-endian-bom))
211 (multiple-value-bind (query-coding-succeeded query-coding-table) 279 (multiple-value-bind (query-coding-succeeded query-coding-table)
212 (query-coding-region (point-min) (point-max) coding-system) 280 (query-coding-region (point-min) (point-max) coding-system)
213 (Assert (null query-coding-succeeded)) 281 (Assert (null query-coding-succeeded)
282 "checking unicode coding systems fail with unmapped chars")
214 (Assert (equal query-coding-table 283 (Assert (equal query-coding-table
215 #s(range-table type start-closed-end-open data 284 #s(range-table type start-closed-end-open data
216 ((173 174) t (209 210) t 285 ((173 174) unencodable
217 (254 255) t))))) 286 (209 210) unencodable
287 (254 255) unencodable)))
288 "checking unicode coding systems fail with unmapped chars"))
218 (multiple-value-bind (query-coding-succeeded query-coding-table) 289 (multiple-value-bind (query-coding-succeeded query-coding-table)
219 (query-coding-region (point-min) 173 coding-system) 290 (query-coding-region (point-min) 173 coding-system)
220 (Assert (eq t query-coding-succeeded)) 291 (Assert (eq t query-coding-succeeded)
221 (Assert (null query-coding-table))) 292 "checking unicode coding systems succeed sans unmapped chars")
293 (Assert
294 (null query-coding-table)
295 "checking unicode coding systems succeed sans unmapped chars"))
222 (multiple-value-bind (query-coding-succeeded query-coding-table) 296 (multiple-value-bind (query-coding-succeeded query-coding-table)
223 (query-coding-region 174 209 coding-system) 297 (query-coding-region 174 209 coding-system)
224 (Assert (eq t query-coding-succeeded)) 298 (Assert
225 (Assert (null query-coding-table))) 299 (eq t query-coding-succeeded)
300 "checking unicode coding systems succeed sans unmapped chars, again")
301 (Assert
302 (null query-coding-table)
303 "checking unicode coding systems succeed sans unmapped chars again"))
226 (multiple-value-bind (query-coding-succeeded query-coding-table) 304 (multiple-value-bind (query-coding-succeeded query-coding-table)
227 (query-coding-region 210 254 coding-system) 305 (query-coding-region 210 254 coding-system)
228 (Assert (eq t query-coding-succeeded)) 306 (Assert (eq t query-coding-succeeded))
229 (Assert (null query-coding-table))) 307 (Assert (null query-coding-table)))
230 ;; Check that it errors correctly. 308 ;; Check that it errors correctly.
231 (setq text-conversion-error-signalled nil) 309 (setq text-conversion-error-signalled nil)
232 (condition-case nil 310 (condition-case nil
233 (query-coding-region (point-min) (point-max) coding-system nil t) 311 (query-coding-region (point-min) (point-max) coding-system
312 (current-buffer) nil t)
234 (text-conversion-error 313 (text-conversion-error
235 (setq text-conversion-error-signalled t))) 314 (setq text-conversion-error-signalled t)))
236 (Assert text-conversion-error-signalled) 315 (Assert text-conversion-error-signalled
316 "checking that unicode coding systems error correctly")
237 (setq text-conversion-error-signalled nil) 317 (setq text-conversion-error-signalled nil)
238 (condition-case nil 318 (condition-case nil
239 (query-coding-region (point-min) 173 coding-system nil t) 319 (query-coding-region (point-min) 173 coding-system
320 (current-buffer)
321 nil t)
240 (text-conversion-error 322 (text-conversion-error
241 (setq text-conversion-error-signalled t))) 323 (setq text-conversion-error-signalled t)))
242 (Assert (null text-conversion-error-signalled))) 324 (Assert
243 325 (null text-conversion-error-signalled)
326 "checking that unicode coding systems do not error when unnecessary"))
327
328 (delete-region (point-min) (point-max))
329 (insert (decode-coding-string "\xff\xff\xff\xff"
330 'greek-iso-8bit-with-esc))
331 (insert (decode-coding-string "\xff\xff\xff\xff" 'utf-8))
332 (insert (decode-coding-string "\xff\xff\xff\xff"
333 'greek-iso-8bit-with-esc))
334 (dolist (coding-system '(utf-8 utf-16 utf-16-little-endian
335 utf-32 utf-32-little-endian))
336 (multiple-value-bind (query-coding-succeeded query-coding-table)
337 (query-coding-region (point-min) (point-max) coding-system)
338 (Assert (null query-coding-succeeded)
339 (format
340 "checking %s fails with unmapped chars and invalid seqs"
341 coding-system))
342 (Assert (equal query-coding-table
343 #s(range-table type start-closed-end-open
344 data ((1 5) unencodable
345 (5 9) invalid-sequence
346 (9 13) unencodable)))
347 (format
348 "checking %s fails with unmapped chars and invalid seqs"
349 coding-system)))
350 (multiple-value-bind (query-coding-succeeded query-coding-table)
351 (query-coding-region (point-min) (point-max) coding-system
352 (current-buffer) t)
353 (Assert (null query-coding-succeeded)
354 (format
355 "checking %s fails with unmapped chars sans invalid seqs"
356 coding-system))
357 (Assert
358 (equal query-coding-table
359 #s(range-table type start-closed-end-open
360 data ((1 5) unencodable
361 (9 13) unencodable)))
362 (format
363 "checking %s fails correctly, unmapped chars sans invalid seqs"
364 coding-system))))
244 ;; Now to test #'encode-coding-char. Most of the functionality was 365 ;; Now to test #'encode-coding-char. Most of the functionality was
245 ;; tested in the query-coding-region tests above, so we don't go into 366 ;; tested in the query-coding-region tests above, so we don't go into
246 ;; as much detail here. 367 ;; as much detail here.
247 (Assert (null (encode-coding-char 368 (Assert
248 (decode-char 'ucs #x20ac) 'iso-8859-1))) 369 (null (encode-coding-char
249 (Assert (equal "\x80" (encode-coding-char 370 (decode-char 'ucs #x20ac) 'iso-8859-1))
250 (decode-char 'ucs #x20ac) 'windows-1252))) 371 "check #'encode-coding-char doesn't think iso-8859-1 handles U+20AC")
372 (Assert
373 (equal "\x80" (encode-coding-char
374 (decode-char 'ucs #x20ac) 'windows-1252))
375 "check #'encode-coding-char doesn't think windows-1252 handles U+0080")
251 (delete-region (point-min) (point-max)) 376 (delete-region (point-min) (point-max))
252 377
253 ;; And #'unencodable-char-position. 378 ;; And #'unencodable-char-position.
254 (insert latin-1-chars-string) 379 (insert latin-1-chars-string)
255 (insert (decode-char 'ucs #x20ac)) 380 (insert (decode-char 'ucs #x20ac))
256 (Assert (= 257 (unencodable-char-position (point-min) (point-max) 381 (Assert
257 'iso-8859-1))) 382 (= 257 (unencodable-char-position (point-min) (point-max)
258 (Assert (equal '(257) (unencodable-char-position (point-min) (point-max) 383 'iso-8859-1))
259 'iso-8859-1 1))) 384 "check #'unencodable-char-position doesn't think latin-1 encodes U+20AC")
385 (Assert
386 (equal '(257) (unencodable-char-position (point-min) (point-max)
387 'iso-8859-1 1))
388 "check #'unencodable-char-position doesn't think latin-1 encodes U+20AC")
260 ;; Compatiblity, sigh: 389 ;; Compatiblity, sigh:
261 (Assert (equal '(257) (unencodable-char-position (point-min) (point-max) 390 (Assert
262 'iso-8859-1 0))) 391 (equal '(257) (unencodable-char-position (point-min) (point-max)
392 'iso-8859-1 0))
393 "check #'unencodable-char-position has some borked GNU semantics")
263 (dotimes (i 6) (insert (decode-char 'ucs #x20ac))) 394 (dotimes (i 6) (insert (decode-char 'ucs #x20ac)))
264 ;; Check if it stops at one: 395 ;; Check if it stops at one:
265 (Assert (equal '(257) (unencodable-char-position (point-min) (point-max) 396 (Assert (equal '(257) (unencodable-char-position (point-min) (point-max)
266 'iso-8859-1 1))) 397 'iso-8859-1 1))
398 "check #'unencodable-char-position stops at 1 when asked to")
267 ;; Check if it stops at four: 399 ;; Check if it stops at four:
268 (Assert (equal '(260 259 258 257) 400 (Assert (equal '(260 259 258 257)
269 (unencodable-char-position (point-min) (point-max) 401 (unencodable-char-position (point-min) (point-max)
270 'iso-8859-1 4))) 402 'iso-8859-1 4))
403 "check #'unencodable-char-position stops at 4 when asked to")
271 ;; Check whether it stops at seven: 404 ;; Check whether it stops at seven:
272 (Assert (equal '(263 262 261 260 259 258 257) 405 (Assert (equal '(263 262 261 260 259 258 257)
273 (unencodable-char-position (point-min) (point-max) 406 (unencodable-char-position (point-min) (point-max)
274 'iso-8859-1 7))) 407 'iso-8859-1 7))
408 "check #'unencodable-char-position stops at 7 when asked to")
275 ;; Check that it still stops at seven: 409 ;; Check that it still stops at seven:
276 (Assert (equal '(263 262 261 260 259 258 257) 410 (Assert (equal '(263 262 261 260 259 258 257)
277 (unencodable-char-position (point-min) (point-max) 411 (unencodable-char-position (point-min) (point-max)
278 'iso-8859-1 2000))) 412 'iso-8859-1 2000))
413 "check #'unencodable-char-position stops at 7 if 2000 asked for")
279 ;; Now, #'check-coding-systems-region. 414 ;; Now, #'check-coding-systems-region.
280 ;; UTF-8 should certainly be able to encode these characters: 415 ;; UTF-8 should certainly be able to encode these characters:
281 (Assert (eq t (check-coding-systems-region (point-min) (point-max) 416 (Assert (eq t (check-coding-systems-region (point-min) (point-max)
282 '(utf-8)))) 417 '(utf-8)))
283 (Assert (equal '((iso-8859-1 257 258 259 260 261 262 263) 418 "check #'check-coding-systems-region gives t if encoding works")
284 (windows-1252 129 131 132 133 134 135 136 137 138 139 419 (Assert
285 140 141 143 146 147 148 149 150 151 152 420 (equal '((iso-8859-1 257 258 259 260 261 262 263)
286 153 154 155 156 157 159 160)) 421 (windows-1252 129 130 131 132 133 134 135 136
287 (sort 422 137 138 139 140 141 142 143 144
288 (check-coding-systems-region (point-min) (point-max) 423 145 146 147 148 149 150 151 152
289 '(utf-8 iso-8859-1 424 153 154 155 156 157 158 159 160))
290 windows-1252)) 425 (sort
291 ;; (The sort is to make the algorithm irrelevant.) 426 (check-coding-systems-region (point-min) (point-max)
292 #'(lambda (left right) 427 '(utf-8 iso-8859-1
293 (string< (car left) (car right)))))) 428 windows-1252))
429 ;; (The sort is to make the algorithm irrelevant.)
430 #'(lambda (left right)
431 (string< (car left) (car right)))))
432 "check #'check-coding-systems-region behaves well given a list")
294 ;; Ensure that the indices are all decreased by one when passed a 433 ;; Ensure that the indices are all decreased by one when passed a
295 ;; string: 434 ;; string:
296 (Assert (equal '((iso-8859-1 256 257 258 259 260 261 262) 435 (Assert
297 (windows-1252 128 130 131 132 133 134 135 136 137 138 436 (equal '((iso-8859-1 256 257 258 259 260 261 262)
298 139 140 142 145 146 147 148 149 150 151 437 (windows-1252 128 129 130 131 132 133 134 135
299 152 153 154 155 156 158 159)) 438 136 137 138 139 140 141 142 143
300 (sort 439 144 145 146 147 148 149 150 151
301 (check-coding-systems-region (buffer-string) nil 440 152 153 154 155 156 157 158 159))
302 '(utf-8 iso-8859-1 441 (sort
303 windows-1252)) 442 (check-coding-systems-region (buffer-string) nil
304 #'(lambda (left right) 443 '(utf-8 iso-8859-1
305 (string< (car left) (car right))))))))) 444 windows-1252))
306 445 #'(lambda (left right)
446 (string< (car left) (car right)))))
447 "check #'check-coding-systems-region behaves given a string and list"))))
448
449
450