Mercurial > hg > xemacs-beta
comparison tests/automated/query-coding-tests.el @ 4604:e0a8715fdb1f
Support new IGNORE-INVALID-SEQUENCESP argument, #'query-coding-region.
lisp/ChangeLog addition:
2009-02-07 Aidan Kehoe <kehoea@parhasard.net>
* coding.el (query-coding-clear-highlights):
Rename the BUFFER argument to BUFFER-OR-STRING, describe it as
possibly being a string in its documentation.
(default-query-coding-region):
Add a new IGNORE-INVALID-SEQUENCESP argument, document that this
function does not support it.
Bind case-fold-search to nil, we don't want this to influence what the
function thinks is encodable or not.
(query-coding-region):
Add a new IGNORE-INVALID-SEQUENCESP argument, document what it
does; reflect this new argument in the associated compiler macro.
(query-coding-string):
Add a new IGNORE-INVALID-SEQUENCESP argument, document what it
does. Support the HIGHLIGHT argument correctly.
* unicode.el (unicode-query-coding-region):
Add a new IGNORE-INVALID-SEQUENCESP argument, document what it
does, implement this. Document a potential problem.
Use #'query-coding-clear-highlights instead of reimplementing it
ourselves.
Remove some debugging messages.
* mule/arabic.el (iso-8859-6):
* mule/cyrillic.el (iso-8859-5):
* mule/greek.el (iso-8859-7):
* mule/hebrew.el (iso-8859-8):
* mule/latin.el (iso-8859-2):
* mule/latin.el (iso-8859-3):
* mule/latin.el (iso-8859-4):
* mule/latin.el (iso-8859-14):
* mule/latin.el (iso-8859-15):
* mule/latin.el (iso-8859-16):
* mule/latin.el (iso-8859-9):
* mule/latin.el (windows-1252):
* mule/mule-coding.el (iso-8859-1):
Avoid the assumption that characters not given an explicit mapping
in these coding systems map to the ISO 8859-1 characters
corresponding to the octets on disk; this makes it much more
reasonable to implement the IGNORE-INVALID-SEQUENCESP argument to
query-coding-region.
* mule/mule-cmds.el (set-language-info):
Correct the docstring.
* mule/mule-cmds.el (finish-set-language-environment):
Treat invalid Unicode sequences produced from
invalid-sequence-coding-system and corresponding to control
characters the same as control characters in redisplay.
* mule/mule-cmds.el:
Document that encode-coding-char is available in coding.el
* mule/mule-coding.el (make-8-bit-generate-helper):
Change to return the both the encode-program generated and the
relevant non-ASCII charset; update the docstring to reflect this.
* mule/mule-coding.el
(make-8-bit-generate-encode-program-and-skip-chars-strings):
Rename this function; have it return skip-chars-strings as well as
the encode program. Have these skip-chars-strings use ranges for
charsets, where possible.
* mule/mule-coding.el (make-8-bit-create-decode-encode-tables):
Revise this to allow people to specify explicitly characters that
should be undefined (= corresponding to keys in
unicode-error-default-translation-table), and treating unspecified
octets above #x7f as undefined by default.
* mule/mule-coding.el (8-bit-fixed-query-coding-region):
Add a new IGNORE-INVALID-SEQUENCESP argument, implement support
for it using the 8-bit-fixed-invalid-sequences-skip-chars coding
system property; remove some debugging messages.
* mule/mule-coding.el (make-8-bit-coding-system):
This function is dumped, autoloading it makes no sense.
Document what happens when characters above #x7f are not
specified, implement this.
* mule/vietnamese.el:
Correct spelling.
tests/ChangeLog addition:
2009-02-07 Aidan Kehoe <kehoea@parhasard.net>
* automated/query-coding-tests.el:
Add FAILING-CASE arguments to the Assert calls, making #'q-c-debug
mostly unnecessary. Remove #'q-c-debug.
Add new tests that use the IGNORE-INVALID-SEQUENCESP argument to
#'query-coding-region; rework the existing ones to respect it.
author | Aidan Kehoe <kehoea@parhasard.net> |
---|---|
date | Sat, 07 Feb 2009 17:13:37 +0000 |
parents | e6a7054a9c30 |
children | 8cbca852bcd4 |
comparison
equal
deleted
inserted
replaced
4603:202cb69c4d87 | 4604:e0a8715fdb1f |
---|---|
28 | 28 |
29 ;; Test the query-coding-region and query-coding-string implementations for | 29 ;; Test the query-coding-region and query-coding-string implementations for |
30 ;; some well-known coding systems. | 30 ;; some well-known coding systems. |
31 | 31 |
32 (require 'bytecomp) | 32 (require 'bytecomp) |
33 | |
34 (defun q-c-debug (&rest aerger) | |
35 (let ((standard-output (get-buffer-create "query-coding-debug")) | |
36 (fmt (condition-case nil | |
37 (and (stringp (first aerger)) | |
38 (apply #'format aerger)) | |
39 (error nil)))) | |
40 (if fmt | |
41 (progn | |
42 (princ (apply #'format aerger)) | |
43 (terpri)) | |
44 (princ "--> ") | |
45 (let ((i 1)) | |
46 (dolist (sgra aerger) | |
47 (if (> i 1) (princ " ")) | |
48 (princ (format "%d. " i)) | |
49 (prin1 sgra) | |
50 (incf i)) | |
51 (terpri))))) | |
52 | |
53 ;; Comment this out if debugging: | |
54 (defalias 'q-c-debug #'ignore) | |
55 | 33 |
56 (when (featurep 'mule) | 34 (when (featurep 'mule) |
57 (let ((ascii-chars-string (apply #'string | 35 (let ((ascii-chars-string (apply #'string |
58 (loop for i from #x0 to #x7f | 36 (loop for i from #x0 to #x7f |
59 collect (int-to-char i)))) | 37 collect (int-to-char i)))) |
62 collect (int-to-char i)))) | 40 collect (int-to-char i)))) |
63 unix-coding-system text-conversion-error-signalled) | 41 unix-coding-system text-conversion-error-signalled) |
64 (with-temp-buffer | 42 (with-temp-buffer |
65 (insert ascii-chars-string) | 43 (insert ascii-chars-string) |
66 ;; First, check all the coding systems that are ASCII-transparent for | 44 ;; First, check all the coding systems that are ASCII-transparent for |
67 ;; ASCII-transparency in the check. | 45 ;; ASCII-transparency in query-coding-region. |
68 (dolist (coding-system | 46 (dolist (coding-system |
69 (delete-duplicates | 47 (delete-duplicates |
70 (mapcar #'(lambda (coding-system) | 48 (mapcar #'(lambda (coding-system) |
71 (unless (coding-system-alias-p coding-system) | 49 (unless (coding-system-alias-p coding-system) |
72 ;; We're only interested in the version with | 50 ;; We're only interested in the version with |
85 unix-coding-system) | 63 unix-coding-system) |
86 '(undecided chain)))) | 64 '(undecided chain)))) |
87 unix-coding-system))) | 65 unix-coding-system))) |
88 (coding-system-list nil)) | 66 (coding-system-list nil)) |
89 :test #'eq)) | 67 :test #'eq)) |
90 (q-c-debug "looking at coding system %S" (coding-system-name | |
91 coding-system)) | |
92 (multiple-value-bind (query-coding-succeeded query-coding-table) | 68 (multiple-value-bind (query-coding-succeeded query-coding-table) |
93 (query-coding-region (point-min) (point-max) coding-system) | 69 (query-coding-region (point-min) (point-max) coding-system) |
94 (Assert (eq t query-coding-succeeded)) | 70 (Assert (eq t query-coding-succeeded) |
95 (Assert (null query-coding-table))) | 71 (format "checking query-coding-region ASCII-transparency, %s" |
72 coding-system)) | |
73 (Assert (null query-coding-table) | |
74 (format "checking query-coding-region ASCII-transparency, %s" | |
75 coding-system))) | |
96 (multiple-value-bind (query-coding-succeeded query-coding-table) | 76 (multiple-value-bind (query-coding-succeeded query-coding-table) |
97 (query-coding-string ascii-chars-string coding-system) | 77 (query-coding-string ascii-chars-string coding-system) |
98 (Assert (eq t query-coding-succeeded)) | 78 (Assert (eq t query-coding-succeeded) |
99 (Assert (null query-coding-table)))) | 79 (format "checking query-coding-string ASCII-transparency, %s" |
80 coding-system)) | |
81 (Assert (null query-coding-table) | |
82 (format "checking query-coding-string ASCII-transparency, %s" | |
83 coding-system)))) | |
100 (delete-region (point-min) (point-max)) | 84 (delete-region (point-min) (point-max)) |
101 ;; Check for success from the two Latin-1 coding systems | 85 ;; Check for success from the two Latin-1 coding systems |
102 (insert latin-1-chars-string) | 86 (insert latin-1-chars-string) |
103 (multiple-value-bind (query-coding-succeeded query-coding-table) | 87 (multiple-value-bind (query-coding-succeeded query-coding-table) |
104 (query-coding-region (point-min) (point-max) 'iso-8859-1-unix) | 88 (query-coding-region (point-min) (point-max) 'iso-8859-1-unix) |
105 (Assert (eq t query-coding-succeeded)) | 89 (Assert (eq t query-coding-succeeded) |
106 (Assert (null query-coding-table))) | 90 "checking query-coding-region iso-8859-1-transparency") |
91 (Assert (null query-coding-table) | |
92 "checking query-coding-region iso-8859-1-transparency")) | |
107 (multiple-value-bind (query-coding-succeeded query-coding-table) | 93 (multiple-value-bind (query-coding-succeeded query-coding-table) |
108 (query-coding-string (buffer-string) 'iso-8859-1-unix) | 94 (query-coding-string (buffer-string) 'iso-8859-1-unix) |
109 (Assert (eq t query-coding-succeeded)) | 95 (Assert (eq t query-coding-succeeded) |
110 (Assert (null query-coding-table))) | 96 "checking query-coding-string iso-8859-1-transparency") |
97 (Assert (null query-coding-table) | |
98 "checking query-coding-string iso-8859-1-transparency")) | |
111 (multiple-value-bind (query-coding-succeeded query-coding-table) | 99 (multiple-value-bind (query-coding-succeeded query-coding-table) |
112 (query-coding-string (buffer-string) 'iso-latin-1-with-esc-unix) | 100 (query-coding-string (buffer-string) 'iso-latin-1-with-esc-unix) |
113 (Assert (eq t query-coding-succeeded)) | 101 (Assert |
114 (Assert (null query-coding-table))) | 102 (eq t query-coding-succeeded) |
103 "checking query-coding-region iso-latin-1-with-esc-transparency") | |
104 (Assert | |
105 (null query-coding-table) | |
106 "checking query-coding-region iso-latin-1-with-esc-transparency")) | |
115 ;; Make it fail, check that it fails correctly | 107 ;; Make it fail, check that it fails correctly |
116 (insert (decode-char 'ucs #x20AC)) ;; EURO SIGN | 108 (insert (decode-char 'ucs #x20AC)) ;; EURO SIGN |
117 (multiple-value-bind (query-coding-succeeded query-coding-table) | 109 (multiple-value-bind (query-coding-succeeded query-coding-table) |
118 (query-coding-region (point-min) (point-max) 'iso-8859-1-unix) | 110 (query-coding-region (point-min) (point-max) 'iso-8859-1-unix) |
119 (Assert (null query-coding-succeeded)) | 111 (Assert |
120 (Assert (equal query-coding-table | 112 (null query-coding-succeeded) |
121 #s(range-table type start-closed-end-open data | 113 "checking that query-coding-region fails, U+20AC, iso-8859-1") |
122 ((257 258) t))))) | 114 (Assert |
115 (equal query-coding-table | |
116 #s(range-table type start-closed-end-open data | |
117 ((257 258) unencodable))) | |
118 "checking query-coding-region fails correctly, U+20AC, iso-8859-1")) | |
123 (multiple-value-bind (query-coding-succeeded query-coding-table) | 119 (multiple-value-bind (query-coding-succeeded query-coding-table) |
124 (query-coding-region (point-min) (point-max) | 120 (query-coding-region (point-min) (point-max) |
125 'iso-latin-1-with-esc-unix) | 121 'iso-latin-1-with-esc-unix) |
126 ;; Stupidly, this succeeds. The behaviour is compatible with | 122 ;; Stupidly, this succeeds. The behaviour is compatible with |
127 ;; GNU, though, and we encourage people not to use | 123 ;; GNU, though, and we encourage people not to use |
128 ;; iso-latin-1-with-esc-unix anyway: | 124 ;; iso-latin-1-with-esc-unix anyway: |
129 (Assert query-coding-succeeded) | 125 (Assert |
130 (Assert (null query-coding-table))) | 126 query-coding-succeeded |
127 "checking that query-coding-region succeeds, U+20AC, \ | |
128 iso-latin-with-esc-unix-1") | |
129 (Assert | |
130 (null query-coding-table) | |
131 "checking that query-coding-region succeeds, U+20AC, \ | |
132 iso-latin-with-esc-unix-1")) | |
131 ;; Check that it errors correctly. | 133 ;; Check that it errors correctly. |
132 (setq text-conversion-error-signalled nil) | 134 (setq text-conversion-error-signalled nil) |
133 (condition-case nil | 135 (condition-case nil |
134 (query-coding-region (point-min) (point-max) 'iso-8859-1-unix nil t) | 136 (query-coding-region (point-min) (point-max) 'iso-8859-1-unix |
137 (current-buffer) nil t) | |
135 (text-conversion-error | 138 (text-conversion-error |
136 (setq text-conversion-error-signalled t))) | 139 (setq text-conversion-error-signalled t))) |
137 (Assert text-conversion-error-signalled) | 140 (Assert |
141 text-conversion-error-signalled | |
142 "checking query-coding-region signals text-conversion-error correctly") | |
138 (setq text-conversion-error-signalled nil) | 143 (setq text-conversion-error-signalled nil) |
139 (condition-case nil | 144 (condition-case nil |
140 (query-coding-region (point-min) (point-max) | 145 (query-coding-region (point-min) (point-max) |
141 'iso-latin-1-with-esc-unix nil t) | 146 'iso-latin-1-with-esc-unix nil nil t) |
142 (text-conversion-error | 147 (text-conversion-error |
143 (setq text-conversion-error-signalled t))) | 148 (setq text-conversion-error-signalled t))) |
144 (Assert (null text-conversion-error-signalled)) | 149 (Assert |
150 (null text-conversion-error-signalled) | |
151 "checking query-coding-region doesn't signal text-conversion-error") | |
145 (delete-region (point-min) (point-max)) | 152 (delete-region (point-min) (point-max)) |
146 (insert latin-1-chars-string) | 153 (insert latin-1-chars-string) |
147 (decode-coding-region (point-min) (point-max) 'windows-1252-unix) | 154 (decode-coding-region (point-min) (point-max) 'windows-1252-unix) |
148 (goto-char (point-max)) ;; #'decode-coding-region just messed up point. | 155 (goto-char (point-max)) ;; #'decode-coding-region just messed up point. |
149 (multiple-value-bind (query-coding-succeeded query-coding-table) | 156 (multiple-value-bind (query-coding-succeeded query-coding-table) |
150 (query-coding-region (point-min) (point-max) 'windows-1252-unix) | 157 (query-coding-region (point-min) (point-max) 'windows-1252-unix) |
151 (Assert (eq t query-coding-succeeded)) | 158 (Assert |
152 (Assert (null query-coding-table))) | 159 (null query-coding-succeeded) |
160 "check query-coding-region fails, windows-1252, invalid-sequences") | |
161 (Assert | |
162 (equal query-coding-table | |
163 #s(range-table type start-closed-end-open | |
164 data ((130 131) invalid-sequence | |
165 (142 143) invalid-sequence | |
166 (144 146) invalid-sequence | |
167 (158 159) invalid-sequence))) | |
168 "check query-coding-region fails, windows-1252, invalid-sequences")) | |
169 (multiple-value-bind (query-coding-succeeded query-coding-table) | |
170 (query-coding-region (point-min) (point-max) 'windows-1252-unix | |
171 (current-buffer) t) | |
172 (Assert | |
173 (eq t query-coding-succeeded) | |
174 "checking that query-coding-region succeeds, U+20AC, windows-1252") | |
175 (Assert | |
176 (null query-coding-table) | |
177 "checking that query-coding-region succeeds, U+20AC, windows-1252")) | |
153 (insert ?\x80) | 178 (insert ?\x80) |
154 (multiple-value-bind (query-coding-succeeded query-coding-table) | 179 (multiple-value-bind (query-coding-succeeded query-coding-table) |
180 (query-coding-region (point-min) (point-max) 'windows-1252-unix | |
181 (current-buffer) t) | |
182 (Assert | |
183 (null query-coding-succeeded) | |
184 "checking that query-coding-region fails, U+0080, windows-1252") | |
185 (Assert | |
186 (equal query-coding-table | |
187 #s(range-table type start-closed-end-open data | |
188 ((257 258) unencodable))) | |
189 "checking that query-coding-region fails, U+0080, windows-1252")) | |
190 (multiple-value-bind (query-coding-succeeded query-coding-table) | |
155 (query-coding-region (point-min) (point-max) 'windows-1252-unix) | 191 (query-coding-region (point-min) (point-max) 'windows-1252-unix) |
156 (Assert (null query-coding-succeeded)) | 192 (Assert |
157 (Assert (equal query-coding-table | 193 (null query-coding-succeeded) |
158 #s(range-table type start-closed-end-open data | 194 "check query-coding-region fails, U+0080, invalid-sequence, cp1252") |
159 ((257 258) t))))) | 195 (Assert |
196 (equal query-coding-table | |
197 #s(range-table type start-closed-end-open | |
198 data ((130 131) invalid-sequence | |
199 (142 143) invalid-sequence | |
200 (144 146) invalid-sequence | |
201 (158 159) invalid-sequence | |
202 (257 258) unencodable))) | |
203 "check query-coding-region fails, U+0080, invalid-sequence, cp1252")) | |
160 ;; Try a similar approach with koi8-o, the koi8 variant with | 204 ;; Try a similar approach with koi8-o, the koi8 variant with |
161 ;; support for Old Church Slavonic. | 205 ;; support for Old Church Slavonic. |
162 (delete-region (point-min) (point-max)) | 206 (delete-region (point-min) (point-max)) |
163 (insert latin-1-chars-string) | 207 (insert latin-1-chars-string) |
164 (decode-coding-region (point-min) (point-max) 'koi8-o-unix) | 208 (decode-coding-region (point-min) (point-max) 'koi8-o-unix) |
165 (multiple-value-bind (query-coding-succeeded query-coding-table) | 209 (multiple-value-bind (query-coding-succeeded query-coding-table) |
166 (query-coding-region (point-min) (point-max) 'koi8-o-unix) | 210 (query-coding-region (point-min) (point-max) 'koi8-o-unix) |
167 (Assert (eq t query-coding-succeeded)) | 211 (Assert |
168 (Assert (null query-coding-table))) | 212 (eq t query-coding-succeeded) |
213 "checking that query-coding-region succeeds, koi8-o-unix") | |
214 (Assert | |
215 (null query-coding-table) | |
216 "checking that query-coding-region succeeds, koi8-o-unix")) | |
169 (multiple-value-bind (query-coding-succeeded query-coding-table) | 217 (multiple-value-bind (query-coding-succeeded query-coding-table) |
170 (query-coding-region (point-min) (point-max) 'escape-quoted) | 218 (query-coding-region (point-min) (point-max) 'escape-quoted) |
171 (Assert (eq t query-coding-succeeded)) | 219 (Assert (eq t query-coding-succeeded) |
172 (Assert (null query-coding-table))) | 220 "checking that query-coding-region succeeds, escape-quoted") |
221 (Assert (null query-coding-table) | |
222 "checking that query-coding-region succeeds, escape-quoted")) | |
173 (multiple-value-bind (query-coding-succeeded query-coding-table) | 223 (multiple-value-bind (query-coding-succeeded query-coding-table) |
174 (query-coding-region (point-min) (point-max) 'windows-1252-unix) | 224 (query-coding-region (point-min) (point-max) 'windows-1252-unix) |
175 (Assert (null query-coding-succeeded)) | 225 (Assert |
176 (Assert (equal query-coding-table | 226 (null query-coding-succeeded) |
177 #s(range-table type start-closed-end-open | 227 "checking that query-coding-region fails, windows-1252 and Cyrillic") |
178 data ((129 131) t (132 133) t (139 140) t | 228 (Assert |
179 (141 146) t (155 156) t (157 161) t | 229 (equal query-coding-table |
180 (162 170) t (173 176) t (178 187) t | 230 #s(range-table type start-closed-end-open |
181 (189 192) t (193 257) t))))) | 231 data ((129 131) unencodable |
232 (132 133) unencodable | |
233 (139 140) unencodable | |
234 (141 146) unencodable | |
235 (155 156) unencodable | |
236 (157 161) unencodable | |
237 (162 170) unencodable | |
238 (173 176) unencodable | |
239 (178 187) unencodable | |
240 (189 192) unencodable | |
241 (193 257) unencodable))) | |
242 "checking that query-coding-region fails, windows-1252 and Cyrillic")) | |
182 (multiple-value-bind (query-coding-succeeded query-coding-table) | 243 (multiple-value-bind (query-coding-succeeded query-coding-table) |
183 (query-coding-region (point-min) (point-max) 'koi8-r-unix) | 244 (query-coding-region (point-min) (point-max) 'koi8-r-unix) |
184 (Assert (null query-coding-succeeded)) | 245 (Assert |
185 (Assert (equal query-coding-table | 246 (null query-coding-succeeded) |
186 #s(range-table type start-closed-end-open | 247 "checking that query-coding-region fails, koi8-r and OCS characters") |
187 data ((129 154) t (155 161) t (162 164) t | 248 (Assert |
188 (165 177) t (178 180) t | 249 (equal query-coding-table |
189 (181 192) t))))) | 250 #s(range-table type start-closed-end-open |
251 data ((129 154) unencodable | |
252 (155 161) unencodable | |
253 (162 164) unencodable | |
254 (165 177) unencodable | |
255 (178 180) unencodable | |
256 (181 192) unencodable))) | |
257 "checking that query-coding-region fails, koi8-r and OCS characters")) | |
190 ;; Check that the Unicode coding systems handle characters | 258 ;; Check that the Unicode coding systems handle characters |
191 ;; without Unicode mappings. | 259 ;; without Unicode mappings. |
192 (delete-region (point-min) (point-max)) | 260 (delete-region (point-min) (point-max)) |
193 (insert latin-1-chars-string) | 261 (insert latin-1-chars-string) |
194 (decode-coding-region (point-min) (point-max) 'greek-iso-8bit-with-esc) | 262 (decode-coding-region (point-min) (point-max) 'greek-iso-8bit-with-esc) |
208 utf-32-mac utf-8-mac utf-16-little-endian-unix | 276 utf-32-mac utf-8-mac utf-16-little-endian-unix |
209 ucs-4-little-endian ucs-4-little-endian-unix utf-8 | 277 ucs-4-little-endian ucs-4-little-endian-unix utf-8 |
210 utf-16-little-endian-bom)) | 278 utf-16-little-endian-bom)) |
211 (multiple-value-bind (query-coding-succeeded query-coding-table) | 279 (multiple-value-bind (query-coding-succeeded query-coding-table) |
212 (query-coding-region (point-min) (point-max) coding-system) | 280 (query-coding-region (point-min) (point-max) coding-system) |
213 (Assert (null query-coding-succeeded)) | 281 (Assert (null query-coding-succeeded) |
282 "checking unicode coding systems fail with unmapped chars") | |
214 (Assert (equal query-coding-table | 283 (Assert (equal query-coding-table |
215 #s(range-table type start-closed-end-open data | 284 #s(range-table type start-closed-end-open data |
216 ((173 174) t (209 210) t | 285 ((173 174) unencodable |
217 (254 255) t))))) | 286 (209 210) unencodable |
287 (254 255) unencodable))) | |
288 "checking unicode coding systems fail with unmapped chars")) | |
218 (multiple-value-bind (query-coding-succeeded query-coding-table) | 289 (multiple-value-bind (query-coding-succeeded query-coding-table) |
219 (query-coding-region (point-min) 173 coding-system) | 290 (query-coding-region (point-min) 173 coding-system) |
220 (Assert (eq t query-coding-succeeded)) | 291 (Assert (eq t query-coding-succeeded) |
221 (Assert (null query-coding-table))) | 292 "checking unicode coding systems succeed sans unmapped chars") |
293 (Assert | |
294 (null query-coding-table) | |
295 "checking unicode coding systems succeed sans unmapped chars")) | |
222 (multiple-value-bind (query-coding-succeeded query-coding-table) | 296 (multiple-value-bind (query-coding-succeeded query-coding-table) |
223 (query-coding-region 174 209 coding-system) | 297 (query-coding-region 174 209 coding-system) |
224 (Assert (eq t query-coding-succeeded)) | 298 (Assert |
225 (Assert (null query-coding-table))) | 299 (eq t query-coding-succeeded) |
300 "checking unicode coding systems succeed sans unmapped chars, again") | |
301 (Assert | |
302 (null query-coding-table) | |
303 "checking unicode coding systems succeed sans unmapped chars again")) | |
226 (multiple-value-bind (query-coding-succeeded query-coding-table) | 304 (multiple-value-bind (query-coding-succeeded query-coding-table) |
227 (query-coding-region 210 254 coding-system) | 305 (query-coding-region 210 254 coding-system) |
228 (Assert (eq t query-coding-succeeded)) | 306 (Assert (eq t query-coding-succeeded)) |
229 (Assert (null query-coding-table))) | 307 (Assert (null query-coding-table))) |
230 ;; Check that it errors correctly. | 308 ;; Check that it errors correctly. |
231 (setq text-conversion-error-signalled nil) | 309 (setq text-conversion-error-signalled nil) |
232 (condition-case nil | 310 (condition-case nil |
233 (query-coding-region (point-min) (point-max) coding-system nil t) | 311 (query-coding-region (point-min) (point-max) coding-system |
312 (current-buffer) nil t) | |
234 (text-conversion-error | 313 (text-conversion-error |
235 (setq text-conversion-error-signalled t))) | 314 (setq text-conversion-error-signalled t))) |
236 (Assert text-conversion-error-signalled) | 315 (Assert text-conversion-error-signalled |
316 "checking that unicode coding systems error correctly") | |
237 (setq text-conversion-error-signalled nil) | 317 (setq text-conversion-error-signalled nil) |
238 (condition-case nil | 318 (condition-case nil |
239 (query-coding-region (point-min) 173 coding-system nil t) | 319 (query-coding-region (point-min) 173 coding-system |
320 (current-buffer) | |
321 nil t) | |
240 (text-conversion-error | 322 (text-conversion-error |
241 (setq text-conversion-error-signalled t))) | 323 (setq text-conversion-error-signalled t))) |
242 (Assert (null text-conversion-error-signalled))) | 324 (Assert |
243 | 325 (null text-conversion-error-signalled) |
326 "checking that unicode coding systems do not error when unnecessary")) | |
327 | |
328 (delete-region (point-min) (point-max)) | |
329 (insert (decode-coding-string "\xff\xff\xff\xff" | |
330 'greek-iso-8bit-with-esc)) | |
331 (insert (decode-coding-string "\xff\xff\xff\xff" 'utf-8)) | |
332 (insert (decode-coding-string "\xff\xff\xff\xff" | |
333 'greek-iso-8bit-with-esc)) | |
334 (dolist (coding-system '(utf-8 utf-16 utf-16-little-endian | |
335 utf-32 utf-32-little-endian)) | |
336 (multiple-value-bind (query-coding-succeeded query-coding-table) | |
337 (query-coding-region (point-min) (point-max) coding-system) | |
338 (Assert (null query-coding-succeeded) | |
339 (format | |
340 "checking %s fails with unmapped chars and invalid seqs" | |
341 coding-system)) | |
342 (Assert (equal query-coding-table | |
343 #s(range-table type start-closed-end-open | |
344 data ((1 5) unencodable | |
345 (5 9) invalid-sequence | |
346 (9 13) unencodable))) | |
347 (format | |
348 "checking %s fails with unmapped chars and invalid seqs" | |
349 coding-system))) | |
350 (multiple-value-bind (query-coding-succeeded query-coding-table) | |
351 (query-coding-region (point-min) (point-max) coding-system | |
352 (current-buffer) t) | |
353 (Assert (null query-coding-succeeded) | |
354 (format | |
355 "checking %s fails with unmapped chars sans invalid seqs" | |
356 coding-system)) | |
357 (Assert | |
358 (equal query-coding-table | |
359 #s(range-table type start-closed-end-open | |
360 data ((1 5) unencodable | |
361 (9 13) unencodable))) | |
362 (format | |
363 "checking %s fails correctly, unmapped chars sans invalid seqs" | |
364 coding-system)))) | |
244 ;; Now to test #'encode-coding-char. Most of the functionality was | 365 ;; Now to test #'encode-coding-char. Most of the functionality was |
245 ;; tested in the query-coding-region tests above, so we don't go into | 366 ;; tested in the query-coding-region tests above, so we don't go into |
246 ;; as much detail here. | 367 ;; as much detail here. |
247 (Assert (null (encode-coding-char | 368 (Assert |
248 (decode-char 'ucs #x20ac) 'iso-8859-1))) | 369 (null (encode-coding-char |
249 (Assert (equal "\x80" (encode-coding-char | 370 (decode-char 'ucs #x20ac) 'iso-8859-1)) |
250 (decode-char 'ucs #x20ac) 'windows-1252))) | 371 "check #'encode-coding-char doesn't think iso-8859-1 handles U+20AC") |
372 (Assert | |
373 (equal "\x80" (encode-coding-char | |
374 (decode-char 'ucs #x20ac) 'windows-1252)) | |
375 "check #'encode-coding-char doesn't think windows-1252 handles U+0080") | |
251 (delete-region (point-min) (point-max)) | 376 (delete-region (point-min) (point-max)) |
252 | 377 |
253 ;; And #'unencodable-char-position. | 378 ;; And #'unencodable-char-position. |
254 (insert latin-1-chars-string) | 379 (insert latin-1-chars-string) |
255 (insert (decode-char 'ucs #x20ac)) | 380 (insert (decode-char 'ucs #x20ac)) |
256 (Assert (= 257 (unencodable-char-position (point-min) (point-max) | 381 (Assert |
257 'iso-8859-1))) | 382 (= 257 (unencodable-char-position (point-min) (point-max) |
258 (Assert (equal '(257) (unencodable-char-position (point-min) (point-max) | 383 'iso-8859-1)) |
259 'iso-8859-1 1))) | 384 "check #'unencodable-char-position doesn't think latin-1 encodes U+20AC") |
385 (Assert | |
386 (equal '(257) (unencodable-char-position (point-min) (point-max) | |
387 'iso-8859-1 1)) | |
388 "check #'unencodable-char-position doesn't think latin-1 encodes U+20AC") | |
260 ;; Compatiblity, sigh: | 389 ;; Compatiblity, sigh: |
261 (Assert (equal '(257) (unencodable-char-position (point-min) (point-max) | 390 (Assert |
262 'iso-8859-1 0))) | 391 (equal '(257) (unencodable-char-position (point-min) (point-max) |
392 'iso-8859-1 0)) | |
393 "check #'unencodable-char-position has some borked GNU semantics") | |
263 (dotimes (i 6) (insert (decode-char 'ucs #x20ac))) | 394 (dotimes (i 6) (insert (decode-char 'ucs #x20ac))) |
264 ;; Check if it stops at one: | 395 ;; Check if it stops at one: |
265 (Assert (equal '(257) (unencodable-char-position (point-min) (point-max) | 396 (Assert (equal '(257) (unencodable-char-position (point-min) (point-max) |
266 'iso-8859-1 1))) | 397 'iso-8859-1 1)) |
398 "check #'unencodable-char-position stops at 1 when asked to") | |
267 ;; Check if it stops at four: | 399 ;; Check if it stops at four: |
268 (Assert (equal '(260 259 258 257) | 400 (Assert (equal '(260 259 258 257) |
269 (unencodable-char-position (point-min) (point-max) | 401 (unencodable-char-position (point-min) (point-max) |
270 'iso-8859-1 4))) | 402 'iso-8859-1 4)) |
403 "check #'unencodable-char-position stops at 4 when asked to") | |
271 ;; Check whether it stops at seven: | 404 ;; Check whether it stops at seven: |
272 (Assert (equal '(263 262 261 260 259 258 257) | 405 (Assert (equal '(263 262 261 260 259 258 257) |
273 (unencodable-char-position (point-min) (point-max) | 406 (unencodable-char-position (point-min) (point-max) |
274 'iso-8859-1 7))) | 407 'iso-8859-1 7)) |
408 "check #'unencodable-char-position stops at 7 when asked to") | |
275 ;; Check that it still stops at seven: | 409 ;; Check that it still stops at seven: |
276 (Assert (equal '(263 262 261 260 259 258 257) | 410 (Assert (equal '(263 262 261 260 259 258 257) |
277 (unencodable-char-position (point-min) (point-max) | 411 (unencodable-char-position (point-min) (point-max) |
278 'iso-8859-1 2000))) | 412 'iso-8859-1 2000)) |
413 "check #'unencodable-char-position stops at 7 if 2000 asked for") | |
279 ;; Now, #'check-coding-systems-region. | 414 ;; Now, #'check-coding-systems-region. |
280 ;; UTF-8 should certainly be able to encode these characters: | 415 ;; UTF-8 should certainly be able to encode these characters: |
281 (Assert (eq t (check-coding-systems-region (point-min) (point-max) | 416 (Assert (eq t (check-coding-systems-region (point-min) (point-max) |
282 '(utf-8)))) | 417 '(utf-8))) |
283 (Assert (equal '((iso-8859-1 257 258 259 260 261 262 263) | 418 "check #'check-coding-systems-region gives t if encoding works") |
284 (windows-1252 129 131 132 133 134 135 136 137 138 139 | 419 (Assert |
285 140 141 143 146 147 148 149 150 151 152 | 420 (equal '((iso-8859-1 257 258 259 260 261 262 263) |
286 153 154 155 156 157 159 160)) | 421 (windows-1252 129 130 131 132 133 134 135 136 |
287 (sort | 422 137 138 139 140 141 142 143 144 |
288 (check-coding-systems-region (point-min) (point-max) | 423 145 146 147 148 149 150 151 152 |
289 '(utf-8 iso-8859-1 | 424 153 154 155 156 157 158 159 160)) |
290 windows-1252)) | 425 (sort |
291 ;; (The sort is to make the algorithm irrelevant.) | 426 (check-coding-systems-region (point-min) (point-max) |
292 #'(lambda (left right) | 427 '(utf-8 iso-8859-1 |
293 (string< (car left) (car right)))))) | 428 windows-1252)) |
429 ;; (The sort is to make the algorithm irrelevant.) | |
430 #'(lambda (left right) | |
431 (string< (car left) (car right))))) | |
432 "check #'check-coding-systems-region behaves well given a list") | |
294 ;; Ensure that the indices are all decreased by one when passed a | 433 ;; Ensure that the indices are all decreased by one when passed a |
295 ;; string: | 434 ;; string: |
296 (Assert (equal '((iso-8859-1 256 257 258 259 260 261 262) | 435 (Assert |
297 (windows-1252 128 130 131 132 133 134 135 136 137 138 | 436 (equal '((iso-8859-1 256 257 258 259 260 261 262) |
298 139 140 142 145 146 147 148 149 150 151 | 437 (windows-1252 128 129 130 131 132 133 134 135 |
299 152 153 154 155 156 158 159)) | 438 136 137 138 139 140 141 142 143 |
300 (sort | 439 144 145 146 147 148 149 150 151 |
301 (check-coding-systems-region (buffer-string) nil | 440 152 153 154 155 156 157 158 159)) |
302 '(utf-8 iso-8859-1 | 441 (sort |
303 windows-1252)) | 442 (check-coding-systems-region (buffer-string) nil |
304 #'(lambda (left right) | 443 '(utf-8 iso-8859-1 |
305 (string< (car left) (car right))))))))) | 444 windows-1252)) |
306 | 445 #'(lambda (left right) |
446 (string< (car left) (car right))))) | |
447 "check #'check-coding-systems-region behaves given a string and list")))) | |
448 | |
449 | |
450 |