comparison lisp/unicode.el @ 4690:257b468bf2ca

Move the #'query-coding-region implementation to C. This is necessary because there is no reasonable way to access the corresponding mswindows-multibyte functionality from Lisp, and we need such functionality if we're going to have a reliable and portable #'query-coding-region implementation. However, this change doesn't yet provide #'query-coding-region for the mswindow-multibyte coding systems, there should be no functional differences between an XEmacs with this change and one without it. src/ChangeLog addition: 2009-09-19 Aidan Kehoe <kehoea@parhasard.net> Move the #'query-coding-region implementation to C. This is necessary because there is no reasonable way to access the corresponding mswindows-multibyte functionality from Lisp, and we need such functionality if we're going to have a reliable and portable #'query-coding-region implementation. However, this change doesn't yet provide #'query-coding-region for the mswindow-multibyte coding systems, there should be no functional differences between an XEmacs with this change and one without it. * mule-coding.c (struct fixed_width_coding_system): Add a new coding system type, fixed_width, and implement it. It uses the CCL infrastructure but has a much simpler creation API, and its own query_method, formerly in lisp/mule/mule-coding.el. * unicode.c: Move the Unicode query method implementation here from unicode.el. * lisp.h: Declare Fmake_coding_system_internal, Fcopy_range_table here. * intl-win32.c (complex_vars_of_intl_win32): Use Fmake_coding_system_internal, not Fmake_coding_system. * general-slots.h: Add Qsucceeded, Qunencodable, Qinvalid_sequence here. * file-coding.h (enum coding_system_variant): Add fixed_width_coding_system here. (struct coding_system_methods): Add query_method and query_lstream_method to the coding system methods. Provide flags for the query methods. Declare the default query method; initialise it correctly in INITIALIZE_CODING_SYSTEM_TYPE. * file-coding.c (default_query_method): New function, the default query method for coding systems that do not set it. Moved from coding.el. (make_coding_system_1): Accept new elements in PROPS in #'make-coding-system; aliases, a list of aliases; safe-chars and safe-charsets (these were previously accepted but not saved); and category. (Fmake_coding_system_internal): New function, what used to be #'make-coding-system--on Mule builds, we've now moved some of the functionality of this to Lisp. (Fcoding_system_canonical_name_p): Move this earlier in the file, since it's now called from within make_coding_system_1. (Fquery_coding_region): Move the implementation of this here, from coding.el. (complex_vars_of_file_coding): Call Fmake_coding_system_internal, not Fmake_coding_system; specify safe-charsets properties when we're a mule build. * extents.h (mouse_highlight_priority, Fset_extent_priority, Fset_extent_face, Fmap_extents): Make these available to other C files. lisp/ChangeLog addition: 2009-09-19 Aidan Kehoe <kehoea@parhasard.net> Move the #'query-coding-region implementation to C. * coding.el: Consolidate code that depends on the presence or absence of Mule at the end of this file. (default-query-coding-region, query-coding-region): Move these functions to C. (default-query-coding-region-safe-charset-skip-chars-map): Remove this variable, the corresponding C variable is Vdefault_query_coding_region_chartab_cache in file-coding.c. (query-coding-string): Update docstring to reflect actual multiple values, be more careful about not modifying a range table that we're currently mapping over. (encode-coding-char): Make the implementation of this simpler. (featurep 'mule): Autoload #'make-coding-system from mule/make-coding-system.el if we're a mule build; provide an appropriate compiler macro. Do various non-mule compatibility things if we're not a mule build. * update-elc.el (additional-dump-dependencies): Add mule/make-coding-system as a dump time dependency if we're a mule build. * unicode.el (ccl-encode-to-ucs-2): (decode-char): (encode-char): Move these earlier in the file, for the sake of some byte compile warnings. (unicode-query-coding-region): Move this to unicode.c * mule/make-coding-system.el: New file, not dumped. Contains the functionality to rework the arguments necessary for fixed-width coding systems, and contains the implementation of #'make-coding-system, which now calls #'make-coding-system-internal. * mule/vietnamese.el (viscii): * mule/latin.el (iso-8859-2): (windows-1250): (iso-8859-3): (iso-8859-4): (iso-8859-14): (iso-8859-15): (iso-8859-16): (iso-8859-9): (macintosh): (windows-1252): * mule/hebrew.el (iso-8859-8): * mule/greek.el (iso-8859-7): (windows-1253): * mule/cyrillic.el (iso-8859-5): (koi8-r): (koi8-u): (windows-1251): (alternativnyj): (koi8-ru): (koi8-t): (koi8-c): (koi8-o): * mule/arabic.el (iso-8859-6): (windows-1256): Move all these coding systems to being of type fixed-width, not of type CCL. This allows the distinct query-coding-region for them to be in C, something which will eventually allow us to implement query-coding-region for the mswindows-multibyte coding systems. * mule/general-late.el (posix-charset-to-coding-system-hash): Document why we're pre-emptively persuading the byte compiler that the ELC for this file needs to be written using escape-quoted. Call #'set-unicode-query-skip-chars-args, now the Unicode query-coding-region implementation is in C. * mule/thai-xtis.el (tis-620): Don't bother checking whether we're XEmacs or not here. * mule/mule-coding.el: Move the eight bit fixed-width functionality from this file to make-coding-system.el. tests/ChangeLog addition: 2009-09-19 Aidan Kehoe <kehoea@parhasard.net> * automated/mule-tests.el: Check a coding system's type, not an 8-bit-fixed property, for whether that coding system should be treated as a fixed-width coding system. * automated/query-coding-tests.el: Don't test the query coding functionality for mswindows-multibyte coding systems, it's not yet implemented.
author Aidan Kehoe <kehoea@parhasard.net>
date Sat, 19 Sep 2009 22:53:13 +0100
parents 75e7ab37b6c8
children e29fcfd8df5f
comparison
equal deleted inserted replaced
4689:0636c6ccb430 4690:257b468bf2ca
162 composite ethiopic indian-1-column indian-2-column jit-ucs-charset-0 162 composite ethiopic indian-1-column indian-2-column jit-ucs-charset-0
163 katakana-jisx0201 lao thai-tis620 thai-xtis tibetan tibetan-1-column 163 katakana-jisx0201 lao thai-tis620 thai-xtis tibetan tibetan-1-column
164 latin-jisx0201 chinese-cns11643-3 chinese-cns11643-4 164 latin-jisx0201 chinese-cns11643-3 chinese-cns11643-4
165 chinese-cns11643-5 chinese-cns11643-6 chinese-cns11643-7))))) 165 chinese-cns11643-5 chinese-cns11643-6 chinese-cns11643-7)))))
166 166
167 (make-coding-system
168 'utf-16 'unicode
169 "UTF-16"
170 '(mnemonic "UTF-16"
171 documentation
172 "UTF-16 Unicode encoding -- the standard (almost-) fixed-width
173 two-byte encoding, with surrogates. It will be fixed-width if all
174 characters are in the BMP (Basic Multilingual Plane -- first 65536
175 codepoints). Cannot represent characters with codepoints above
176 0x10FFFF (a little more than 1,000,000). Unicode and ISO guarantee
177 never to encode any characters outside this range -- all the rest are
178 for private, corporate or internal use."
179 unicode-type utf-16))
180
181 (define-coding-system-alias 'utf-16-be 'utf-16)
182
183 (make-coding-system
184 'utf-16-bom 'unicode
185 "UTF-16 w/BOM"
186 '(mnemonic "UTF16-BOM"
187 documentation
188 "UTF-16 Unicode encoding with byte order mark (BOM) at the beginning.
189 The BOM is Unicode character U+FEFF -- i.e. the first two bytes are
190 0xFE and 0xFF, respectively, or reversed in a little-endian
191 representation. It has been sanctioned by the Unicode Consortium for
192 use at the beginning of a Unicode stream as a marker of the byte order
193 of the stream, and commonly appears in Unicode files under Microsoft
194 Windows, where it also functions as a magic cookie identifying a
195 Unicode file. The character is called \"ZERO WIDTH NO-BREAK SPACE\"
196 and is suitable as a byte-order marker because:
197
198 -- it has no displayable representation
199 -- due to its semantics it never normally appears at the beginning
200 of a stream
201 -- its reverse U+FFFE is not a legal Unicode character
202 -- neither byte sequence is at all likely in any other standard
203 encoding, particularly at the beginning of a stream
204
205 This coding system will insert a BOM at the beginning of a stream when
206 writing and strip it off when reading."
207 unicode-type utf-16
208 need-bom t))
209
210 (make-coding-system
211 'utf-16-little-endian 'unicode
212 "UTF-16 Little Endian"
213 '(mnemonic "UTF16-LE"
214 documentation
215 "Little-endian version of UTF-16 Unicode encoding.
216 See `utf-16' coding system."
217 unicode-type utf-16
218 little-endian t))
219
220 (define-coding-system-alias 'utf-16-le 'utf-16-little-endian)
221
222 (make-coding-system
223 'utf-16-little-endian-bom 'unicode
224 "UTF-16 Little Endian w/BOM"
225 '(mnemonic "MSW-Unicode"
226 documentation
227 "Little-endian version of UTF-16 Unicode encoding, with byte order mark.
228 Standard encoding for representing Unicode under MS Windows. See
229 `utf-16-bom' coding system."
230 unicode-type utf-16
231 little-endian t
232 need-bom t))
233
234 (make-coding-system
235 'ucs-4 'unicode
236 "UCS-4"
237 '(mnemonic "UCS4"
238 documentation
239 "UCS-4 Unicode encoding -- fully fixed-width four-byte encoding."
240 unicode-type ucs-4))
241
242 (make-coding-system
243 'ucs-4-little-endian 'unicode
244 "UCS-4 Little Endian"
245 '(mnemonic "UCS4-LE"
246 documentation
247 ;; #### I don't think this is permitted by ISO 10646, only Unicode.
248 ;; Call it UTF-32 instead?
249 "Little-endian version of UCS-4 Unicode encoding. See `ucs-4' coding system."
250 unicode-type ucs-4
251 little-endian t))
252
253 (make-coding-system
254 'utf-32 'unicode
255 "UTF-32"
256 '(mnemonic "UTF32"
257 documentation
258 "UTF-32 Unicode encoding -- fixed-width four-byte encoding,
259 characters less than #x10FFFF are not supported. "
260 unicode-type utf-32))
261
262 (make-coding-system
263 'utf-32-little-endian 'unicode
264 "UTF-32 Little Endian"
265 '(mnemonic "UTF32-LE"
266 documentation
267 "Little-endian version of UTF-32 Unicode encoding.
268
269 A fixed-width four-byte encoding, characters less than #x10FFFF are not
270 supported. "
271 unicode-type ucs-4 little-endian t))
272
273 (make-coding-system
274 'utf-8 'unicode
275 "UTF-8"
276 '(mnemonic "UTF8"
277 documentation "
278 UTF-8 Unicode encoding -- ASCII-compatible 8-bit variable-width encoding
279 sharing the following principles with the Mule-internal encoding:
280
281 -- All ASCII characters (codepoints 0 through 127) are represented
282 by themselves (i.e. using one byte, with the same value as the
283 ASCII codepoint), and these bytes are disjoint from bytes
284 representing non-ASCII characters.
285
286 This means that any 8-bit clean application can safely process
287 UTF-8-encoded text as it were ASCII, with no corruption (e.g. a
288 '/' byte is always a slash character, never the second byte of
289 some other character, as with Big5, so a pathname encoded in
290 UTF-8 can safely be split up into components and reassembled
291 again using standard ASCII processes).
292
293 -- Leading bytes and non-leading bytes in the encoding of a
294 character are disjoint, so moving backwards is easy.
295
296 -- Given only the leading byte, you know how many following bytes
297 are present.
298 "
299 unicode-type utf-8))
300
301 (make-coding-system
302 'utf-8-bom 'unicode
303 "UTF-8 w/BOM"
304 '(mnemonic "MSW-UTF8"
305 documentation
306 "UTF-8 Unicode encoding, with byte order mark.
307 Standard encoding for representing UTF-8 under MS Windows."
308 unicode-type utf-8
309 little-endian t
310 need-bom t))
311
312 (defun decode-char (quote-ucs code &optional restriction)
313 "FSF compatibility--return Mule character with Unicode codepoint CODE.
314 The second argument must be 'ucs, the third argument is ignored. "
315 ;; We're prepared to accept invalid Unicode in unicode-to-char, but not in
316 ;; this function, which is the API that should actually be used, since
317 ;; it's available in GNU and in Mule-UCS.
318 (check-argument-range code #x0 #x10FFFF)
319 (assert (eq quote-ucs 'ucs) t
320 "Sorry, decode-char doesn't yet support anything but the UCS. ")
321 (unicode-to-char code))
322
323 (defun encode-char (char quote-ucs &optional restriction)
324 "FSF compatibility--return the Unicode code point of CHAR.
325 The second argument must be 'ucs, the third argument is ignored. "
326 (assert (eq quote-ucs 'ucs) t
327 "Sorry, encode-char doesn't yet support anything but the UCS. ")
328 (char-to-unicode char))
329
330 (defconst ccl-encode-to-ucs-2 167 (defconst ccl-encode-to-ucs-2
331 (eval-when-compile 168 (eval-when-compile
332 (let ((pre-existing 169 (let ((pre-existing
333 ;; This is the compiled CCL program from the assert 170 ;; This is the compiled CCL program from the assert
334 ;; below. Since this file is dumped and ccl.el isn't (and 171 ;; below. Since this file is dumped and ccl.el isn't (and
368 205
369 (when (featurep 'mule) 206 (when (featurep 'mule)
370 (put 'ccl-encode-to-ucs-2 'ccl-program-idx 207 (put 'ccl-encode-to-ucs-2 'ccl-program-idx
371 (declare-fboundp 208 (declare-fboundp
372 (register-ccl-program 'ccl-encode-to-ucs-2 ccl-encode-to-ucs-2)))) 209 (register-ccl-program 'ccl-encode-to-ucs-2 ccl-encode-to-ucs-2))))
210
211 (defun decode-char (quote-ucs code &optional restriction)
212 "FSF compatibility--return Mule character with Unicode codepoint CODE.
213 The second argument must be 'ucs, the third argument is ignored. "
214 ;; We're prepared to accept invalid Unicode in unicode-to-char, but not in
215 ;; this function, which is the API that should actually be used, since
216 ;; it's available in GNU and in Mule-UCS.
217 (check-argument-range code #x0 #x10FFFF)
218 (assert (eq quote-ucs 'ucs) t
219 "Sorry, decode-char doesn't yet support anything but the UCS. ")
220 (unicode-to-char code))
221
222 (defun encode-char (char quote-ucs &optional restriction)
223 "FSF compatibility--return the Unicode code point of CHAR.
224 The second argument must be 'ucs, the third argument is ignored. "
225 (assert (eq quote-ucs 'ucs) t
226 "Sorry, encode-char doesn't yet support anything but the UCS. ")
227 (char-to-unicode char))
228
229 (make-coding-system
230 'utf-16 'unicode
231 "UTF-16"
232 '(mnemonic "UTF-16"
233 documentation
234 "UTF-16 Unicode encoding -- the standard (almost-) fixed-width
235 two-byte encoding, with surrogates. It will be fixed-width if all
236 characters are in the BMP (Basic Multilingual Plane -- first 65536
237 codepoints). Cannot represent characters with codepoints above
238 0x10FFFF (a little more than 1,000,000). Unicode and ISO guarantee
239 never to encode any characters outside this range -- all the rest are
240 for private, corporate or internal use."
241 unicode-type utf-16))
242
243 (define-coding-system-alias 'utf-16-be 'utf-16)
244
245 (make-coding-system
246 'utf-16-bom 'unicode
247 "UTF-16 w/BOM"
248 '(mnemonic "UTF16-BOM"
249 documentation
250 "UTF-16 Unicode encoding with byte order mark (BOM) at the beginning.
251 The BOM is Unicode character U+FEFF -- i.e. the first two bytes are
252 0xFE and 0xFF, respectively, or reversed in a little-endian
253 representation. It has been sanctioned by the Unicode Consortium for
254 use at the beginning of a Unicode stream as a marker of the byte order
255 of the stream, and commonly appears in Unicode files under Microsoft
256 Windows, where it also functions as a magic cookie identifying a
257 Unicode file. The character is called \"ZERO WIDTH NO-BREAK SPACE\"
258 and is suitable as a byte-order marker because:
259
260 -- it has no displayable representation
261 -- due to its semantics it never normally appears at the beginning
262 of a stream
263 -- its reverse U+FFFE is not a legal Unicode character
264 -- neither byte sequence is at all likely in any other standard
265 encoding, particularly at the beginning of a stream
266
267 This coding system will insert a BOM at the beginning of a stream when
268 writing and strip it off when reading."
269 unicode-type utf-16
270 need-bom t))
271
272 (make-coding-system
273 'utf-16-little-endian 'unicode
274 "UTF-16 Little Endian"
275 '(mnemonic "UTF16-LE"
276 documentation
277 "Little-endian version of UTF-16 Unicode encoding.
278 See `utf-16' coding system."
279 unicode-type utf-16
280 little-endian t))
281
282 (define-coding-system-alias 'utf-16-le 'utf-16-little-endian)
283
284 (make-coding-system
285 'utf-16-little-endian-bom 'unicode
286 "UTF-16 Little Endian w/BOM"
287 '(mnemonic "MSW-Unicode"
288 documentation
289 "Little-endian version of UTF-16 Unicode encoding, with byte order mark.
290 Standard encoding for representing Unicode under MS Windows. See
291 `utf-16-bom' coding system."
292 unicode-type utf-16
293 little-endian t
294 need-bom t))
295
296 (make-coding-system
297 'ucs-4 'unicode
298 "UCS-4"
299 '(mnemonic "UCS4"
300 documentation
301 "UCS-4 Unicode encoding -- fully fixed-width four-byte encoding."
302 unicode-type ucs-4))
303
304 (make-coding-system
305 'ucs-4-little-endian 'unicode
306 "UCS-4 Little Endian"
307 '(mnemonic "UCS4-LE"
308 documentation
309 ;; #### I don't think this is permitted by ISO 10646, only Unicode.
310 ;; Call it UTF-32 instead?
311 "Little-endian version of UCS-4 Unicode encoding. See `ucs-4' coding system."
312 unicode-type ucs-4
313 little-endian t))
314
315 (make-coding-system
316 'utf-32 'unicode
317 "UTF-32"
318 '(mnemonic "UTF32"
319 documentation
320 "UTF-32 Unicode encoding -- fixed-width four-byte encoding,
321 characters less than #x10FFFF are not supported. "
322 unicode-type utf-32))
323
324 (make-coding-system
325 'utf-32-little-endian 'unicode
326 "UTF-32 Little Endian"
327 '(mnemonic "UTF32-LE"
328 documentation
329 "Little-endian version of UTF-32 Unicode encoding.
330
331 A fixed-width four-byte encoding, characters less than #x10FFFF are not
332 supported. "
333 unicode-type ucs-4 little-endian t))
334
335 (make-coding-system
336 'utf-8 'unicode
337 "UTF-8"
338 '(mnemonic "UTF8"
339 documentation "
340 UTF-8 Unicode encoding -- ASCII-compatible 8-bit variable-width encoding
341 sharing the following principles with the Mule-internal encoding:
342
343 -- All ASCII characters (codepoints 0 through 127) are represented
344 by themselves (i.e. using one byte, with the same value as the
345 ASCII codepoint), and these bytes are disjoint from bytes
346 representing non-ASCII characters.
347
348 This means that any 8-bit clean application can safely process
349 UTF-8-encoded text as it were ASCII, with no corruption (e.g. a
350 '/' byte is always a slash character, never the second byte of
351 some other character, as with Big5, so a pathname encoded in
352 UTF-8 can safely be split up into components and reassembled
353 again using standard ASCII processes).
354
355 -- Leading bytes and non-leading bytes in the encoding of a
356 character are disjoint, so moving backwards is easy.
357
358 -- Given only the leading byte, you know how many following bytes
359 are present.
360 "
361 unicode-type utf-8))
362
363 (make-coding-system
364 'utf-8-bom 'unicode
365 "UTF-8 w/BOM"
366 '(mnemonic "MSW-UTF8"
367 documentation
368 "UTF-8 Unicode encoding, with byte order mark.
369 Standard encoding for representing UTF-8 under MS Windows."
370 unicode-type utf-8
371 little-endian t
372 need-bom t))
373 373
374 ;; Now, create jit-ucs-charset-0 entries for those characters in Windows 374 ;; Now, create jit-ucs-charset-0 entries for those characters in Windows
375 ;; Glyph List 4 that would otherwise end up in East Asian character sets. 375 ;; Glyph List 4 that would otherwise end up in East Asian character sets.
376 ;; 376 ;;
377 ;; WGL4 is a character repertoire from Microsoft that gives a guideline 377 ;; WGL4 is a character repertoire from Microsoft that gives a guideline
611 begin end buffer)) 611 begin end buffer))
612 612
613 ;; Sure would be nice to be able to use defface here. 613 ;; Sure would be nice to be able to use defface here.
614 (copy-face 'highlight 'unicode-invalid-sequence-warning-face) 614 (copy-face 'highlight 'unicode-invalid-sequence-warning-face)
615 615
616 (defvar unicode-query-coding-skip-chars-arg nil ;; Set in general-late.el
617 "Used by `unicode-query-coding-region' to skip chars with known mappings.")
618
619 (defun unicode-query-coding-region (begin end coding-system
620 &optional buffer ignore-invalid-sequencesp
621 errorp highlightp)
622 "The `query-coding-region' implementation for Unicode coding systems.
623
624 Supports IGNORE-INVALID-SEQUENCESP, that is, XEmacs characters that reflect
625 invalid octets on disk will be treated as encodable if this argument is
626 specified, and as not encodable if it is not specified."
627
628 ;; Potential problem here; the octets that correspond to octets from #x00
629 ;; to #x7f on disk will be treated by utf-8 and utf-7 as invalid
630 ;; sequences, and thus, in theory, encodable.
631
632 (check-argument-type #'coding-system-p
633 (setq coding-system (find-coding-system coding-system)))
634 (check-argument-type #'integer-or-marker-p begin)
635 (check-argument-type #'integer-or-marker-p end)
636 (let* ((skip-chars-arg (concat unicode-query-coding-skip-chars-arg
637 (if ignore-invalid-sequencesp
638 unicode-invalid-sequence-regexp-range
639 "")))
640 (ranges (make-range-table))
641 (looking-at-arg (concat "[" skip-chars-arg "]"))
642 (case-fold-search nil)
643 (invalid-sequence-lower-unicode-bound
644 (char-to-unicode
645 (aref (decode-coding-string "\xd8\x00\x00\x00"
646 'utf-16-be) 3)))
647 (invalid-sequence-upper-unicode-bound
648 (char-to-unicode
649 (aref (decode-coding-string "\xd8\x00\x00\xFF"
650 'utf-16-be) 3)))
651 fail-range-start fail-range-end char-after failed
652 extent char-unicode failed-reason previous-failed-reason)
653 (save-excursion
654 (when highlightp
655 (query-coding-clear-highlights begin end buffer))
656 (goto-char begin buffer)
657 (skip-chars-forward skip-chars-arg end buffer)
658 (while (< (point buffer) end)
659 (setq char-after (char-after (point buffer) buffer)
660 fail-range-start (point buffer))
661 (while (and
662 (< (point buffer) end)
663 (not (looking-at looking-at-arg))
664 (or (and
665 (= -1 (setq char-unicode (char-to-unicode char-after)))
666 (setq failed-reason 'unencodable))
667 (and (not ignore-invalid-sequencesp)
668 ;; The default case, with ignore-invalid-sequencesp
669 ;; not specified:
670 ;; If the character is in the Unicode range that
671 ;; corresponds to an invalid octet, we want to
672 ;; treat it as unencodable.
673 (<= invalid-sequence-lower-unicode-bound
674 char-unicode)
675 (<= char-unicode
676 invalid-sequence-upper-unicode-bound)
677 (setq failed-reason 'invalid-sequence)))
678 (or (null previous-failed-reason)
679 (eq previous-failed-reason failed-reason)))
680 (forward-char 1 buffer)
681 (setq char-after (char-after (point buffer) buffer)
682 failed t
683 previous-failed-reason failed-reason))
684 (if (= fail-range-start (point buffer))
685 ;; The character can actually be encoded by the coding
686 ;; system; check the characters past it.
687 (forward-char 1 buffer)
688 ;; Can't be encoded; note this.
689 (when errorp
690 (error 'text-conversion-error
691 (format "Cannot encode %s using coding system"
692 (buffer-substring fail-range-start (point buffer)
693 buffer))
694 (coding-system-name coding-system)))
695 (assert
696 (not (null previous-failed-reason)) t
697 "If we've got here, previous-failed-reason should be non-nil.")
698 (put-range-table fail-range-start
699 ;; If char-after is non-nil, we're not at
700 ;; the end of the buffer.
701 (setq fail-range-end (if char-after
702 (point buffer)
703 (point-max buffer)))
704 previous-failed-reason ranges)
705 (setq previous-failed-reason nil)
706 (when highlightp
707 (setq extent (make-extent fail-range-start fail-range-end buffer))
708 (set-extent-priority extent (+ mouse-highlight-priority 2))
709 (set-extent-face extent 'query-coding-warning-face)))
710 (skip-chars-forward skip-chars-arg end buffer))
711 (if failed
712 (values nil ranges)
713 (values t nil)))))
714
715 (loop
716 for coding-system in (coding-system-list)
717 initially (unless (featurep 'mule) (return))
718 do (when (eq 'unicode (coding-system-type coding-system))
719 (coding-system-put coding-system 'query-coding-function
720 #'unicode-query-coding-region)))
721
722 (unless (featurep 'mule) 616 (unless (featurep 'mule)
723 ;; We do this in such a roundabout way--instead of having the above defun 617 ;; We do this in such a roundabout way--instead of having the above defun
724 ;; and defvar calls inside a (when (featurep 'mule) ...) form--to have 618 ;; and defvar calls inside a (when (featurep 'mule) ...) form--to have
725 ;; make-docfile.c pick up symbol and function documentation correctly. An 619 ;; make-docfile.c pick up symbol and function documentation correctly. An
726 ;; alternative approach would be to fix make-docfile.c to be able to read 620 ;; alternative approach would be to fix make-docfile.c to be able to read