comparison src/mule-coding.c @ 4690:257b468bf2ca

Move the #'query-coding-region implementation to C. This is necessary because there is no reasonable way to access the corresponding mswindows-multibyte functionality from Lisp, and we need such functionality if we're going to have a reliable and portable #'query-coding-region implementation. However, this change doesn't yet provide #'query-coding-region for the mswindow-multibyte coding systems, there should be no functional differences between an XEmacs with this change and one without it. src/ChangeLog addition: 2009-09-19 Aidan Kehoe <kehoea@parhasard.net> Move the #'query-coding-region implementation to C. This is necessary because there is no reasonable way to access the corresponding mswindows-multibyte functionality from Lisp, and we need such functionality if we're going to have a reliable and portable #'query-coding-region implementation. However, this change doesn't yet provide #'query-coding-region for the mswindow-multibyte coding systems, there should be no functional differences between an XEmacs with this change and one without it. * mule-coding.c (struct fixed_width_coding_system): Add a new coding system type, fixed_width, and implement it. It uses the CCL infrastructure but has a much simpler creation API, and its own query_method, formerly in lisp/mule/mule-coding.el. * unicode.c: Move the Unicode query method implementation here from unicode.el. * lisp.h: Declare Fmake_coding_system_internal, Fcopy_range_table here. * intl-win32.c (complex_vars_of_intl_win32): Use Fmake_coding_system_internal, not Fmake_coding_system. * general-slots.h: Add Qsucceeded, Qunencodable, Qinvalid_sequence here. * file-coding.h (enum coding_system_variant): Add fixed_width_coding_system here. (struct coding_system_methods): Add query_method and query_lstream_method to the coding system methods. Provide flags for the query methods. Declare the default query method; initialise it correctly in INITIALIZE_CODING_SYSTEM_TYPE. * file-coding.c (default_query_method): New function, the default query method for coding systems that do not set it. Moved from coding.el. (make_coding_system_1): Accept new elements in PROPS in #'make-coding-system; aliases, a list of aliases; safe-chars and safe-charsets (these were previously accepted but not saved); and category. (Fmake_coding_system_internal): New function, what used to be #'make-coding-system--on Mule builds, we've now moved some of the functionality of this to Lisp. (Fcoding_system_canonical_name_p): Move this earlier in the file, since it's now called from within make_coding_system_1. (Fquery_coding_region): Move the implementation of this here, from coding.el. (complex_vars_of_file_coding): Call Fmake_coding_system_internal, not Fmake_coding_system; specify safe-charsets properties when we're a mule build. * extents.h (mouse_highlight_priority, Fset_extent_priority, Fset_extent_face, Fmap_extents): Make these available to other C files. lisp/ChangeLog addition: 2009-09-19 Aidan Kehoe <kehoea@parhasard.net> Move the #'query-coding-region implementation to C. * coding.el: Consolidate code that depends on the presence or absence of Mule at the end of this file. (default-query-coding-region, query-coding-region): Move these functions to C. (default-query-coding-region-safe-charset-skip-chars-map): Remove this variable, the corresponding C variable is Vdefault_query_coding_region_chartab_cache in file-coding.c. (query-coding-string): Update docstring to reflect actual multiple values, be more careful about not modifying a range table that we're currently mapping over. (encode-coding-char): Make the implementation of this simpler. (featurep 'mule): Autoload #'make-coding-system from mule/make-coding-system.el if we're a mule build; provide an appropriate compiler macro. Do various non-mule compatibility things if we're not a mule build. * update-elc.el (additional-dump-dependencies): Add mule/make-coding-system as a dump time dependency if we're a mule build. * unicode.el (ccl-encode-to-ucs-2): (decode-char): (encode-char): Move these earlier in the file, for the sake of some byte compile warnings. (unicode-query-coding-region): Move this to unicode.c * mule/make-coding-system.el: New file, not dumped. Contains the functionality to rework the arguments necessary for fixed-width coding systems, and contains the implementation of #'make-coding-system, which now calls #'make-coding-system-internal. * mule/vietnamese.el (viscii): * mule/latin.el (iso-8859-2): (windows-1250): (iso-8859-3): (iso-8859-4): (iso-8859-14): (iso-8859-15): (iso-8859-16): (iso-8859-9): (macintosh): (windows-1252): * mule/hebrew.el (iso-8859-8): * mule/greek.el (iso-8859-7): (windows-1253): * mule/cyrillic.el (iso-8859-5): (koi8-r): (koi8-u): (windows-1251): (alternativnyj): (koi8-ru): (koi8-t): (koi8-c): (koi8-o): * mule/arabic.el (iso-8859-6): (windows-1256): Move all these coding systems to being of type fixed-width, not of type CCL. This allows the distinct query-coding-region for them to be in C, something which will eventually allow us to implement query-coding-region for the mswindows-multibyte coding systems. * mule/general-late.el (posix-charset-to-coding-system-hash): Document why we're pre-emptively persuading the byte compiler that the ELC for this file needs to be written using escape-quoted. Call #'set-unicode-query-skip-chars-args, now the Unicode query-coding-region implementation is in C. * mule/thai-xtis.el (tis-620): Don't bother checking whether we're XEmacs or not here. * mule/mule-coding.el: Move the eight bit fixed-width functionality from this file to make-coding-system.el. tests/ChangeLog addition: 2009-09-19 Aidan Kehoe <kehoea@parhasard.net> * automated/mule-tests.el: Check a coding system's type, not an 8-bit-fixed property, for whether that coding system should be treated as a fixed-width coding system. * automated/query-coding-tests.el: Don't test the query coding functionality for mswindows-multibyte coding systems, it's not yet implemented.
author Aidan Kehoe <kehoea@parhasard.net>
date Sat, 19 Sep 2009 22:53:13 +0100
parents 726060ee587c
children 3ba90c659d01
comparison
equal deleted inserted replaced
4689:0636c6ccb430 4690:257b468bf2ca
34 #include "lisp.h" 34 #include "lisp.h"
35 35
36 #include "charset.h" 36 #include "charset.h"
37 #include "mule-ccl.h" 37 #include "mule-ccl.h"
38 #include "file-coding.h" 38 #include "file-coding.h"
39 #include "elhash.h"
40 #include "rangetab.h"
41 #include "buffer.h"
42 #include "extents.h"
39 43
40 Lisp_Object Qshift_jis, Qiso2022, Qbig5, Qccl; 44 Lisp_Object Qshift_jis, Qiso2022, Qbig5, Qccl;
41 45
42 Lisp_Object Qcharset_g0, Qcharset_g1, Qcharset_g2, Qcharset_g3; 46 Lisp_Object Qcharset_g0, Qcharset_g1, Qcharset_g2, Qcharset_g3;
43 Lisp_Object Qforce_g0_on_output, Qforce_g1_on_output; 47 Lisp_Object Qforce_g0_on_output, Qforce_g1_on_output;
45 Lisp_Object Qno_iso6429; 49 Lisp_Object Qno_iso6429;
46 Lisp_Object Qinput_charset_conversion, Qoutput_charset_conversion; 50 Lisp_Object Qinput_charset_conversion, Qoutput_charset_conversion;
47 Lisp_Object Qshort, Qno_ascii_eol, Qno_ascii_cntl, Qseven, Qlock_shift; 51 Lisp_Object Qshort, Qno_ascii_eol, Qno_ascii_cntl, Qseven, Qlock_shift;
48 52
49 Lisp_Object Qiso_7, Qiso_8_designate, Qiso_8_1, Qiso_8_2, Qiso_lock_shift; 53 Lisp_Object Qiso_7, Qiso_8_designate, Qiso_8_1, Qiso_8_2, Qiso_lock_shift;
54
55 Lisp_Object Qfrom_unicode, Qquery_skip_chars, Qinvalid_sequences_skip_chars;
56 Lisp_Object Qfixed_width;
50 57
51 58
52 /************************************************************************/ 59 /************************************************************************/
53 /* Shift-JIS methods */ 60 /* Shift-JIS methods */
54 /************************************************************************/ 61 /************************************************************************/
3386 else if (EQ (prop, Qencode)) 3393 else if (EQ (prop, Qencode))
3387 return XCODING_SYSTEM_CCL_ENCODE (coding_system); 3394 return XCODING_SYSTEM_CCL_ENCODE (coding_system);
3388 else 3395 else
3389 return Qunbound; 3396 return Qunbound;
3390 } 3397 }
3398
3399 /************************************************************************/
3400 /* FIXED_WIDTH methods */
3401 /************************************************************************/
3402
3403 struct fixed_width_coding_system
3404 {
3405 /* For a fixed_width coding system, these specify the CCL programs
3406 used for decoding (input) and encoding (output). */
3407 Lisp_Object decode;
3408 Lisp_Object encode;
3409 Lisp_Object from_unicode;
3410 Lisp_Object invalid_sequences_skip_chars;
3411 Lisp_Object query_skip_chars;
3412
3413 /* This is not directly accessible from Lisp; it is a concatenation of the
3414 previous two strings, used for simplicity of implementation. */
3415 Lisp_Object invalid_and_query_skip_chars;
3416 };
3417
3418 #define CODING_SYSTEM_FIXED_WIDTH_DECODE(codesys) \
3419 (CODING_SYSTEM_TYPE_DATA (codesys, fixed_width)->decode)
3420 #define CODING_SYSTEM_FIXED_WIDTH_ENCODE(codesys) \
3421 (CODING_SYSTEM_TYPE_DATA (codesys, fixed_width)->encode)
3422 #define CODING_SYSTEM_FIXED_WIDTH_FROM_UNICODE(codesys) \
3423 (CODING_SYSTEM_TYPE_DATA (codesys, fixed_width)->from_unicode)
3424 #define CODING_SYSTEM_FIXED_WIDTH_INVALID_SEQUENCES_SKIP_CHARS(codesys) \
3425 (CODING_SYSTEM_TYPE_DATA (codesys, \
3426 fixed_width)->invalid_sequences_skip_chars)
3427 #define CODING_SYSTEM_FIXED_WIDTH_QUERY_SKIP_CHARS(codesys) \
3428 (CODING_SYSTEM_TYPE_DATA (codesys, fixed_width)->query_skip_chars)
3429 #define CODING_SYSTEM_FIXED_WIDTH_INVALID_AND_QUERY_SKIP_CHARS(codesys) \
3430 (CODING_SYSTEM_TYPE_DATA (codesys, \
3431 fixed_width)->invalid_and_query_skip_chars)
3432
3433 #define XCODING_SYSTEM_FIXED_WIDTH_DECODE(codesys) \
3434 CODING_SYSTEM_FIXED_WIDTH_DECODE (XCODING_SYSTEM (codesys))
3435 #define XCODING_SYSTEM_FIXED_WIDTH_ENCODE(codesys) \
3436 CODING_SYSTEM_FIXED_WIDTH_ENCODE (XCODING_SYSTEM (codesys))
3437 #define XCODING_SYSTEM_FIXED_WIDTH_FROM_UNICODE(codesys) \
3438 (CODING_SYSTEM_FIXED_WIDTH_FROM_UNICODE (XCODING_SYSTEM (codesys)))
3439 #define XCODING_SYSTEM_FIXED_WIDTH_INVALID_SEQUENCES_SKIP_CHARS(codesys) \
3440 (CODING_SYSTEM_FIXED_WIDTH_INVALID_SEQUENCES_SKIP_CHARS \
3441 (XCODING_SYSTEM (codesys)))
3442 #define XCODING_SYSTEM_FIXED_WIDTH_QUERY_SKIP_CHARS(codesys) \
3443 (CODING_SYSTEM_FIXED_WIDTH_QUERY_SKIP_CHARS (XCODING_SYSTEM (codesys)))
3444 #define XCODING_SYSTEM_FIXED_WIDTH_INVALID_AND_QUERY_SKIP_CHARS(codesys) \
3445 (CODING_SYSTEM_FIXED_WIDTH_INVALID_AND_QUERY_SKIP_CHARS \
3446 (XCODING_SYSTEM(codesys)))
3447
3448 struct fixed_width_coding_stream
3449 {
3450 /* state of the running CCL program */
3451 struct ccl_program ccl;
3452 };
3453
3454 static const struct memory_description
3455 fixed_width_coding_system_description[] = {
3456 { XD_LISP_OBJECT, offsetof (struct fixed_width_coding_system, decode) },
3457 { XD_LISP_OBJECT, offsetof (struct fixed_width_coding_system, encode) },
3458 { XD_LISP_OBJECT, offsetof (struct fixed_width_coding_system,
3459 from_unicode) },
3460 { XD_LISP_OBJECT, offsetof (struct fixed_width_coding_system,
3461 invalid_sequences_skip_chars) },
3462 { XD_LISP_OBJECT, offsetof (struct fixed_width_coding_system,
3463 query_skip_chars) },
3464 { XD_LISP_OBJECT, offsetof (struct fixed_width_coding_system,
3465 invalid_and_query_skip_chars) },
3466 { XD_END }
3467 };
3468
3469 DEFINE_CODING_SYSTEM_TYPE_WITH_DATA (fixed_width);
3470
3471 static void
3472 fixed_width_mark (Lisp_Object codesys)
3473 {
3474 mark_object (XCODING_SYSTEM_FIXED_WIDTH_DECODE (codesys));
3475 mark_object (XCODING_SYSTEM_FIXED_WIDTH_ENCODE (codesys));
3476 mark_object (XCODING_SYSTEM_FIXED_WIDTH_FROM_UNICODE (codesys));
3477 mark_object
3478 (XCODING_SYSTEM_FIXED_WIDTH_INVALID_SEQUENCES_SKIP_CHARS (codesys));
3479 mark_object (XCODING_SYSTEM_FIXED_WIDTH_QUERY_SKIP_CHARS (codesys) );
3480 mark_object
3481 (XCODING_SYSTEM_FIXED_WIDTH_INVALID_AND_QUERY_SKIP_CHARS(codesys));
3482 }
3483
3484 static Bytecount
3485 fixed_width_convert (struct coding_stream *str, const UExtbyte *src,
3486 unsigned_char_dynarr *dst, Bytecount n)
3487 {
3488 struct fixed_width_coding_stream *data =
3489 CODING_STREAM_TYPE_DATA (str, fixed_width);
3490 Bytecount orign = n;
3491
3492 data->ccl.last_block = str->eof;
3493 /* When applying a CCL program to a stream, SRC must not be NULL -- this
3494 is a special signal to the driver that read and write operations are
3495 not allowed. The code does not actually look at what SRC points to if
3496 N == 0. */
3497 ccl_driver (&data->ccl, src ? src : (const unsigned char *) "",
3498 dst, n, 0,
3499 str->direction == CODING_DECODE ? CCL_MODE_DECODING :
3500 CCL_MODE_ENCODING);
3501 return orign;
3502 }
3503
3504 static void
3505 fixed_width_init_coding_stream (struct coding_stream *str)
3506 {
3507 struct fixed_width_coding_stream *data =
3508 CODING_STREAM_TYPE_DATA (str, fixed_width);
3509
3510 setup_ccl_program (&data->ccl,
3511 str->direction == CODING_DECODE ?
3512 XCODING_SYSTEM_FIXED_WIDTH_DECODE (str->codesys) :
3513 XCODING_SYSTEM_FIXED_WIDTH_ENCODE (str->codesys));
3514 }
3515
3516 static void
3517 fixed_width_rewind_coding_stream (struct coding_stream *str)
3518 {
3519 fixed_width_init_coding_stream (str);
3520 }
3521
3522 static void
3523 fixed_width_init (Lisp_Object codesys)
3524 {
3525 XCODING_SYSTEM_FIXED_WIDTH_DECODE (codesys) = Qnil;
3526 XCODING_SYSTEM_FIXED_WIDTH_ENCODE (codesys) = Qnil;
3527 XCODING_SYSTEM_FIXED_WIDTH_FROM_UNICODE (codesys) = Qnil;
3528 XCODING_SYSTEM_FIXED_WIDTH_INVALID_SEQUENCES_SKIP_CHARS (codesys) = Qnil;
3529 XCODING_SYSTEM_FIXED_WIDTH_QUERY_SKIP_CHARS (codesys) = Qnil;
3530 XCODING_SYSTEM_FIXED_WIDTH_INVALID_AND_QUERY_SKIP_CHARS(codesys) = Qnil;
3531 }
3532
3533 static int
3534 fixed_width_putprop (Lisp_Object codesys, Lisp_Object key,
3535 Lisp_Object value)
3536 {
3537 struct ccl_program test_ccl;
3538
3539 if (EQ (key, Qdecode) || EQ (key, Qencode))
3540 {
3541 /* Check if the CCL infrastructure thinks this is a sane CCL
3542 program: */
3543 if (setup_ccl_program (&test_ccl, value) < 0)
3544 {
3545 invalid_argument ("Invalid CCL program", value);
3546 }
3547
3548 if (EQ (key, Qdecode))
3549 {
3550 XCODING_SYSTEM_FIXED_WIDTH_DECODE (codesys) = value;
3551 }
3552 else
3553 {
3554 XCODING_SYSTEM_FIXED_WIDTH_ENCODE (codesys) = value;
3555 }
3556 }
3557 else if (EQ (key, Qfrom_unicode))
3558 {
3559 CHECK_HASH_TABLE (value);
3560 XCODING_SYSTEM_FIXED_WIDTH_FROM_UNICODE (codesys) = value;
3561 }
3562 else if (EQ (key, Qinvalid_sequences_skip_chars))
3563 {
3564 CHECK_STRING (value);
3565
3566 /* Make sure Lisp can't make our data inconsistent: */
3567 value = Fcopy_sequence (value);
3568
3569 XCODING_SYSTEM_FIXED_WIDTH_INVALID_SEQUENCES_SKIP_CHARS (codesys)
3570 = value;
3571
3572 XCODING_SYSTEM_FIXED_WIDTH_INVALID_AND_QUERY_SKIP_CHARS (codesys)
3573 = concat2 (value,
3574 XCODING_SYSTEM_FIXED_WIDTH_QUERY_SKIP_CHARS (codesys));
3575 }
3576 else if (EQ (key, Qquery_skip_chars))
3577 {
3578 CHECK_STRING (value);
3579
3580 /* Make sure Lisp can't make our data inconsistent: */
3581 value = Fcopy_sequence (value);
3582
3583 XCODING_SYSTEM_FIXED_WIDTH_QUERY_SKIP_CHARS (codesys) = value;
3584
3585 XCODING_SYSTEM_FIXED_WIDTH_INVALID_AND_QUERY_SKIP_CHARS (codesys)
3586 = concat2 (value,
3587 XCODING_SYSTEM_FIXED_WIDTH_INVALID_SEQUENCES_SKIP_CHARS
3588 (codesys));
3589 }
3590 else
3591 {
3592 return 0;
3593 }
3594
3595 return 1;
3596 }
3597
3598 static Lisp_Object
3599 fixed_width_getprop (Lisp_Object codesys, Lisp_Object prop)
3600 {
3601 if (EQ (prop, Qdecode))
3602 {
3603 return XCODING_SYSTEM_FIXED_WIDTH_DECODE (codesys);
3604 }
3605 else if (EQ (prop, Qencode))
3606 {
3607 return XCODING_SYSTEM_FIXED_WIDTH_ENCODE (codesys);
3608 }
3609 else if (EQ (prop, Qfrom_unicode))
3610 {
3611 return XCODING_SYSTEM_FIXED_WIDTH_FROM_UNICODE (codesys);
3612 }
3613 else if (EQ (prop, Qinvalid_sequences_skip_chars))
3614 {
3615 /* Make sure Lisp can't make our data inconsistent: */
3616 return
3617 Fcopy_sequence
3618 (XCODING_SYSTEM_FIXED_WIDTH_INVALID_SEQUENCES_SKIP_CHARS (codesys));
3619 }
3620 else if (EQ (prop, Qquery_skip_chars))
3621 {
3622 return
3623 Fcopy_sequence (XCODING_SYSTEM_FIXED_WIDTH_QUERY_SKIP_CHARS (codesys));
3624 }
3625
3626 return Qunbound;
3627 }
3628
3629 static Lisp_Object Vfixed_width_query_ranges_cache;
3630
3631 static Lisp_Object
3632 fixed_width_skip_chars_data_given_strings (Lisp_Object string,
3633 Lisp_Object query_skip_chars,
3634 Lisp_Object
3635 invalid_sequences_skip_chars,
3636 Binbyte *fastmap,
3637 int fastmap_len)
3638 {
3639 Lisp_Object result = Fgethash (string,
3640 Vfixed_width_query_ranges_cache,
3641 Qnil);
3642 REGISTER Ibyte *p, *pend;
3643 REGISTER Ichar c;
3644
3645 memset (fastmap, query_coding_unencodable, fastmap_len);
3646
3647 if (!NILP (result))
3648 {
3649 int i;
3650 Lisp_Object ranged;
3651 assert (RANGE_TABLEP (result));
3652 for (i = 0; i < fastmap_len; ++i)
3653 {
3654 ranged = Fget_range_table (make_int (i), result, Qnil);
3655
3656 if (EQ (ranged, Qsucceeded))
3657 {
3658 fastmap [i] = query_coding_succeeded;
3659 }
3660 else if (EQ (ranged, Qinvalid_sequence))
3661 {
3662 fastmap [i] = query_coding_invalid_sequence;
3663 }
3664 }
3665 return result;
3666 }
3667
3668 result = Fmake_range_table (Qstart_closed_end_closed);
3669
3670 p = XSTRING_DATA (query_skip_chars);
3671 pend = p + XSTRING_LENGTH (query_skip_chars);
3672
3673 while (p != pend)
3674 {
3675 c = itext_ichar (p);
3676
3677 INC_IBYTEPTR (p);
3678
3679 if (c == '\\')
3680 {
3681 if (p == pend) break;
3682 c = itext_ichar (p);
3683 INC_IBYTEPTR (p);
3684 }
3685
3686 if (p != pend && *p == '-')
3687 {
3688 Ichar cend;
3689
3690 /* Skip over the dash. */
3691 p++;
3692 if (p == pend) break;
3693 cend = itext_ichar (p);
3694
3695 Fput_range_table (make_int (c), make_int (cend), Qsucceeded,
3696 result);
3697
3698 while (c <= cend && c < fastmap_len)
3699 {
3700 fastmap[c] = query_coding_succeeded;
3701 c++;
3702 }
3703
3704 INC_IBYTEPTR (p);
3705 }
3706 else
3707 {
3708 if (c < fastmap_len)
3709 fastmap[c] = query_coding_succeeded;
3710
3711 Fput_range_table (make_int (c), make_int (c), Qsucceeded, result);
3712 }
3713 }
3714
3715
3716 p = XSTRING_DATA (invalid_sequences_skip_chars);
3717 pend = p + XSTRING_LENGTH (invalid_sequences_skip_chars);
3718
3719 while (p != pend)
3720 {
3721 c = itext_ichar (p);
3722
3723 INC_IBYTEPTR (p);
3724
3725 if (c == '\\')
3726 {
3727 if (p == pend) break;
3728 c = itext_ichar (p);
3729 INC_IBYTEPTR (p);
3730 }
3731
3732 if (p != pend && *p == '-')
3733 {
3734 Ichar cend;
3735
3736 /* Skip over the dash. */
3737 p++;
3738 if (p == pend) break;
3739 cend = itext_ichar (p);
3740
3741 Fput_range_table (make_int (c), make_int (cend), Qinvalid_sequence,
3742 result);
3743
3744 while (c <= cend && c < fastmap_len)
3745 {
3746 fastmap[c] = query_coding_invalid_sequence;
3747 c++;
3748 }
3749
3750 INC_IBYTEPTR (p);
3751 }
3752 else
3753 {
3754 if (c < fastmap_len)
3755 fastmap[c] = query_coding_invalid_sequence;
3756
3757 Fput_range_table (make_int (c), make_int (c), Qinvalid_sequence,
3758 result);
3759 }
3760 }
3761
3762 Fputhash (string, result, Vfixed_width_query_ranges_cache);
3763
3764 return result;
3765 }
3766
3767 static Lisp_Object
3768 fixed_width_query (Lisp_Object codesys, struct buffer *buf,
3769 Charbpos end, int flags)
3770 {
3771 Charbpos pos = BUF_PT (buf), fail_range_start, fail_range_end;
3772 Charbpos pos_byte = BYTE_BUF_PT (buf);
3773 Lisp_Object skip_chars_range_table, from_unicode, checked_unicode,
3774 result = Qnil;
3775 enum query_coding_failure_reasons failed_reason,
3776 previous_failed_reason = query_coding_succeeded;
3777 Binbyte fastmap[0xff];
3778
3779 from_unicode = XCODING_SYSTEM_FIXED_WIDTH_FROM_UNICODE (codesys);
3780
3781 skip_chars_range_table =
3782 fixed_width_skip_chars_data_given_strings
3783 ((flags & QUERY_METHOD_IGNORE_INVALID_SEQUENCES ?
3784 XCODING_SYSTEM_FIXED_WIDTH_INVALID_AND_QUERY_SKIP_CHARS
3785 (codesys) :
3786 XCODING_SYSTEM_FIXED_WIDTH_QUERY_SKIP_CHARS(codesys)),
3787 XCODING_SYSTEM_FIXED_WIDTH_QUERY_SKIP_CHARS(codesys),
3788 (flags & QUERY_METHOD_IGNORE_INVALID_SEQUENCES ?
3789 build_string("") :
3790 XCODING_SYSTEM_FIXED_WIDTH_INVALID_SEQUENCES_SKIP_CHARS (codesys)),
3791 fastmap, (int)(sizeof (fastmap)));
3792
3793 if (flags & QUERY_METHOD_HIGHLIGHT &&
3794 /* If we're being called really early, live without highlights getting
3795 cleared properly: */
3796 !(UNBOUNDP (XSYMBOL (Qquery_coding_clear_highlights)->function)))
3797 {
3798 /* It's okay to call Lisp here, the only non-stack object we may have
3799 allocated up to this point is skip_chars_range_table, and that's
3800 reachable from its entry in Vfixed_width_query_ranges_cache. */
3801 call3 (Qquery_coding_clear_highlights, make_int (pos), make_int (end),
3802 wrap_buffer (buf));
3803 }
3804
3805 while (pos < end)
3806 {
3807 Ichar ch = BYTE_BUF_FETCH_CHAR (buf, pos_byte);
3808 if ((ch < (int) (sizeof(fastmap))) ?
3809 (fastmap[ch] == query_coding_succeeded) :
3810 (EQ (Qsucceeded, Fget_range_table (make_int (ch),
3811 skip_chars_range_table, Qnil))))
3812 {
3813 pos++;
3814 INC_BYTEBPOS (buf, pos_byte);
3815 }
3816 else
3817 {
3818 fail_range_start = pos;
3819 while ((pos < end) &&
3820 ((!(flags & QUERY_METHOD_IGNORE_INVALID_SEQUENCES) &&
3821 EQ (Qinvalid_sequence, Fget_range_table
3822 (make_int (ch), skip_chars_range_table, Qnil))
3823 && (failed_reason = query_coding_invalid_sequence))
3824 || ((NILP ((checked_unicode =
3825 Fgethash (Fchar_to_unicode (make_char (ch)),
3826 from_unicode, Qnil))))
3827 && (failed_reason = query_coding_unencodable)))
3828 && (previous_failed_reason == query_coding_succeeded
3829 || previous_failed_reason == failed_reason))
3830 {
3831 pos++;
3832 INC_BYTEBPOS (buf, pos_byte);
3833 ch = BYTE_BUF_FETCH_CHAR (buf, pos_byte);
3834 previous_failed_reason = failed_reason;
3835 }
3836
3837 if (fail_range_start == pos)
3838 {
3839 /* The character can actually be encoded; move on. */
3840 pos++;
3841 INC_BYTEBPOS (buf, pos_byte);
3842 }
3843 else
3844 {
3845 assert (previous_failed_reason == query_coding_invalid_sequence
3846 || previous_failed_reason == query_coding_unencodable);
3847
3848 if (flags & QUERY_METHOD_ERRORP)
3849 {
3850 DECLARE_EISTRING (error_details);
3851
3852 eicpy_ascii (error_details, "Cannot encode ");
3853 eicat_lstr (error_details,
3854 make_string_from_buffer (buf, fail_range_start,
3855 pos - fail_range_start));
3856 eicat_ascii (error_details, " using coding system");
3857
3858 signal_error (Qtext_conversion_error,
3859 (const CIbyte *)(eidata (error_details)),
3860 XCODING_SYSTEM_NAME (codesys));
3861 }
3862
3863 if (NILP (result))
3864 {
3865 result = Fmake_range_table (Qstart_closed_end_open);
3866 }
3867
3868 fail_range_end = pos;
3869
3870 Fput_range_table (make_int (fail_range_start),
3871 make_int (fail_range_end),
3872 (previous_failed_reason
3873 == query_coding_unencodable ?
3874 Qunencodable : Qinvalid_sequence),
3875 result);
3876 previous_failed_reason = query_coding_succeeded;
3877
3878 if (flags & QUERY_METHOD_HIGHLIGHT)
3879 {
3880 Lisp_Object extent
3881 = Fmake_extent (make_int (fail_range_start),
3882 make_int (fail_range_end),
3883 wrap_buffer (buf));
3884
3885 Fset_extent_priority
3886 (extent, make_int (2 + mouse_highlight_priority));
3887 Fset_extent_face (extent, Qquery_coding_warning_face);
3888 }
3889 }
3890 }
3891 }
3892
3893 return result;
3894 }
3391 3895
3392 3896
3393 /************************************************************************/ 3897 /************************************************************************/
3394 /* Initialization */ 3898 /* Initialization */
3395 /************************************************************************/ 3899 /************************************************************************/
3428 DEFSYMBOL (Qiso_7); 3932 DEFSYMBOL (Qiso_7);
3429 DEFSYMBOL (Qiso_8_designate); 3933 DEFSYMBOL (Qiso_8_designate);
3430 DEFSYMBOL (Qiso_8_1); 3934 DEFSYMBOL (Qiso_8_1);
3431 DEFSYMBOL (Qiso_8_2); 3935 DEFSYMBOL (Qiso_8_2);
3432 DEFSYMBOL (Qiso_lock_shift); 3936 DEFSYMBOL (Qiso_lock_shift);
3937
3938 DEFSYMBOL (Qfrom_unicode);
3939 DEFSYMBOL (Qinvalid_sequences_skip_chars);
3940 DEFSYMBOL (Qquery_skip_chars);
3941 DEFSYMBOL (Qfixed_width);
3433 } 3942 }
3434 3943
3435 void 3944 void
3436 coding_system_type_create_mule_coding (void) 3945 coding_system_type_create_mule_coding (void)
3437 { 3946 {
3463 CODING_SYSTEM_HAS_METHOD (ccl, init_coding_stream); 3972 CODING_SYSTEM_HAS_METHOD (ccl, init_coding_stream);
3464 CODING_SYSTEM_HAS_METHOD (ccl, rewind_coding_stream); 3973 CODING_SYSTEM_HAS_METHOD (ccl, rewind_coding_stream);
3465 CODING_SYSTEM_HAS_METHOD (ccl, putprop); 3974 CODING_SYSTEM_HAS_METHOD (ccl, putprop);
3466 CODING_SYSTEM_HAS_METHOD (ccl, getprop); 3975 CODING_SYSTEM_HAS_METHOD (ccl, getprop);
3467 3976
3977 INITIALIZE_CODING_SYSTEM_TYPE_WITH_DATA (fixed_width,
3978 "fixed-width-coding-system-p");
3979 CODING_SYSTEM_HAS_METHOD (fixed_width, mark);
3980 CODING_SYSTEM_HAS_METHOD (fixed_width, convert);
3981 CODING_SYSTEM_HAS_METHOD (fixed_width, query);
3982 CODING_SYSTEM_HAS_METHOD (fixed_width, init);
3983 CODING_SYSTEM_HAS_METHOD (fixed_width, init_coding_stream);
3984 CODING_SYSTEM_HAS_METHOD (fixed_width, rewind_coding_stream);
3985 CODING_SYSTEM_HAS_METHOD (fixed_width, putprop);
3986 CODING_SYSTEM_HAS_METHOD (fixed_width, getprop);
3987
3468 INITIALIZE_CODING_SYSTEM_TYPE (shift_jis, "shift-jis-coding-system-p"); 3988 INITIALIZE_CODING_SYSTEM_TYPE (shift_jis, "shift-jis-coding-system-p");
3469 CODING_SYSTEM_HAS_METHOD (shift_jis, convert); 3989 CODING_SYSTEM_HAS_METHOD (shift_jis, convert);
3470 3990
3471 INITIALIZE_DETECTOR (shift_jis); 3991 INITIALIZE_DETECTOR (shift_jis);
3472 DETECTOR_HAS_METHOD (shift_jis, detect); 3992 DETECTOR_HAS_METHOD (shift_jis, detect);
3483 void 4003 void
3484 reinit_coding_system_type_create_mule_coding (void) 4004 reinit_coding_system_type_create_mule_coding (void)
3485 { 4005 {
3486 REINITIALIZE_CODING_SYSTEM_TYPE (iso2022); 4006 REINITIALIZE_CODING_SYSTEM_TYPE (iso2022);
3487 REINITIALIZE_CODING_SYSTEM_TYPE (ccl); 4007 REINITIALIZE_CODING_SYSTEM_TYPE (ccl);
4008 REINITIALIZE_CODING_SYSTEM_TYPE (fixed_width);
3488 REINITIALIZE_CODING_SYSTEM_TYPE (shift_jis); 4009 REINITIALIZE_CODING_SYSTEM_TYPE (shift_jis);
3489 REINITIALIZE_CODING_SYSTEM_TYPE (big5); 4010 REINITIALIZE_CODING_SYSTEM_TYPE (big5);
3490 } 4011 }
3491 4012
3492 void 4013 void
3495 } 4016 }
3496 4017
3497 void 4018 void
3498 vars_of_mule_coding (void) 4019 vars_of_mule_coding (void)
3499 { 4020 {
3500 } 4021 /* This needs to be HASH_TABLE_EQ, there's a corner case where
4022 HASH_TABLE_EQUAL won't work. */
4023 Vfixed_width_query_ranges_cache
4024 = make_lisp_hash_table (32, HASH_TABLE_KEY_WEAK, HASH_TABLE_EQ);
4025 staticpro (&Vfixed_width_query_ranges_cache);
4026 }