view lisp/mule/thai-xtis.el @ 4690:257b468bf2ca

Move the #'query-coding-region implementation to C. This is necessary because there is no reasonable way to access the corresponding mswindows-multibyte functionality from Lisp, and we need such functionality if we're going to have a reliable and portable #'query-coding-region implementation. However, this change doesn't yet provide #'query-coding-region for the mswindow-multibyte coding systems, there should be no functional differences between an XEmacs with this change and one without it. src/ChangeLog addition: 2009-09-19 Aidan Kehoe <kehoea@parhasard.net> Move the #'query-coding-region implementation to C. This is necessary because there is no reasonable way to access the corresponding mswindows-multibyte functionality from Lisp, and we need such functionality if we're going to have a reliable and portable #'query-coding-region implementation. However, this change doesn't yet provide #'query-coding-region for the mswindow-multibyte coding systems, there should be no functional differences between an XEmacs with this change and one without it. * mule-coding.c (struct fixed_width_coding_system): Add a new coding system type, fixed_width, and implement it. It uses the CCL infrastructure but has a much simpler creation API, and its own query_method, formerly in lisp/mule/mule-coding.el. * unicode.c: Move the Unicode query method implementation here from unicode.el. * lisp.h: Declare Fmake_coding_system_internal, Fcopy_range_table here. * intl-win32.c (complex_vars_of_intl_win32): Use Fmake_coding_system_internal, not Fmake_coding_system. * general-slots.h: Add Qsucceeded, Qunencodable, Qinvalid_sequence here. * file-coding.h (enum coding_system_variant): Add fixed_width_coding_system here. (struct coding_system_methods): Add query_method and query_lstream_method to the coding system methods. Provide flags for the query methods. Declare the default query method; initialise it correctly in INITIALIZE_CODING_SYSTEM_TYPE. * file-coding.c (default_query_method): New function, the default query method for coding systems that do not set it. Moved from coding.el. (make_coding_system_1): Accept new elements in PROPS in #'make-coding-system; aliases, a list of aliases; safe-chars and safe-charsets (these were previously accepted but not saved); and category. (Fmake_coding_system_internal): New function, what used to be #'make-coding-system--on Mule builds, we've now moved some of the functionality of this to Lisp. (Fcoding_system_canonical_name_p): Move this earlier in the file, since it's now called from within make_coding_system_1. (Fquery_coding_region): Move the implementation of this here, from coding.el. (complex_vars_of_file_coding): Call Fmake_coding_system_internal, not Fmake_coding_system; specify safe-charsets properties when we're a mule build. * extents.h (mouse_highlight_priority, Fset_extent_priority, Fset_extent_face, Fmap_extents): Make these available to other C files. lisp/ChangeLog addition: 2009-09-19 Aidan Kehoe <kehoea@parhasard.net> Move the #'query-coding-region implementation to C. * coding.el: Consolidate code that depends on the presence or absence of Mule at the end of this file. (default-query-coding-region, query-coding-region): Move these functions to C. (default-query-coding-region-safe-charset-skip-chars-map): Remove this variable, the corresponding C variable is Vdefault_query_coding_region_chartab_cache in file-coding.c. (query-coding-string): Update docstring to reflect actual multiple values, be more careful about not modifying a range table that we're currently mapping over. (encode-coding-char): Make the implementation of this simpler. (featurep 'mule): Autoload #'make-coding-system from mule/make-coding-system.el if we're a mule build; provide an appropriate compiler macro. Do various non-mule compatibility things if we're not a mule build. * update-elc.el (additional-dump-dependencies): Add mule/make-coding-system as a dump time dependency if we're a mule build. * unicode.el (ccl-encode-to-ucs-2): (decode-char): (encode-char): Move these earlier in the file, for the sake of some byte compile warnings. (unicode-query-coding-region): Move this to unicode.c * mule/make-coding-system.el: New file, not dumped. Contains the functionality to rework the arguments necessary for fixed-width coding systems, and contains the implementation of #'make-coding-system, which now calls #'make-coding-system-internal. * mule/vietnamese.el (viscii): * mule/latin.el (iso-8859-2): (windows-1250): (iso-8859-3): (iso-8859-4): (iso-8859-14): (iso-8859-15): (iso-8859-16): (iso-8859-9): (macintosh): (windows-1252): * mule/hebrew.el (iso-8859-8): * mule/greek.el (iso-8859-7): (windows-1253): * mule/cyrillic.el (iso-8859-5): (koi8-r): (koi8-u): (windows-1251): (alternativnyj): (koi8-ru): (koi8-t): (koi8-c): (koi8-o): * mule/arabic.el (iso-8859-6): (windows-1256): Move all these coding systems to being of type fixed-width, not of type CCL. This allows the distinct query-coding-region for them to be in C, something which will eventually allow us to implement query-coding-region for the mswindows-multibyte coding systems. * mule/general-late.el (posix-charset-to-coding-system-hash): Document why we're pre-emptively persuading the byte compiler that the ELC for this file needs to be written using escape-quoted. Call #'set-unicode-query-skip-chars-args, now the Unicode query-coding-region implementation is in C. * mule/thai-xtis.el (tis-620): Don't bother checking whether we're XEmacs or not here. * mule/mule-coding.el: Move the eight bit fixed-width functionality from this file to make-coding-system.el. tests/ChangeLog addition: 2009-09-19 Aidan Kehoe <kehoea@parhasard.net> * automated/mule-tests.el: Check a coding system's type, not an 8-bit-fixed property, for whether that coding system should be treated as a fixed-width coding system. * automated/query-coding-tests.el: Don't test the query coding functionality for mswindows-multibyte coding systems, it's not yet implemented.
author Aidan Kehoe <kehoea@parhasard.net>
date Sat, 19 Sep 2009 22:53:13 +0100
parents 1d74a1d115ee
children 308d34e9f07d
line wrap: on
line source

;;; thai-xtis.el --- Support for Thai (XTIS) -*- coding: iso-2022-7bit; -*-

;; Copyright (C) 1999 Electrotechnical Laboratory, JAPAN.
;; Licensed to the Free Software Foundation.

;; Author: TAKAHASHI Naoto <ntakahas@etl.go.jp>
;;         MORIOKA Tomohiko <tomo@etl.go.jp>
;; Created: 1998-03-27 for Emacs-20.3 by TAKAHASHI Naoto
;;	    1999-03-29 imported and modified for XEmacs	by MORIOKA Tomohiko

;; Keywords: mule, multilingual, Thai, XTIS

;; This file is part of XEmacs.

;; XEmacs is free software; you can redistribute it and/or modify it
;; under the terms of the GNU General Public License as published by
;; the Free Software Foundation; either version 2, or (at your option)
;; any later version.

;; XEmacs is distributed in the hope that it will be useful, but
;; WITHOUT ANY WARRANTY; without even the implied warranty of
;; MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
;; General Public License for more details.

;; You should have received a copy of the GNU General Public License
;; along with XEmacs; see the file COPYING.  If not, write to the Free
;; Software Foundation, Inc., 59 Temple Place - Suite 330, Boston, MA
;; 02111-1307, USA.

;;; Commentary:

;; For Thai, the pre-composed character set proposed by
;; Virach Sornlertlamvanich <virach@links.nectec.or.th> is supported.

;;; Code:

(make-charset 'thai-xtis "Precomposed Thai (XTIS by Virach)."
	      '(registries ["xtis-0"]
		dimension 2
		columns 1
		chars 94
		final ??
		graphic 0))

(define-category ?x "Precomposed Thai character.")
(modify-category-entry 'thai-xtis ?x)

(when (featurep 'xemacs)
  (let ((deflist	'(;; chars	syntax
			  ("$(?!0(B-$(?NxP0R0S0`0(B-$(?e0(B"	"w")
			  ("$(?p0(B-$(?y0(B"	"w")
			  ("$(?O0f0_0o0z0{0(B"	"_")
			  ))
	elm chars len syntax to ch i)
    (while deflist
      (setq elm (car deflist))
      (setq chars (car elm)
	    len (length chars)
	    syntax (nth 1 elm)
	    i 0)
      (while (< i len)
	(if (= (aref chars i) ?-)
	    (setq i (1+ i)
		  to (nth 1 (split-char (aref chars i))))
	  (setq ch (nth 1 (split-char (aref chars i)))
		to ch))
	(while (<= ch to)
	  (modify-syntax-entry (vector 'thai-xtis ch) syntax)
	  (setq ch (1+ ch)))
	(setq i (1+ i)))
      (setq deflist (cdr deflist))))

  (put-charset-property 'thai-xtis 'preferred-coding-system 'tis-620)
  )

;; This is the ccl-decode-thai-xtis automaton.
;;
;; "WRITE x y" == (insert (make-char 'thai-xtis x y))
;; "write x" == (insert x)
;; rx' == (tis620-to-thai-xtis-second-byte-bitpattern rx)
;; r3 == "no vower nor tone"
;; r4 == (charset-id 'thai-xtis)
;; 
;;          |               input (= r0)
;;   state  |--------------------------------------------
;;          |  consonant  |    vowel    |    tone
;; ---------+-------------+-------------+----------------
;;  r1 == 0 | r1 = r0     | WRITE r0,r3 | WRITE r0,r3
;;  r2 == 0 |             |             |
;; ---------+-------------+-------------+----------------
;;  r1 == C | WRITE r1,r3 | r2 = r0'    | WRITE r1,r3|r0'
;;  r2 == 0 | r1 = r0     |             | r1 = 0
;; ---------+-------------+-------------+----------------
;;  r1 == C | WRITE r1,r2 | WRITE r1,r2 | WRITE r1,r2|r0'
;;  r2 == V | r1 = r0     | WRITE r0,r3 | r1 = r2 = 0
;;          | r2 = 0      | r1 = r2 = 0 |
;; 
;; 
;;          |               input (= r0) 
;;   state  |-----------------------------------------
;;          |    symbol   |    ASCII    |     EOF
;; ---------+-------------+-------------+-------------
;;  r1 == 0 | WRITE r0,r3 | write r0    |
;;  r2 == 0 |             |             |
;; ---------+-------------+-------------+-------------
;;  r1 == C | WRITE r1,r3 | WRITE r1,r3 | WRITE r1,r3
;;  r2 == 0 | WRITE r0,r3 | write r0    |
;;          | r1 = 0      | r1 = 0      |
;; ---------+-------------+-------------+-------------
;;  r1 == C | WRITE r1,r2 | WRITE r1,r2 | WRITE r1,r2
;;  r2 == V | WRITE r0,r3 | write r0    |
;;          | r1 = r2 = 0 | r1 = r2 = 0 |


(eval-and-compile

;; input  : r5 = 1st byte, r6 = 2nd byte
;; Their values will be destroyed.
(define-ccl-program ccl-thai-xtis-write
  '(0
    ((r5 = ((r5 & #x7F) << 7))
     (r6 = ((r6 & #x7F) | r5))
     (write-multibyte-character r4 r6))))

(define-ccl-program ccl-thai-xtis-consonant
  '(0
    (if (r1 == 0)
	(r1 = r0)
      (if (r2 == 0)
	  ((r5 = r1) (r6 = r3) (call ccl-thai-xtis-write)
	   (r1 = r0))
	((r5 = r1) (r6 = r2) (call ccl-thai-xtis-write)
	 (r1 = r0)
	 (r2 = 0))))))

(define-ccl-program ccl-thai-xtis-vowel
  '(0
    ((if (r1 == 0)
	 ((r5 = r0) (r6 = r3) (call ccl-thai-xtis-write))
       ((if (r2 == 0)
	    (r2 = ((r0 - 204) << 3))
	  ((r5 = r1) (r6 = r2) (call ccl-thai-xtis-write)
	   (r5 = r0) (r6 = r3) (call ccl-thai-xtis-write)
	   (r1 = 0)
	   (r2 = 0))))))))

(define-ccl-program ccl-thai-xtis-vowel-d1
  '(0
    ((if (r1 == 0)
	 ((r5 = r0) (r6 = r3) (call ccl-thai-xtis-write))
       ((if (r2 == 0)
	    (r2 = #x38)
	  ((r5 = r1) (r6 = r2) (call ccl-thai-xtis-write)
	   (r5 = r0) (r6 = r3) (call ccl-thai-xtis-write)
	   (r1 = 0)
	   (r2 = 0))))))))

(define-ccl-program ccl-thai-xtis-vowel-ee
  '(0
    ((if (r1 == 0)
	 ((r5 = r0) (r6 = r3) (call ccl-thai-xtis-write))
       ((if (r2 == 0)
	    (r2 = #x78)
	  ((r5 = r1) (r6 = r2) (call ccl-thai-xtis-write)
	   (r5 = r0) (r6 = r3) (call ccl-thai-xtis-write)
	   (r1 = 0)
	   (r2 = 0))))))))

(define-ccl-program ccl-thai-xtis-tone
  '(0
    (if (r1 == 0)
	((r5 = r0) (r6 = r3) (call ccl-thai-xtis-write))
      (if (r2 == 0)
	  ((r5 = r1) (r6 = ((r0 - #xE6) | r3)) (call ccl-thai-xtis-write)
	   (r1 = 0))
	((r5 = r1) (r6 = ((r0 - #xE6) | r2)) (call ccl-thai-xtis-write)
	 (r1 = 0)
	 (r2 = 0))))))

(define-ccl-program ccl-thai-xtis-symbol
  '(0
    (if (r1 == 0)
	((r5 = r0) (r6 = r3) (call ccl-thai-xtis-write))
      (if (r2 == 0)
	  ((r5 = r1) (r6 = r3) (call ccl-thai-xtis-write)
	   (r5 = r0) (r6 = r3) (call ccl-thai-xtis-write)
	   (r1 = 0))
	((r5 = r1) (r6 = r2) (call ccl-thai-xtis-write)
	 (r5 = r0) (r6 = r3) (call ccl-thai-xtis-write)
	 (r1 = 0)
	 (r2 = 0))))))

(define-ccl-program ccl-thai-xtis-ascii
  '(0
    (if (r1 == 0)
	(write r0)
      (if (r2 == 0)
	  ((r5 = r1) (r6 = r3) (call ccl-thai-xtis-write)
	   (write r0)
	   (r1 = 0))
	((r5 = r1) (r6 = r2) (call ccl-thai-xtis-write)
	 (write r0)
	 (r1 = 0)
	 (r2 = 0))))))

(define-ccl-program ccl-thai-xtis-eof
  '(0
    (if (r1 != 0)
	(if (r2 == 0)
	    ((r5 = r1) (r6 = r3) (call ccl-thai-xtis-write))
	  ((r5 = r1) (r6 = r2) (call ccl-thai-xtis-write))))))

(define-ccl-program ccl-decode-thai-xtis
  `(4
    ((read r0)
     (r1 = 0)
     (r2 = 0)
     (r3 = #x30)
     (r4 = ,(charset-id 'thai-xtis))
     (loop
      (if (r0 < 161)
	  (call ccl-thai-xtis-ascii)
	(branch (r0 - 161)
		(call ccl-thai-xtis-consonant)
		(call ccl-thai-xtis-consonant)
		(call ccl-thai-xtis-consonant)
		(call ccl-thai-xtis-consonant)
		(call ccl-thai-xtis-consonant)
		(call ccl-thai-xtis-consonant)
		(call ccl-thai-xtis-consonant)
		(call ccl-thai-xtis-consonant)
		(call ccl-thai-xtis-consonant)
		(call ccl-thai-xtis-consonant)
		(call ccl-thai-xtis-consonant)
		(call ccl-thai-xtis-consonant)
		(call ccl-thai-xtis-consonant)
		(call ccl-thai-xtis-consonant)
		(call ccl-thai-xtis-consonant)
		(call ccl-thai-xtis-consonant)
		(call ccl-thai-xtis-consonant)
		(call ccl-thai-xtis-consonant)
		(call ccl-thai-xtis-consonant)
		(call ccl-thai-xtis-consonant)
		(call ccl-thai-xtis-consonant)
		(call ccl-thai-xtis-consonant)
		(call ccl-thai-xtis-consonant)
		(call ccl-thai-xtis-consonant)
		(call ccl-thai-xtis-consonant)
		(call ccl-thai-xtis-consonant)
		(call ccl-thai-xtis-consonant)
		(call ccl-thai-xtis-consonant)
		(call ccl-thai-xtis-consonant)
		(call ccl-thai-xtis-consonant)
		(call ccl-thai-xtis-consonant)
		(call ccl-thai-xtis-consonant)
		(call ccl-thai-xtis-consonant)
		(call ccl-thai-xtis-consonant)
		(call ccl-thai-xtis-consonant)
		(call ccl-thai-xtis-symbol)
		(call ccl-thai-xtis-consonant)
		(call ccl-thai-xtis-symbol)
		(call ccl-thai-xtis-consonant)
		(call ccl-thai-xtis-consonant)
		(call ccl-thai-xtis-consonant)
		(call ccl-thai-xtis-consonant)
		(call ccl-thai-xtis-consonant)
		(call ccl-thai-xtis-consonant)
		(call ccl-thai-xtis-consonant)
		(call ccl-thai-xtis-consonant)
		(call ccl-thai-xtis-symbol)
		(call ccl-thai-xtis-symbol)
		(call ccl-thai-xtis-vowel-d1)
		(call ccl-thai-xtis-symbol)
		(call ccl-thai-xtis-symbol)
		(call ccl-thai-xtis-vowel)
		(call ccl-thai-xtis-vowel)
		(call ccl-thai-xtis-vowel)
		(call ccl-thai-xtis-vowel)
		(call ccl-thai-xtis-vowel)
		(call ccl-thai-xtis-vowel)
		(call ccl-thai-xtis-vowel)
		nil
		nil
		nil
		nil
		(call ccl-thai-xtis-symbol)
		(call ccl-thai-xtis-symbol)
		(call ccl-thai-xtis-symbol)
		(call ccl-thai-xtis-symbol)
		(call ccl-thai-xtis-symbol)
		(call ccl-thai-xtis-symbol)
		(call ccl-thai-xtis-symbol)
		(call ccl-thai-xtis-symbol)
		(call ccl-thai-xtis-tone)
		(call ccl-thai-xtis-tone)
		(call ccl-thai-xtis-tone)
		(call ccl-thai-xtis-tone)
		(call ccl-thai-xtis-tone)
		(call ccl-thai-xtis-tone)
		(call ccl-thai-xtis-tone)
		(call ccl-thai-xtis-vowel-ee)
		(call ccl-thai-xtis-symbol)
		(call ccl-thai-xtis-symbol)
		(call ccl-thai-xtis-symbol)
		(call ccl-thai-xtis-symbol)
		(call ccl-thai-xtis-symbol)
		(call ccl-thai-xtis-symbol)
		(call ccl-thai-xtis-symbol)
		(call ccl-thai-xtis-symbol)
		(call ccl-thai-xtis-symbol)
		(call ccl-thai-xtis-symbol)
		(call ccl-thai-xtis-symbol)
		(call ccl-thai-xtis-symbol)
		(call ccl-thai-xtis-symbol)
		nil
		nil
		nil))
      (read r0)
      (repeat)))

    (call ccl-thai-xtis-eof)))

)

(defconst leading-code-private-21 #x9F)

(define-ccl-program ccl-encode-thai-xtis
  `(1
    ((read r0)
     (loop
      (if (r0 == ,leading-code-private-21)
	  ((read r1)
	   (if (r1 == ,(charset-id 'thai-xtis))
	       ((read r0)
		(write r0)
		(read r0)
		(r1 = (r0 & 7))
		(r0 = ((r0 - #xB0) >> 3))
		(if (r0 != 0)
		    (write r0 [0 209 212 213 214 215 216 217 218 238]))
		(if (r1 != 0)
		    (write r1 [0 231 232 233 234 235 236 237]))
		(read r0)
		(repeat))
	     ((write r0 r1)
	      (read r0)
	      (repeat))))
	(write-read-repeat r0))))))

(make-coding-system
 'tis-620 'ccl
 "TIS620 (Thai)"
 `(mnemonic "TIS620"
   decode ccl-decode-thai-xtis
   encode ccl-encode-thai-xtis
   safe-charsets (ascii thai-xtis)
   documentation "external=tis620, internal=thai-xtis"))
(coding-system-put 'tis-620 'category 'iso-8-1)

(set-language-info-alist
 "Thai-XTIS"
 '((charset thai-xtis)
   (coding-system tis-620 iso-2022-7bit)
   (tutorial . "TUTORIAL.th")
   (tutorial-coding-system . tis-620)
   (coding-priority tis-620 iso-2022-7bit)
   (sample-text . "$(?!:(B")
   (documentation . t)))

;; thai-xtis.el ends here.