Mercurial > hg > xemacs-beta
view lisp/mule/arabic.el @ 4690:257b468bf2ca
Move the #'query-coding-region implementation to C.
This is necessary because there is no reasonable way to access the
corresponding mswindows-multibyte functionality from Lisp, and we need such
functionality if we're going to have a reliable and portable
#'query-coding-region implementation. However, this change doesn't yet
provide #'query-coding-region for the mswindow-multibyte coding systems,
there should be no functional differences between an XEmacs with this change
and one without it.
src/ChangeLog addition:
2009-09-19 Aidan Kehoe <kehoea@parhasard.net>
Move the #'query-coding-region implementation to C.
This is necessary because there is no reasonable way to access the
corresponding mswindows-multibyte functionality from Lisp, and we
need such functionality if we're going to have a reliable and
portable #'query-coding-region implementation. However, this
change doesn't yet provide #'query-coding-region for the
mswindow-multibyte coding systems, there should be no functional
differences between an XEmacs with this change and one without it.
* mule-coding.c (struct fixed_width_coding_system):
Add a new coding system type, fixed_width, and implement it. It
uses the CCL infrastructure but has a much simpler creation API,
and its own query_method, formerly in lisp/mule/mule-coding.el.
* unicode.c:
Move the Unicode query method implementation here from
unicode.el.
* lisp.h: Declare Fmake_coding_system_internal, Fcopy_range_table
here.
* intl-win32.c (complex_vars_of_intl_win32):
Use Fmake_coding_system_internal, not Fmake_coding_system.
* general-slots.h: Add Qsucceeded, Qunencodable, Qinvalid_sequence
here.
* file-coding.h (enum coding_system_variant):
Add fixed_width_coding_system here.
(struct coding_system_methods):
Add query_method and query_lstream_method to the coding system
methods.
Provide flags for the query methods.
Declare the default query method; initialise it correctly in
INITIALIZE_CODING_SYSTEM_TYPE.
* file-coding.c (default_query_method):
New function, the default query method for coding systems that do
not set it. Moved from coding.el.
(make_coding_system_1):
Accept new elements in PROPS in #'make-coding-system; aliases, a
list of aliases; safe-chars and safe-charsets (these were
previously accepted but not saved); and category.
(Fmake_coding_system_internal):
New function, what used to be #'make-coding-system--on Mule
builds, we've now moved some of the functionality of this to
Lisp.
(Fcoding_system_canonical_name_p):
Move this earlier in the file, since it's now called from within
make_coding_system_1.
(Fquery_coding_region):
Move the implementation of this here, from coding.el.
(complex_vars_of_file_coding):
Call Fmake_coding_system_internal, not Fmake_coding_system;
specify safe-charsets properties when we're a mule build.
* extents.h (mouse_highlight_priority, Fset_extent_priority,
Fset_extent_face, Fmap_extents):
Make these available to other C files.
lisp/ChangeLog addition:
2009-09-19 Aidan Kehoe <kehoea@parhasard.net>
Move the #'query-coding-region implementation to C.
* coding.el:
Consolidate code that depends on the presence or absence of Mule
at the end of this file.
(default-query-coding-region, query-coding-region):
Move these functions to C.
(default-query-coding-region-safe-charset-skip-chars-map):
Remove this variable, the corresponding C variable is
Vdefault_query_coding_region_chartab_cache in file-coding.c.
(query-coding-string): Update docstring to reflect actual multiple
values, be more careful about not modifying a range table that
we're currently mapping over.
(encode-coding-char): Make the implementation of this simpler.
(featurep 'mule): Autoload #'make-coding-system from
mule/make-coding-system.el if we're a mule build; provide an
appropriate compiler macro.
Do various non-mule compatibility things if we're not a mule
build.
* update-elc.el (additional-dump-dependencies):
Add mule/make-coding-system as a dump time dependency if we're a
mule build.
* unicode.el (ccl-encode-to-ucs-2):
(decode-char):
(encode-char):
Move these earlier in the file, for the sake of some byte compile
warnings.
(unicode-query-coding-region):
Move this to unicode.c
* mule/make-coding-system.el:
New file, not dumped. Contains the functionality to rework the
arguments necessary for fixed-width coding systems, and contains
the implementation of #'make-coding-system, which now calls
#'make-coding-system-internal.
* mule/vietnamese.el (viscii):
* mule/latin.el (iso-8859-2):
(windows-1250):
(iso-8859-3):
(iso-8859-4):
(iso-8859-14):
(iso-8859-15):
(iso-8859-16):
(iso-8859-9):
(macintosh):
(windows-1252):
* mule/hebrew.el (iso-8859-8):
* mule/greek.el (iso-8859-7):
(windows-1253):
* mule/cyrillic.el (iso-8859-5):
(koi8-r):
(koi8-u):
(windows-1251):
(alternativnyj):
(koi8-ru):
(koi8-t):
(koi8-c):
(koi8-o):
* mule/arabic.el (iso-8859-6):
(windows-1256):
Move all these coding systems to being of type fixed-width, not of
type CCL. This allows the distinct query-coding-region for them to
be in C, something which will eventually allow us to implement
query-coding-region for the mswindows-multibyte coding systems.
* mule/general-late.el (posix-charset-to-coding-system-hash):
Document why we're pre-emptively persuading the byte compiler that
the ELC for this file needs to be written using escape-quoted.
Call #'set-unicode-query-skip-chars-args, now the Unicode
query-coding-region implementation is in C.
* mule/thai-xtis.el (tis-620):
Don't bother checking whether we're XEmacs or not here.
* mule/mule-coding.el:
Move the eight bit fixed-width functionality from this file to
make-coding-system.el.
tests/ChangeLog addition:
2009-09-19 Aidan Kehoe <kehoea@parhasard.net>
* automated/mule-tests.el:
Check a coding system's type, not an 8-bit-fixed property, for
whether that coding system should be treated as a fixed-width
coding system.
* automated/query-coding-tests.el:
Don't test the query coding functionality for mswindows-multibyte
coding systems, it's not yet implemented.
author | Aidan Kehoe <kehoea@parhasard.net> |
---|---|
date | Sat, 19 Sep 2009 22:53:13 +0100 |
parents | e0a8715fdb1f |
children | a67bfb29dd8b |
line wrap: on
line source
;;; arabic.el --- pre-loaded support for Arabic. -*- coding: iso-2022-7bit; -*- ;; Copyright (C) 1992,93,94,95 Free Software Foundation, Inc. ;; Copyright (C) 1995 Amdahl Corporation. ;; Copyright (C) 1995 Sun Microsystems. ;; Copyright (C) 2002 Ben Wing. ;; This file is part of XEmacs. ;; XEmacs is free software; you can redistribute it and/or modify it ;; under the terms of the GNU General Public License as published by ;; the Free Software Foundation; either version 2, or (at your option) ;; any later version. ;; XEmacs is distributed in the hope that it will be useful, but ;; WITHOUT ANY WARRANTY; without even the implied warranty of ;; MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU ;; General Public License for more details. ;; You should have received a copy of the GNU General Public License ;; along with XEmacs; see the file COPYING. If not, write to the ;; Free Software Foundation, Inc., 59 Temple Place - Suite 330, ;; Boston, MA 02111-1307, USA. ;;; Commentary: ;; Synched up with: Mule 2.3, FSF 21.1. ;;; Code: ;; See iso-with-esc.el for commentary on the ISO standard Arabic character ;; set. (make-coding-system 'iso-8859-6 'fixed-width "ISO 8859-6 (Arabic)" '(unicode-map ((#x80 ?\u0080) ;; <control> (#x81 ?\u0081) ;; <control> (#x82 ?\u0082) ;; <control> (#x83 ?\u0083) ;; <control> (#x84 ?\u0084) ;; <control> (#x85 ?\u0085) ;; <control> (#x86 ?\u0086) ;; <control> (#x87 ?\u0087) ;; <control> (#x88 ?\u0088) ;; <control> (#x89 ?\u0089) ;; <control> (#x8A ?\u008A) ;; <control> (#x8B ?\u008B) ;; <control> (#x8C ?\u008C) ;; <control> (#x8D ?\u008D) ;; <control> (#x8E ?\u008E) ;; <control> (#x8F ?\u008F) ;; <control> (#x90 ?\u0090) ;; <control> (#x91 ?\u0091) ;; <control> (#x92 ?\u0092) ;; <control> (#x93 ?\u0093) ;; <control> (#x94 ?\u0094) ;; <control> (#x95 ?\u0095) ;; <control> (#x96 ?\u0096) ;; <control> (#x97 ?\u0097) ;; <control> (#x98 ?\u0098) ;; <control> (#x99 ?\u0099) ;; <control> (#x9A ?\u009A) ;; <control> (#x9B ?\u009B) ;; <control> (#x9C ?\u009C) ;; <control> (#x9D ?\u009D) ;; <control> (#x9E ?\u009E) ;; <control> (#x9F ?\u009F) ;; <control> (#xA0 ?\u00A0) ;; NO-BREAK SPACE (#xA4 ?\u00A4) ;; CURRENCY SIGN (#xAC ?\u060C) ;; ARABIC COMMA (#xAD ?\u00AD) ;; SOFT HYPHEN (#xBB ?\u061B) ;; ARABIC SEMICOLON (#xBF ?\u061F) ;; ARABIC QUESTION MARK (#xC1 ?\u0621) ;; ARABIC LETTER HAMZA (#xC2 ?\u0622) ;; ARABIC LETTER ALEF WITH MADDA ABOVE (#xC3 ?\u0623) ;; ARABIC LETTER ALEF WITH HAMZA ABOVE (#xC4 ?\u0624) ;; ARABIC LETTER WAW WITH HAMZA ABOVE (#xC5 ?\u0625) ;; ARABIC LETTER ALEF WITH HAMZA BELOW (#xC6 ?\u0626) ;; ARABIC LETTER YEH WITH HAMZA ABOVE (#xC7 ?\u0627) ;; ARABIC LETTER ALEF (#xC8 ?\u0628) ;; ARABIC LETTER BEH (#xC9 ?\u0629) ;; ARABIC LETTER TEH MARBUTA (#xCA ?\u062A) ;; ARABIC LETTER TEH (#xCB ?\u062B) ;; ARABIC LETTER THEH (#xCC ?\u062C) ;; ARABIC LETTER JEEM (#xCD ?\u062D) ;; ARABIC LETTER HAH (#xCE ?\u062E) ;; ARABIC LETTER KHAH (#xCF ?\u062F) ;; ARABIC LETTER DAL (#xD0 ?\u0630) ;; ARABIC LETTER THAL (#xD1 ?\u0631) ;; ARABIC LETTER REH (#xD2 ?\u0632) ;; ARABIC LETTER ZAIN (#xD3 ?\u0633) ;; ARABIC LETTER SEEN (#xD4 ?\u0634) ;; ARABIC LETTER SHEEN (#xD5 ?\u0635) ;; ARABIC LETTER SAD (#xD6 ?\u0636) ;; ARABIC LETTER DAD (#xD7 ?\u0637) ;; ARABIC LETTER TAH (#xD8 ?\u0638) ;; ARABIC LETTER ZAH (#xD9 ?\u0639) ;; ARABIC LETTER AIN (#xDA ?\u063A) ;; ARABIC LETTER GHAIN (#xE0 ?\u0640) ;; ARABIC TATWEEL (#xE1 ?\u0641) ;; ARABIC LETTER FEH (#xE2 ?\u0642) ;; ARABIC LETTER QAF (#xE3 ?\u0643) ;; ARABIC LETTER KAF (#xE4 ?\u0644) ;; ARABIC LETTER LAM (#xE5 ?\u0645) ;; ARABIC LETTER MEEM (#xE6 ?\u0646) ;; ARABIC LETTER NOON (#xE7 ?\u0647) ;; ARABIC LETTER HEH (#xE8 ?\u0648) ;; ARABIC LETTER WAW (#xE9 ?\u0649) ;; ARABIC LETTER ALEF MAKSURA (#xEA ?\u064A) ;; ARABIC LETTER YEH (#xEB ?\u064B) ;; ARABIC FATHATAN (#xEC ?\u064C) ;; ARABIC DAMMATAN (#xED ?\u064D) ;; ARABIC KASRATAN (#xEE ?\u064E) ;; ARABIC FATHA (#xEF ?\u064F) ;; ARABIC DAMMA (#xF0 ?\u0650) ;; ARABIC KASRA (#xF1 ?\u0651) ;; ARABIC SHADDA (#xF2 ?\u0652)) ;; ARABIC SUKUN mnemonic "ArISO")) (make-coding-system 'windows-1256 'fixed-width "Windows-1256 (Arabic)" '(unicode-map ((#x80 ?\u20AC) ;; EURO SIGN (#x81 ?\u067E) ;; ARABIC LETTER PEH (#x82 ?\u201A) ;; SINGLE LOW-9 QUOTATION MARK (#x83 ?\u0192) ;; LATIN SMALL LETTER F WITH HOOK (#x84 ?\u201E) ;; DOUBLE LOW-9 QUOTATION MARK (#x85 ?\u2026) ;; HORIZONTAL ELLIPSIS (#x86 ?\u2020) ;; DAGGER (#x87 ?\u2021) ;; DOUBLE DAGGER (#x88 ?\u02C6) ;; MODIFIER LETTER CIRCUMFLEX ACCENT (#x89 ?\u2030) ;; PER MILLE SIGN (#x8A ?\u0679) ;; ARABIC LETTER TTEH (#x8B ?\u2039) ;; SINGLE LEFT-POINTING ANGLE QUOTATION MARK (#x8C ?\u0152) ;; LATIN CAPITAL LIGATURE OE (#x8D ?\u0686) ;; ARABIC LETTER TCHEH (#x8E ?\u0698) ;; ARABIC LETTER JEH (#x8F ?\u0688) ;; ARABIC LETTER DDAL (#x90 ?\u06AF) ;; ARABIC LETTER GAF (#x91 ?\u2018) ;; LEFT SINGLE QUOTATION MARK (#x92 ?\u2019) ;; RIGHT SINGLE QUOTATION MARK (#x93 ?\u201C) ;; LEFT DOUBLE QUOTATION MARK (#x94 ?\u201D) ;; RIGHT DOUBLE QUOTATION MARK (#x95 ?\u2022) ;; BULLET (#x96 ?\u2013) ;; EN DASH (#x97 ?\u2014) ;; EM DASH (#x98 ?\u06A9) ;; ARABIC LETTER KEHEH (#x99 ?\u2122) ;; TRADE MARK SIGN (#x9A ?\u0691) ;; ARABIC LETTER RREH (#x9B ?\u203A) ;; SINGLE RIGHT-POINTING ANGLE QUOTATION MARK (#x9C ?\u0153) ;; LATIN SMALL LIGATURE OE (#x9D ?\u200C) ;; ZERO WIDTH NON-JOINER (#x9E ?\u200D) ;; ZERO WIDTH JOINER (#x9F ?\u06BA) ;; ARABIC LETTER NOON GHUNNA (#xA0 ?\u00A0) ;; NO-BREAK SPACE (#xA1 ?\u060C) ;; ARABIC COMMA (#xA2 ?\u00A2) ;; CENT SIGN (#xA3 ?\u00A3) ;; POUND SIGN (#xA4 ?\u00A4) ;; CURRENCY SIGN (#xA5 ?\u00A5) ;; YEN SIGN (#xA6 ?\u00A6) ;; BROKEN BAR (#xA7 ?\u00A7) ;; SECTION SIGN (#xA8 ?\u00A8) ;; DIAERESIS (#xA9 ?\u00A9) ;; COPYRIGHT SIGN (#xAA ?\u06BE) ;; ARABIC LETTER HEH DOACHASHMEE (#xAB ?\u00AB) ;; LEFT-POINTING DOUBLE ANGLE QUOTATION MARK (#xAC ?\u00AC) ;; NOT SIGN (#xAD ?\u00AD) ;; SOFT HYPHEN (#xAE ?\u00AE) ;; REGISTERED SIGN (#xAF ?\u00AF) ;; MACRON (#xB0 ?\u00B0) ;; DEGREE SIGN (#xB1 ?\u00B1) ;; PLUS-MINUS SIGN (#xB2 ?\u00B2) ;; SUPERSCRIPT TWO (#xB3 ?\u00B3) ;; SUPERSCRIPT THREE (#xB4 ?\u00B4) ;; ACUTE ACCENT (#xB5 ?\u00B5) ;; MICRO SIGN (#xB6 ?\u00B6) ;; PILCROW SIGN (#xB7 ?\u00B7) ;; MIDDLE DOT (#xB8 ?\u00B8) ;; CEDILLA (#xB9 ?\u00B9) ;; SUPERSCRIPT ONE (#xBA ?\u061B) ;; ARABIC SEMICOLON (#xBB ?\u00BB) ;; RIGHT-POINTING DOUBLE ANGLE QUOTATION MARK (#xBC ?\u00BC) ;; VULGAR FRACTION ONE QUARTER (#xBD ?\u00BD) ;; VULGAR FRACTION ONE HALF (#xBE ?\u00BE) ;; VULGAR FRACTION THREE QUARTERS (#xBF ?\u061F) ;; ARABIC QUESTION MARK (#xC0 ?\u06C1) ;; ARABIC LETTER HEH GOAL (#xC1 ?\u0621) ;; ARABIC LETTER HAMZA (#xC2 ?\u0622) ;; ARABIC LETTER ALEF WITH MADDA ABOVE (#xC3 ?\u0623) ;; ARABIC LETTER ALEF WITH HAMZA ABOVE (#xC4 ?\u0624) ;; ARABIC LETTER WAW WITH HAMZA ABOVE (#xC5 ?\u0625) ;; ARABIC LETTER ALEF WITH HAMZA BELOW (#xC6 ?\u0626) ;; ARABIC LETTER YEH WITH HAMZA ABOVE (#xC7 ?\u0627) ;; ARABIC LETTER ALEF (#xC8 ?\u0628) ;; ARABIC LETTER BEH (#xC9 ?\u0629) ;; ARABIC LETTER TEH MARBUTA (#xCA ?\u062A) ;; ARABIC LETTER TEH (#xCB ?\u062B) ;; ARABIC LETTER THEH (#xCC ?\u062C) ;; ARABIC LETTER JEEM (#xCD ?\u062D) ;; ARABIC LETTER HAH (#xCE ?\u062E) ;; ARABIC LETTER KHAH (#xCF ?\u062F) ;; ARABIC LETTER DAL (#xD0 ?\u0630) ;; ARABIC LETTER THAL (#xD1 ?\u0631) ;; ARABIC LETTER REH (#xD2 ?\u0632) ;; ARABIC LETTER ZAIN (#xD3 ?\u0633) ;; ARABIC LETTER SEEN (#xD4 ?\u0634) ;; ARABIC LETTER SHEEN (#xD5 ?\u0635) ;; ARABIC LETTER SAD (#xD6 ?\u0636) ;; ARABIC LETTER DAD (#xD7 ?\u00D7) ;; MULTIPLICATION SIGN (#xD8 ?\u0637) ;; ARABIC LETTER TAH (#xD9 ?\u0638) ;; ARABIC LETTER ZAH (#xDA ?\u0639) ;; ARABIC LETTER AIN (#xDB ?\u063A) ;; ARABIC LETTER GHAIN (#xDC ?\u0640) ;; ARABIC TATWEEL (#xDD ?\u0641) ;; ARABIC LETTER FEH (#xDE ?\u0642) ;; ARABIC LETTER QAF (#xDF ?\u0643) ;; ARABIC LETTER KAF (#xE0 ?\u00E0) ;; LATIN SMALL LETTER A WITH GRAVE (#xE1 ?\u0644) ;; ARABIC LETTER LAM (#xE2 ?\u00E2) ;; LATIN SMALL LETTER A WITH CIRCUMFLEX (#xE3 ?\u0645) ;; ARABIC LETTER MEEM (#xE4 ?\u0646) ;; ARABIC LETTER NOON (#xE5 ?\u0647) ;; ARABIC LETTER HEH (#xE6 ?\u0648) ;; ARABIC LETTER WAW (#xE7 ?\u00E7) ;; LATIN SMALL LETTER C WITH CEDILLA (#xE8 ?\u00E8) ;; LATIN SMALL LETTER E WITH GRAVE (#xE9 ?\u00E9) ;; LATIN SMALL LETTER E WITH ACUTE (#xEA ?\u00EA) ;; LATIN SMALL LETTER E WITH CIRCUMFLEX (#xEB ?\u00EB) ;; LATIN SMALL LETTER E WITH DIAERESIS (#xEC ?\u0649) ;; ARABIC LETTER ALEF MAKSURA (#xED ?\u064A) ;; ARABIC LETTER YEH (#xEE ?\u00EE) ;; LATIN SMALL LETTER I WITH CIRCUMFLEX (#xEF ?\u00EF) ;; LATIN SMALL LETTER I WITH DIAERESIS (#xF0 ?\u064B) ;; ARABIC FATHATAN (#xF1 ?\u064C) ;; ARABIC DAMMATAN (#xF2 ?\u064D) ;; ARABIC KASRATAN (#xF3 ?\u064E) ;; ARABIC FATHA (#xF4 ?\u00F4) ;; LATIN SMALL LETTER O WITH CIRCUMFLEX (#xF5 ?\u064F) ;; ARABIC DAMMA (#xF6 ?\u0650) ;; ARABIC KASRA (#xF7 ?\u00F7) ;; DIVISION SIGN (#xF8 ?\u0651) ;; ARABIC SHADDA (#xF9 ?\u00F9) ;; LATIN SMALL LETTER U WITH GRAVE (#xFA ?\u0652) ;; ARABIC SUKUN (#xFB ?\u00FB) ;; LATIN SMALL LETTER U WITH CIRCUMFLEX (#xFC ?\u00FC) ;; LATIN SMALL LETTER U WITH DIAERESIS (#xFD ?\u200E) ;; LEFT-TO-RIGHT MARK (#xFE ?\u200F) ;; RIGHT-TO-LEFT MARK (#xFF ?\u06D2)) ;; ARABIC LETTER YEH BARREE mnemonic "cp1256" documentation "This is the much Windows encoding for Arabic, much superior to the ISO standard one." aliases (cp1256))) ;; The Mac Arabic coding systems don't have defined MIME names. ;; #### Decide what to do about the syntax of the Arabic punctuation. ;;; arabic.el ends here