Mercurial > hg > xemacs-beta
annotate lisp/mule/vietnamese.el @ 4604:e0a8715fdb1f
Support new IGNORE-INVALID-SEQUENCESP argument, #'query-coding-region.
lisp/ChangeLog addition:
2009-02-07 Aidan Kehoe <kehoea@parhasard.net>
* coding.el (query-coding-clear-highlights):
Rename the BUFFER argument to BUFFER-OR-STRING, describe it as
possibly being a string in its documentation.
(default-query-coding-region):
Add a new IGNORE-INVALID-SEQUENCESP argument, document that this
function does not support it.
Bind case-fold-search to nil, we don't want this to influence what the
function thinks is encodable or not.
(query-coding-region):
Add a new IGNORE-INVALID-SEQUENCESP argument, document what it
does; reflect this new argument in the associated compiler macro.
(query-coding-string):
Add a new IGNORE-INVALID-SEQUENCESP argument, document what it
does. Support the HIGHLIGHT argument correctly.
* unicode.el (unicode-query-coding-region):
Add a new IGNORE-INVALID-SEQUENCESP argument, document what it
does, implement this. Document a potential problem.
Use #'query-coding-clear-highlights instead of reimplementing it
ourselves.
Remove some debugging messages.
* mule/arabic.el (iso-8859-6):
* mule/cyrillic.el (iso-8859-5):
* mule/greek.el (iso-8859-7):
* mule/hebrew.el (iso-8859-8):
* mule/latin.el (iso-8859-2):
* mule/latin.el (iso-8859-3):
* mule/latin.el (iso-8859-4):
* mule/latin.el (iso-8859-14):
* mule/latin.el (iso-8859-15):
* mule/latin.el (iso-8859-16):
* mule/latin.el (iso-8859-9):
* mule/latin.el (windows-1252):
* mule/mule-coding.el (iso-8859-1):
Avoid the assumption that characters not given an explicit mapping
in these coding systems map to the ISO 8859-1 characters
corresponding to the octets on disk; this makes it much more
reasonable to implement the IGNORE-INVALID-SEQUENCESP argument to
query-coding-region.
* mule/mule-cmds.el (set-language-info):
Correct the docstring.
* mule/mule-cmds.el (finish-set-language-environment):
Treat invalid Unicode sequences produced from
invalid-sequence-coding-system and corresponding to control
characters the same as control characters in redisplay.
* mule/mule-cmds.el:
Document that encode-coding-char is available in coding.el
* mule/mule-coding.el (make-8-bit-generate-helper):
Change to return the both the encode-program generated and the
relevant non-ASCII charset; update the docstring to reflect this.
* mule/mule-coding.el
(make-8-bit-generate-encode-program-and-skip-chars-strings):
Rename this function; have it return skip-chars-strings as well as
the encode program. Have these skip-chars-strings use ranges for
charsets, where possible.
* mule/mule-coding.el (make-8-bit-create-decode-encode-tables):
Revise this to allow people to specify explicitly characters that
should be undefined (= corresponding to keys in
unicode-error-default-translation-table), and treating unspecified
octets above #x7f as undefined by default.
* mule/mule-coding.el (8-bit-fixed-query-coding-region):
Add a new IGNORE-INVALID-SEQUENCESP argument, implement support
for it using the 8-bit-fixed-invalid-sequences-skip-chars coding
system property; remove some debugging messages.
* mule/mule-coding.el (make-8-bit-coding-system):
This function is dumped, autoloading it makes no sense.
Document what happens when characters above #x7f are not
specified, implement this.
* mule/vietnamese.el:
Correct spelling.
tests/ChangeLog addition:
2009-02-07 Aidan Kehoe <kehoea@parhasard.net>
* automated/query-coding-tests.el:
Add FAILING-CASE arguments to the Assert calls, making #'q-c-debug
mostly unnecessary. Remove #'q-c-debug.
Add new tests that use the IGNORE-INVALID-SEQUENCESP argument to
#'query-coding-region; rework the existing ones to respect it.
author | Aidan Kehoe <kehoea@parhasard.net> |
---|---|
date | Sat, 07 Feb 2009 17:13:37 +0000 |
parents | 5b55fa103aa1 |
children | 257b468bf2ca |
rev | line source |
---|---|
428 | 1 ;;; vietnamese.el --- Support for Vietnamese -*- coding: iso-2022-7bit; -*- |
2 | |
3 ;; Copyright (C) 1995 Electrotechnical Laboratory, JAPAN. | |
4 ;; Licensed to the Free Software Foundation. | |
5 ;; Copyright (C) 1997 MORIOKA Tomohiko | |
788 | 6 ;; Copyright (C) 2002 Ben Wing. |
428 | 7 |
8 ;; Keywords: multilingual, Vietnamese | |
9 | |
10 ;; This file is part of XEmacs. | |
11 | |
12 ;; XEmacs is free software; you can redistribute it and/or modify it | |
13 ;; under the terms of the GNU General Public License as published by | |
14 ;; the Free Software Foundation; either version 2, or (at your option) | |
15 ;; any later version. | |
16 | |
17 ;; XEmacs is distributed in the hope that it will be useful, but | |
18 ;; WITHOUT ANY WARRANTY; without even the implied warranty of | |
19 ;; MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU | |
20 ;; General Public License for more details. | |
21 | |
22 ;; You should have received a copy of the GNU General Public License | |
23 ;; along with XEmacs; see the file COPYING. If not, write to the Free | |
24 ;; Software Foundation, Inc., 59 Temple Place - Suite 330, Boston, MA | |
25 ;; 02111-1307, USA. | |
26 | |
27 ;;; Commentary: | |
28 | |
4604
e0a8715fdb1f
Support new IGNORE-INVALID-SEQUENCESP argument, #'query-coding-region.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4133
diff
changeset
|
29 ;; For Vietnamese, the character sets VISCII and VSCII are supported. |
428 | 30 |
31 ;;; Code: | |
32 | |
778 | 33 ;; Vietnamese VISCII. VISCII is 1-byte character set which contains |
34 ;; more than 96 characters. Since Emacs can't handle it as one | |
35 ;; character set, it is divided into two: lower case letters and upper | |
36 ;; case letters. | |
37 (make-charset 'vietnamese-viscii-lower "VISCII1.1 lower-case" | |
38 '(dimension | |
39 1 | |
3659 | 40 registries ["VISCII1.1"] |
778 | 41 chars 96 |
42 columns 1 | |
43 direction l2r | |
44 final ?1 | |
45 graphic 1 | |
46 short-name "VISCII lower" | |
47 long-name "VISCII lower-case" | |
48 )) | |
49 | |
50 (make-charset 'vietnamese-viscii-upper "VISCII1.1 upper-case" | |
51 '(dimension | |
52 1 | |
3659 | 53 registries ["VISCII1.1"] |
778 | 54 chars 96 |
55 columns 1 | |
56 direction l2r | |
57 final ?2 | |
58 graphic 1 | |
59 short-name "VISCII upper" | |
60 long-name "VISCII upper-case" | |
61 )) | |
62 | |
63 (define-category ?v "Vietnamese character.") | |
64 (modify-category-entry 'vietnamese-viscii-lower ?v) | |
65 (modify-category-entry 'vietnamese-viscii-upper ?v) | |
66 | |
4072 | 67 (make-8-bit-coding-system |
68 'viscii | |
69 '((#x02 ?\u1EB2) ;; CAPITAL LETTER A WITH BREVE AND HOOK ABOVE | |
70 (#x05 ?\u1EB4) ;; CAPITAL LETTER A WITH BREVE AND TILDE | |
71 (#x06 ?\u1EAA) ;; CAPITAL LETTER A WITH CIRCUMFLEX AND TILDE | |
72 (#x14 ?\u1EF6) ;; CAPITAL LETTER Y WITH HOOK ABOVE | |
73 (#x19 ?\u1EF8) ;; CAPITAL LETTER Y WITH TILDE | |
74 (#x1E ?\u1EF4) ;; CAPITAL LETTER Y WITH DOT BELOW | |
75 (#x80 ?\u1EA0) ;; CAPITAL LETTER A WITH DOT BELOW | |
76 (#x81 ?\u1EAE) ;; CAPITAL LETTER A WITH BREVE AND ACUTE | |
77 (#x82 ?\u1EB0) ;; CAPITAL LETTER A WITH BREVE AND GRAVE | |
78 (#x83 ?\u1EB6) ;; CAPITAL LETTER A WITH BREVE AND DOT BELOW | |
79 (#x84 ?\u1EA4) ;; CAPITAL LETTER A WITH CIRCUMFLEX AND ACUTE | |
80 (#x85 ?\u1EA6) ;; CAPITAL LETTER A WITH CIRCUMFLEX AND GRAVE | |
81 (#x86 ?\u1EA8) ;; CAPITAL LETTER A WITH CIRCUMFLEX AND HOOK ABOVE | |
82 (#x87 ?\u1EAC) ;; CAPITAL LETTER A WITH CIRCUMFLEX AND DOT BELOW | |
83 (#x88 ?\u1EBC) ;; CAPITAL LETTER E WITH TILDE | |
84 (#x89 ?\u1EB8) ;; CAPITAL LETTER E WITH DOT BELOW | |
85 (#x8A ?\u1EBE) ;; CAPITAL LETTER E WITH CIRCUMFLEX AND ACUTE | |
86 (#x8B ?\u1EC0) ;; CAPITAL LETTER E WITH CIRCUMFLEX AND GRAVE | |
87 (#x8C ?\u1EC2) ;; CAPITAL LETTER E WITH CIRCUMFLEX AND HOOK ABOVE | |
88 (#x8D ?\u1EC4) ;; CAPITAL LETTER E WITH CIRCUMFLEX AND TILDE | |
89 (#x8E ?\u1EC6) ;; CAPITAL LETTER E WITH CIRCUMFLEX AND DOT BELOW | |
90 (#x8F ?\u1ED0) ;; CAPITAL LETTER O WITH CIRCUMFLEX AND ACUTE | |
91 (#x90 ?\u1ED2) ;; CAPITAL LETTER O WITH CIRCUMFLEX AND GRAVE | |
92 (#x91 ?\u1ED4) ;; CAPITAL LETTER O WITH CIRCUMFLEX AND HOOK ABOVE | |
93 (#x92 ?\u1ED6) ;; CAPITAL LETTER O WITH CIRCUMFLEX AND TILDE | |
94 (#x93 ?\u1ED8) ;; CAPITAL LETTER O WITH CIRCUMFLEX AND DOT BELOW | |
95 (#x94 ?\u1EE2) ;; CAPITAL LETTER O WITH HORN AND DOT BELOW | |
96 (#x95 ?\u1EDA) ;; CAPITAL LETTER O WITH HORN AND ACUTE | |
97 (#x96 ?\u1EDC) ;; CAPITAL LETTER O WITH HORN AND GRAVE | |
98 (#x97 ?\u1EDE) ;; CAPITAL LETTER O WITH HORN AND HOOK ABOVE | |
99 (#x98 ?\u1ECA) ;; CAPITAL LETTER I WITH DOT BELOW | |
100 (#x99 ?\u1ECE) ;; CAPITAL LETTER O WITH HOOK ABOVE | |
101 (#x9A ?\u1ECC) ;; CAPITAL LETTER O WITH DOT BELOW | |
102 (#x9B ?\u1EC8) ;; CAPITAL LETTER I WITH HOOK ABOVE | |
103 (#x9C ?\u1EE6) ;; CAPITAL LETTER U WITH HOOK ABOVE | |
104 (#x9D ?\u0168) ;; CAPITAL LETTER U WITH TILDE | |
105 (#x9E ?\u1EE4) ;; CAPITAL LETTER U WITH DOT BELOW | |
106 (#x9F ?\u1EF2) ;; CAPITAL LETTER Y WITH GRAVE | |
107 (#xA0 ?\u00D5) ;; CAPITAL LETTER O WITH TILDE | |
108 (#xA1 ?\u1EAF) ;; SMALL LETTER A WITH BREVE AND ACUTE | |
109 (#xA2 ?\u1EB1) ;; SMALL LETTER A WITH BREVE AND GRAVE | |
110 (#xA3 ?\u1EB7) ;; SMALL LETTER A WITH BREVE AND DOT BELOW | |
111 (#xA4 ?\u1EA5) ;; SMALL LETTER A WITH CIRCUMFLEX AND ACUTE | |
112 (#xA5 ?\u1EA7) ;; SMALL LETTER A WITH CIRCUMFLEX AND GRAVE | |
113 (#xA6 ?\u1EA8) ;; CAPITAL LETTER A WITH CIRCUMFLEX AND HOOK ABOVE | |
114 (#xA7 ?\u1EAD) ;; SMALL LETTER A WITH CIRCUMFLEX AND DOT BELOW | |
115 (#xA8 ?\u1EBD) ;; SMALL LETTER E WITH TILDE | |
116 (#xA9 ?\u1EB9) ;; SMALL LETTER E WITH DOT BELOW | |
117 (#xAA ?\u1EBF) ;; SMALL LETTER E WITH CIRCUMFLEX AND ACUTE | |
118 (#xAB ?\u1EC1) ;; SMALL LETTER E WITH CIRCUMFLEX AND GRAVE | |
119 (#xAC ?\u1EC3) ;; SMALL LETTER E WITH CIRCUMFLEX AND HOOK ABOVE | |
120 (#xAD ?\u1EC5) ;; SMALL LETTER E WITH CIRCUMFLEX AND TILDE | |
121 (#xAE ?\u1EC7) ;; SMALL LETTER E WITH CIRCUMFLEX AND DOT BELOW | |
122 (#xAF ?\u1ED1) ;; SMALL LETTER O WITH CIRCUMFLEX AND ACUTE | |
123 (#xB0 ?\u1ED3) ;; SMALL LETTER O WITH CIRCUMFLEX AND GRAVE | |
124 (#xB1 ?\u1ED5) ;; SMALL LETTER O WITH CIRCUMFLEX AND HOOK ABOVE | |
125 (#xB2 ?\u1ED7) ;; SMALL LETTER O WITH CIRCUMFLEX AND TILDE | |
126 (#xB3 ?\u1EE0) ;; CAPITAL LETTER O WITH HORN AND TILDE | |
127 (#xB4 ?\u01A0) ;; CAPITAL LETTER O WITH HORN | |
128 (#xB5 ?\u1ED9) ;; SMALL LETTER O WITH CIRCUMFLEX AND DOT BELOW | |
129 (#xB6 ?\u1EDD) ;; SMALL LETTER O WITH HORN AND GRAVE | |
130 (#xB7 ?\u1EDF) ;; SMALL LETTER O WITH HORN AND HOOK ABOVE | |
131 (#xB8 ?\u1ECB) ;; SMALL LETTER I WITH DOT BELOW | |
132 (#xB9 ?\u1EF0) ;; CAPITAL LETTER U WITH HORN AND DOT BELOW | |
133 (#xBA ?\u1EE8) ;; CAPITAL LETTER U WITH HORN AND ACUTE | |
134 (#xBB ?\u1EEA) ;; CAPITAL LETTER U WITH HORN AND GRAVE | |
135 (#xBC ?\u1EEC) ;; CAPITAL LETTER U WITH HORN AND HOOK ABOVE | |
136 (#xBD ?\u01A1) ;; SMALL LETTER O WITH HORN | |
137 (#xBE ?\u1EDB) ;; SMALL LETTER O WITH HORN AND ACUTE | |
138 (#xBF ?\u01AF) ;; CAPITAL LETTER U WITH HORN | |
139 (#xC0 ?\u00C0) ;; CAPITAL LETTER A WITH GRAVE | |
140 (#xC1 ?\u00C1) ;; CAPITAL LETTER A WITH ACUTE | |
141 (#xC2 ?\u00C2) ;; CAPITAL LETTER A WITH CIRCUMFLEX | |
142 (#xC3 ?\u00C3) ;; CAPITAL LETTER A WITH TILDE | |
143 (#xC4 ?\u1EA2) ;; CAPITAL LETTER A WITH HOOK ABOVE | |
144 (#xC5 ?\u0102) ;; CAPITAL LETTER A WITH BREVE | |
145 (#xC6 ?\u1EB3) ;; SMALL LETTER A WITH BREVE AND HOOK ABOVE | |
146 (#xC7 ?\u1EB5) ;; SMALL LETTER A WITH BREVE AND TILDE | |
147 (#xC8 ?\u00C8) ;; CAPITAL LETTER E WITH GRAVE | |
148 (#xC9 ?\u00C9) ;; CAPITAL LETTER E WITH ACUTE | |
149 (#xCA ?\u00CA) ;; CAPITAL LETTER E WITH CIRCUMFLEX | |
150 (#xCB ?\u1EBA) ;; CAPITAL LETTER E WITH HOOK ABOVE | |
151 (#xCC ?\u00CC) ;; CAPITAL LETTER I WITH GRAVE | |
152 (#xCD ?\u00CD) ;; CAPITAL LETTER I WITH ACUTE | |
153 (#xCE ?\u0128) ;; CAPITAL LETTER I WITH TILDE | |
154 (#xCF ?\u1EF3) ;; SMALL LETTER Y WITH GRAVE | |
155 (#xD0 ?\u0110) ;; CAPITAL LETTER D WITH STROKE | |
156 (#xD1 ?\u1EE9) ;; SMALL LETTER U WITH HORN AND ACUTE | |
157 (#xD2 ?\u00D2) ;; CAPITAL LETTER O WITH GRAVE | |
158 (#xD3 ?\u00D3) ;; CAPITAL LETTER O WITH ACUTE | |
159 (#xD4 ?\u00D4) ;; CAPITAL LETTER O WITH CIRCUMFLEX | |
160 (#xD5 ?\u1EA1) ;; SMALL LETTER A WITH DOT BELOW | |
161 (#xD6 ?\u1EF7) ;; SMALL LETTER Y WITH HOOK ABOVE | |
162 (#xD7 ?\u1EEB) ;; SMALL LETTER U WITH HORN AND GRAVE | |
163 (#xD8 ?\u1EED) ;; SMALL LETTER U WITH HORN AND HOOK ABOVE | |
164 (#xD9 ?\u00D9) ;; CAPITAL LETTER U WITH GRAVE | |
165 (#xDA ?\u00DA) ;; CAPITAL LETTER U WITH ACUTE | |
166 (#xDB ?\u1EF9) ;; SMALL LETTER Y WITH TILDE | |
167 (#xDC ?\u1EF5) ;; SMALL LETTER Y WITH DOT BELOW | |
168 (#xDD ?\u00DD) ;; CAPITAL LETTER Y WITH ACUTE | |
169 (#xDE ?\u1EE1) ;; SMALL LETTER O WITH HORN AND TILDE | |
170 (#xDF ?\u01B0) ;; SMALL LETTER U WITH HORN | |
171 (#xE0 ?\u00E0) ;; SMALL LETTER A WITH GRAVE | |
172 (#xE1 ?\u00E1) ;; SMALL LETTER A WITH ACUTE | |
173 (#xE2 ?\u00E2) ;; SMALL LETTER A WITH CIRCUMFLEX | |
174 (#xE3 ?\u00E3) ;; SMALL LETTER A WITH TILDE | |
175 (#xE4 ?\u1EA3) ;; SMALL LETTER A WITH HOOK ABOVE | |
176 (#xE5 ?\u0103) ;; SMALL LETTER A WITH BREVE | |
177 (#xE6 ?\u1EEF) ;; SMALL LETTER U WITH HORN AND TILDE | |
178 (#xE7 ?\u1EAB) ;; SMALL LETTER A WITH CIRCUMFLEX AND TILDE | |
179 (#xE8 ?\u00E8) ;; SMALL LETTER E WITH GRAVE | |
180 (#xE9 ?\u00E9) ;; SMALL LETTER E WITH ACUTE | |
181 (#xEA ?\u00EA) ;; SMALL LETTER E WITH CIRCUMFLEX | |
182 (#xEB ?\u1EBB) ;; SMALL LETTER E WITH HOOK ABOVE | |
183 (#xEC ?\u00EC) ;; SMALL LETTER I WITH GRAVE | |
184 (#xED ?\u00ED) ;; SMALL LETTER I WITH ACUTE | |
185 (#xEE ?\u0129) ;; SMALL LETTER I WITH TILDE | |
186 (#xEF ?\u1EC9) ;; SMALL LETTER I WITH HOOK ABOVE | |
187 (#xF0 ?\u0111) ;; SMALL LETTER D WITH STROKE | |
188 (#xF1 ?\u1EF1) ;; SMALL LETTER U WITH HORN AND DOT BELOW | |
189 (#xF2 ?\u00F2) ;; SMALL LETTER O WITH GRAVE | |
190 (#xF3 ?\u00F3) ;; SMALL LETTER O WITH ACUTE | |
191 (#xF4 ?\u00F4) ;; SMALL LETTER O WITH CIRCUMFLEX | |
192 (#xF5 ?\u00F5) ;; SMALL LETTER O WITH TILDE | |
193 (#xF6 ?\u1ECF) ;; SMALL LETTER O WITH HOOK ABOVE | |
194 (#xF7 ?\u1ECD) ;; SMALL LETTER O WITH DOT BELOW | |
195 (#xF8 ?\u1EE5) ;; SMALL LETTER U WITH DOT BELOW | |
196 (#xF9 ?\u00F9) ;; SMALL LETTER U WITH GRAVE | |
197 (#xFA ?\u00FA) ;; SMALL LETTER U WITH ACUTE | |
198 (#xFB ?\u0169) ;; SMALL LETTER U WITH TILDE | |
199 (#xFC ?\u1EE7) ;; SMALL LETTER U WITH HOOK ABOVE | |
200 (#xFD ?\u00FD) ;; SMALL LETTER Y WITH ACUTE | |
201 (#xFE ?\u1EE3) ;; SMALL LETTER O WITH HORN AND DOT BELOW | |
202 (#xFF ?\u1EEE)) ;; CAPITAL LETTER U WITH HORN AND TILDE | |
771 | 203 "VISCII 1.1 (Vietnamese)" |
4072 | 204 '(mnemonic "VISCII")) |
428 | 205 |
206 (set-language-info-alist | |
207 "Vietnamese" '((charset vietnamese-viscii-lower vietnamese-viscii-upper) | |
4133 | 208 (coding-system viscii) |
428 | 209 (coding-priority viscii) |
771 | 210 (locale "vietnamese" "vi") |
3970 | 211 ;; Not available in packages. |
212 ;; (input-method . "vietnamese-viqr") | |
428 | 213 (features viet-util) |
214 (sample-text . "Vietnamese (Ti,1*(Bng Vi,1.(Bt) Ch,1`(Bo b,1U(Bn") | |
215 (documentation . "\ | |
440 | 216 For Vietnamese, Emacs uses special charsets internally. |
428 | 217 They can be decoded from and encoded to VISCC, VSCII, and VIQR. |
218 Current setting put higher priority to the coding system VISCII than VSCII. | |
219 If you prefer VSCII, please do: (prefer-coding-system 'vietnamese-vscii)") | |
220 )) | |
221 | |
222 ;;; vietnamese.el ends here |