Mercurial > hg > xemacs-beta
comparison lisp/mule/cyrillic.el @ 3767:6b2ef948e140
[xemacs-hg @ 2006-12-29 18:09:38 by aidan]
etc/ChangeLog addition:
2006-12-21 Aidan Kehoe <kehoea@parhasard.net>
* unicode/unicode-consortium/8859-7.TXT:
Update the mapping to the 2003 version of ISO 8859-7.
lisp/ChangeLog addition:
2006-12-21 Aidan Kehoe <kehoea@parhasard.net>
* mule/cyrillic.el:
* mule/cyrillic.el (iso-8859-5):
* mule/cyrillic.el (cyrillic-koi8-r-encode-table):
Add syntax, case support for Cyrillic; make some parentheses more
Lispy.
* mule/european.el:
Content moved to latin.el, file deleted.
* mule/general-late.el:
If Unicode tables are to be loaded at dump time, do it here, not
in loadup.el.
* mule/greek.el:
Add syntax, case support for Greek.
* mule/latin.el:
Move the content of european.el here. Change the case table
mappings to use hexadecimal codes, to make cross reference to the
standards easier. In all cases, take character syntax from similar
characters in Latin-1 , rather than deciding separately what
syntax they should take. Add (incomplete) support for case with
Turkish. Remove description of the character sets used from the
language environments' doc strings, since now that we create
variant language environments on the fly, such descriptions will
often be inaccurate. Set the native-coding-system language info
property while setting the other coding-system properties of the
language.
* mule/misc-lang.el (ipa):
Remove the language environment. The International Phonetic
_Alphabet_ is not a language, it's inane to have a corresponding
language environment in XEmacs.
* mule/mule-cmds.el (create-variant-language-environment):
Also modify the coding-priority when creating a new language
environment; document that.
* mule/mule-cmds.el (get-language-environment-from-locale):
Recognise that the 'native-coding-system language-info property
can be a list, interpret it correctly when it is one.
2006-12-21 Aidan Kehoe <kehoea@parhasard.net>
* coding.el (coding-system-category):
Use the new 'unicode-type property for finding what sort of
Unicode coding system subtype a coding system is, instead of the
overshadowed 'type property.
* dumped-lisp.el (preloaded-file-list):
mule/european.el has been removed.
* loadup.el (really-early-error-handler):
Unicode tables loaded at dump time are now in
mule/general-late.el.
* simple.el (count-lines):
Add some backslashes to to parentheses in docstrings to help
fontification along.
* simple.el (what-cursor-position):
Wrap a line to fit in 80 characters.
* unicode.el:
Use the 'unicode-type property, not 'type, for setting the Unicode
coding-system subtype.
src/ChangeLog addition:
2006-12-21 Aidan Kehoe <kehoea@parhasard.net>
* file-coding.c:
Update the make-coding-system docstring to reflect unicode-type
* general-slots.h:
New symbol, unicode-type, since 'type was being overridden when
accessing a coding system's Unicode subtype.
* intl-win32.c:
Backslash a few parentheses, to help fontification along.
* intl-win32.c (complex_vars_of_intl_win32):
Use the 'unicode-type symbol, not 'type, when creating the
Microsoft Unicode coding system.
* unicode.c (unicode_putprop):
* unicode.c (unicode_getprop):
* unicode.c (unicode_print):
Using 'type as the property name when working out what Unicode
subtype a given coding system is was broken, since there's a
general coding system property called 'type. Change the former to
use 'unicode-type instead.
author | aidan |
---|---|
date | Fri, 29 Dec 2006 18:09:51 +0000 |
parents | f37a5923ceba |
children | fbf54025c136 |
comparison
equal
deleted
inserted
replaced
3766:a3dcf9d17a40 | 3767:6b2ef948e140 |
---|---|
27 ;;; Commentary: | 27 ;;; Commentary: |
28 | 28 |
29 ;; The character set ISO8859-5 is supported. KOI-8 and ALTERNATIVNYJ are | 29 ;; The character set ISO8859-5 is supported. KOI-8 and ALTERNATIVNYJ are |
30 ;; converted to ISO8859-5 internally. | 30 ;; converted to ISO8859-5 internally. |
31 | 31 |
32 ;; Windows-1251 support deleted because XEmacs has automatic support. | 32 ;; [Windows-1251 support deleted because XEmacs has automatic support.] |
33 | |
34 ;; #### We only have automatic support on Windows; that needs to be put | |
35 ;; back. Also, the Russian Wikipedia articles on KOI-8 list several other | |
36 ;; related encodings--KOI8-U (Ukrainian), KOI8-RU (simultaneous support for | |
37 ;; Russian, Belorussian, and Ukrainian), KOI8-C (for languages of the | |
38 ;; Caucasus), KOI8-O (Old Church Slavonic)--and it would be nice to have | |
39 ;; them. Beyond that, we're currently trashing lots of code points with | |
40 ;; KOI-8 R; it would be nice to leverage the Unicode support to not do that. | |
33 | 41 |
34 ;;; Code: | 42 ;;; Code: |
35 | 43 |
36 ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;; | 44 ;; Case table: |
37 ;;; CYRILLIC | |
38 ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;; | |
39 | |
40 ;; ISO-8859-5 | |
41 | |
42 (loop | 45 (loop |
43 for (upper lower) | 46 for (upper lower) |
44 in '((#xcf #xef) ; YA | 47 in '((#xcf #xef) ; YA |
45 (#xce #xee) ; YU | 48 (#xce #xee) ; YU |
46 (#xcd #xed) ; E | 49 (#xcd #xed) ; E |
92 (put-case-table-pair (make-char 'cyrillic-iso8859-5 upper) | 95 (put-case-table-pair (make-char 'cyrillic-iso8859-5 upper) |
93 (make-char 'cyrillic-iso8859-5 lower) | 96 (make-char 'cyrillic-iso8859-5 lower) |
94 case-table)) | 97 case-table)) |
95 | 98 |
96 ;; The default character syntax is now word. Pay attention to the | 99 ;; The default character syntax is now word. Pay attention to the |
97 ;; exceptions in ISO-8859-5. | 100 ;; exceptions in ISO-8859-5, copying them from ISO-8859-1. |
98 (dolist (code '(#xAD ;; SOFT HYPHEN | 101 (loop |
99 #xF0 ;; NUMERO SIGN | 102 for (latin-1 cyrillic) |
100 #xFD)) ;; SECTION SIGN | 103 in '((#xAD #xAD) ;; SOFT HYPHEN |
101 (modify-syntax-entry (make-char 'cyrillic-iso8859-5 code) ".")) | 104 (#xA7 #xFD) ;; SECTION SIGN |
102 | 105 (#xA0 #xA0)) ;; NO BREAK SPACE |
103 ;; NO-BREAK SPACE | 106 with syntax-table = (standard-syntax-table) |
104 (modify-syntax-entry (make-char 'cyrillic-iso8859-5 #xA0) " ") | 107 do (modify-syntax-entry |
108 (make-char 'cyrillic-iso8859-5 cyrillic) | |
109 (string (char-syntax (make-char 'latin-iso8859-1 latin-1))) | |
110 syntax-table)) | |
111 | |
112 ;; Take NUMERO SIGN's syntax from #. | |
113 (modify-syntax-entry (make-char 'cyrillic-iso8859-5 #xF0) | |
114 (string (char-syntax ?\# (standard-syntax-table))) | |
115 (standard-syntax-table)) | |
105 | 116 |
106 (make-coding-system | 117 (make-coding-system |
107 'iso-8859-5 'iso2022 | 118 'iso-8859-5 'iso2022 |
108 "ISO-8859-5 (Cyrillic)" | 119 "ISO-8859-5 (Cyrillic)" |
109 '(charset-g0 ascii | 120 '(charset-g0 ascii |
110 charset-g1 cyrillic-iso8859-5 | 121 charset-g1 cyrillic-iso8859-5 |
111 charset-g2 t | 122 charset-g2 t |
112 charset-g3 t | 123 charset-g3 t |
113 mnemonic "ISO8/Cyr" | 124 mnemonic "ISO8/Cyr")) |
114 )) | |
115 | 125 |
116 (set-language-info-alist | 126 (set-language-info-alist |
117 "Cyrillic-ISO" '((charset cyrillic-iso8859-5) | 127 "Cyrillic-ISO" '((charset cyrillic-iso8859-5) |
118 (tutorial . "TUTORIAL.ru") | 128 (tutorial . "TUTORIAL.ru") |
119 (coding-system iso-8859-5) | 129 (coding-system iso-8859-5) |
153 (i 0)) | 163 (i 0)) |
154 (while (< i 256) | 164 (while (< i 256) |
155 (let* ((ch (aref cyrillic-koi8-r-decode-table i)) | 165 (let* ((ch (aref cyrillic-koi8-r-decode-table i)) |
156 (split (split-char ch))) | 166 (split (split-char ch))) |
157 (cond ((eq (car split) 'cyrillic-iso8859-5) | 167 (cond ((eq (car split) 'cyrillic-iso8859-5) |
158 (aset table (logior (nth 1 split) 128) i) | 168 (aset table (logior (nth 1 split) 128) i)) |
159 ) | |
160 ((eq ch 32)) | 169 ((eq ch 32)) |
161 ((eq (car split) 'ascii) | 170 ((eq (car split) 'ascii) |
162 (aset table ch i) | 171 (aset table ch i)))) |
163 ))) | |
164 (setq i (1+ i))) | 172 (setq i (1+ i))) |
165 table) | 173 table) |
166 "Cyrillic KOI8-R encoding table.") | 174 "Cyrillic KOI8-R encoding table.") |
167 | 175 |
168 ) | 176 ) |