annotate lisp/mule/japanese.el @ 771:943eaba38521

[xemacs-hg @ 2002-03-13 08:51:24 by ben] The big ben-mule-21-5 check-in! Various files were added and deleted. See CHANGES-ben-mule. There are still some test suite failures. No crashes, though. Many of the failures have to do with problems in the test suite itself rather than in the actual code. I'll be addressing these in the next day or so -- none of the test suite failures are at all critical. Meanwhile I'll be trying to address the biggest issues -- i.e. build or run failures, which will almost certainly happen on various platforms. All comments should be sent to ben@xemacs.org -- use a Cc: if necessary when sending to mailing lists. There will be pre- and post- tags, something like pre-ben-mule-21-5-merge-in, and post-ben-mule-21-5-merge-in.
author ben
date Wed, 13 Mar 2002 08:54:06 +0000
parents 98528da0b7fc
children 2923009caf47
Ignore whitespace changes - Everywhere: Within whitespace: At end of lines:
rev   line source
428
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1 ;;; japanese.el --- Japanese support -*- coding: iso-2022-7bit; -*-
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
2
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
3 ;; Copyright (C) 1995 Electrotechnical Laboratory, JAPAN.
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
4 ;; Licensed to the Free Software Foundation.
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
5 ;; Copyright (C) 1997 MORIOKA Tomohiko
771
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents: 450
diff changeset
6 ;; Copyright (C) 2000, 2002 Ben Wing.
428
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
7
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
8 ;; Keywords: multilingual, Japanese
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
9
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
10 ;; This file is part of XEmacs.
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
11
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
12 ;; XEmacs is free software; you can redistribute it and/or modify it
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
13 ;; under the terms of the GNU General Public License as published by
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
14 ;; the Free Software Foundation; either version 2, or (at your option)
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
15 ;; any later version.
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
16
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
17 ;; XEmacs is distributed in the hope that it will be useful, but
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
18 ;; WITHOUT ANY WARRANTY; without even the implied warranty of
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
19 ;; MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
20 ;; General Public License for more details.
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
21
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
22 ;; You should have received a copy of the GNU General Public License
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
23 ;; along with XEmacs; see the file COPYING. If not, write to the Free
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
24 ;; Software Foundation, Inc., 59 Temple Place - Suite 330, Boston, MA
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
25 ;; 02111-1307, USA.
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
26
771
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents: 450
diff changeset
27 ;;; Synched up with: Emacs 20.6 (international/japanese.el).
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents: 450
diff changeset
28
428
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
29 ;;; Commentary:
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
30
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
31 ;; For Japanese, character sets JISX0201, JISX0208, JISX0212 are
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
32 ;; supported.
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
33
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
34 ;;; Code:
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
35
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
36 ;;; Syntax of Japanese characters.
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
37 (modify-syntax-entry 'katakana-jisx0201 "w")
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
38 (modify-syntax-entry 'japanese-jisx0212 "w")
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
39
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
40 (modify-syntax-entry 'japanese-jisx0208 "w")
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
41 (loop for row in '(33 34 40)
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
42 do (modify-syntax-entry `[japanese-jisx0208 ,row] "_"))
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
43 (loop for char in '(?$B!<(B ?$B!+(B ?$B!,(B ?$B!3(B ?$B!4(B ?$B!5(B ?$B!6(B ?$B!7(B ?$B!8(B ?$B!9(B ?$B!:(B ?$B!;(B)
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
44 do (modify-syntax-entry char "w"))
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
45 (modify-syntax-entry ?\$B!J(B "($B!K(B")
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
46 (modify-syntax-entry ?\$B!N(B "($B!O(B")
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
47 (modify-syntax-entry ?\$B!P(B "($B!Q(B")
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
48 (modify-syntax-entry ?\$B!V(B "($B!W(B")
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
49 (modify-syntax-entry ?\$B!X(B "($B!Y(B")
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
50 (modify-syntax-entry ?\$B!K(B ")$B!J(B")
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
51 (modify-syntax-entry ?\$B!O(B ")$B!N(B")
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
52 (modify-syntax-entry ?\$B!Q(B ")$B!P(B")
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
53 (modify-syntax-entry ?\$B!W(B ")$B!V(B")
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
54 (modify-syntax-entry ?\$B!Y(B ")$B!X(B")
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
55
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
56 ;;; Character categories S, A, H, K, G, Y, and C
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
57 (define-category ?S "Japanese 2-byte symbol character.")
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
58 (modify-category-entry [japanese-jisx0208 33] ?S)
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
59 (modify-category-entry [japanese-jisx0208 34] ?S)
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
60 (modify-category-entry [japanese-jisx0208 40] ?S)
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
61 (define-category ?A "Japanese 2-byte Alphanumeric character.")
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
62 (modify-category-entry [japanese-jisx0208 35] ?A)
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
63 (define-category ?H "Japanese 2-byte Hiragana character.")
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
64 (modify-category-entry [japanese-jisx0208 36] ?H)
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
65 (define-category ?K "Japanese 2-byte Katakana character.")
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
66 (modify-category-entry [japanese-jisx0208 37] ?K)
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
67 (define-category ?G "Japanese 2-byte Greek character.")
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
68 (modify-category-entry [japanese-jisx0208 38] ?G)
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
69 (define-category ?Y "Japanese 2-byte Cyrillic character.")
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
70 (modify-category-entry [japanese-jisx0208 39] ?Y)
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
71 (define-category ?C "Japanese 2-byte Kanji characters.")
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
72 (loop for row from 48 to 126
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
73 do (modify-category-entry `[japanese-jisx0208 ,row] ?C))
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
74 (loop for char in '(?$B!<(B ?$B!+(B ?$B!,(B)
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
75 do (modify-category-entry char ?K)
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
76 (modify-category-entry char ?H))
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
77 (loop for char in '(?$B!3(B ?$B!4(B ?$B!5(B ?$B!6(B ?$B!7(B ?$B!8(B ?$B!9(B ?$B!:(B ?$B!;(B)
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
78 do (modify-category-entry char ?C))
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
79 (modify-category-entry 'japanese-jisx0212 ?C)
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
80
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
81 (defvar japanese-word-regexp
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
82 "\\cA+\\cH*\\|\\cK+\\cH*\\|\\cC+\\cH*\\|\\cH+\\|\\ck+\\|\\sw+"
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
83 "Regular expression used to match a Japanese word.")
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
84
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
85 (set-word-regexp japanese-word-regexp)
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
86 (setq forward-word-regexp "\\w\\>")
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
87 (setq backward-word-regexp "\\<\\w")
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
88
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
89 ;;; Paragraph setting
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
90 (setq sentence-end
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
91 (concat
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
92 "\\("
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
93 "\\("
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
94 "[.?!][]\"')}]*"
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
95 "\\|"
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
96 "[$B!%!)!*(B][$B!O!I!G!K!Q!M!S!U!W!Y(B]*"
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
97 "\\)"
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
98 "\\($\\|\t\\| \\)"
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
99 "\\|"
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
100 "$B!#(B"
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
101 "\\)"
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
102 "[ \t\n]*"))
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
103 (setq paragraph-start "^[ $B!!(B\t\n\f]")
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
104 (setq paragraph-separate "^[ $B!!(B\t\f]*$")
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
105
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
106 ;; EGG specific setup
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
107 (define-egg-environment 'japanese
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
108 "Japanese settings for egg."
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
109 (lambda ()
771
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents: 450
diff changeset
110 (with-boundp '(its:*standard-modes* its:*current-map* wnn-server-type)
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents: 450
diff changeset
111 (with-fboundp 'its:get-mode-map
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents: 450
diff changeset
112 (when (not (featurep 'egg-jpn))
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents: 450
diff changeset
113 (load "its-hira")
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents: 450
diff changeset
114 (load "its-kata")
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents: 450
diff changeset
115 (load "its-hankaku")
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents: 450
diff changeset
116 (load "its-zenkaku")
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents: 450
diff changeset
117 (setq its:*standard-modes*
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents: 450
diff changeset
118 (append
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents: 450
diff changeset
119 (list (its:get-mode-map "roma-kana")
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents: 450
diff changeset
120 (its:get-mode-map "roma-kata")
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents: 450
diff changeset
121 (its:get-mode-map "downcase")
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents: 450
diff changeset
122 (its:get-mode-map "upcase")
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents: 450
diff changeset
123 (its:get-mode-map "zenkaku-downcase")
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents: 450
diff changeset
124 (its:get-mode-map "zenkaku-upcase"))
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents: 450
diff changeset
125 its:*standard-modes*))
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents: 450
diff changeset
126 (provide 'egg-jpn))
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents: 450
diff changeset
127 (setq wnn-server-type 'jserver)
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents: 450
diff changeset
128 ;; Can't do this here any more. Must do it when selecting egg-wnn
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents: 450
diff changeset
129 ;; or egg-sj3
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents: 450
diff changeset
130 ;; (setq egg-default-startup-file "eggrc-wnn")
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents: 450
diff changeset
131 (setq-default its:*current-map* (its:get-mode-map "roma-kana"))))))
428
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
132
450
98528da0b7fc Import from CVS: tag r21-2-40
cvs
parents: 428
diff changeset
133 ;; stuff for providing grammatic processing of Japanese text
428
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
134 ;; something like this should probably be created for all environments...
450
98528da0b7fc Import from CVS: tag r21-2-40
cvs
parents: 428
diff changeset
135 ;; #### Arrgh. This stuff should defvar'd in either fill.el or kinsoku.el.
98528da0b7fc Import from CVS: tag r21-2-40
cvs
parents: 428
diff changeset
136 ;; Then the language environment should set these things, probably buffer-
98528da0b7fc Import from CVS: tag r21-2-40
cvs
parents: 428
diff changeset
137 ;; locally.
428
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
138
771
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents: 450
diff changeset
139 ;; #### will be moved to fill.el
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents: 450
diff changeset
140 (defvar space-insertable
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents: 450
diff changeset
141 (let* ((aletter (concat "\\(" ascii-char "\\|" kanji-char "\\)"))
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents: 450
diff changeset
142 (kanji-space-insertable
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents: 450
diff changeset
143 (concat
428
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
144 "$B!"(B" aletter "\\|"
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
145 "$B!#(B" aletter "\\|"
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
146 aletter "$B!J(B" "\\|"
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
147 "$B!K(B" aletter "\\|"
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
148 ascii-alphanumeric kanji-kanji-char "\\|"
771
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents: 450
diff changeset
149 kanji-kanji-char ascii-alphanumeric)))
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents: 450
diff changeset
150 (concat " " aletter "\\|" kanji-space-insertable))
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents: 450
diff changeset
151 "Regexp for finding points that can have spaces inserted into them for justification")
428
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
152
771
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents: 450
diff changeset
153 ;; Beginning of FSF synching with international/japanese.el.
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents: 450
diff changeset
154
428
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
155 ;; (make-coding-system
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
156 ;; 'iso-2022-jp 2 ?J
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
157 ;; "ISO 2022 based 7bit encoding for Japanese (MIME:ISO-2022-JP)"
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
158 ;; '((ascii japanese-jisx0208-1978 japanese-jisx0208
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
159 ;; latin-jisx0201 japanese-jisx0212 katakana-jisx0201) nil nil nil
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
160 ;; short ascii-eol ascii-cntl seven)
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
161 ;; '((safe-charsets ascii japanese-jisx0208-1978 japanese-jisx0208
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
162 ;; latin-jisx0201 japanese-jisx0212 katakana-jisx0201)
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
163 ;; (mime-charset . iso-2022-jp)))
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
164
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
165 (make-coding-system
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
166 'iso-2022-jp 'iso2022
771
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents: 450
diff changeset
167 "ISO-2022-JP (Japanese mail)"
428
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
168 '(charset-g0 ascii
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
169 short t
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
170 seven t
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
171 input-charset-conversion ((latin-jisx0201 ascii)
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
172 (japanese-jisx0208-1978 japanese-jisx0208))
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
173 mnemonic "MULE/7bit"
771
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents: 450
diff changeset
174 documentation
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents: 450
diff changeset
175 "Coding system used for communication with mail and news in Japan."
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents: 450
diff changeset
176 ))
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents: 450
diff changeset
177
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents: 450
diff changeset
178 (make-coding-system
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents: 450
diff changeset
179 'jis7 'iso2022
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents: 450
diff changeset
180 "JIS7 (old Japanese 7-bit encoding)"
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents: 450
diff changeset
181 '(charset-g0 ascii
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents: 450
diff changeset
182 charset-g1 katakana-jisx0201
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents: 450
diff changeset
183 short t
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents: 450
diff changeset
184 seven t
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents: 450
diff changeset
185 lock-shift t
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents: 450
diff changeset
186 input-charset-conversion ((latin-jisx0201 ascii)
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents: 450
diff changeset
187 (japanese-jisx0208-1978 japanese-jisx0208))
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents: 450
diff changeset
188 mnemonic "JIS7"
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents: 450
diff changeset
189 documentation
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents: 450
diff changeset
190 "Old JIS 7-bit encoding; mostly superseded by ISO-2022-JP.
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents: 450
diff changeset
191 Uses locking-shift (SI/SO) to select half-width katakana."
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents: 450
diff changeset
192 ))
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents: 450
diff changeset
193
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents: 450
diff changeset
194 (make-coding-system
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents: 450
diff changeset
195 'jis8 'iso2022
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents: 450
diff changeset
196 "JIS8 (old Japanese 8-bit encoding)"
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents: 450
diff changeset
197 '(charset-g0 ascii
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents: 450
diff changeset
198 charset-g1 katakana-jisx0201
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents: 450
diff changeset
199 short t
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents: 450
diff changeset
200 input-charset-conversion ((latin-jisx0201 ascii)
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents: 450
diff changeset
201 (japanese-jisx0208-1978 japanese-jisx0208))
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents: 450
diff changeset
202 mnemonic "JIS8"
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents: 450
diff changeset
203 documentation
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents: 450
diff changeset
204 "Old JIS 8-bit encoding; mostly superseded by ISO-2022-JP.
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents: 450
diff changeset
205 Uses high bytes for half-width katakana."
428
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
206 ))
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
207
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
208 (define-coding-system-alias 'junet 'iso-2022-jp)
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
209
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
210 ;; (make-coding-system
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
211 ;; 'iso-2022-jp-2 2 ?J
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
212 ;; "ISO 2022 based 7bit encoding for CJK, Latin-1, and Greek (MIME:ISO-2022-JP-2)"
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
213 ;; '((ascii japanese-jisx0208-1978 japanese-jisx0208
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
214 ;; latin-jisx0201 japanese-jisx0212 katakana-jisx0201
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
215 ;; chinese-gb2312 korean-ksc5601) nil
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
216 ;; (nil latin-iso8859-1 greek-iso8859-7) nil
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
217 ;; short ascii-eol ascii-cntl seven nil single-shift)
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
218 ;; '((safe-charsets ascii japanese-jisx0208-1978 japanese-jisx0208
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
219 ;; latin-jisx0201 japanese-jisx0212 katakana-jisx0201
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
220 ;; chinese-gb2312 korean-ksc5601
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
221 ;; latin-iso8859-1 greek-iso8859-7)
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
222 ;; (mime-charset . iso-2022-jp-2)))
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
223
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
224 ;; (make-coding-system
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
225 ;; 'japanese-shift-jis 1 ?S
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
226 ;; "Shift-JIS 8-bit encoding for Japanese (MIME:SHIFT_JIS)"
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
227 ;; nil
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
228 ;; '((safe-charsets ascii japanese-jisx0208 japanese-jisx0208-1978
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
229 ;; latin-jisx0201 katakana-jisx0201)
771
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents: 450
diff changeset
230 ;; (mime-charset . shift-jis)
428
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
231 ;; (charset-origin-alist (japanese-jisx0208 "SJIS" encode-sjis-char)
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
232 ;; (katakana-jisx0201 "SJIS" encode-sjis-char))))
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
233
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
234 (make-coding-system
771
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents: 450
diff changeset
235 'shift-jis 'shift-jis
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents: 450
diff changeset
236 "Shift-JIS"
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents: 450
diff changeset
237 '(mnemonic "Ja/SJIS"
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents: 450
diff changeset
238 documentation "The standard Japanese encoding in MS Windows."
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents: 450
diff changeset
239 ))
428
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
240
771
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents: 450
diff changeset
241 ;; A former name?
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents: 450
diff changeset
242 (define-coding-system-alias 'shift_jis 'shift-jis)
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents: 450
diff changeset
243
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents: 450
diff changeset
244 ;; FSF:
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents: 450
diff changeset
245 ;; (define-coding-system-alias 'shift-jis 'japanese-shift-jis)
428
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
246 ;; (define-coding-system-alias 'sjis 'japanese-shift-jis)
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
247
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
248 ;; (make-coding-system
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
249 ;; 'japanese-iso-7bit-1978-irv 2 ?j
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
250 ;; "ISO 2022 based 7-bit encoding for Japanese JISX0208-1978 and JISX0201-Roman"
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
251 ;; '((ascii japanese-jisx0208-1978 japanese-jisx0208
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
252 ;; latin-jisx0201 japanese-jisx0212 katakana-jisx0201 t) nil nil nil
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
253 ;; short ascii-eol ascii-cntl seven nil nil use-roman use-oldjis)
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
254 ;; '(ascii japanese-jisx0208-1978 japanese-jisx0208 latin-jisx0201))
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
255
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
256 (make-coding-system
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
257 'iso-2022-jp-1978-irv 'iso2022
771
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents: 450
diff changeset
258 "ISO-2022-JP-1978-IRV (Old JIS)"
428
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
259 '(charset-g0 ascii
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
260 short t
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
261 seven t
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
262 output-charset-conversion ((ascii latin-jisx0201)
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
263 (japanese-jisx0208 japanese-jisx0208-1978))
771
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents: 450
diff changeset
264 documentation
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents: 450
diff changeset
265 "This is a coding system used for old JIS terminals. It's an ISO
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents: 450
diff changeset
266 2022 based 7-bit encoding for Japanese JISX0208-1978 and JISX0201-Roman."
428
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
267 mnemonic "Ja-78/7bit"
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
268 ))
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
269
771
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents: 450
diff changeset
270 ;; FSF:
428
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
271 ;; (define-coding-system-alias 'iso-2022-jp-1978-irv 'japanese-iso-7bit-1978-irv)
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
272 ;; (define-coding-system-alias 'old-jis 'japanese-iso-7bit-1978-irv)
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
273
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
274 (define-coding-system-alias 'old-jis 'iso-2022-jp-1978-irv)
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
275
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
276 ;; (make-coding-system
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
277 ;; 'japanese-iso-8bit 2 ?E
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
278 ;; "ISO 2022 based EUC encoding for Japanese (MIME:EUC-JP)"
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
279 ;; '(ascii japanese-jisx0208 katakana-jisx0201 japanese-jisx0212
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
280 ;; short ascii-eol ascii-cntl nil nil single-shift)
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
281 ;; '((safe-charsets ascii latin-jisx0201 japanese-jisx0208 japanese-jisx0208-1978
771
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents: 450
diff changeset
282 ;; katakana-jisx0201 japanese-jisx0212)
428
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
283 ;; (mime-charset . euc-jp)))
771
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents: 450
diff changeset
284 ;;
428
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
285 (make-coding-system
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
286 'euc-jp 'iso2022
771
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents: 450
diff changeset
287 "Japanese EUC"
428
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
288 '(charset-g0 ascii
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
289 charset-g1 japanese-jisx0208
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
290 charset-g2 katakana-jisx0201
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
291 charset-g3 japanese-jisx0212
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
292 short t
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
293 mnemonic "Ja/EUC"
771
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents: 450
diff changeset
294 documentation
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents: 450
diff changeset
295 "Japanese EUC (Extended Unix Code), the standard Japanese encoding in Unix.
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents: 450
diff changeset
296 Equivalent MIME encoding: EUC-JP.
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents: 450
diff changeset
297
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents: 450
diff changeset
298 Japanese EUC was the forefather of all the different EUC's, which all follow
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents: 450
diff changeset
299 a similar structure:
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents: 450
diff changeset
300
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents: 450
diff changeset
301 1. Up to four character sets can be encoded.
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents: 450
diff changeset
302
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents: 450
diff changeset
303 2. This is a non-modal encoding, i.e. it is impossible to set a global state
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents: 450
diff changeset
304 that affects anything more than the directly following character. [Modal
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents: 450
diff changeset
305 encodings typically have escape sequences to change global settings, which
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents: 450
diff changeset
306 affect all the following characters until the setting is turned off.
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents: 450
diff changeset
307 Modal encodings are typically used when it's necessary to support text in
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents: 450
diff changeset
308 a wide variety of character sets and still keep basic ASCII compatibility,
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents: 450
diff changeset
309 or in cases (e.g. sending email) where the allowed characters that can
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents: 450
diff changeset
310 pass the gateway are small and (typically) no high-bit range is available.
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents: 450
diff changeset
311
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents: 450
diff changeset
312 3. The first character set is always ASCII or some national variant of it,
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents: 450
diff changeset
313 and encoded in the standard ASCII position. All characters in all other
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents: 450
diff changeset
314 character sets are encoded entirely using high-half bytes. Therefore,
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents: 450
diff changeset
315 it is safe to scan for ASCII characters, such as '/' to separate path
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents: 450
diff changeset
316 components, in the obvious way.
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents: 450
diff changeset
317
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents: 450
diff changeset
318 4. Each of the other three character sets can be of dimension 1, 2, or 3.
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents: 450
diff changeset
319 A dimension-1 character set contains 96 bytes; a dimension-2 character
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents: 450
diff changeset
320 set contains 96 x 96 bytes; and a dimension-3 character set contains
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents: 450
diff changeset
321 96 x 96 x 96 bytes. 94 instead of 96 as the number of characters per
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents: 450
diff changeset
322 dimension is also supported. Character sets of dimensions 1, 2, and 3
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents: 450
diff changeset
323 use 1-3 bytes, respectively, to encode a character, and each byte is
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents: 450
diff changeset
324 in the range A0-FF (or A1-FE for those with 94 bytes per dimension).
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents: 450
diff changeset
325
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents: 450
diff changeset
326 5. The four character sets encoded in EUC are called G0, G1, G2, and G3.
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents: 450
diff changeset
327 As mentioned earlier, G0 is ASCII or some variant, and encoded into
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents: 450
diff changeset
328 the ASCII positions 00 - 7F. G1 is encoded directly by laying out
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents: 450
diff changeset
329 its bytes. G2 is encoded using an 8E byte followed by the character's
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents: 450
diff changeset
330 bytes. G3 is encoded using an 8F byte followed by the character's bytes."
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents: 450
diff changeset
331
428
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
332 ))
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
333
771
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents: 450
diff changeset
334 ;; FSF:
428
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
335 ;; (define-coding-system-alias 'euc-japan-1990 'japanese-iso-8bit)
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
336 ;; (define-coding-system-alias 'euc-japan 'japanese-iso-8bit)
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
337 ;; (define-coding-system-alias 'euc-jp 'japanese-iso-8bit)
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
338
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
339 (define-coding-system-alias 'euc-japan 'euc-jp) ; only for w3
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
340 (define-coding-system-alias 'japanese-euc 'euc-jp)
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
341
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
342 (set-language-info-alist
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
343 "Japanese" '((setup-function . setup-japanese-environment-internal)
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
344 (exit-function . exit-japanese-environment)
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
345 (tutorial . "TUTORIAL.ja")
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
346 (charset japanese-jisx0208 japanese-jisx0208-1978
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
347 japanese-jisx0212 latin-jisx0201 katakana-jisx0201)
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
348 (coding-system iso-2022-jp euc-jp
771
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents: 450
diff changeset
349 shift-jis iso-2022-jp-2)
428
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
350 (coding-priority iso-2022-jp euc-jp
771
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents: 450
diff changeset
351 shift-jis iso-2022-jp-2)
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents: 450
diff changeset
352 ;; These locale names come from the X11R6 locale.alias file.
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents: 450
diff changeset
353 ;; What an incredible fucking mess!!!!!!!!!!!!!!!!!!!!!!!!!!
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents: 450
diff changeset
354 ;; What's worse is that typical Unix implementations of
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents: 450
diff changeset
355 ;; setlocale() return back exactly what you passed them, even
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents: 450
diff changeset
356 ;; though it's perfectly allowed (and in fact done under
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents: 450
diff changeset
357 ;; Windows) to expand the locale to its full form (including
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents: 450
diff changeset
358 ;; encoding), so you have some hint as to the encoding!!!
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents: 450
diff changeset
359 ;;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents: 450
diff changeset
360 ;; We order them in such a way that we're maximally likely
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents: 450
diff changeset
361 ;; to get an encoding name.
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents: 450
diff changeset
362 ;;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents: 450
diff changeset
363 (locale
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents: 450
diff changeset
364 ;; SunOS 5.7: ja ja_JP.PCK ja_JP.UTF-8 japanese
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents: 450
diff changeset
365 ;; RedHat Linux 6.2J: ja ja_JP ja_JP.eucJP ja_JP.ujis \
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents: 450
diff changeset
366 ;; japanese japanese.euc
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents: 450
diff changeset
367 ;; HP-UX 10.20: ja_JP.SJIS ja_JP.eucJPput ja_JP.kana8
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents: 450
diff changeset
368 ;; Cygwin b20.1: ja_JP.EUC
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents: 450
diff changeset
369 ;; FreeBSD 2.2.8: ja_JP.EUC ja_JP.SJIS
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents: 450
diff changeset
370
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents: 450
diff changeset
371 ;; EUC locales
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents: 450
diff changeset
372 "ja_JP.EUC"
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents: 450
diff changeset
373 "ja_JP.eucJP"
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents: 450
diff changeset
374 "ja_JP.AJEC"
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents: 450
diff changeset
375 "ja_JP.ujis"
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents: 450
diff changeset
376 "Japanese-EUC"
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents: 450
diff changeset
377 "japanese.euc"
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents: 450
diff changeset
378
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents: 450
diff changeset
379 ;; Shift-JIS locales
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents: 450
diff changeset
380 "ja_JP.SJIS"
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents: 450
diff changeset
381 "ja_JP.mscode"
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents: 450
diff changeset
382 "ja.SJIS"
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents: 450
diff changeset
383
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents: 450
diff changeset
384 ;; 7-bit locales
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents: 450
diff changeset
385 "ja_JP.ISO-2022-JP"
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents: 450
diff changeset
386 "ja_JP.jis7"
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents: 450
diff changeset
387 "ja_JP.pjis"
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents: 450
diff changeset
388 "ja_JP.JIS"
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents: 450
diff changeset
389 "ja.JIS"
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents: 450
diff changeset
390
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents: 450
diff changeset
391 ;; 8-bit locales
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents: 450
diff changeset
392 "ja_JP.jis8"
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents: 450
diff changeset
393
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents: 450
diff changeset
394 ;; encoding-unspecified locales
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents: 450
diff changeset
395 "ja_JP"
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents: 450
diff changeset
396 "Ja_JP"
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents: 450
diff changeset
397 "Jp_JP"
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents: 450
diff changeset
398 "japanese"
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents: 450
diff changeset
399 "japan"
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents: 450
diff changeset
400 "ja"
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents: 450
diff changeset
401 )
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents: 450
diff changeset
402
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents: 450
diff changeset
403 (native-coding-system
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents: 450
diff changeset
404 ;; first, see if an explicit encoding was given.
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents: 450
diff changeset
405 #'(lambda (locale)
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents: 450
diff changeset
406 (let ((case-fold-search t))
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents: 450
diff changeset
407 (cond
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents: 450
diff changeset
408 ;; many unix versions
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents: 450
diff changeset
409 ((string-match "\\.euc" locale) 'euc-jp)
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents: 450
diff changeset
410 ((string-match "\\.sjis" locale) 'shift-jis)
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents: 450
diff changeset
411
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents: 450
diff changeset
412 ;; X11R6 (CJKV p. 471)
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents: 450
diff changeset
413 ((string-match "\\.jis7" locale) 'jis7)
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents: 450
diff changeset
414 ((string-match "\\.jis8" locale) 'jis8)
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents: 450
diff changeset
415 ((string-match "\\.mscode" locale) 'shift-jis)
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents: 450
diff changeset
416 ((string-match "\\.pjis" locale) 'iso-2022-jp)
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents: 450
diff changeset
417 ((string-match "\\.ujis" locale) 'euc-jp)
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents: 450
diff changeset
418
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents: 450
diff changeset
419 ;; other names in X11R6 locale.alias
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents: 450
diff changeset
420 ((string-match "\\.ajec" locale) 'euc-jp)
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents: 450
diff changeset
421 ((string-match "-euc" locale) 'euc-jp)
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents: 450
diff changeset
422 ((string-match "\\.iso-2022-jp" locale) 'iso-2022-jp)
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents: 450
diff changeset
423 ((string-match "\\.jis" locale) 'jis7) ;; or just jis?
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents: 450
diff changeset
424 )))
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents: 450
diff changeset
425
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents: 450
diff changeset
426 ;; aix (CJKV p. 465)
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents: 450
diff changeset
427 #'(lambda (locale)
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents: 450
diff changeset
428 (when (eq system-type 'aix)
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents: 450
diff changeset
429 (cond
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents: 450
diff changeset
430 ((string-match "^Ja_JP" locale) 'shift-jis)
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents: 450
diff changeset
431 ((string-match "^ja_JP" locale) 'euc-jp))))
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents: 450
diff changeset
432
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents: 450
diff changeset
433 ;; other X11R6 locale.alias
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents: 450
diff changeset
434 #'(lambda (locale)
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents: 450
diff changeset
435 (cond
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents: 450
diff changeset
436 ((string-match "^Jp_JP" locale) 'euc-jp)
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents: 450
diff changeset
437 ((and (eq system-type 'hpux) (eq locale "japanese"))
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents: 450
diff changeset
438 'shift-jis)))
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents: 450
diff changeset
439
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents: 450
diff changeset
440 ;; fallback
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents: 450
diff changeset
441 'euc-jp)
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben]
ben
parents: 450
diff changeset
442
428
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
443 ;; (input-method . "japanese")
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
444 (features japan-util)
450
98528da0b7fc Import from CVS: tag r21-2-40
cvs
parents: 428
diff changeset
445 (sample-text . "Japanese ($BF|K\8l(B) $B$3$s$K$A$O(B, (I:]FAJ(B")
428
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
446 (documentation . t)))
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
447
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
448 ;;; japanese.el ends here