annotate man/lispref/mule.texi @ 438:84b14dcb0985 r21-2-27

Import from CVS: tag r21-2-27
author cvs
date Mon, 13 Aug 2007 11:32:25 +0200
parents 3ecd8885ac67
children 8de8e3f6228a
Ignore whitespace changes - Everywhere: Within whitespace: At end of lines:
rev   line source
428
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1 @c -*-texinfo-*-
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
2 @c This is part of the XEmacs Lisp Reference Manual.
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
3 @c Copyright (C) 1996 Ben Wing.
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
4 @c See the file lispref.texi for copying conditions.
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
5 @setfilename ../../info/internationalization.info
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
6 @node MULE, Tips, Internationalization, top
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
7 @chapter MULE
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
8
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
9 @dfn{MULE} is the name originally given to the version of GNU Emacs
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
10 extended for multi-lingual (and in particular Asian-language) support.
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
11 ``MULE'' is short for ``MUlti-Lingual Emacs''. It was originally called
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
12 Nemacs (``Nihon Emacs'' where ``Nihon'' is the Japanese word for
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
13 ``Japan''), when it only provided support for Japanese. XEmacs
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
14 refers to its multi-lingual support as @dfn{MULE support} since it
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
15 is based on @dfn{MULE}.
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
16
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
17 @menu
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
18 * Internationalization Terminology::
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
19 Definition of various internationalization terms.
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
20 * Charsets:: Sets of related characters.
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
21 * MULE Characters:: Working with characters in XEmacs/MULE.
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
22 * Composite Characters:: Making new characters by overstriking other ones.
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
23 * ISO 2022:: An international standard for charsets and encodings.
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
24 * Coding Systems:: Ways of representing a string of chars using integers.
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
25 * CCL:: A special language for writing fast converters.
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
26 * Category Tables:: Subdividing charsets into groups.
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
27 @end menu
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
28
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
29 @node Internationalization Terminology
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
30 @section Internationalization Terminology
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
31
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
32 In internationalization terminology, a string of text is divided up
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
33 into @dfn{characters}, which are the printable units that make up the
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
34 text. A single character is (for example) a capital @samp{A}, the
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
35 number @samp{2}, a Katakana character, a Kanji ideograph (an
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
36 @dfn{ideograph} is a ``picture'' character, such as is used in Japanese
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
37 Kanji, Chinese Hanzi, and Korean Hangul; typically there are thousands
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
38 of such ideographs in each language), etc. The basic property of a
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
39 character is its shape. Note that the same character may be drawn by
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
40 two different people (or in two different fonts) in slightly different
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
41 ways, although the basic shape will be the same.
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
42
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
43 In some cases, the differences will be significant enough that it is
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
44 actually possible to identify two or more distinct shapes that both
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
45 represent the same character. For example, the lowercase letters
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
46 @samp{a} and @samp{g} each have two distinct possible shapes -- the
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
47 @samp{a} can optionally have a curved tail projecting off the top, and
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
48 the @samp{g} can be formed either of two loops, or of one loop and a
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
49 tail hanging off the bottom. Such distinct possible shapes of a
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
50 character are called @dfn{glyphs}. The important characteristic of two
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
51 glyphs making up the same character is that the choice between one or
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
52 the other is purely stylistic and has no linguistic effect on a word
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
53 (this is the reason why a capital @samp{A} and lowercase @samp{a}
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
54 are different characters rather than different glyphs -- e.g.
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
55 @samp{Aspen} is a city while @samp{aspen} is a kind of tree).
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
56
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
57 Note that @dfn{character} and @dfn{glyph} are used differently
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
58 here than elsewhere in XEmacs.
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
59
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
60 A @dfn{character set} is simply a set of related characters. ASCII,
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
61 for example, is a set of 94 characters (or 128, if you count
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
62 non-printing characters). Other character sets are ISO8859-1 (ASCII
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
63 plus various accented characters and other international symbols),
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
64 JISX0201 (ASCII, more or less, plus half-width Katakana), JISX0208
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
65 (Japanese Kanji), JISX0212 (a second set of less-used Japanese Kanji),
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
66 GB2312 (Mainland Chinese Hanzi), etc.
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
67
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
68 Every character set has one or more @dfn{orderings}, which can be
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
69 viewed as a way of assigning a number (or set of numbers) to each
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
70 character in the set. For most character sets, there is a standard
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
71 ordering, and in fact all of the character sets mentioned above define a
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
72 particular ordering. ASCII, for example, places letters in their
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
73 ``natural'' order, puts uppercase letters before lowercase letters,
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
74 numbers before letters, etc. Note that for many of the Asian character
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
75 sets, there is no natural ordering of the characters. The actual
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
76 orderings are based on one or more salient characteristic, of which
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
77 there are many to choose from -- e.g. number of strokes, common
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
78 radicals, phonetic ordering, etc.
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
79
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
80 The set of numbers assigned to any particular character are called
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
81 the character's @dfn{position codes}. The number of position codes
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
82 required to index a particular character in a character set is called
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
83 the @dfn{dimension} of the character set. ASCII, being a relatively
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
84 small character set, is of dimension one, and each character in the
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
85 set is indexed using a single position code, in the range 0 through
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
86 127 (if non-printing characters are included) or 33 through 126
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
87 (if only the printing characters are considered). JISX0208, i.e.
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
88 Japanese Kanji, has thousands of characters, and is of dimension two --
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
89 every character is indexed by two position codes, each in the range
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
90 33 through 126. (Note that the choice of the range here is somewhat
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
91 arbitrary. Although a character set such as JISX0208 defines an
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
92 @emph{ordering} of all its characters, it does not define the actual
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
93 mapping between numbers and characters. You could just as easily
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
94 index the characters in JISX0208 using numbers in the range 0 through
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
95 93, 1 through 94, 2 through 95, etc. The reason for the actual range
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
96 chosen is so that the position codes match up with the actual values
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
97 used in the common encodings.)
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
98
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
99 An @dfn{encoding} is a way of numerically representing characters from
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
100 one or more character sets into a stream of like-sized numerical values
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
101 called @dfn{words}; typically these are 8-bit, 16-bit, or 32-bit
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
102 quantities. If an encoding encompasses only one character set, then the
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
103 position codes for the characters in that character set could be used
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
104 directly. (This is the case with ASCII, and as a result, most people do
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
105 not understand the difference between a character set and an encoding.)
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
106 This is not possible, however, if more than one character set is to be
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
107 used in the encoding. For example, printed Japanese text typically
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
108 requires characters from multiple character sets -- ASCII, JISX0208, and
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
109 JISX0212, to be specific. Each of these is indexed using one or more
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
110 position codes in the range 33 through 126, so the position codes could
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
111 not be used directly or there would be no way to tell which character
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
112 was meant. Different Japanese encodings handle this differently -- JIS
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
113 uses special escape characters to denote different character sets; EUC
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
114 sets the high bit of the position codes for JISX0208 and JISX0212, and
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
115 puts a special extra byte before each JISX0212 character; etc. (JIS,
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
116 EUC, and most of the other encodings you will encounter are 7-bit or
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
117 8-bit encodings. There is one common 16-bit encoding, which is Unicode;
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
118 this strives to represent all the world's characters in a single large
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
119 character set. 32-bit encodings are generally used internally in
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
120 programs to simplify the code that manipulates them; however, they are
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
121 not much used externally because they are not very space-efficient.)
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
122
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
123 Encodings are classified as either @dfn{modal} or @dfn{non-modal}. In
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
124 a @dfn{modal encoding}, there are multiple states that the encoding can be in,
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
125 and the interpretation of the values in the stream depends on the
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
126 current global state of the encoding. Special values in the encoding,
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
127 called @dfn{escape sequences}, are used to change the global state.
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
128 JIS, for example, is a modal encoding. The bytes @samp{ESC $ B}
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
129 indicate that, from then on, bytes are to be interpreted as position
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
130 codes for JISX0208, rather than as ASCII. This effect is cancelled
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
131 using the bytes @samp{ESC ( B}, which mean ``switch from whatever the
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
132 current state is to ASCII''. To switch to JISX0212, the escape sequence
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
133 @samp{ESC $ ( D}. (Note that here, as is common, the escape sequences do
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
134 in fact begin with @samp{ESC}. This is not necessarily the case,
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
135 however.)
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
136
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
137 A @dfn{non-modal encoding} has no global state that extends past the
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
138 character currently being interpreted. EUC, for example, is a
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
139 non-modal encoding. Characters in JISX0208 are encoded by setting
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
140 the high bit of the position codes, and characters in JISX0212 are
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
141 encoded by doing the same but also prefixing the character with the
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
142 byte 0x8F.
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
143
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
144 The advantage of a modal encoding is that it is generally more
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
145 space-efficient, and is easily extendable because there are essentially
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
146 an arbitrary number of escape sequences that can be created. The
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
147 disadvantage, however, is that it is much more difficult to work with
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
148 if it is not being processed in a sequential manner. In the non-modal
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
149 EUC encoding, for example, the byte 0x41 always refers to the letter
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
150 @samp{A}; whereas in JIS, it could either be the letter @samp{A}, or
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
151 one of the two position codes in a JISX0208 character, or one of the
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
152 two position codes in a JISX0212 character. Determining exactly which
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
153 one is meant could be difficult and time-consuming if the previous
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
154 bytes in the string have not already been processed.
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
155
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
156 Non-modal encodings are further divided into @dfn{fixed-width} and
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
157 @dfn{variable-width} formats. A fixed-width encoding always uses
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
158 the same number of words per character, whereas a variable-width
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
159 encoding does not. EUC is a good example of a variable-width
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
160 encoding: one to three bytes are used per character, depending on
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
161 the character set. 16-bit and 32-bit encodings are nearly always
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
162 fixed-width, and this is in fact one of the main reasons for using
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
163 an encoding with a larger word size. The advantages of fixed-width
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
164 encodings should be obvious. The advantages of variable-width
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
165 encodings are that they are generally more space-efficient and allow
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
166 for compatibility with existing 8-bit encodings such as ASCII.
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
167
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
168 Note that the bytes in an 8-bit encoding are often referred to
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
169 as @dfn{octets} rather than simply as bytes. This terminology
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
170 dates back to the days before 8-bit bytes were universal, when
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
171 some computers had 9-bit bytes, others had 10-bit bytes, etc.
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
172
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
173 @node Charsets
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
174 @section Charsets
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
175
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
176 A @dfn{charset} in MULE is an object that encapsulates a
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
177 particular character set as well as an ordering of those characters.
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
178 Charsets are permanent objects and are named using symbols, like
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
179 faces.
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
180
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
181 @defun charsetp object
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
182 This function returns non-@code{nil} if @var{object} is a charset.
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
183 @end defun
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
184
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
185 @menu
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
186 * Charset Properties:: Properties of a charset.
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
187 * Basic Charset Functions:: Functions for working with charsets.
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
188 * Charset Property Functions:: Functions for accessing charset properties.
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
189 * Predefined Charsets:: Predefined charset objects.
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
190 @end menu
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
191
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
192 @node Charset Properties
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
193 @subsection Charset Properties
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
194
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
195 Charsets have the following properties:
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
196
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
197 @table @code
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
198 @item name
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
199 A symbol naming the charset. Every charset must have a different name;
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
200 this allows a charset to be referred to using its name rather than
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
201 the actual charset object.
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
202 @item doc-string
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
203 A documentation string describing the charset.
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
204 @item registry
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
205 A regular expression matching the font registry field for this character
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
206 set. For example, both the @code{ascii} and @code{latin-iso8859-1}
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
207 charsets use the registry @code{"ISO8859-1"}. This field is used to
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
208 choose an appropriate font when the user gives a general font
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
209 specification such as @samp{-*-courier-medium-r-*-140-*}, i.e. a
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
210 14-point upright medium-weight Courier font.
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
211 @item dimension
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
212 Number of position codes used to index a character in the character set.
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
213 XEmacs/MULE can only handle character sets of dimension 1 or 2.
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
214 This property defaults to 1.
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
215 @item chars
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
216 Number of characters in each dimension. In XEmacs/MULE, the only
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
217 allowed values are 94 or 96. (There are a couple of pre-defined
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
218 character sets, such as ASCII, that do not follow this, but you cannot
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
219 define new ones like this.) Defaults to 94. Note that if the dimension
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
220 is 2, the character set thus described is 94x94 or 96x96.
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
221 @item columns
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
222 Number of columns used to display a character in this charset.
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
223 Only used in TTY mode. (Under X, the actual width of a character
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
224 can be derived from the font used to display the characters.)
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
225 If unspecified, defaults to the dimension. (This is almost
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
226 always the correct value, because character sets with dimension 2
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
227 are usually ideograph character sets, which need two columns to
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
228 display the intricate ideographs.)
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
229 @item direction
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
230 A symbol, either @code{l2r} (left-to-right) or @code{r2l}
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
231 (right-to-left). Defaults to @code{l2r}. This specifies the
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
232 direction that the text should be displayed in, and will be
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
233 left-to-right for most charsets but right-to-left for Hebrew
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
234 and Arabic. (Right-to-left display is not currently implemented.)
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
235 @item final
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
236 Final byte of the standard ISO 2022 escape sequence designating this
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
237 charset. Must be supplied. Each combination of (@var{dimension},
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
238 @var{chars}) defines a separate namespace for final bytes, and each
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
239 charset within a particular namespace must have a different final byte.
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
240 Note that ISO 2022 restricts the final byte to the range 0x30 - 0x7E if
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
241 dimension == 1, and 0x30 - 0x5F if dimension == 2. Note also that final
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
242 bytes in the range 0x30 - 0x3F are reserved for user-defined (not
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
243 official) character sets. For more information on ISO 2022, see @ref{Coding
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
244 Systems}.
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
245 @item graphic
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
246 0 (use left half of font on output) or 1 (use right half of font on
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
247 output). Defaults to 0. This specifies how to convert the position
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
248 codes that index a character in a character set into an index into the
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
249 font used to display the character set. With @code{graphic} set to 0,
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
250 position codes 33 through 126 map to font indices 33 through 126; with
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
251 it set to 1, position codes 33 through 126 map to font indices 161
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
252 through 254 (i.e. the same number but with the high bit set). For
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
253 example, for a font whose registry is ISO8859-1, the left half of the
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
254 font (octets 0x20 - 0x7F) is the @code{ascii} charset, while the right
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
255 half (octets 0xA0 - 0xFF) is the @code{latin-iso8859-1} charset.
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
256 @item ccl-program
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
257 A compiled CCL program used to convert a character in this charset into
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
258 an index into the font. This is in addition to the @code{graphic}
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
259 property. If a CCL program is defined, the position codes of a
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
260 character will first be processed according to @code{graphic} and
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
261 then passed through the CCL program, with the resulting values used
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
262 to index the font.
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
263
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
264 This is used, for example, in the Big5 character set (used in Taiwan).
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
265 This character set is not ISO-2022-compliant, and its size (94x157) does
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
266 not fit within the maximum 96x96 size of ISO-2022-compliant character
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
267 sets. As a result, XEmacs/MULE splits it (in a rather complex fashion,
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
268 so as to group the most commonly used characters together) into two
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
269 charset objects (@code{big5-1} and @code{big5-2}), each of size 94x94,
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
270 and each charset object uses a CCL program to convert the modified
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
271 position codes back into standard Big5 indices to retrieve a character
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
272 from a Big5 font.
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
273 @end table
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
274
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
275 Most of the above properties can only be changed when the charset
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
276 is created. @xref{Charset Property Functions}.
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
277
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
278 @node Basic Charset Functions
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
279 @subsection Basic Charset Functions
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
280
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
281 @defun find-charset charset-or-name
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
282 This function retrieves the charset of the given name. If
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
283 @var{charset-or-name} is a charset object, it is simply returned.
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
284 Otherwise, @var{charset-or-name} should be a symbol. If there is no
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
285 such charset, @code{nil} is returned. Otherwise the associated charset
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
286 object is returned.
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
287 @end defun
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
288
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
289 @defun get-charset name
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
290 This function retrieves the charset of the given name. Same as
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
291 @code{find-charset} except an error is signalled if there is no such
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
292 charset instead of returning @code{nil}.
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
293 @end defun
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
294
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
295 @defun charset-list
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
296 This function returns a list of the names of all defined charsets.
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
297 @end defun
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
298
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
299 @defun make-charset name doc-string props
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
300 This function defines a new character set. This function is for use
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
301 with Mule support. @var{name} is a symbol, the name by which the
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
302 character set is normally referred. @var{doc-string} is a string
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
303 describing the character set. @var{props} is a property list,
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
304 describing the specific nature of the character set. The recognized
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
305 properties are @code{registry}, @code{dimension}, @code{columns},
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
306 @code{chars}, @code{final}, @code{graphic}, @code{direction}, and
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
307 @code{ccl-program}, as previously described.
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
308 @end defun
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
309
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
310 @defun make-reverse-direction-charset charset new-name
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
311 This function makes a charset equivalent to @var{charset} but which goes
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
312 in the opposite direction. @var{new-name} is the name of the new
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
313 charset. The new charset is returned.
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
314 @end defun
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
315
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
316 @defun charset-from-attributes dimension chars final &optional direction
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
317 This function returns a charset with the given @var{dimension},
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
318 @var{chars}, @var{final}, and @var{direction}. If @var{direction} is
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
319 omitted, both directions will be checked (left-to-right will be returned
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
320 if character sets exist for both directions).
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
321 @end defun
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
322
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
323 @defun charset-reverse-direction-charset charset
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
324 This function returns the charset (if any) with the same dimension,
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
325 number of characters, and final byte as @var{charset}, but which is
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
326 displayed in the opposite direction.
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
327 @end defun
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
328
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
329 @node Charset Property Functions
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
330 @subsection Charset Property Functions
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
331
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
332 All of these functions accept either a charset name or charset object.
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
333
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
334 @defun charset-property charset prop
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
335 This function returns property @var{prop} of @var{charset}.
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
336 @xref{Charset Properties}.
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
337 @end defun
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
338
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
339 Convenience functions are also provided for retrieving individual
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
340 properties of a charset.
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
341
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
342 @defun charset-name charset
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
343 This function returns the name of @var{charset}. This will be a symbol.
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
344 @end defun
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
345
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
346 @defun charset-doc-string charset
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
347 This function returns the doc string of @var{charset}.
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
348 @end defun
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
349
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
350 @defun charset-registry charset
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
351 This function returns the registry of @var{charset}.
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
352 @end defun
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
353
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
354 @defun charset-dimension charset
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
355 This function returns the dimension of @var{charset}.
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
356 @end defun
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
357
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
358 @defun charset-chars charset
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
359 This function returns the number of characters per dimension of
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
360 @var{charset}.
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
361 @end defun
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
362
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
363 @defun charset-columns charset
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
364 This function returns the number of display columns per character (in
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
365 TTY mode) of @var{charset}.
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
366 @end defun
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
367
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
368 @defun charset-direction charset
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
369 This function returns the display direction of @var{charset} -- either
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
370 @code{l2r} or @code{r2l}.
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
371 @end defun
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
372
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
373 @defun charset-final charset
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
374 This function returns the final byte of the ISO 2022 escape sequence
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
375 designating @var{charset}.
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
376 @end defun
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
377
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
378 @defun charset-graphic charset
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
379 This function returns either 0 or 1, depending on whether the position
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
380 codes of characters in @var{charset} map to the left or right half
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
381 of their font, respectively.
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
382 @end defun
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
383
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
384 @defun charset-ccl-program charset
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
385 This function returns the CCL program, if any, for converting
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
386 position codes of characters in @var{charset} into font indices.
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
387 @end defun
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
388
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
389 The only property of a charset that can currently be set after
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
390 the charset has been created is the CCL program.
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
391
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
392 @defun set-charset-ccl-program charset ccl-program
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
393 This function sets the @code{ccl-program} property of @var{charset} to
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
394 @var{ccl-program}.
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
395 @end defun
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
396
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
397 @node Predefined Charsets
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
398 @subsection Predefined Charsets
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
399
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
400 The following charsets are predefined in the C code.
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
401
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
402 @example
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
403 Name Type Fi Gr Dir Registry
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
404 --------------------------------------------------------------
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
405 ascii 94 B 0 l2r ISO8859-1
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
406 control-1 94 0 l2r ---
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
407 latin-iso8859-1 94 A 1 l2r ISO8859-1
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
408 latin-iso8859-2 96 B 1 l2r ISO8859-2
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
409 latin-iso8859-3 96 C 1 l2r ISO8859-3
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
410 latin-iso8859-4 96 D 1 l2r ISO8859-4
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
411 cyrillic-iso8859-5 96 L 1 l2r ISO8859-5
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
412 arabic-iso8859-6 96 G 1 r2l ISO8859-6
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
413 greek-iso8859-7 96 F 1 l2r ISO8859-7
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
414 hebrew-iso8859-8 96 H 1 r2l ISO8859-8
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
415 latin-iso8859-9 96 M 1 l2r ISO8859-9
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
416 thai-tis620 96 T 1 l2r TIS620
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
417 katakana-jisx0201 94 I 1 l2r JISX0201.1976
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
418 latin-jisx0201 94 J 0 l2r JISX0201.1976
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
419 japanese-jisx0208-1978 94x94 @@ 0 l2r JISX0208.1978
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
420 japanese-jisx0208 94x94 B 0 l2r JISX0208.19(83|90)
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
421 japanese-jisx0212 94x94 D 0 l2r JISX0212
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
422 chinese-gb2312 94x94 A 0 l2r GB2312
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
423 chinese-cns11643-1 94x94 G 0 l2r CNS11643.1
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
424 chinese-cns11643-2 94x94 H 0 l2r CNS11643.2
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
425 chinese-big5-1 94x94 0 0 l2r Big5
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
426 chinese-big5-2 94x94 1 0 l2r Big5
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
427 korean-ksc5601 94x94 C 0 l2r KSC5601
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
428 composite 96x96 0 l2r ---
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
429 @end example
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
430
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
431 The following charsets are predefined in the Lisp code.
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
432
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
433 @example
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
434 Name Type Fi Gr Dir Registry
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
435 --------------------------------------------------------------
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
436 arabic-digit 94 2 0 l2r MuleArabic-0
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
437 arabic-1-column 94 3 0 r2l MuleArabic-1
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
438 arabic-2-column 94 4 0 r2l MuleArabic-2
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
439 sisheng 94 0 0 l2r sisheng_cwnn\|OMRON_UDC_ZH
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
440 chinese-cns11643-3 94x94 I 0 l2r CNS11643.1
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
441 chinese-cns11643-4 94x94 J 0 l2r CNS11643.1
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
442 chinese-cns11643-5 94x94 K 0 l2r CNS11643.1
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
443 chinese-cns11643-6 94x94 L 0 l2r CNS11643.1
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
444 chinese-cns11643-7 94x94 M 0 l2r CNS11643.1
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
445 ethiopic 94x94 2 0 l2r Ethio
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
446 ascii-r2l 94 B 0 r2l ISO8859-1
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
447 ipa 96 0 1 l2r MuleIPA
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
448 vietnamese-lower 96 1 1 l2r VISCII1.1
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
449 vietnamese-upper 96 2 1 l2r VISCII1.1
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
450 @end example
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
451
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
452 For all of the above charsets, the dimension and number of columns are
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
453 the same.
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
454
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
455 Note that ASCII, Control-1, and Composite are handled specially.
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
456 This is why some of the fields are blank; and some of the filled-in
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
457 fields (e.g. the type) are not really accurate.
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
458
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
459 @node MULE Characters
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
460 @section MULE Characters
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
461
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
462 @defun make-char charset arg1 &optional arg2
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
463 This function makes a multi-byte character from @var{charset} and octets
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
464 @var{arg1} and @var{arg2}.
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
465 @end defun
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
466
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
467 @defun char-charset ch
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
468 This function returns the character set of char @var{ch}.
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
469 @end defun
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
470
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
471 @defun char-octet ch &optional n
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
472 This function returns the octet (i.e. position code) numbered @var{n}
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
473 (should be 0 or 1) of char @var{ch}. @var{n} defaults to 0 if omitted.
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
474 @end defun
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
475
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
476 @defun find-charset-region start end &optional buffer
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
477 This function returns a list of the charsets in the region between
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
478 @var{start} and @var{end}. @var{buffer} defaults to the current buffer
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
479 if omitted.
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
480 @end defun
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
481
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
482 @defun find-charset-string string
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
483 This function returns a list of the charsets in @var{string}.
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
484 @end defun
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
485
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
486 @node Composite Characters
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
487 @section Composite Characters
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
488
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
489 Composite characters are not yet completely implemented.
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
490
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
491 @defun make-composite-char string
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
492 This function converts a string into a single composite character. The
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
493 character is the result of overstriking all the characters in the
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
494 string.
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
495 @end defun
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
496
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
497 @defun composite-char-string ch
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
498 This function returns a string of the characters comprising a composite
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
499 character.
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
500 @end defun
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
501
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
502 @defun compose-region start end &optional buffer
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
503 This function composes the characters in the region from @var{start} to
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
504 @var{end} in @var{buffer} into one composite character. The composite
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
505 character replaces the composed characters. @var{buffer} defaults to
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
506 the current buffer if omitted.
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
507 @end defun
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
508
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
509 @defun decompose-region start end &optional buffer
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
510 This function decomposes any composite characters in the region from
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
511 @var{start} to @var{end} in @var{buffer}. This converts each composite
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
512 character into one or more characters, the individual characters out of
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
513 which the composite character was formed. Non-composite characters are
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
514 left as-is. @var{buffer} defaults to the current buffer if omitted.
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
515 @end defun
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
516
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
517 @node ISO 2022
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
518 @section ISO 2022
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
519
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
520 This section briefly describes the ISO 2022 encoding standard. For more
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
521 thorough understanding, please refer to the original document of ISO
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
522 2022.
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
523
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
524 Character sets (@dfn{charsets}) are classified into the following four
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
525 categories, according to the number of characters of charset:
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
526 94-charset, 96-charset, 94x94-charset, and 96x96-charset.
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
527
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
528 @need 1000
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
529 @table @asis
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
530 @item 94-charset
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
531 ASCII(B), left(J) and right(I) half of JISX0201, ...
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
532 @item 96-charset
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
533 Latin-1(A), Latin-2(B), Latin-3(C), ...
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
534 @item 94x94-charset
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
535 GB2312(A), JISX0208(B), KSC5601(C), ...
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
536 @item 96x96-charset
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
537 none for the moment
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
538 @end table
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
539
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
540 The character in parentheses after the name of each charset
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
541 is the @dfn{final character} @var{F}, which can be regarded as
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
542 the identifier of the charset. ECMA allocates @var{F} to each
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
543 charset. @var{F} is in the range of 0x30..0x7F, but 0x30..0x3F
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
544 are only for private use.
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
545
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
546 Note: @dfn{ECMA} = European Computer Manufacturers Association
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
547
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
548 There are four @dfn{registers of charsets}, called G0 thru G3.
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
549 You can designate (or assign) any charset to one of these
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
550 registers.
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
551
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
552 The code space contained within one octet (of size 256) is divided into
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
553 4 areas: C0, GL, C1, and GR. GL and GR are the areas into which a
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
554 register of charset can be invoked into.
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
555
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
556 @example
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
557 @group
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
558 C0: 0x00 - 0x1F
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
559 GL: 0x20 - 0x7F
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
560 C1: 0x80 - 0x9F
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
561 GR: 0xA0 - 0xFF
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
562 @end group
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
563 @end example
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
564
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
565 Usually, in the initial state, G0 is invoked into GL, and G1
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
566 is invoked into GR.
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
567
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
568 ISO 2022 distinguishes 7-bit environments and 8-bit environments. In
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
569 7-bit environments, only C0 and GL are used.
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
570
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
571 Charset designation is done by escape sequences of the form:
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
572
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
573 @example
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
574 ESC [@var{I}] @var{I} @var{F}
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
575 @end example
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
576
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
577 where @var{I} is an intermediate character in the range 0x20 - 0x2F, and
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
578 @var{F} is the final character identifying this charset.
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
579
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
580 The meaning of intermediate characters are:
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
581
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
582 @example
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
583 @group
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
584 $ [0x24]: indicate charset of dimension 2 (94x94 or 96x96).
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
585 ( [0x28]: designate to G0 a 94-charset whose final byte is @var{F}.
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
586 ) [0x29]: designate to G1 a 94-charset whose final byte is @var{F}.
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
587 * [0x2A]: designate to G2 a 94-charset whose final byte is @var{F}.
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
588 + [0x2B]: designate to G3 a 94-charset whose final byte is @var{F}.
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
589 - [0x2D]: designate to G1 a 96-charset whose final byte is @var{F}.
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
590 . [0x2E]: designate to G2 a 96-charset whose final byte is @var{F}.
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
591 / [0x2F]: designate to G3 a 96-charset whose final byte is @var{F}.
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
592 @end group
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
593 @end example
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
594
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
595 The following rule is not allowed in ISO 2022 but can be used in Mule.
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
596
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
597 @example
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
598 , [0x2C]: designate to G0 a 96-charset whose final byte is @var{F}.
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
599 @end example
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
600
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
601 Here are examples of designations:
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
602
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
603 @example
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
604 @group
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
605 ESC ( B : designate to G0 ASCII
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
606 ESC - A : designate to G1 Latin-1
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
607 ESC $ ( A or ESC $ A : designate to G0 GB2312
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
608 ESC $ ( B or ESC $ B : designate to G0 JISX0208
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
609 ESC $ ) C : designate to G1 KSC5601
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
610 @end group
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
611 @end example
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
612
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
613 To use a charset designated to G2 or G3, and to use a charset designated
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
614 to G1 in a 7-bit environment, you must explicitly invoke G1, G2, or G3
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
615 into GL. There are two types of invocation, Locking Shift (forever) and
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
616 Single Shift (one character only).
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
617
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
618 Locking Shift is done as follows:
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
619
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
620 @example
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
621 LS0 or SI (0x0F): invoke G0 into GL
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
622 LS1 or SO (0x0E): invoke G1 into GL
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
623 LS2: invoke G2 into GL
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
624 LS3: invoke G3 into GL
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
625 LS1R: invoke G1 into GR
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
626 LS2R: invoke G2 into GR
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
627 LS3R: invoke G3 into GR
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
628 @end example
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
629
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
630 Single Shift is done as follows:
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
631
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
632 @example
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
633 @group
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
634 SS2 or ESC N: invoke G2 into GL
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
635 SS3 or ESC O: invoke G3 into GL
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
636 @end group
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
637 @end example
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
638
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
639 (#### Ben says: I think the above is slightly incorrect. It appears that
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
640 SS2 invokes G2 into GR and SS3 invokes G3 into GR, whereas ESC N and
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
641 ESC O behave as indicated. The above definitions will not parse
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
642 EUC-encoded text correctly, and it looks like the code in mule-coding.c
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
643 has similar problems.)
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
644
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
645 You may realize that there are a lot of ISO-2022-compliant ways of
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
646 encoding multilingual text. Now, in the world, there exist many coding
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
647 systems such as X11's Compound Text, Japanese JUNET code, and so-called
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
648 EUC (Extended UNIX Code); all of these are variants of ISO 2022.
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
649
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
650 In Mule, we characterize ISO 2022 by the following attributes:
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
651
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
652 @enumerate
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
653 @item
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
654 Initial designation to G0 thru G3.
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
655 @item
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
656 Allow designation of short form for Japanese and Chinese.
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
657 @item
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
658 Should we designate ASCII to G0 before control characters?
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
659 @item
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
660 Should we designate ASCII to G0 at the end of line?
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
661 @item
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
662 7-bit environment or 8-bit environment.
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
663 @item
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
664 Use Locking Shift or not.
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
665 @item
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
666 Use ASCII or JIS0201-1976-Roman.
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
667 @item
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
668 Use JISX0208-1983 or JISX0208-1976.
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
669 @end enumerate
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
670
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
671 (The last two are only for Japanese.)
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
672
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
673 By specifying these attributes, you can create any variant
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
674 of ISO 2022.
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
675
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
676 Here are several examples:
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
677
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
678 @example
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
679 @group
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
680 junet -- Coding system used in JUNET.
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
681 1. G0 <- ASCII, G1..3 <- never used
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
682 2. Yes.
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
683 3. Yes.
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
684 4. Yes.
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
685 5. 7-bit environment
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
686 6. No.
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
687 7. Use ASCII
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
688 8. Use JISX0208-1983
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
689 @end group
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
690
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
691 @group
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
692 ctext -- Compound Text
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
693 1. G0 <- ASCII, G1 <- Latin-1, G2,3 <- never used
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
694 2. No.
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
695 3. No.
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
696 4. Yes.
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
697 5. 8-bit environment
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
698 6. No.
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
699 7. Use ASCII
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
700 8. Use JISX0208-1983
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
701 @end group
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
702
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
703 @group
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
704 euc-china -- Chinese EUC. Although many people call this
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
705 as "GB encoding", the name may cause misunderstanding.
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
706 1. G0 <- ASCII, G1 <- GB2312, G2,3 <- never used
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
707 2. No.
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
708 3. Yes.
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
709 4. Yes.
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
710 5. 8-bit environment
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
711 6. No.
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
712 7. Use ASCII
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
713 8. Use JISX0208-1983
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
714 @end group
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
715
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
716 @group
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
717 korean-mail -- Coding system used in Korean network.
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
718 1. G0 <- ASCII, G1 <- KSC5601, G2,3 <- never used
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
719 2. No.
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
720 3. Yes.
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
721 4. Yes.
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
722 5. 7-bit environment
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
723 6. Yes.
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
724 7. No.
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
725 8. No.
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
726 @end group
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
727 @end example
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
728
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
729 Mule creates all these coding systems by default.
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
730
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
731 @node Coding Systems
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
732 @section Coding Systems
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
733
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
734 A coding system is an object that defines how text containing multiple
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
735 character sets is encoded into a stream of (typically 8-bit) bytes. The
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
736 coding system is used to decode the stream into a series of characters
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
737 (which may be from multiple charsets) when the text is read from a file
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
738 or process, and is used to encode the text back into the same format
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
739 when it is written out to a file or process.
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
740
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
741 For example, many ISO-2022-compliant coding systems (such as Compound
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
742 Text, which is used for inter-client data under the X Window System) use
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
743 escape sequences to switch between different charsets -- Japanese Kanji,
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
744 for example, is invoked with @samp{ESC $ ( B}; ASCII is invoked with
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
745 @samp{ESC ( B}; and Cyrillic is invoked with @samp{ESC - L}. See
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
746 @code{make-coding-system} for more information.
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
747
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
748 Coding systems are normally identified using a symbol, and the symbol is
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
749 accepted in place of the actual coding system object whenever a coding
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
750 system is called for. (This is similar to how faces and charsets work.)
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
751
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
752 @defun coding-system-p object
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
753 This function returns non-@code{nil} if @var{object} is a coding system.
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
754 @end defun
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
755
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
756 @menu
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
757 * Coding System Types:: Classifying coding systems.
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
758 * EOL Conversion:: Dealing with different ways of denoting
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
759 the end of a line.
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
760 * Coding System Properties:: Properties of a coding system.
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
761 * Basic Coding System Functions:: Working with coding systems.
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
762 * Coding System Property Functions:: Retrieving a coding system's properties.
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
763 * Encoding and Decoding Text:: Encoding and decoding text.
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
764 * Detection of Textual Encoding:: Determining how text is encoded.
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
765 * Big5 and Shift-JIS Functions:: Special functions for these non-standard
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
766 encodings.
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
767 @end menu
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
768
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
769 @node Coding System Types
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
770 @subsection Coding System Types
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
771
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
772 @table @code
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
773 @item nil
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
774 @itemx autodetect
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
775 Automatic conversion. XEmacs attempts to detect the coding system used
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
776 in the file.
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
777 @item no-conversion
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
778 No conversion. Use this for binary files and such. On output, graphic
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
779 characters that are not in ASCII or Latin-1 will be replaced by a
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
780 @samp{?}. (For a no-conversion-encoded buffer, these characters will
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
781 only be present if you explicitly insert them.)
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
782 @item shift-jis
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
783 Shift-JIS (a Japanese encoding commonly used in PC operating systems).
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
784 @item iso2022
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
785 Any ISO-2022-compliant encoding. Among other things, this includes JIS
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
786 (the Japanese encoding commonly used for e-mail), national variants of
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
787 EUC (the standard Unix encoding for Japanese and other languages), and
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
788 Compound Text (an encoding used in X11). You can specify more specific
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
789 information about the conversion with the @var{flags} argument.
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
790 @item big5
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
791 Big5 (the encoding commonly used for Taiwanese).
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
792 @item ccl
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
793 The conversion is performed using a user-written pseudo-code program.
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
794 CCL (Code Conversion Language) is the name of this pseudo-code.
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
795 @item internal
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
796 Write out or read in the raw contents of the memory representing the
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
797 buffer's text. This is primarily useful for debugging purposes, and is
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
798 only enabled when XEmacs has been compiled with @code{DEBUG_XEMACS} set
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
799 (the @samp{--debug} configure option). @strong{Warning}: Reading in a
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
800 file using @code{internal} conversion can result in an internal
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
801 inconsistency in the memory representing a buffer's text, which will
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
802 produce unpredictable results and may cause XEmacs to crash. Under
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
803 normal circumstances you should never use @code{internal} conversion.
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
804 @end table
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
805
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
806 @node EOL Conversion
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
807 @subsection EOL Conversion
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
808
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
809 @table @code
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
810 @item nil
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
811 Automatically detect the end-of-line type (LF, CRLF, or CR). Also
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
812 generate subsidiary coding systems named @code{@var{name}-unix},
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
813 @code{@var{name}-dos}, and @code{@var{name}-mac}, that are identical to
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
814 this coding system but have an EOL-TYPE value of @code{lf}, @code{crlf},
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
815 and @code{cr}, respectively.
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
816 @item lf
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
817 The end of a line is marked externally using ASCII LF. Since this is
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
818 also the way that XEmacs represents an end-of-line internally,
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
819 specifying this option results in no end-of-line conversion. This is
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
820 the standard format for Unix text files.
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
821 @item crlf
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
822 The end of a line is marked externally using ASCII CRLF. This is the
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
823 standard format for MS-DOS text files.
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
824 @item cr
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
825 The end of a line is marked externally using ASCII CR. This is the
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
826 standard format for Macintosh text files.
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
827 @item t
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
828 Automatically detect the end-of-line type but do not generate subsidiary
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
829 coding systems. (This value is converted to @code{nil} when stored
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
830 internally, and @code{coding-system-property} will return @code{nil}.)
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
831 @end table
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
832
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
833 @node Coding System Properties
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
834 @subsection Coding System Properties
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
835
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
836 @table @code
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
837 @item mnemonic
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
838 String to be displayed in the modeline when this coding system is
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
839 active.
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
840
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
841 @item eol-type
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
842 End-of-line conversion to be used. It should be one of the types
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
843 listed in @ref{EOL Conversion}.
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
844
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
845 @item post-read-conversion
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
846 Function called after a file has been read in, to perform the decoding.
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
847 Called with two arguments, @var{beg} and @var{end}, denoting a region of
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
848 the current buffer to be decoded.
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
849
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
850 @item pre-write-conversion
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
851 Function called before a file is written out, to perform the encoding.
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
852 Called with two arguments, @var{beg} and @var{end}, denoting a region of
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
853 the current buffer to be encoded.
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
854 @end table
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
855
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
856 The following additional properties are recognized if @var{type} is
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
857 @code{iso2022}:
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
858
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
859 @table @code
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
860 @item charset-g0
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
861 @itemx charset-g1
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
862 @itemx charset-g2
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
863 @itemx charset-g3
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
864 The character set initially designated to the G0 - G3 registers.
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
865 The value should be one of
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
866
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
867 @itemize @bullet
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
868 @item
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
869 A charset object (designate that character set)
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
870 @item
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
871 @code{nil} (do not ever use this register)
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
872 @item
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
873 @code{t} (no character set is initially designated to the register, but
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
874 may be later on; this automatically sets the corresponding
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
875 @code{force-g*-on-output} property)
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
876 @end itemize
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
877
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
878 @item force-g0-on-output
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
879 @itemx force-g1-on-output
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
880 @itemx force-g2-on-output
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
881 @itemx force-g3-on-output
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
882 If non-@code{nil}, send an explicit designation sequence on output
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
883 before using the specified register.
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
884
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
885 @item short
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
886 If non-@code{nil}, use the short forms @samp{ESC $ @@}, @samp{ESC $ A},
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
887 and @samp{ESC $ B} on output in place of the full designation sequences
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
888 @samp{ESC $ ( @@}, @samp{ESC $ ( A}, and @samp{ESC $ ( B}.
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
889
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
890 @item no-ascii-eol
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
891 If non-@code{nil}, don't designate ASCII to G0 at each end of line on
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
892 output. Setting this to non-@code{nil} also suppresses other
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
893 state-resetting that normally happens at the end of a line.
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
894
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
895 @item no-ascii-cntl
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
896 If non-@code{nil}, don't designate ASCII to G0 before control chars on
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
897 output.
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
898
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
899 @item seven
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
900 If non-@code{nil}, use 7-bit environment on output. Otherwise, use 8-bit
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
901 environment.
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
902
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
903 @item lock-shift
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
904 If non-@code{nil}, use locking-shift (SO/SI) instead of single-shift or
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
905 designation by escape sequence.
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
906
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
907 @item no-iso6429
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
908 If non-@code{nil}, don't use ISO6429's direction specification.
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
909
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
910 @item escape-quoted
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
911 If non-nil, literal control characters that are the same as the
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
912 beginning of a recognized ISO 2022 or ISO 6429 escape sequence (in
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
913 particular, ESC (0x1B), SO (0x0E), SI (0x0F), SS2 (0x8E), SS3 (0x8F),
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
914 and CSI (0x9B)) are ``quoted'' with an escape character so that they can
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
915 be properly distinguished from an escape sequence. (Note that doing
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
916 this results in a non-portable encoding.) This encoding flag is used for
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
917 byte-compiled files. Note that ESC is a good choice for a quoting
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
918 character because there are no escape sequences whose second byte is a
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
919 character from the Control-0 or Control-1 character sets; this is
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
920 explicitly disallowed by the ISO 2022 standard.
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
921
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
922 @item input-charset-conversion
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
923 A list of conversion specifications, specifying conversion of characters
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
924 in one charset to another when decoding is performed. Each
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
925 specification is a list of two elements: the source charset, and the
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
926 destination charset.
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
927
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
928 @item output-charset-conversion
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
929 A list of conversion specifications, specifying conversion of characters
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
930 in one charset to another when encoding is performed. The form of each
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
931 specification is the same as for @code{input-charset-conversion}.
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
932 @end table
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
933
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
934 The following additional properties are recognized (and required) if
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
935 @var{type} is @code{ccl}:
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
936
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
937 @table @code
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
938 @item decode
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
939 CCL program used for decoding (converting to internal format).
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
940
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
941 @item encode
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
942 CCL program used for encoding (converting to external format).
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
943 @end table
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
944
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
945 @node Basic Coding System Functions
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
946 @subsection Basic Coding System Functions
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
947
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
948 @defun find-coding-system coding-system-or-name
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
949 This function retrieves the coding system of the given name.
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
950
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
951 If @var{coding-system-or-name} is a coding-system object, it is simply
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
952 returned. Otherwise, @var{coding-system-or-name} should be a symbol.
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
953 If there is no such coding system, @code{nil} is returned. Otherwise
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
954 the associated coding system object is returned.
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
955 @end defun
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
956
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
957 @defun get-coding-system name
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
958 This function retrieves the coding system of the given name. Same as
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
959 @code{find-coding-system} except an error is signalled if there is no
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
960 such coding system instead of returning @code{nil}.
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
961 @end defun
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
962
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
963 @defun coding-system-list
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
964 This function returns a list of the names of all defined coding systems.
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
965 @end defun
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
966
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
967 @defun coding-system-name coding-system
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
968 This function returns the name of the given coding system.
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
969 @end defun
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
970
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
971 @defun make-coding-system name type &optional doc-string props
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
972 This function registers symbol @var{name} as a coding system.
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
973
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
974 @var{type} describes the conversion method used and should be one of
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
975 the types listed in @ref{Coding System Types}.
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
976
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
977 @var{doc-string} is a string describing the coding system.
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
978
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
979 @var{props} is a property list, describing the specific nature of the
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
980 character set. Recognized properties are as in @ref{Coding System
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
981 Properties}.
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
982 @end defun
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
983
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
984 @defun copy-coding-system old-coding-system new-name
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
985 This function copies @var{old-coding-system} to @var{new-name}. If
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
986 @var{new-name} does not name an existing coding system, a new one will
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
987 be created.
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
988 @end defun
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
989
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
990 @defun subsidiary-coding-system coding-system eol-type
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
991 This function returns the subsidiary coding system of
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
992 @var{coding-system} with eol type @var{eol-type}.
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
993 @end defun
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
994
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
995 @node Coding System Property Functions
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
996 @subsection Coding System Property Functions
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
997
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
998 @defun coding-system-doc-string coding-system
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
999 This function returns the doc string for @var{coding-system}.
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1000 @end defun
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1001
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1002 @defun coding-system-type coding-system
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1003 This function returns the type of @var{coding-system}.
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1004 @end defun
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1005
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1006 @defun coding-system-property coding-system prop
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1007 This function returns the @var{prop} property of @var{coding-system}.
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1008 @end defun
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1009
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1010 @node Encoding and Decoding Text
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1011 @subsection Encoding and Decoding Text
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1012
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1013 @defun decode-coding-region start end coding-system &optional buffer
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1014 This function decodes the text between @var{start} and @var{end} which
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1015 is encoded in @var{coding-system}. This is useful if you've read in
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1016 encoded text from a file without decoding it (e.g. you read in a
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1017 JIS-formatted file but used the @code{binary} or @code{no-conversion} coding
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1018 system, so that it shows up as @samp{^[$B!<!+^[(B}). The length of the
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1019 encoded text is returned. @var{buffer} defaults to the current buffer
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1020 if unspecified.
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1021 @end defun
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1022
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1023 @defun encode-coding-region start end coding-system &optional buffer
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1024 This function encodes the text between @var{start} and @var{end} using
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1025 @var{coding-system}. This will, for example, convert Japanese
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1026 characters into stuff such as @samp{^[$B!<!+^[(B} if you use the JIS
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1027 encoding. The length of the encoded text is returned. @var{buffer}
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1028 defaults to the current buffer if unspecified.
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1029 @end defun
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1030
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1031 @node Detection of Textual Encoding
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1032 @subsection Detection of Textual Encoding
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1033
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1034 @defun coding-category-list
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1035 This function returns a list of all recognized coding categories.
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1036 @end defun
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1037
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1038 @defun set-coding-priority-list list
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1039 This function changes the priority order of the coding categories.
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1040 @var{list} should be a list of coding categories, in descending order of
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1041 priority. Unspecified coding categories will be lower in priority than
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1042 all specified ones, in the same relative order they were in previously.
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1043 @end defun
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1044
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1045 @defun coding-priority-list
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1046 This function returns a list of coding categories in descending order of
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1047 priority.
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1048 @end defun
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1049
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1050 @defun set-coding-category-system coding-category coding-system
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1051 This function changes the coding system associated with a coding category.
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1052 @end defun
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1053
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1054 @defun coding-category-system coding-category
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1055 This function returns the coding system associated with a coding category.
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1056 @end defun
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1057
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1058 @defun detect-coding-region start end &optional buffer
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1059 This function detects coding system of the text in the region between
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1060 @var{start} and @var{end}. Returned value is a list of possible coding
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1061 systems ordered by priority. If only ASCII characters are found, it
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1062 returns @code{autodetect} or one of its subsidiary coding systems
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1063 according to a detected end-of-line type. Optional arg @var{buffer}
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1064 defaults to the current buffer.
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1065 @end defun
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1066
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1067 @node Big5 and Shift-JIS Functions
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1068 @subsection Big5 and Shift-JIS Functions
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1069
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1070 These are special functions for working with the non-standard
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1071 Shift-JIS and Big5 encodings.
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1072
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1073 @defun decode-shift-jis-char code
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1074 This function decodes a JISX0208 character of Shift-JIS coding-system.
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1075 @var{code} is the character code in Shift-JIS as a cons of type bytes.
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1076 The corresponding character is returned.
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1077 @end defun
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1078
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1079 @defun encode-shift-jis-char ch
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1080 This function encodes a JISX0208 character @var{ch} to SHIFT-JIS
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1081 coding-system. The corresponding character code in SHIFT-JIS is
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1082 returned as a cons of two bytes.
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1083 @end defun
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1084
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1085 @defun decode-big5-char code
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1086 This function decodes a Big5 character @var{code} of BIG5 coding-system.
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1087 @var{code} is the character code in BIG5. The corresponding character
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1088 is returned.
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1089 @end defun
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1090
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1091 @defun encode-big5-char ch
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1092 This function encodes the Big5 character @var{char} to BIG5
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1093 coding-system. The corresponding character code in Big5 is returned.
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1094 @end defun
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1095
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1096 @node CCL, Category Tables, Coding Systems, MULE
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1097 @section CCL
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1098
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1099 CCL (Code Conversion Language) is a simple structured programming
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1100 language designed for character coding conversions. A CCL program is
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1101 compiled to CCL code (represented by a vector of integers) and executed
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1102 by the CCL interpreter embedded in Emacs. The CCL interpreter
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1103 implements a virtual machine with 8 registers called @code{r0}, ...,
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1104 @code{r7}, a number of control structures, and some I/O operators. Take
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1105 care when using registers @code{r0} (used in implicit @dfn{set}
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1106 statements) and especially @code{r7} (used internally by several
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1107 statements and operations, especially for multiple return values and I/O
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1108 operations).
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1109
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1110 CCL is used for code conversion during process I/O and file I/O for
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1111 non-ISO2022 coding systems. (It is the only way for a user to specify a
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1112 code conversion function.) It is also used for calculating the code
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1113 point of an X11 font from a character code. However, since CCL is
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1114 designed as a powerful programming language, it can be used for more
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1115 generic calculation where efficiency is demanded. A combination of
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1116 three or more arithmetic operations can be calculated faster by CCL than
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1117 by Emacs Lisp.
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1118
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1119 @strong{Warning:} The code in @file{src/mule-ccl.c} and
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1120 @file{$packages/lisp/mule-base/mule-ccl.el} is the definitive
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1121 description of CCL's semantics. The previous version of this section
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1122 contained several typos and obsolete names left from earlier versions of
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1123 MULE, and many may remain. (I am not an experienced CCL programmer; the
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1124 few who know CCL well find writing English painful.)
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1125
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1126 A CCL program transforms an input data stream into an output data
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1127 stream. The input stream, held in a buffer of constant bytes, is left
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1128 unchanged. The buffer may be filled by an external input operation,
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1129 taken from an Emacs buffer, or taken from a Lisp string. The output
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1130 buffer is a dynamic array of bytes, which can be written by an external
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1131 output operation, inserted into an Emacs buffer, or returned as a Lisp
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1132 string.
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1133
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1134 A CCL program is a (Lisp) list containing two or three members. The
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1135 first member is the @dfn{buffer magnification}, which indicates the
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1136 required minimum size of the output buffer as a multiple of the input
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1137 buffer. It is followed by the @dfn{main block} which executes while
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1138 there is input remaining, and an optional @dfn{EOF block} which is
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1139 executed when the input is exhausted. Both the main block and the EOF
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1140 block are CCL blocks.
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1141
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1142 A @dfn{CCL block} is either a CCL statement or list of CCL statements.
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1143 A @dfn{CCL statement} is either a @dfn{set statement} (either an integer
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1144 or an @dfn{assignment}, which is a list of a register to receive the
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1145 assignment, an assignment operator, and an expression) or a @dfn{control
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1146 statement} (a list starting with a keyword, whose allowable syntax
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1147 depends on the keyword).
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1148
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1149 @menu
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1150 * CCL Syntax:: CCL program syntax in BNF notation.
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1151 * CCL Statements:: Semantics of CCL statements.
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1152 * CCL Expressions:: Operators and expressions in CCL.
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1153 * Calling CCL:: Running CCL programs.
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1154 * CCL Examples:: The encoding functions for Big5 and KOI-8.
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1155 @end menu
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1156
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1157 @node CCL Syntax, CCL Statements, CCL, CCL
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1158 @comment Node, Next, Previous, Up
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1159 @subsection CCL Syntax
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1160
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1161 The full syntax of a CCL program in BNF notation:
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1162
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1163 @format
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1164 CCL_PROGRAM :=
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1165 (BUFFER_MAGNIFICATION
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1166 CCL_MAIN_BLOCK
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1167 [ CCL_EOF_BLOCK ])
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1168
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1169 BUFFER_MAGNIFICATION := integer
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1170 CCL_MAIN_BLOCK := CCL_BLOCK
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1171 CCL_EOF_BLOCK := CCL_BLOCK
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1172
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1173 CCL_BLOCK :=
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1174 STATEMENT | (STATEMENT [STATEMENT ...])
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1175 STATEMENT :=
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1176 SET | IF | BRANCH | LOOP | REPEAT | BREAK | READ | WRITE
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1177 | CALL | END
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1178
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1179 SET :=
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1180 (REG = EXPRESSION)
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1181 | (REG ASSIGNMENT_OPERATOR EXPRESSION)
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1182 | integer
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1183
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1184 EXPRESSION := ARG | (EXPRESSION OPERATOR ARG)
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1185
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1186 IF := (if EXPRESSION CCL_BLOCK [CCL_BLOCK])
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1187 BRANCH := (branch EXPRESSION CCL_BLOCK [CCL_BLOCK ...])
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1188 LOOP := (loop STATEMENT [STATEMENT ...])
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1189 BREAK := (break)
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1190 REPEAT :=
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1191 (repeat)
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1192 | (write-repeat [REG | integer | string])
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1193 | (write-read-repeat REG [integer | ARRAY])
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1194 READ :=
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1195 (read REG ...)
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1196 | (read-if (REG OPERATOR ARG) CCL_BLOCK CCL_BLOCK)
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1197 | (read-branch REG CCL_BLOCK [CCL_BLOCK ...])
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1198 WRITE :=
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1199 (write REG ...)
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1200 | (write EXPRESSION)
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1201 | (write integer) | (write string) | (write REG ARRAY)
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1202 | string
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1203 CALL := (call ccl-program-name)
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1204 END := (end)
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1205
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1206 REG := r0 | r1 | r2 | r3 | r4 | r5 | r6 | r7
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1207 ARG := REG | integer
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1208 OPERATOR :=
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1209 + | - | * | / | % | & | '|' | ^ | << | >> | <8 | >8 | //
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1210 | < | > | == | <= | >= | != | de-sjis | en-sjis
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1211 ASSIGNMENT_OPERATOR :=
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1212 += | -= | *= | /= | %= | &= | '|=' | ^= | <<= | >>=
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1213 ARRAY := '[' integer ... ']'
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1214 @end format
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1215
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1216 @node CCL Statements, CCL Expressions, CCL Syntax, CCL
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1217 @comment Node, Next, Previous, Up
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1218 @subsection CCL Statements
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1219
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1220 The Emacs Code Conversion Language provides the following statement
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1221 types: @dfn{set}, @dfn{if}, @dfn{branch}, @dfn{loop}, @dfn{repeat},
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1222 @dfn{break}, @dfn{read}, @dfn{write}, @dfn{call}, and @dfn{end}.
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1223
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1224 @heading Set statement:
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1225
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1226 The @dfn{set} statement has three variants with the syntaxes
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1227 @samp{(@var{reg} = @var{expression})},
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1228 @samp{(@var{reg} @var{assignment_operator} @var{expression})}, and
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1229 @samp{@var{integer}}. The assignment operator variation of the
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1230 @dfn{set} statement works the same way as the corresponding C expression
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1231 statement does. The assignment operators are @code{+=}, @code{-=},
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1232 @code{*=}, @code{/=}, @code{%=}, @code{&=}, @code{|=}, @code{^=},
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1233 @code{<<=}, and @code{>>=}, and they have the same meanings as in C. A
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1234 "naked integer" @var{integer} is equivalent to a @var{set} statement of
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1235 the form @code{(r0 = @var{integer})}.
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1236
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1237 @heading I/O statements:
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1238
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1239 The @dfn{read} statement takes one or more registers as arguments. It
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1240 reads one byte (a C char) from the input into each register in turn.
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1241
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1242 The @dfn{write} takes several forms. In the form @samp{(write @var{reg}
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1243 ...)} it takes one or more registers as arguments and writes each in
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1244 turn to the output. The integer in a register (interpreted as an
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1245 Emchar) is encoded to multibyte form (ie, Bufbytes) and written to the
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1246 current output buffer. If it is less than 256, it is written as is.
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1247 The forms @samp{(write @var{expression})} and @samp{(write
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1248 @var{integer})} are treated analogously. The form @samp{(write
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1249 @var{string})} writes the constant string to the output. A
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1250 "naked string" @samp{@var{string}} is equivalent to the statement @samp{(write
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1251 @var{string})}. The form @samp{(write @var{reg} @var{array})} writes
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1252 the @var{reg}th element of the @var{array} to the output.
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1253
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1254 @heading Conditional statements:
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1255
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1256 The @dfn{if} statement takes an @var{expression}, a @var{CCL block}, and
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1257 an optional @var{second CCL block} as arguments. If the
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1258 @var{expression} evaluates to non-zero, the first @var{CCL block} is
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1259 executed. Otherwise, if there is a @var{second CCL block}, it is
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1260 executed.
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1261
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1262 The @dfn{read-if} variant of the @dfn{if} statement takes an
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1263 @var{expression}, a @var{CCL block}, and an optional @var{second CCL
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1264 block} as arguments. The @var{expression} must have the form
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1265 @code{(@var{reg} @var{operator} @var{operand})} (where @var{operand} is
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1266 a register or an integer). The @code{read-if} statement first reads
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1267 from the input into the first register operand in the @var{expression},
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1268 then conditionally executes a CCL block just as the @code{if} statement
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1269 does.
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1270
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1271 The @dfn{branch} statement takes an @var{expression} and one or more CCL
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1272 blocks as arguments. The CCL blocks are treated as a zero-indexed
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1273 array, and the @code{branch} statement uses the @var{expression} as the
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1274 index of the CCL block to execute. Null CCL blocks may be used as
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1275 no-ops, continuing execution with the statement following the
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1276 @code{branch} statement in the containing CCL block. Out-of-range
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1277 values for the @var{EXPRESSION} are also treated as no-ops.
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1278
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1279 The @dfn{read-branch} variant of the @dfn{branch} statement takes an
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1280 @var{register}, a @var{CCL block}, and an optional @var{second CCL
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1281 block} as arguments. The @code{read-branch} statement first reads from
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1282 the input into the @var{register}, then conditionally executes a CCL
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1283 block just as the @code{branch} statement does.
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1284
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1285 @heading Loop control statements:
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1286
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1287 The @dfn{loop} statement creates a block with an implied jump from the
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1288 end of the block back to its head. The loop is exited on a @code{break}
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1289 statement, and continued without executing the tail by a @code{repeat}
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1290 statement.
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1291
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1292 The @dfn{break} statement, written @samp{(break)}, terminates the
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1293 current loop and continues with the next statement in the current
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1294 block.
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1295
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1296 The @dfn{repeat} statement has three variants, @code{repeat},
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1297 @code{write-repeat}, and @code{write-read-repeat}. Each continues the
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1298 current loop from its head, possibly after performing I/O.
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1299 @code{repeat} takes no arguments and does no I/O before jumping.
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1300 @code{write-repeat} takes a single argument (a register, an
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1301 integer, or a string), writes it to the output, then jumps.
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1302 @code{write-read-repeat} takes one or two arguments. The first must
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1303 be a register. The second may be an integer or an array; if absent, it
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1304 is implicitly set to the first (register) argument.
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1305 @code{write-read-repeat} writes its second argument to the output, then
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1306 reads from the input into the register, and finally jumps. See the
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1307 @code{write} and @code{read} statements for the semantics of the I/O
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1308 operations for each type of argument.
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1309
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1310 @heading Other control statements:
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1311
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1312 The @dfn{call} statement, written @samp{(call @var{ccl-program-name})},
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1313 executes a CCL program as a subroutine. It does not return a value to
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1314 the caller, but can modify the register status.
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1315
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1316 The @dfn{end} statement, written @samp{(end)}, terminates the CCL
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1317 program successfully, and returns to caller (which may be a CCL
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1318 program). It does not alter the status of the registers.
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1319
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1320 @node CCL Expressions, Calling CCL, CCL Statements, CCL
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1321 @comment Node, Next, Previous, Up
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1322 @subsection CCL Expressions
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1323
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1324 CCL, unlike Lisp, uses infix expressions. The simplest CCL expressions
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1325 consist of a single @var{operand}, either a register (one of @code{r0},
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1326 ..., @code{r0}) or an integer. Complex expressions are lists of the
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1327 form @code{( @var{expression} @var{operator} @var{operand} )}. Unlike
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1328 C, assignments are not expressions.
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1329
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1330 In the following table, @var{X} is the target resister for a @dfn{set}.
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1331 In subexpressions, this is implicitly @code{r7}. This means that
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1332 @code{>8}, @code{//}, @code{de-sjis}, and @code{en-sjis} cannot be used
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1333 freely in subexpressions, since they return parts of their values in
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1334 @code{r7}. @var{Y} may be an expression, register, or integer, while
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1335 @var{Z} must be a register or an integer.
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1336
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1337 @multitable @columnfractions .22 .14 .09 .55
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1338 @item Name @tab Operator @tab Code @tab C-like Description
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1339 @item CCL_PLUS @tab @code{+} @tab 0x00 @tab X = Y + Z
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1340 @item CCL_MINUS @tab @code{-} @tab 0x01 @tab X = Y - Z
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1341 @item CCL_MUL @tab @code{*} @tab 0x02 @tab X = Y * Z
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1342 @item CCL_DIV @tab @code{/} @tab 0x03 @tab X = Y / Z
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1343 @item CCL_MOD @tab @code{%} @tab 0x04 @tab X = Y % Z
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1344 @item CCL_AND @tab @code{&} @tab 0x05 @tab X = Y & Z
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1345 @item CCL_OR @tab @code{|} @tab 0x06 @tab X = Y | Z
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1346 @item CCL_XOR @tab @code{^} @tab 0x07 @tab X = Y ^ Z
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1347 @item CCL_LSH @tab @code{<<} @tab 0x08 @tab X = Y << Z
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1348 @item CCL_RSH @tab @code{>>} @tab 0x09 @tab X = Y >> Z
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1349 @item CCL_LSH8 @tab @code{<8} @tab 0x0A @tab X = (Y << 8) | Z
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1350 @item CCL_RSH8 @tab @code{>8} @tab 0x0B @tab X = Y >> 8, r[7] = Y & 0xFF
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1351 @item CCL_DIVMOD @tab @code{//} @tab 0x0C @tab X = Y / Z, r[7] = Y % Z
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1352 @item CCL_LS @tab @code{<} @tab 0x10 @tab X = (X < Y)
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1353 @item CCL_GT @tab @code{>} @tab 0x11 @tab X = (X > Y)
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1354 @item CCL_EQ @tab @code{==} @tab 0x12 @tab X = (X == Y)
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1355 @item CCL_LE @tab @code{<=} @tab 0x13 @tab X = (X <= Y)
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1356 @item CCL_GE @tab @code{>=} @tab 0x14 @tab X = (X >= Y)
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1357 @item CCL_NE @tab @code{!=} @tab 0x15 @tab X = (X != Y)
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1358 @item CCL_ENCODE_SJIS @tab @code{en-sjis} @tab 0x16 @tab X = HIGHER_BYTE (SJIS (Y, Z))
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1359 @item @tab @tab @tab r[7] = LOWER_BYTE (SJIS (Y, Z)
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1360 @item CCL_DECODE_SJIS @tab @code{de-sjis} @tab 0x17 @tab X = HIGHER_BYTE (DE-SJIS (Y, Z))
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1361 @item @tab @tab @tab r[7] = LOWER_BYTE (DE-SJIS (Y, Z))
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1362 @end multitable
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1363
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1364 The CCL operators are as in C, with the addition of CCL_LSH8, CCL_RSH8,
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1365 CCL_DIVMOD, CCL_ENCODE_SJIS, and CCL_DECODE_SJIS. The CCL_ENCODE_SJIS
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1366 and CCL_DECODE_SJIS treat their first and second bytes as the high and
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1367 low bytes of a two-byte character code. (SJIS stands for Shift JIS, an
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1368 encoding of Japanese characters used by Microsoft. CCL_ENCODE_SJIS is a
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1369 complicated transformation of the Japanese standard JIS encoding to
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1370 Shift JIS. CCL_DECODE_SJIS is its inverse.) It is somewhat odd to
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1371 represent the SJIS operations in infix form.
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1372
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1373 @node Calling CCL, CCL Examples, CCL Expressions, CCL
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1374 @comment Node, Next, Previous, Up
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1375 @subsection Calling CCL
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1376
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1377 CCL programs are called automatically during Emacs buffer I/O when the
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1378 external representation has a coding system type of @code{shift-jis},
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1379 @code{big5}, or @code{ccl}. The program is specified by the coding
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1380 system (@pxref{Coding Systems}). You can also call CCL programs from
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1381 other CCL programs, and from Lisp using these functions:
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1382
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1383 @defun ccl-execute ccl-program status
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1384 Execute @var{ccl-program} with registers initialized by
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1385 @var{status}. @var{ccl-program} is a vector of compiled CCL code
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1386 created by @code{ccl-compile}. It is an error for the program to try to
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1387 execute a CCL I/O command. @var{status} must be a vector of nine
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1388 values, specifying the initial value for the R0, R1 .. R7 registers and
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1389 for the instruction counter IC. A @code{nil} value for a register
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1390 initializer causes the register to be set to 0. A @code{nil} value for
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1391 the IC initializer causes execution to start at the beginning of the
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1392 program. When the program is done, @var{status} is modified (by
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1393 side-effect) to contain the ending values for the corresponding
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1394 registers and IC.
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1395 @end defun
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1396
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1397 @defun ccl-execute-on-string ccl-program status str &optional continue
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1398 Execute @var{ccl-program} with initial @var{status} on
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1399 @var{string}. @var{ccl-program} is a vector of compiled CCL code
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1400 created by @code{ccl-compile}. @var{status} must be a vector of nine
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1401 values, specifying the initial value for the R0, R1 .. R7 registers and
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1402 for the instruction counter IC. A @code{nil} value for a register
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1403 initializer causes the register to be set to 0. A @code{nil} value for
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1404 the IC initializer causes execution to start at the beginning of the
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1405 program. An optional fourth argument @var{continue}, if non-nil, causes
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1406 the IC to
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1407 remain on the unsatisfied read operation if the program terminates due
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1408 to exhaustion of the input buffer. Otherwise the IC is set to the end
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1409 of the program. When the program is done, @var{status} is modified (by
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1410 side-effect) to contain the ending values for the corresponding
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1411 registers and IC. Returns the resulting string.
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1412 @end defun
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1413
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1414 To call a CCL program from another CCL program, it must first be
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1415 registered:
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1416
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1417 @defun register-ccl-program name ccl-program
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1418 Register @var{name} for CCL program @var{program} in
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1419 @code{ccl-program-table}. @var{program} should be the compiled form of
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1420 a CCL program, or nil. Return index number of the registered CCL
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1421 program.
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1422 @end defun
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1423
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1424 Information about the processor time used by the CCL interpreter can be
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1425 obtained using these functions:
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1426
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1427 @defun ccl-elapsed-time
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1428 Returns the elapsed processor time of the CCL interpreter as cons of
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1429 user and system time, as
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1430 floating point numbers measured in seconds. If only one
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1431 overall value can be determined, the return value will be a cons of that
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1432 value and 0.
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1433 @end defun
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1434
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1435 @defun ccl-reset-elapsed-time
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1436 Resets the CCL interpreter's internal elapsed time registers.
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1437 @end defun
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1438
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1439 @node CCL Examples, , Calling CCL, CCL
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1440 @comment Node, Next, Previous, Up
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1441 @subsection CCL Examples
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1442
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1443 This section is not yet written.
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1444
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1445 @node Category Tables, , CCL, MULE
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1446 @section Category Tables
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1447
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1448 A category table is a type of char table used for keeping track of
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1449 categories. Categories are used for classifying characters for use in
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1450 regexps -- you can refer to a category rather than having to use a
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1451 complicated [] expression (and category lookups are significantly
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1452 faster).
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1453
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1454 There are 95 different categories available, one for each printable
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1455 character (including space) in the ASCII charset. Each category is
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1456 designated by one such character, called a @dfn{category designator}.
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1457 They are specified in a regexp using the syntax @samp{\cX}, where X is a
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1458 category designator. (This is not yet implemented.)
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1459
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1460 A category table specifies, for each character, the categories that
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1461 the character is in. Note that a character can be in more than one
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1462 category. More specifically, a category table maps from a character to
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1463 either the value @code{nil} (meaning the character is in no categories)
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1464 or a 95-element bit vector, specifying for each of the 95 categories
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1465 whether the character is in that category.
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1466
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1467 Special Lisp functions are provided that abstract this, so you do not
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1468 have to directly manipulate bit vectors.
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1469
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1470 @defun category-table-p obj
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1471 This function returns @code{t} if @var{arg} is a category table.
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1472 @end defun
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1473
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1474 @defun category-table &optional buffer
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1475 This function returns the current category table. This is the one
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1476 specified by the current buffer, or by @var{buffer} if it is
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1477 non-@code{nil}.
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1478 @end defun
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1479
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1480 @defun standard-category-table
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1481 This function returns the standard category table. This is the one used
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1482 for new buffers.
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1483 @end defun
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1484
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1485 @defun copy-category-table &optional table
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1486 This function constructs a new category table and return it. It is a
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1487 copy of the @var{table}, which defaults to the standard category table.
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1488 @end defun
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1489
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1490 @defun set-category-table table &optional buffer
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1491 This function selects a new category table for @var{buffer}. One
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1492 argument, a category table. @var{buffer} defaults to the current buffer
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1493 if omitted.
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1494 @end defun
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1495
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1496 @defun category-designator-p obj
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1497 This function returns @code{t} if @var{arg} is a category designator (a
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1498 char in the range @samp{' '} to @samp{'~'}).
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1499 @end defun
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1500
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1501 @defun category-table-value-p obj
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1502 This function returns @code{t} if @var{arg} is a category table value.
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1503 Valid values are @code{nil} or a bit vector of size 95.
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1504 @end defun
3ecd8885ac67 Import from CVS: tag r21-2-22
cvs
parents:
diff changeset
1505