annotate man/lispref/mule.texi @ 410:de805c49cfc1 r21-2-35

Import from CVS: tag r21-2-35
author cvs
date Mon, 13 Aug 2007 11:19:21 +0200
parents 2f8bb876ab1d
children 697ef44129c6
Ignore whitespace changes - Everywhere: Within whitespace: At end of lines:
rev   line source
0
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
1 @c -*-texinfo-*-
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
2 @c This is part of the XEmacs Lisp Reference Manual.
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
3 @c Copyright (C) 1996 Ben Wing.
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
4 @c See the file lispref.texi for copying conditions.
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
5 @setfilename ../../info/internationalization.info
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
6 @node MULE, Tips, Internationalization, top
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
7 @chapter MULE
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
8
404
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
9 @dfn{MULE} is the name originally given to the version of GNU Emacs
0
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
10 extended for multi-lingual (and in particular Asian-language) support.
404
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
11 ``MULE'' is short for ``MUlti-Lingual Emacs''. It is an extension and
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
12 complete rewrite of Nemacs (``Nihon Emacs'' where ``Nihon'' is the
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
13 Japanese word for ``Japan''), which only provided support for Japanese.
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
14 XEmacs refers to its multi-lingual support as @dfn{MULE support} since
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
15 it is based on @dfn{MULE}.
0
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
16
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
17 @menu
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
18 * Internationalization Terminology::
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
19 Definition of various internationalization terms.
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
20 * Charsets:: Sets of related characters.
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
21 * MULE Characters:: Working with characters in XEmacs/MULE.
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
22 * Composite Characters:: Making new characters by overstriking other ones.
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
23 * Coding Systems:: Ways of representing a string of chars using integers.
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
24 * CCL:: A special language for writing fast converters.
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
25 * Category Tables:: Subdividing charsets into groups.
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
26 @end menu
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
27
404
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
28 @node Internationalization Terminology, Charsets, , MULE
0
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
29 @section Internationalization Terminology
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
30
404
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
31 In internationalization terminology, a string of text is divided up
0
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
32 into @dfn{characters}, which are the printable units that make up the
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
33 text. A single character is (for example) a capital @samp{A}, the
404
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
34 number @samp{2}, a Katakana character, a Hangul character, a Kanji
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
35 ideograph (an @dfn{ideograph} is a ``picture'' character, such as is
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
36 used in Japanese Kanji, Chinese Hanzi, and Korean Hanja; typically there
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
37 are thousands of such ideographs in each language), etc. The basic
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
38 property of a character is that it is the smallest unit of text with
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
39 semantic significance in text processing.
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
40
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
41 Human beings normally process text visually, so to a first approximation
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
42 a character may be identified with its shape. Note that the same
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
43 character may be drawn by two different people (or in two different
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
44 fonts) in slightly different ways, although the "basic shape" will be the
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
45 same. But consider the works of Scott Kim; human beings can recognize
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
46 hugely variant shapes as the "same" character. Sometimes, especially
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
47 where characters are extremely complicated to write, completely
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
48 different shapes may be defined as the "same" character in national
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
49 standards. The Taiwanese variant of Hanzi is generally the most
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
50 complicated; over the centuries, the Japanese, Koreans, and the People's
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
51 Republic of China have adopted simplifications of the shape, but the
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
52 line of descent from the original shape is recorded, and the meanings
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
53 and pronunciation of different forms of the same character are
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
54 considered to be identical within each language. (Of course, it may
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
55 take a specialist to recognize the related form; the point is that the
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
56 relations are standardized, despite the differing shapes.)
0
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
57
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
58 In some cases, the differences will be significant enough that it is
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
59 actually possible to identify two or more distinct shapes that both
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
60 represent the same character. For example, the lowercase letters
398
74fd4e045ea6 Import from CVS: tag r21-2-29
cvs
parents: 371
diff changeset
61 @samp{a} and @samp{g} each have two distinct possible shapes---the
0
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
62 @samp{a} can optionally have a curved tail projecting off the top, and
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
63 the @samp{g} can be formed either of two loops, or of one loop and a
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
64 tail hanging off the bottom. Such distinct possible shapes of a
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
65 character are called @dfn{glyphs}. The important characteristic of two
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
66 glyphs making up the same character is that the choice between one or
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
67 the other is purely stylistic and has no linguistic effect on a word
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
68 (this is the reason why a capital @samp{A} and lowercase @samp{a}
398
74fd4e045ea6 Import from CVS: tag r21-2-29
cvs
parents: 371
diff changeset
69 are different characters rather than different glyphs---e.g.
0
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
70 @samp{Aspen} is a city while @samp{aspen} is a kind of tree).
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
71
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
72 Note that @dfn{character} and @dfn{glyph} are used differently
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
73 here than elsewhere in XEmacs.
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
74
404
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
75 A @dfn{character set} is essentially a set of related characters. ASCII,
0
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
76 for example, is a set of 94 characters (or 128, if you count
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
77 non-printing characters). Other character sets are ISO8859-1 (ASCII
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
78 plus various accented characters and other international symbols),
404
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
79 JIS X 0201 (ASCII, more or less, plus half-width Katakana), JIS X 0208
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
80 (Japanese Kanji), JIS X 0212 (a second set of less-used Japanese Kanji),
0
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
81 GB2312 (Mainland Chinese Hanzi), etc.
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
82
404
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
83 The definition of a character set will implicitly or explicitly give
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
84 it an @dfn{ordering}, a way of assigning a number to each character in
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
85 the set. For many character sets, there is a natural ordering, for
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
86 example the ``ABC'' ordering of the Roman letters. But it is not clear
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
87 whether digits should come before or after the letters, and in fact
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
88 different European languages treat the ordering of accented characters
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
89 differently. It is useful to use the natural order where available, of
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
90 course. The number assigned to any particular character is called the
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
91 character's @dfn{code point}. (Within a given character set, each
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
92 character has a unique code point. Thus the word "set" is ill-chosen;
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
93 different orderings of the same characters are different character sets.
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
94 Identifying characters is simple enough for alphabetic character sets,
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
95 but the difference in ordering can cause great headaches when the same
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
96 thousands of characters are used by different cultures as in the Hanzi.)
0
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
97
404
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
98 A code point may be broken into a number of @dfn{position codes}. The
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
99 number of position codes required to index a particular character in a
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
100 character set is called the @dfn{dimension} of the character set. For
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
101 practical purposes, a position code may be thought of as a byte-sized
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
102 index. The printing characters of ASCII, being a relatively small
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
103 character set, is of dimension one, and each character in the set is
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
104 indexed using a single position code, in the range 1 through 94. Use of
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
105 this unusual range, rather than the familiar 33 through 126, is an
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
106 intentional abstraction; to understand the programming issues you must
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
107 break the equation between character sets and encodings.
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
108
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
109 JIS X 0208, i.e. Japanese Kanji, has thousands of characters, and is
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
110 of dimension two -- every character is indexed by two position codes,
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
111 each in the range 1 through 94. (This number ``94'' is not a
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
112 coincidence; we shall see that the JIS position codes were chosen so
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
113 that JIS kanji could be encoded without using codes that in ASCII are
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
114 associated with device control functions.) Note that the choice of the
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
115 range here is somewhat arbitrary. You could just as easily index the
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
116 printing characters in ASCII using numbers in the range 0 through 93, 2
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
117 through 95, 3 through 96, etc. In fact, the standardized
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
118 @emph{encoding} for the ASCII @emph{character set} uses the range 33
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
119 through 126.
0
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
120
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
121 An @dfn{encoding} is a way of numerically representing characters from
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
122 one or more character sets into a stream of like-sized numerical values
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
123 called @dfn{words}; typically these are 8-bit, 16-bit, or 32-bit
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
124 quantities. If an encoding encompasses only one character set, then the
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
125 position codes for the characters in that character set could be used
404
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
126 directly. (This is the case with the trivial cipher used by children,
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
127 assigning 1 to `A', 2 to `B', and so on.) However, even with ASCII,
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
128 other considerations intrude. For example, why are the upper- and
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
129 lowercase alphabets separated by 8 characters? Why do the digits start
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
130 with `0' being assigned the code 48? In both cases because semantically
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
131 interesting operations (case conversion and numerical value extraction)
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
132 become convenient masking operations. Other artificial aspects (the
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
133 control characters being assigned to codes 0--31 and 127) are historical
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
134 accidents. (The use of 127 for @samp{DEL} is an artifact of the "punch
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
135 once" nature of paper tape, for example.)
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
136
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
137 Naive use of the position code is not possible, however, if more than
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
138 one character set is to be used in the encoding. For example, printed
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
139 Japanese text typically requires characters from multiple character sets
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
140 -- ASCII, JIS X 0208, and JIS X 0212, to be specific. Each of these is
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
141 indexed using one or more position codes in the range 1 through 94, so
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
142 the position codes could not be used directly or there would be no way
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
143 to tell which character was meant. Different Japanese encodings handle
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
144 this differently -- JIS uses special escape characters to denote
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
145 different character sets; EUC sets the high bit of the position codes
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
146 for JIS X 0208 and JIS X 0212, and puts a special extra byte before each
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
147 JIS X 0212 character; etc. (JIS, EUC, and most of the other encodings
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
148 you will encounter in files are 7-bit or 8-bit encodings. There is one
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
149 common 16-bit encoding, which is Unicode; this strives to represent all
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
150 the world's characters in a single large character set. 32-bit
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
151 encodings are often used internally in programs, such as XEmacs with
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
152 MULE support, to simplify the code that manipulates them; however, they
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
153 are not used externally because they are not very space-efficient.)
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
154
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
155 A general method of handling text using multiple character sets
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
156 (whether for multilingual text, or simply text in an extremely
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
157 complicated single language like Japanese) is defined in the
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
158 international standard ISO 2022. ISO 2022 will be discussed in more
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
159 detail later (@pxref{ISO 2022}), but for now suffice it to say that text
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
160 needs control functions (at least spacing), and if escape sequences are
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
161 to be used, an escape sequence introducer. It was decided to make all
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
162 text streams compatible with ASCII in the sense that the codes 0--31
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
163 (and 128-159) would always be control codes, never graphic characters,
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
164 and where defined by the character set the @samp{SPC} character would be
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
165 assigned code 32, and @samp{DEL} would be assigned 127. Thus there are
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
166 94 code points remaining if 7 bits are used. This is the reason that
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
167 most character sets are defined using position codes in the range 1
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
168 through 94. Then ISO 2022 compatible encodings are produced by shifting
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
169 the position codes 1 to 94 into character codes 33 to 126, or (if 8 bit
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
170 codes are available) into character codes 161 to 254.
0
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
171
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
172 Encodings are classified as either @dfn{modal} or @dfn{non-modal}. In
404
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
173 a @dfn{modal encoding}, there are multiple states that the encoding can
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
174 be in, and the interpretation of the values in the stream depends on the
0
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
175 current global state of the encoding. Special values in the encoding,
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
176 called @dfn{escape sequences}, are used to change the global state.
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
177 JIS, for example, is a modal encoding. The bytes @samp{ESC $ B}
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
178 indicate that, from then on, bytes are to be interpreted as position
404
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
179 codes for JIS X 0208, rather than as ASCII. This effect is cancelled
0
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
180 using the bytes @samp{ESC ( B}, which mean ``switch from whatever the
404
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
181 current state is to ASCII''. To switch to JIS X 0212, the escape
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
182 sequence @samp{ESC $ ( D}. (Note that here, as is common, the escape
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
183 sequences do in fact begin with @samp{ESC}. This is not necessarily the
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
184 case, however. Some encodings use control characters called "locking
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
185 shifts" (effect persists until cancelled) to switch character sets.)
0
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
186
404
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
187 A @dfn{non-modal encoding} has no global state that extends past the
0
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
188 character currently being interpreted. EUC, for example, is a
404
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
189 non-modal encoding. Characters in JIS X 0208 are encoded by setting
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
190 the high bit of the position codes, and characters in JIS X 0212 are
0
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
191 encoded by doing the same but also prefixing the character with the
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
192 byte 0x8F.
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
193
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
194 The advantage of a modal encoding is that it is generally more
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
195 space-efficient, and is easily extendable because there are essentially
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
196 an arbitrary number of escape sequences that can be created. The
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
197 disadvantage, however, is that it is much more difficult to work with
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
198 if it is not being processed in a sequential manner. In the non-modal
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
199 EUC encoding, for example, the byte 0x41 always refers to the letter
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
200 @samp{A}; whereas in JIS, it could either be the letter @samp{A}, or
404
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
201 one of the two position codes in a JIS X 0208 character, or one of the
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
202 two position codes in a JIS X 0212 character. Determining exactly which
0
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
203 one is meant could be difficult and time-consuming if the previous
404
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
204 bytes in the string have not already been processed, or impossible if
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
205 they are drawn from an external stream that cannot be rewound.
0
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
206
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
207 Non-modal encodings are further divided into @dfn{fixed-width} and
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
208 @dfn{variable-width} formats. A fixed-width encoding always uses
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
209 the same number of words per character, whereas a variable-width
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
210 encoding does not. EUC is a good example of a variable-width
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
211 encoding: one to three bytes are used per character, depending on
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
212 the character set. 16-bit and 32-bit encodings are nearly always
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
213 fixed-width, and this is in fact one of the main reasons for using
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
214 an encoding with a larger word size. The advantages of fixed-width
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
215 encodings should be obvious. The advantages of variable-width
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
216 encodings are that they are generally more space-efficient and allow
404
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
217 for compatibility with existing 8-bit encodings such as ASCII. (For
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
218 example, in Unicode ASCII characters are simply promoted to a 16-bit
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
219 representation. That means that every ASCII character contains a
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
220 @samp{NUL} byte; evidently all of the standard string manipulation
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
221 functions will lose badly in a fixed-width Unicode environment.)
0
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
222
404
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
223 The bytes in an 8-bit encoding are often referred to as @dfn{octets}
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
224 rather than simply as bytes. This terminology dates back to the days
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
225 before 8-bit bytes were universal, when some computers had 9-bit bytes,
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
226 others had 10-bit bytes, etc.
0
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
227
404
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
228 @node Charsets, MULE Characters, Internationalization Terminology, MULE
0
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
229 @section Charsets
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
230
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
231 A @dfn{charset} in MULE is an object that encapsulates a
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
232 particular character set as well as an ordering of those characters.
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
233 Charsets are permanent objects and are named using symbols, like
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
234 faces.
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
235
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
236 @defun charsetp object
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
237 This function returns non-@code{nil} if @var{object} is a charset.
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
238 @end defun
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
239
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
240 @menu
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
241 * Charset Properties:: Properties of a charset.
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
242 * Basic Charset Functions:: Functions for working with charsets.
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
243 * Charset Property Functions:: Functions for accessing charset properties.
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
244 * Predefined Charsets:: Predefined charset objects.
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
245 @end menu
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
246
404
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
247 @node Charset Properties, Basic Charset Functions, , Charsets
0
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
248 @subsection Charset Properties
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
249
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
250 Charsets have the following properties:
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
251
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
252 @table @code
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
253 @item name
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
254 A symbol naming the charset. Every charset must have a different name;
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
255 this allows a charset to be referred to using its name rather than
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
256 the actual charset object.
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
257 @item doc-string
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
258 A documentation string describing the charset.
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
259 @item registry
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
260 A regular expression matching the font registry field for this character
114
8619ce7e4c50 Import from CVS: tag r20-1b9
cvs
parents: 70
diff changeset
261 set. For example, both the @code{ascii} and @code{latin-iso8859-1}
8619ce7e4c50 Import from CVS: tag r20-1b9
cvs
parents: 70
diff changeset
262 charsets use the registry @code{"ISO8859-1"}. This field is used to
8619ce7e4c50 Import from CVS: tag r20-1b9
cvs
parents: 70
diff changeset
263 choose an appropriate font when the user gives a general font
8619ce7e4c50 Import from CVS: tag r20-1b9
cvs
parents: 70
diff changeset
264 specification such as @samp{-*-courier-medium-r-*-140-*}, i.e. a
8619ce7e4c50 Import from CVS: tag r20-1b9
cvs
parents: 70
diff changeset
265 14-point upright medium-weight Courier font.
0
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
266 @item dimension
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
267 Number of position codes used to index a character in the character set.
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
268 XEmacs/MULE can only handle character sets of dimension 1 or 2.
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
269 This property defaults to 1.
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
270 @item chars
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
271 Number of characters in each dimension. In XEmacs/MULE, the only
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
272 allowed values are 94 or 96. (There are a couple of pre-defined
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
273 character sets, such as ASCII, that do not follow this, but you cannot
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
274 define new ones like this.) Defaults to 94. Note that if the dimension
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
275 is 2, the character set thus described is 94x94 or 96x96.
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
276 @item columns
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
277 Number of columns used to display a character in this charset.
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
278 Only used in TTY mode. (Under X, the actual width of a character
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
279 can be derived from the font used to display the characters.)
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
280 If unspecified, defaults to the dimension. (This is almost
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
281 always the correct value, because character sets with dimension 2
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
282 are usually ideograph character sets, which need two columns to
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
283 display the intricate ideographs.)
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
284 @item direction
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
285 A symbol, either @code{l2r} (left-to-right) or @code{r2l}
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
286 (right-to-left). Defaults to @code{l2r}. This specifies the
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
287 direction that the text should be displayed in, and will be
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
288 left-to-right for most charsets but right-to-left for Hebrew
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
289 and Arabic. (Right-to-left display is not currently implemented.)
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
290 @item final
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
291 Final byte of the standard ISO 2022 escape sequence designating this
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
292 charset. Must be supplied. Each combination of (@var{dimension},
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
293 @var{chars}) defines a separate namespace for final bytes, and each
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
294 charset within a particular namespace must have a different final byte.
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
295 Note that ISO 2022 restricts the final byte to the range 0x30 - 0x7E if
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
296 dimension == 1, and 0x30 - 0x5F if dimension == 2. Note also that final
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
297 bytes in the range 0x30 - 0x3F are reserved for user-defined (not
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
298 official) character sets. For more information on ISO 2022, see @ref{Coding
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
299 Systems}.
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
300 @item graphic
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
301 0 (use left half of font on output) or 1 (use right half of font on
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
302 output). Defaults to 0. This specifies how to convert the position
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
303 codes that index a character in a character set into an index into the
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
304 font used to display the character set. With @code{graphic} set to 0,
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
305 position codes 33 through 126 map to font indices 33 through 126; with
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
306 it set to 1, position codes 33 through 126 map to font indices 161
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
307 through 254 (i.e. the same number but with the high bit set). For
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
308 example, for a font whose registry is ISO8859-1, the left half of the
114
8619ce7e4c50 Import from CVS: tag r20-1b9
cvs
parents: 70
diff changeset
309 font (octets 0x20 - 0x7F) is the @code{ascii} charset, while the right
8619ce7e4c50 Import from CVS: tag r20-1b9
cvs
parents: 70
diff changeset
310 half (octets 0xA0 - 0xFF) is the @code{latin-iso8859-1} charset.
0
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
311 @item ccl-program
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
312 A compiled CCL program used to convert a character in this charset into
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
313 an index into the font. This is in addition to the @code{graphic}
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
314 property. If a CCL program is defined, the position codes of a
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
315 character will first be processed according to @code{graphic} and
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
316 then passed through the CCL program, with the resulting values used
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
317 to index the font.
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
318
404
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
319 This is used, for example, in the Big5 character set (used in Taiwan).
0
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
320 This character set is not ISO-2022-compliant, and its size (94x157) does
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
321 not fit within the maximum 96x96 size of ISO-2022-compliant character
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
322 sets. As a result, XEmacs/MULE splits it (in a rather complex fashion,
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
323 so as to group the most commonly used characters together) into two
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
324 charset objects (@code{big5-1} and @code{big5-2}), each of size 94x94,
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
325 and each charset object uses a CCL program to convert the modified
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
326 position codes back into standard Big5 indices to retrieve a character
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
327 from a Big5 font.
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
328 @end table
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
329
404
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
330 Most of the above properties can only be set when the charset is
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
331 initialized, and cannot be changed later.
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
332 @xref{Charset Property Functions}.
0
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
333
404
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
334 @node Basic Charset Functions, Charset Property Functions, Charset Properties, Charsets
0
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
335 @subsection Basic Charset Functions
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
336
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
337 @defun find-charset charset-or-name
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
338 This function retrieves the charset of the given name. If
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
339 @var{charset-or-name} is a charset object, it is simply returned.
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
340 Otherwise, @var{charset-or-name} should be a symbol. If there is no
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
341 such charset, @code{nil} is returned. Otherwise the associated charset
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
342 object is returned.
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
343 @end defun
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
344
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
345 @defun get-charset name
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
346 This function retrieves the charset of the given name. Same as
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
347 @code{find-charset} except an error is signalled if there is no such
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
348 charset instead of returning @code{nil}.
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
349 @end defun
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
350
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
351 @defun charset-list
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
352 This function returns a list of the names of all defined charsets.
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
353 @end defun
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
354
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
355 @defun make-charset name doc-string props
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
356 This function defines a new character set. This function is for use
404
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
357 with MULE support. @var{name} is a symbol, the name by which the
0
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
358 character set is normally referred. @var{doc-string} is a string
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
359 describing the character set. @var{props} is a property list,
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
360 describing the specific nature of the character set. The recognized
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
361 properties are @code{registry}, @code{dimension}, @code{columns},
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
362 @code{chars}, @code{final}, @code{graphic}, @code{direction}, and
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
363 @code{ccl-program}, as previously described.
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
364 @end defun
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
365
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
366 @defun make-reverse-direction-charset charset new-name
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
367 This function makes a charset equivalent to @var{charset} but which goes
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
368 in the opposite direction. @var{new-name} is the name of the new
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
369 charset. The new charset is returned.
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
370 @end defun
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
371
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
372 @defun charset-from-attributes dimension chars final &optional direction
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
373 This function returns a charset with the given @var{dimension},
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
374 @var{chars}, @var{final}, and @var{direction}. If @var{direction} is
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
375 omitted, both directions will be checked (left-to-right will be returned
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
376 if character sets exist for both directions).
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
377 @end defun
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
378
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
379 @defun charset-reverse-direction-charset charset
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
380 This function returns the charset (if any) with the same dimension,
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
381 number of characters, and final byte as @var{charset}, but which is
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
382 displayed in the opposite direction.
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
383 @end defun
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
384
404
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
385 @node Charset Property Functions, Predefined Charsets, Basic Charset Functions, Charsets
0
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
386 @subsection Charset Property Functions
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
387
404
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
388 All of these functions accept either a charset name or charset object.
0
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
389
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
390 @defun charset-property charset prop
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
391 This function returns property @var{prop} of @var{charset}.
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
392 @xref{Charset Properties}.
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
393 @end defun
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
394
404
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
395 Convenience functions are also provided for retrieving individual
0
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
396 properties of a charset.
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
397
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
398 @defun charset-name charset
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
399 This function returns the name of @var{charset}. This will be a symbol.
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
400 @end defun
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
401
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
402 @defun charset-doc-string charset
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
403 This function returns the doc string of @var{charset}.
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
404 @end defun
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
405
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
406 @defun charset-registry charset
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
407 This function returns the registry of @var{charset}.
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
408 @end defun
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
409
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
410 @defun charset-dimension charset
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
411 This function returns the dimension of @var{charset}.
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
412 @end defun
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
413
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
414 @defun charset-chars charset
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
415 This function returns the number of characters per dimension of
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
416 @var{charset}.
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
417 @end defun
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
418
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
419 @defun charset-columns charset
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
420 This function returns the number of display columns per character (in
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
421 TTY mode) of @var{charset}.
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
422 @end defun
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
423
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
424 @defun charset-direction charset
398
74fd4e045ea6 Import from CVS: tag r21-2-29
cvs
parents: 371
diff changeset
425 This function returns the display direction of @var{charset}---either
0
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
426 @code{l2r} or @code{r2l}.
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
427 @end defun
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
428
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
429 @defun charset-final charset
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
430 This function returns the final byte of the ISO 2022 escape sequence
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
431 designating @var{charset}.
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
432 @end defun
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
433
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
434 @defun charset-graphic charset
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
435 This function returns either 0 or 1, depending on whether the position
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
436 codes of characters in @var{charset} map to the left or right half
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
437 of their font, respectively.
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
438 @end defun
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
439
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
440 @defun charset-ccl-program charset
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
441 This function returns the CCL program, if any, for converting
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
442 position codes of characters in @var{charset} into font indices.
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
443 @end defun
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
444
404
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
445 The only property of a charset that can currently be set after
0
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
446 the charset has been created is the CCL program.
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
447
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
448 @defun set-charset-ccl-program charset ccl-program
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
449 This function sets the @code{ccl-program} property of @var{charset} to
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
450 @var{ccl-program}.
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
451 @end defun
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
452
404
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
453 @node Predefined Charsets, , Charset Property Functions, Charsets
0
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
454 @subsection Predefined Charsets
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
455
404
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
456 The following charsets are predefined in the C code.
0
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
457
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
458 @example
54
05472e90ae02 Import from CVS: tag r19-16-pre2
cvs
parents: 0
diff changeset
459 Name Type Fi Gr Dir Registry
0
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
460 --------------------------------------------------------------
54
05472e90ae02 Import from CVS: tag r19-16-pre2
cvs
parents: 0
diff changeset
461 ascii 94 B 0 l2r ISO8859-1
05472e90ae02 Import from CVS: tag r19-16-pre2
cvs
parents: 0
diff changeset
462 control-1 94 0 l2r ---
114
8619ce7e4c50 Import from CVS: tag r20-1b9
cvs
parents: 70
diff changeset
463 latin-iso8859-1 94 A 1 l2r ISO8859-1
8619ce7e4c50 Import from CVS: tag r20-1b9
cvs
parents: 70
diff changeset
464 latin-iso8859-2 96 B 1 l2r ISO8859-2
8619ce7e4c50 Import from CVS: tag r20-1b9
cvs
parents: 70
diff changeset
465 latin-iso8859-3 96 C 1 l2r ISO8859-3
8619ce7e4c50 Import from CVS: tag r20-1b9
cvs
parents: 70
diff changeset
466 latin-iso8859-4 96 D 1 l2r ISO8859-4
8619ce7e4c50 Import from CVS: tag r20-1b9
cvs
parents: 70
diff changeset
467 cyrillic-iso8859-5 96 L 1 l2r ISO8859-5
8619ce7e4c50 Import from CVS: tag r20-1b9
cvs
parents: 70
diff changeset
468 arabic-iso8859-6 96 G 1 r2l ISO8859-6
8619ce7e4c50 Import from CVS: tag r20-1b9
cvs
parents: 70
diff changeset
469 greek-iso8859-7 96 F 1 l2r ISO8859-7
8619ce7e4c50 Import from CVS: tag r20-1b9
cvs
parents: 70
diff changeset
470 hebrew-iso8859-8 96 H 1 r2l ISO8859-8
8619ce7e4c50 Import from CVS: tag r20-1b9
cvs
parents: 70
diff changeset
471 latin-iso8859-9 96 M 1 l2r ISO8859-9
8619ce7e4c50 Import from CVS: tag r20-1b9
cvs
parents: 70
diff changeset
472 thai-tis620 96 T 1 l2r TIS620
8619ce7e4c50 Import from CVS: tag r20-1b9
cvs
parents: 70
diff changeset
473 katakana-jisx0201 94 I 1 l2r JISX0201.1976
8619ce7e4c50 Import from CVS: tag r20-1b9
cvs
parents: 70
diff changeset
474 latin-jisx0201 94 J 0 l2r JISX0201.1976
54
05472e90ae02 Import from CVS: tag r19-16-pre2
cvs
parents: 0
diff changeset
475 japanese-jisx0208-1978 94x94 @@ 0 l2r JISX0208.1978
05472e90ae02 Import from CVS: tag r19-16-pre2
cvs
parents: 0
diff changeset
476 japanese-jisx0208 94x94 B 0 l2r JISX0208.19(83|90)
05472e90ae02 Import from CVS: tag r19-16-pre2
cvs
parents: 0
diff changeset
477 japanese-jisx0212 94x94 D 0 l2r JISX0212
114
8619ce7e4c50 Import from CVS: tag r20-1b9
cvs
parents: 70
diff changeset
478 chinese-gb2312 94x94 A 0 l2r GB2312
54
05472e90ae02 Import from CVS: tag r19-16-pre2
cvs
parents: 0
diff changeset
479 chinese-cns11643-1 94x94 G 0 l2r CNS11643.1
05472e90ae02 Import from CVS: tag r19-16-pre2
cvs
parents: 0
diff changeset
480 chinese-cns11643-2 94x94 H 0 l2r CNS11643.2
05472e90ae02 Import from CVS: tag r19-16-pre2
cvs
parents: 0
diff changeset
481 chinese-big5-1 94x94 0 0 l2r Big5
05472e90ae02 Import from CVS: tag r19-16-pre2
cvs
parents: 0
diff changeset
482 chinese-big5-2 94x94 1 0 l2r Big5
05472e90ae02 Import from CVS: tag r19-16-pre2
cvs
parents: 0
diff changeset
483 korean-ksc5601 94x94 C 0 l2r KSC5601
05472e90ae02 Import from CVS: tag r19-16-pre2
cvs
parents: 0
diff changeset
484 composite 96x96 0 l2r ---
0
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
485 @end example
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
486
404
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
487 The following charsets are predefined in the Lisp code.
0
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
488
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
489 @example
54
05472e90ae02 Import from CVS: tag r19-16-pre2
cvs
parents: 0
diff changeset
490 Name Type Fi Gr Dir Registry
0
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
491 --------------------------------------------------------------
54
05472e90ae02 Import from CVS: tag r19-16-pre2
cvs
parents: 0
diff changeset
492 arabic-digit 94 2 0 l2r MuleArabic-0
05472e90ae02 Import from CVS: tag r19-16-pre2
cvs
parents: 0
diff changeset
493 arabic-1-column 94 3 0 r2l MuleArabic-1
05472e90ae02 Import from CVS: tag r19-16-pre2
cvs
parents: 0
diff changeset
494 arabic-2-column 94 4 0 r2l MuleArabic-2
05472e90ae02 Import from CVS: tag r19-16-pre2
cvs
parents: 0
diff changeset
495 sisheng 94 0 0 l2r sisheng_cwnn\|OMRON_UDC_ZH
05472e90ae02 Import from CVS: tag r19-16-pre2
cvs
parents: 0
diff changeset
496 chinese-cns11643-3 94x94 I 0 l2r CNS11643.1
05472e90ae02 Import from CVS: tag r19-16-pre2
cvs
parents: 0
diff changeset
497 chinese-cns11643-4 94x94 J 0 l2r CNS11643.1
05472e90ae02 Import from CVS: tag r19-16-pre2
cvs
parents: 0
diff changeset
498 chinese-cns11643-5 94x94 K 0 l2r CNS11643.1
05472e90ae02 Import from CVS: tag r19-16-pre2
cvs
parents: 0
diff changeset
499 chinese-cns11643-6 94x94 L 0 l2r CNS11643.1
05472e90ae02 Import from CVS: tag r19-16-pre2
cvs
parents: 0
diff changeset
500 chinese-cns11643-7 94x94 M 0 l2r CNS11643.1
05472e90ae02 Import from CVS: tag r19-16-pre2
cvs
parents: 0
diff changeset
501 ethiopic 94x94 2 0 l2r Ethio
05472e90ae02 Import from CVS: tag r19-16-pre2
cvs
parents: 0
diff changeset
502 ascii-r2l 94 B 0 r2l ISO8859-1
05472e90ae02 Import from CVS: tag r19-16-pre2
cvs
parents: 0
diff changeset
503 ipa 96 0 1 l2r MuleIPA
05472e90ae02 Import from CVS: tag r19-16-pre2
cvs
parents: 0
diff changeset
504 vietnamese-lower 96 1 1 l2r VISCII1.1
05472e90ae02 Import from CVS: tag r19-16-pre2
cvs
parents: 0
diff changeset
505 vietnamese-upper 96 2 1 l2r VISCII1.1
0
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
506 @end example
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
507
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
508 For all of the above charsets, the dimension and number of columns are
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
509 the same.
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
510
404
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
511 Note that ASCII, Control-1, and Composite are handled specially.
0
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
512 This is why some of the fields are blank; and some of the filled-in
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
513 fields (e.g. the type) are not really accurate.
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
514
404
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
515 @node MULE Characters, Composite Characters, Charsets, MULE
0
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
516 @section MULE Characters
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
517
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
518 @defun make-char charset arg1 &optional arg2
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
519 This function makes a multi-byte character from @var{charset} and octets
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
520 @var{arg1} and @var{arg2}.
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
521 @end defun
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
522
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
523 @defun char-charset ch
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
524 This function returns the character set of char @var{ch}.
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
525 @end defun
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
526
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
527 @defun char-octet ch &optional n
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
528 This function returns the octet (i.e. position code) numbered @var{n}
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
529 (should be 0 or 1) of char @var{ch}. @var{n} defaults to 0 if omitted.
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
530 @end defun
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
531
114
8619ce7e4c50 Import from CVS: tag r20-1b9
cvs
parents: 70
diff changeset
532 @defun find-charset-region start end &optional buffer
0
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
533 This function returns a list of the charsets in the region between
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
534 @var{start} and @var{end}. @var{buffer} defaults to the current buffer
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
535 if omitted.
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
536 @end defun
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
537
114
8619ce7e4c50 Import from CVS: tag r20-1b9
cvs
parents: 70
diff changeset
538 @defun find-charset-string string
0
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
539 This function returns a list of the charsets in @var{string}.
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
540 @end defun
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
541
404
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
542 @node Composite Characters, Coding Systems, MULE Characters, MULE
0
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
543 @section Composite Characters
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
544
404
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
545 Composite characters are not yet completely implemented.
0
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
546
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
547 @defun make-composite-char string
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
548 This function converts a string into a single composite character. The
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
549 character is the result of overstriking all the characters in the
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
550 string.
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
551 @end defun
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
552
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
553 @defun composite-char-string ch
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
554 This function returns a string of the characters comprising a composite
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
555 character.
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
556 @end defun
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
557
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
558 @defun compose-region start end &optional buffer
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
559 This function composes the characters in the region from @var{start} to
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
560 @var{end} in @var{buffer} into one composite character. The composite
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
561 character replaces the composed characters. @var{buffer} defaults to
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
562 the current buffer if omitted.
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
563 @end defun
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
564
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
565 @defun decompose-region start end &optional buffer
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
566 This function decomposes any composite characters in the region from
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
567 @var{start} to @var{end} in @var{buffer}. This converts each composite
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
568 character into one or more characters, the individual characters out of
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
569 which the composite character was formed. Non-composite characters are
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
570 left as-is. @var{buffer} defaults to the current buffer if omitted.
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
571 @end defun
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
572
404
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
573 @node Coding Systems, CCL, Composite Characters, MULE
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
574 @section Coding Systems
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
575
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
576 A coding system is an object that defines how text containing multiple
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
577 character sets is encoded into a stream of (typically 8-bit) bytes. The
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
578 coding system is used to decode the stream into a series of characters
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
579 (which may be from multiple charsets) when the text is read from a file
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
580 or process, and is used to encode the text back into the same format
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
581 when it is written out to a file or process.
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
582
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
583 For example, many ISO-2022-compliant coding systems (such as Compound
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
584 Text, which is used for inter-client data under the X Window System) use
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
585 escape sequences to switch between different charsets -- Japanese Kanji,
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
586 for example, is invoked with @samp{ESC $ ( B}; ASCII is invoked with
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
587 @samp{ESC ( B}; and Cyrillic is invoked with @samp{ESC - L}. See
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
588 @code{make-coding-system} for more information.
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
589
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
590 Coding systems are normally identified using a symbol, and the symbol is
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
591 accepted in place of the actual coding system object whenever a coding
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
592 system is called for. (This is similar to how faces and charsets work.)
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
593
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
594 @defun coding-system-p object
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
595 This function returns non-@code{nil} if @var{object} is a coding system.
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
596 @end defun
0
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
597
404
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
598 @menu
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
599 * Coding System Types:: Classifying coding systems.
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
600 * ISO 2022:: An international standard for
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
601 charsets and encodings.
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
602 * EOL Conversion:: Dealing with different ways of denoting
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
603 the end of a line.
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
604 * Coding System Properties:: Properties of a coding system.
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
605 * Basic Coding System Functions:: Working with coding systems.
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
606 * Coding System Property Functions:: Retrieving a coding system's properties.
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
607 * Encoding and Decoding Text:: Encoding and decoding text.
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
608 * Detection of Textual Encoding:: Determining how text is encoded.
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
609 * Big5 and Shift-JIS Functions:: Special functions for these non-standard
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
610 encodings.
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
611 * Predefined Coding Systems:: Coding systems implemented by MULE.
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
612 @end menu
0
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
613
404
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
614 @node Coding System Types, ISO 2022, , Coding Systems
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
615 @subsection Coding System Types
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
616
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
617 The coding system type determines the basic algorithm XEmacs will use to
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
618 decode or encode a data stream. Character encodings will be converted
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
619 to the MULE encoding, escape sequences processed, and newline sequences
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
620 converted to XEmacs's internal representation. There are three basic
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
621 classes of coding system type: no-conversion, ISO-2022, and special.
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
622
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
623 No conversion allows you to look at the file's internal representation.
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
624 Since XEmacs is basically a text editor, "no conversion" does convert
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
625 newline conventions by default. (Use the 'binary coding-system if this
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
626 is not desired.)
0
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
627
404
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
628 ISO 2022 (@pxref{ISO 2022}) is the basic international standard regulating
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
629 use of "coded character sets for the exchange of data", ie, text
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
630 streams. ISO 2022 contains functions that make it possible to encode
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
631 text streams to comply with restrictions of the Internet mail system and
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
632 de facto restrictions of most file systems (eg, use of the separator
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
633 character in file names). Coding systems which are not ISO 2022
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
634 conformant can be difficult to handle. Perhaps more important, they are
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
635 not adaptable to multilingual information interchange, with the obvious
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
636 exception of ISO 10646 (Unicode). (Unicode is partially supported by
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
637 XEmacs with the addition of the Lisp package ucs-conv.)
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
638
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
639 The special class of coding systems includes automatic detection, CCL (a
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
640 "little language" embedded as an interpreter, useful for translating
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
641 between variants of a single character set), non-ISO-2022-conformant
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
642 encodings like Unicode, Shift JIS, and Big5, and MULE internal coding.
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
643 (NB: this list is based on XEmacs 21.2. Terminology may vary slightly
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
644 for other versions of XEmacs and for GNU Emacs 20.)
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
645
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
646 @table @code
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
647 @item no-conversion
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
648 No conversion, for binary files, and a few special cases of non-ISO-2022
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
649 coding systems where conversion is done by hook functions (usually
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
650 implemented in CCL). On output, graphic characters that are not in
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
651 ASCII or Latin-1 will be replaced by a @samp{?}. (For a
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
652 no-conversion-encoded buffer, these characters will only be present if
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
653 you explicitly insert them.)
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
654 @item iso2022
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
655 Any ISO-2022-compliant encoding. Among others, this includes JIS (the
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
656 Japanese encoding commonly used for e-mail), national variants of EUC
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
657 (the standard Unix encoding for Japanese and other languages), and
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
658 Compound Text (an encoding used in X11). You can specify more specific
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
659 information about the conversion with the @var{flags} argument.
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
660 @item ucs-4
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
661 ISO 10646 UCS-4 encoding. A 31-bit fixed-width superset of Unicode.
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
662 @item utf-8
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
663 ISO 10646 UTF-8 encoding. A ``file system safe'' transformation format
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
664 that can be used with both UCS-4 and Unicode.
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
665 @item undecided
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
666 Automatic conversion. XEmacs attempts to detect the coding system used
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
667 in the file.
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
668 @item shift-jis
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
669 Shift-JIS (a Japanese encoding commonly used in PC operating systems).
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
670 @item big5
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
671 Big5 (the encoding commonly used for Taiwanese).
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
672 @item ccl
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
673 The conversion is performed using a user-written pseudo-code program.
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
674 CCL (Code Conversion Language) is the name of this pseudo-code. For
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
675 example, CCL is used to map KOI8-R characters (an encoding for Russian
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
676 Cyrillic) to ISO8859-5 (the form used internally by MULE).
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
677 @item internal
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
678 Write out or read in the raw contents of the memory representing the
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
679 buffer's text. This is primarily useful for debugging purposes, and is
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
680 only enabled when XEmacs has been compiled with @code{DEBUG_XEMACS} set
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
681 (the @samp{--debug} configure option). @strong{Warning}: Reading in a
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
682 file using @code{internal} conversion can result in an internal
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
683 inconsistency in the memory representing a buffer's text, which will
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
684 produce unpredictable results and may cause XEmacs to crash. Under
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
685 normal circumstances you should never use @code{internal} conversion.
0
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
686 @end table
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
687
404
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
688 @node ISO 2022, EOL Conversion, Coding System Types, Coding Systems
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
689 @section ISO 2022
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
690
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
691 This section briefly describes the ISO 2022 encoding standard. A more
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
692 thorough treatment is available in the original document of ISO
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
693 2022 as well as various national standards (such as JIS X 0202).
0
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
694
404
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
695 Character sets (@dfn{charsets}) are classified into the following four
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
696 categories, according to the number of characters in the charset:
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
697 94-charset, 96-charset, 94x94-charset, and 96x96-charset. This means
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
698 that although an ISO 2022 coding system may have variable width
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
699 characters, each charset used is fixed-width (in contrast to the MULE
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
700 character set and UTF-8, for example).
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
701
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
702 ISO 2022 provides for switching between character sets via escape
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
703 sequences. This switching is somewhat complicated, because ISO 2022
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
704 provides for both legacy applications like Internet mail that accept
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
705 only 7 significant bits in some contexts (RFC 822 headers, for example),
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
706 and more modern "8-bit clean" applications. It also provides for
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
707 compact and transparent representation of languages like Japanese which
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
708 mix ASCII and a national script (even outside of computer programs).
0
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
709
404
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
710 First, ISO 2022 codified prevailing practice by dividing the code space
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
711 into "control" and "graphic" regions. The code points 0x00-0x1F and
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
712 0x80-0x9F are reserved for "control characters", while "graphic
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
713 characters" must be assigned to code points in the regions 0x20-0x7F and
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
714 0xA0-0xFF. The positions 0x20 and 0x7F are special, and under some
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
715 circumstances must be assigned the graphic character "ASCII SPACE" and
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
716 the control character "ASCII DEL" respectively.
0
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
717
404
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
718 The various regions are given the name C0 (0x00-0x1F), GL (0x20-0x7F),
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
719 C1 (0x80-0x9F), and GR (0xA0-0xFF). GL and GR stand for "graphic left"
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
720 and "graphic right", respectively, because of the standard method of
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
721 displaying graphic character sets in tables with the high byte indexing
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
722 columns and the low byte indexing rows. I don't find it very intuitive,
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
723 but these are called "registers".
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
724
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
725 An ISO 2022-conformant encoding for a graphic character set must use a
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
726 fixed number of bytes per character, and the values must fit into a
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
727 single register; that is, each byte must range over either 0x20-0x7F, or
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
728 0xA0-0xFF. It is not allowed to extend the range of the repertoire of a
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
729 character set by using both ranges at the same. This is why a standard
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
730 character set such as ISO 8859-1 is actually considered by ISO 2022 to
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
731 be an aggregation of two character sets, ASCII and LATIN-1, and why it
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
732 is technically incorrect to refer to ISO 8859-1 as "Latin 1". Also, a
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
733 single character's bytes must all be drawn from the same register; this
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
734 is why Shift JIS (for Japanese) and Big 5 (for Chinese) are not ISO
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
735 2022-compatible encodings.
0
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
736
404
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
737 The reason for this restriction becomes clear when you attempt to define
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
738 an efficient, robust encoding for a language like Japanese. Like ISO
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
739 8859, Japanese encodings are aggregations of several character sets. In
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
740 practice, the vast majority of characters are drawn from the "JIS Roman"
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
741 character set (a derivative of ASCII; it won't hurt to think of it as
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
742 ASCII) and the JIS X 0208 standard "basic Japanese" character set
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
743 including not only ideographic characters ("kanji") but syllabic
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
744 Japanese characters ("kana"), a wide variety of symbols, and many
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
745 alphabetic characters (Roman, Greek, and Cyrillic) as well. Although
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
746 JIS X 0208 includes the whole Roman alphabet, as a 2-byte code it is not
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
747 suited to programming; thus the inclusion of ASCII in the standard
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
748 Japanese encodings.
0
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
749
404
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
750 For normal Japanese text such as in newspapers, a broad repertoire of
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
751 approximately 3000 characters is used. Evidently this won't fit into
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
752 one byte; two must be used. But much of the text processed by Japanese
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
753 computers is computer source code, nearly all of which is ASCII. A not
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
754 insignificant portion of ordinary text is English (as such or as
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
755 borrowed Japanese vocabulary) or other languages which can represented
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
756 at least approximately in ASCII, as well. It seems reasonable then to
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
757 represent ASCII in one byte, and JIS X 0208 in two. And this is exactly
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
758 what the Extended Unix Code for Japanese (EUC-JP) does. ASCII is
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
759 invoked to the GL register, and JIS X 0208 is invoked to the GR
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
760 register. Thus, each byte can be tested for its character set by
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
761 looking at the high bit; if set, it is Japanese, if clear, it is ASCII.
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
762 Furthermore, since control characters like newline can never be part of
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
763 a graphic character, even in the case of corruption in transmission the
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
764 stream will be resynchronized at every line break, on the order of 60-80
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
765 bytes. This coding system requires no escape sequences or special
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
766 control codes to represent 99.9% of all Japanese text.
0
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
767
404
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
768 Note carefully the distinction between the character sets (ASCII and JIS
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
769 X 0208), the encoding (EUC-JP), and the coding system (ISO 2022). The
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
770 JIS X 0208 character set is used in three different encodings for
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
771 Japanese, but in ISO-2022-JP it is invoked into GL (so the high bit is
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
772 always clear), in EUC-JP it is invoked into GR (setting the high bit in
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
773 the process), and in Shift JIS the high bit may be set or reset, and the
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
774 significant bits are shifted within the 16-bit character so that the two
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
775 main character sets can coexist with a third (the "halfwidth katakana"
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
776 of JIS X 0201). As the name implies, the ISO-2022-JP encoding is also a
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
777 version of the ISO-2022 coding system.
0
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
778
404
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
779 In order to systematically treat subsidiary character sets (like the
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
780 "halfwidth katakana" already mentioned, and the "supplementary kanji" of
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
781 JIS X 0212), four further registers are defined: G0, G1, G2, and G3.
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
782 Unlike GL and GR, they are not logically distinguished by internal
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
783 format. Instead, the process of "invocation" mentioned earlier is
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
784 broken into two steps: first, a character set is @dfn{designated} to one
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
785 of the registers G0-G3 by use of an @dfn{escape sequence} of the form:
0
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
786
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
787 @example
398
74fd4e045ea6 Import from CVS: tag r21-2-29
cvs
parents: 371
diff changeset
788 ESC [@var{I}] @var{I} @var{F}
0
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
789 @end example
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
790
404
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
791 where @var{I} is an intermediate character or characters in the range
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
792 0x20 - 0x3F, and @var{F}, from the range 0x30-0x7Fm is the final
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
793 character identifying this charset. (Final characters in the range
410
de805c49cfc1 Import from CVS: tag r21-2-35
cvs
parents: 404
diff changeset
794 0x30-0x3F are reserved for private use and will never have a publicly
404
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
795 registered meaning.)
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
796
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
797 Then that register is @dfn{invoked} to either GL or GR, either
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
798 automatically (designations to G0 normally involve invocation to GL as
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
799 well), or by use of shifting (affecting only the following character in
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
800 the data stream) or locking (effective until the next designation or
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
801 locking) control sequences. An encoding conformant to ISO 2022 is
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
802 typically defined by designating the initial contents of the G0-G3
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
803 registers, specifying an 7 or 8 bit environment, and specifying whether
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
804 further designations will be recognized.
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
805
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
806 Some examples of character sets and the registered final characters
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
807 @var{F} used to designate them:
0
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
808
404
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
809 @need 1000
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
810 @table @asis
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
811 @item 94-charset
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
812 ASCII (B), left (J) and right (I) half of JIS X 0201, ...
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
813 @item 96-charset
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
814 Latin-1 (A), Latin-2 (B), Latin-3 (C), ...
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
815 @item 94x94-charset
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
816 GB2312 (A), JIS X 0208 (B), KSC5601 (C), ...
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
817 @item 96x96-charset
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
818 none for the moment
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
819 @end table
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
820
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
821 The meanings of the various characters in these sequences, where not
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
822 specified by the ISO 2022 standard (such as the ESC character), are
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
823 assigned by @dfn{ECMA}, the European Computer Manufacturers Association.
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
824
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
825 The meaning of intermediate characters are:
0
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
826
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
827 @example
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
828 @group
398
74fd4e045ea6 Import from CVS: tag r21-2-29
cvs
parents: 371
diff changeset
829 $ [0x24]: indicate charset of dimension 2 (94x94 or 96x96).
74fd4e045ea6 Import from CVS: tag r21-2-29
cvs
parents: 371
diff changeset
830 ( [0x28]: designate to G0 a 94-charset whose final byte is @var{F}.
74fd4e045ea6 Import from CVS: tag r21-2-29
cvs
parents: 371
diff changeset
831 ) [0x29]: designate to G1 a 94-charset whose final byte is @var{F}.
74fd4e045ea6 Import from CVS: tag r21-2-29
cvs
parents: 371
diff changeset
832 * [0x2A]: designate to G2 a 94-charset whose final byte is @var{F}.
74fd4e045ea6 Import from CVS: tag r21-2-29
cvs
parents: 371
diff changeset
833 + [0x2B]: designate to G3 a 94-charset whose final byte is @var{F}.
404
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
834 , [0x2C]: designate to G0 a 96-charset whose final byte is @var{F}.
398
74fd4e045ea6 Import from CVS: tag r21-2-29
cvs
parents: 371
diff changeset
835 - [0x2D]: designate to G1 a 96-charset whose final byte is @var{F}.
74fd4e045ea6 Import from CVS: tag r21-2-29
cvs
parents: 371
diff changeset
836 . [0x2E]: designate to G2 a 96-charset whose final byte is @var{F}.
74fd4e045ea6 Import from CVS: tag r21-2-29
cvs
parents: 371
diff changeset
837 / [0x2F]: designate to G3 a 96-charset whose final byte is @var{F}.
0
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
838 @end group
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
839 @end example
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
840
404
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
841 The comma may be used in files read and written only by MULE, as a MULE
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
842 extension, but this is illegal in ISO 2022. (The reason is that in ISO
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
843 2022 G0 must be a 94-member character set, with 0x20 assigned the value
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
844 SPACE, and 0x7F assigned the value DEL.)
0
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
845
404
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
846 Here are examples of designations:
0
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
847
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
848 @example
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
849 @group
398
74fd4e045ea6 Import from CVS: tag r21-2-29
cvs
parents: 371
diff changeset
850 ESC ( B : designate to G0 ASCII
74fd4e045ea6 Import from CVS: tag r21-2-29
cvs
parents: 371
diff changeset
851 ESC - A : designate to G1 Latin-1
74fd4e045ea6 Import from CVS: tag r21-2-29
cvs
parents: 371
diff changeset
852 ESC $ ( A or ESC $ A : designate to G0 GB2312
74fd4e045ea6 Import from CVS: tag r21-2-29
cvs
parents: 371
diff changeset
853 ESC $ ( B or ESC $ B : designate to G0 JISX0208
74fd4e045ea6 Import from CVS: tag r21-2-29
cvs
parents: 371
diff changeset
854 ESC $ ) C : designate to G1 KSC5601
0
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
855 @end group
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
856 @end example
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
857
404
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
858 (The short forms used to designate GB2312 and JIS X 0208 are for
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
859 backwards compatibility; the long forms are preferred.)
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
860
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
861 To use a charset designated to G2 or G3, and to use a charset designated
54
05472e90ae02 Import from CVS: tag r19-16-pre2
cvs
parents: 0
diff changeset
862 to G1 in a 7-bit environment, you must explicitly invoke G1, G2, or G3
05472e90ae02 Import from CVS: tag r19-16-pre2
cvs
parents: 0
diff changeset
863 into GL. There are two types of invocation, Locking Shift (forever) and
05472e90ae02 Import from CVS: tag r19-16-pre2
cvs
parents: 0
diff changeset
864 Single Shift (one character only).
0
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
865
404
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
866 Locking Shift is done as follows:
0
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
867
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
868 @example
398
74fd4e045ea6 Import from CVS: tag r21-2-29
cvs
parents: 371
diff changeset
869 LS0 or SI (0x0F): invoke G0 into GL
74fd4e045ea6 Import from CVS: tag r21-2-29
cvs
parents: 371
diff changeset
870 LS1 or SO (0x0E): invoke G1 into GL
74fd4e045ea6 Import from CVS: tag r21-2-29
cvs
parents: 371
diff changeset
871 LS2: invoke G2 into GL
74fd4e045ea6 Import from CVS: tag r21-2-29
cvs
parents: 371
diff changeset
872 LS3: invoke G3 into GL
74fd4e045ea6 Import from CVS: tag r21-2-29
cvs
parents: 371
diff changeset
873 LS1R: invoke G1 into GR
74fd4e045ea6 Import from CVS: tag r21-2-29
cvs
parents: 371
diff changeset
874 LS2R: invoke G2 into GR
74fd4e045ea6 Import from CVS: tag r21-2-29
cvs
parents: 371
diff changeset
875 LS3R: invoke G3 into GR
0
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
876 @end example
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
877
404
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
878 Single Shift is done as follows:
0
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
879
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
880 @example
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
881 @group
398
74fd4e045ea6 Import from CVS: tag r21-2-29
cvs
parents: 371
diff changeset
882 SS2 or ESC N: invoke G2 into GL
74fd4e045ea6 Import from CVS: tag r21-2-29
cvs
parents: 371
diff changeset
883 SS3 or ESC O: invoke G3 into GL
0
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
884 @end group
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
885 @end example
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
886
404
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
887 The shift functions (such as LS1R and SS3) are represented by control
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
888 characters (from C1) in 8 bit environments and by escape sequences in 7
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
889 bit environments.
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
890
0
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
891 (#### Ben says: I think the above is slightly incorrect. It appears that
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
892 SS2 invokes G2 into GR and SS3 invokes G3 into GR, whereas ESC N and
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
893 ESC O behave as indicated. The above definitions will not parse
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
894 EUC-encoded text correctly, and it looks like the code in mule-coding.c
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
895 has similar problems.)
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
896
404
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
897 Evidently there are a lot of ISO-2022-compliant ways of encoding
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
898 multilingual text. Now, in the world, there exist many coding systems
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
899 such as X11's Compound Text, Japanese JUNET code, and so-called EUC
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
900 (Extended UNIX Code); all of these are variants of ISO 2022.
0
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
901
404
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
902 In MULE, we characterize a version of ISO 2022 by the following
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
903 attributes:
0
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
904
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
905 @enumerate
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
906 @item
404
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
907 The character sets initially designated to G0 thru G3.
0
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
908 @item
404
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
909 Whether short form designations are allowed for Japanese and Chinese.
0
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
910 @item
404
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
911 Whether ASCII should be designated to G0 before control characters.
0
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
912 @item
404
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
913 Whether ASCII should be designated to G0 at the end of line.
0
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
914 @item
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
915 7-bit environment or 8-bit environment.
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
916 @item
404
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
917 Whether Locking Shifts are used or not.
0
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
918 @item
404
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
919 Whether to use ASCII or the variant JIS X 0201-1976-Roman.
0
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
920 @item
404
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
921 Whether to use JIS X 0208-1983 or the older version JIS X 0208-1976.
0
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
922 @end enumerate
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
923
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
924 (The last two are only for Japanese.)
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
925
404
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
926 By specifying these attributes, you can create any variant
54
05472e90ae02 Import from CVS: tag r19-16-pre2
cvs
parents: 0
diff changeset
927 of ISO 2022.
0
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
928
404
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
929 Here are several examples:
0
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
930
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
931 @example
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
932 @group
404
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
933 ISO-2022-JP -- Coding system used in Japanese email (RFC 1463 #### check).
398
74fd4e045ea6 Import from CVS: tag r21-2-29
cvs
parents: 371
diff changeset
934 1. G0 <- ASCII, G1..3 <- never used
74fd4e045ea6 Import from CVS: tag r21-2-29
cvs
parents: 371
diff changeset
935 2. Yes.
74fd4e045ea6 Import from CVS: tag r21-2-29
cvs
parents: 371
diff changeset
936 3. Yes.
74fd4e045ea6 Import from CVS: tag r21-2-29
cvs
parents: 371
diff changeset
937 4. Yes.
74fd4e045ea6 Import from CVS: tag r21-2-29
cvs
parents: 371
diff changeset
938 5. 7-bit environment
74fd4e045ea6 Import from CVS: tag r21-2-29
cvs
parents: 371
diff changeset
939 6. No.
74fd4e045ea6 Import from CVS: tag r21-2-29
cvs
parents: 371
diff changeset
940 7. Use ASCII
404
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
941 8. Use JIS X 0208-1983
0
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
942 @end group
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
943
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
944 @group
404
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
945 ctext -- X11 Compound Text
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
946 1. G0 <- ASCII, G1 <- Latin-1, G2,3 <- never used.
398
74fd4e045ea6 Import from CVS: tag r21-2-29
cvs
parents: 371
diff changeset
947 2. No.
74fd4e045ea6 Import from CVS: tag r21-2-29
cvs
parents: 371
diff changeset
948 3. No.
74fd4e045ea6 Import from CVS: tag r21-2-29
cvs
parents: 371
diff changeset
949 4. Yes.
404
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
950 5. 8-bit environment.
398
74fd4e045ea6 Import from CVS: tag r21-2-29
cvs
parents: 371
diff changeset
951 6. No.
404
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
952 7. Use ASCII.
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
953 8. Use JIS X 0208-1983.
0
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
954 @end group
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
955
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
956 @group
404
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
957 euc-china -- Chinese EUC. Often called the "GB encoding", but that is
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
958 technically incorrect.
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
959 1. G0 <- ASCII, G1 <- GB 2312, G2,3 <- never used.
398
74fd4e045ea6 Import from CVS: tag r21-2-29
cvs
parents: 371
diff changeset
960 2. No.
74fd4e045ea6 Import from CVS: tag r21-2-29
cvs
parents: 371
diff changeset
961 3. Yes.
74fd4e045ea6 Import from CVS: tag r21-2-29
cvs
parents: 371
diff changeset
962 4. Yes.
404
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
963 5. 8-bit environment.
398
74fd4e045ea6 Import from CVS: tag r21-2-29
cvs
parents: 371
diff changeset
964 6. No.
404
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
965 7. Use ASCII.
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
966 8. Use JIS X 0208-1983.
0
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
967 @end group
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
968
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
969 @group
404
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
970 ISO-2022-KR -- Coding system used in Korean email.
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
971 1. G0 <- ASCII, G1 <- KSC 5601, G2,3 <- never used.
398
74fd4e045ea6 Import from CVS: tag r21-2-29
cvs
parents: 371
diff changeset
972 2. No.
74fd4e045ea6 Import from CVS: tag r21-2-29
cvs
parents: 371
diff changeset
973 3. Yes.
74fd4e045ea6 Import from CVS: tag r21-2-29
cvs
parents: 371
diff changeset
974 4. Yes.
404
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
975 5. 7-bit environment.
398
74fd4e045ea6 Import from CVS: tag r21-2-29
cvs
parents: 371
diff changeset
976 6. Yes.
404
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
977 7. Use ASCII.
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
978 8. Use JIS X 0208-1983.
0
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
979 @end group
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
980 @end example
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
981
404
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
982 MULE creates all of these coding systems by default.
0
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
983
404
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
984 @node EOL Conversion, Coding System Properties, ISO 2022, Coding Systems
0
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
985 @subsection EOL Conversion
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
986
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
987 @table @code
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
988 @item nil
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
989 Automatically detect the end-of-line type (LF, CRLF, or CR). Also
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
990 generate subsidiary coding systems named @code{@var{name}-unix},
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
991 @code{@var{name}-dos}, and @code{@var{name}-mac}, that are identical to
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
992 this coding system but have an EOL-TYPE value of @code{lf}, @code{crlf},
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
993 and @code{cr}, respectively.
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
994 @item lf
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
995 The end of a line is marked externally using ASCII LF. Since this is
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
996 also the way that XEmacs represents an end-of-line internally,
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
997 specifying this option results in no end-of-line conversion. This is
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
998 the standard format for Unix text files.
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
999 @item crlf
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
1000 The end of a line is marked externally using ASCII CRLF. This is the
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
1001 standard format for MS-DOS text files.
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
1002 @item cr
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
1003 The end of a line is marked externally using ASCII CR. This is the
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
1004 standard format for Macintosh text files.
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
1005 @item t
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
1006 Automatically detect the end-of-line type but do not generate subsidiary
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
1007 coding systems. (This value is converted to @code{nil} when stored
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
1008 internally, and @code{coding-system-property} will return @code{nil}.)
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
1009 @end table
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
1010
404
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
1011 @node Coding System Properties, Basic Coding System Functions, EOL Conversion, Coding Systems
0
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
1012 @subsection Coding System Properties
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
1013
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
1014 @table @code
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
1015 @item mnemonic
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
1016 String to be displayed in the modeline when this coding system is
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
1017 active.
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
1018
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
1019 @item eol-type
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
1020 End-of-line conversion to be used. It should be one of the types
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
1021 listed in @ref{EOL Conversion}.
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
1022
404
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
1023 @item eol-lf
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
1024 The coding system which is the same as this one, except that it uses the
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
1025 Unix line-breaking convention.
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
1026
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
1027 @item eol-crlf
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
1028 The coding system which is the same as this one, except that it uses the
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
1029 DOS line-breaking convention.
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
1030
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
1031 @item eol-cr
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
1032 The coding system which is the same as this one, except that it uses the
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
1033 Macintosh line-breaking convention.
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
1034
0
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
1035 @item post-read-conversion
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
1036 Function called after a file has been read in, to perform the decoding.
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
1037 Called with two arguments, @var{beg} and @var{end}, denoting a region of
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
1038 the current buffer to be decoded.
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
1039
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
1040 @item pre-write-conversion
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
1041 Function called before a file is written out, to perform the encoding.
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
1042 Called with two arguments, @var{beg} and @var{end}, denoting a region of
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
1043 the current buffer to be encoded.
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
1044 @end table
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
1045
404
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
1046 The following additional properties are recognized if @var{type} is
0
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
1047 @code{iso2022}:
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
1048
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
1049 @table @code
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
1050 @item charset-g0
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
1051 @itemx charset-g1
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
1052 @itemx charset-g2
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
1053 @itemx charset-g3
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
1054 The character set initially designated to the G0 - G3 registers.
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
1055 The value should be one of
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
1056
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
1057 @itemize @bullet
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
1058 @item
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
1059 A charset object (designate that character set)
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
1060 @item
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
1061 @code{nil} (do not ever use this register)
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
1062 @item
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
1063 @code{t} (no character set is initially designated to the register, but
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
1064 may be later on; this automatically sets the corresponding
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
1065 @code{force-g*-on-output} property)
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
1066 @end itemize
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
1067
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
1068 @item force-g0-on-output
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
1069 @itemx force-g1-on-output
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
1070 @itemx force-g2-on-output
54
05472e90ae02 Import from CVS: tag r19-16-pre2
cvs
parents: 0
diff changeset
1071 @itemx force-g3-on-output
0
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
1072 If non-@code{nil}, send an explicit designation sequence on output
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
1073 before using the specified register.
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
1074
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
1075 @item short
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
1076 If non-@code{nil}, use the short forms @samp{ESC $ @@}, @samp{ESC $ A},
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
1077 and @samp{ESC $ B} on output in place of the full designation sequences
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
1078 @samp{ESC $ ( @@}, @samp{ESC $ ( A}, and @samp{ESC $ ( B}.
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
1079
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
1080 @item no-ascii-eol
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
1081 If non-@code{nil}, don't designate ASCII to G0 at each end of line on
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
1082 output. Setting this to non-@code{nil} also suppresses other
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
1083 state-resetting that normally happens at the end of a line.
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
1084
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
1085 @item no-ascii-cntl
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
1086 If non-@code{nil}, don't designate ASCII to G0 before control chars on
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
1087 output.
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
1088
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
1089 @item seven
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
1090 If non-@code{nil}, use 7-bit environment on output. Otherwise, use 8-bit
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
1091 environment.
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
1092
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
1093 @item lock-shift
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
1094 If non-@code{nil}, use locking-shift (SO/SI) instead of single-shift or
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
1095 designation by escape sequence.
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
1096
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
1097 @item no-iso6429
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
1098 If non-@code{nil}, don't use ISO6429's direction specification.
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
1099
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
1100 @item escape-quoted
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
1101 If non-nil, literal control characters that are the same as the
54
05472e90ae02 Import from CVS: tag r19-16-pre2
cvs
parents: 0
diff changeset
1102 beginning of a recognized ISO 2022 or ISO 6429 escape sequence (in
0
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
1103 particular, ESC (0x1B), SO (0x0E), SI (0x0F), SS2 (0x8E), SS3 (0x8F),
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
1104 and CSI (0x9B)) are ``quoted'' with an escape character so that they can
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
1105 be properly distinguished from an escape sequence. (Note that doing
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
1106 this results in a non-portable encoding.) This encoding flag is used for
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
1107 byte-compiled files. Note that ESC is a good choice for a quoting
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
1108 character because there are no escape sequences whose second byte is a
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
1109 character from the Control-0 or Control-1 character sets; this is
54
05472e90ae02 Import from CVS: tag r19-16-pre2
cvs
parents: 0
diff changeset
1110 explicitly disallowed by the ISO 2022 standard.
0
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
1111
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
1112 @item input-charset-conversion
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
1113 A list of conversion specifications, specifying conversion of characters
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
1114 in one charset to another when decoding is performed. Each
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
1115 specification is a list of two elements: the source charset, and the
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
1116 destination charset.
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
1117
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
1118 @item output-charset-conversion
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
1119 A list of conversion specifications, specifying conversion of characters
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
1120 in one charset to another when encoding is performed. The form of each
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
1121 specification is the same as for @code{input-charset-conversion}.
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
1122 @end table
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
1123
404
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
1124 The following additional properties are recognized (and required) if
0
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
1125 @var{type} is @code{ccl}:
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
1126
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
1127 @table @code
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
1128 @item decode
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
1129 CCL program used for decoding (converting to internal format).
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
1130
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
1131 @item encode
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
1132 CCL program used for encoding (converting to external format).
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
1133 @end table
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
1134
404
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
1135 The following properties are used internally: @var{eol-cr},
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
1136 @var{eol-crlf}, @var{eol-lf}, and @var{base}.
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
1137
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
1138 @node Basic Coding System Functions, Coding System Property Functions, Coding System Properties, Coding Systems
0
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
1139 @subsection Basic Coding System Functions
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
1140
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
1141 @defun find-coding-system coding-system-or-name
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
1142 This function retrieves the coding system of the given name.
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
1143
404
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
1144 If @var{coding-system-or-name} is a coding-system object, it is simply
0
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
1145 returned. Otherwise, @var{coding-system-or-name} should be a symbol.
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
1146 If there is no such coding system, @code{nil} is returned. Otherwise
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
1147 the associated coding system object is returned.
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
1148 @end defun
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
1149
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
1150 @defun get-coding-system name
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
1151 This function retrieves the coding system of the given name. Same as
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
1152 @code{find-coding-system} except an error is signalled if there is no
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
1153 such coding system instead of returning @code{nil}.
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
1154 @end defun
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
1155
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
1156 @defun coding-system-list
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
1157 This function returns a list of the names of all defined coding systems.
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
1158 @end defun
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
1159
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
1160 @defun coding-system-name coding-system
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
1161 This function returns the name of the given coding system.
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
1162 @end defun
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
1163
404
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
1164 @defun coding-system-base coding-system
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
1165 Returns the base coding system (undecided EOL convention)
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
1166 coding system.
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
1167 @end defun
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
1168
0
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
1169 @defun make-coding-system name type &optional doc-string props
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
1170 This function registers symbol @var{name} as a coding system.
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
1171
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
1172 @var{type} describes the conversion method used and should be one of
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
1173 the types listed in @ref{Coding System Types}.
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
1174
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
1175 @var{doc-string} is a string describing the coding system.
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
1176
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
1177 @var{props} is a property list, describing the specific nature of the
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
1178 character set. Recognized properties are as in @ref{Coding System
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
1179 Properties}.
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
1180 @end defun
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
1181
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
1182 @defun copy-coding-system old-coding-system new-name
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
1183 This function copies @var{old-coding-system} to @var{new-name}. If
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
1184 @var{new-name} does not name an existing coding system, a new one will
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
1185 be created.
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
1186 @end defun
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
1187
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
1188 @defun subsidiary-coding-system coding-system eol-type
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
1189 This function returns the subsidiary coding system of
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
1190 @var{coding-system} with eol type @var{eol-type}.
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
1191 @end defun
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
1192
404
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
1193 @node Coding System Property Functions, Encoding and Decoding Text, Basic Coding System Functions, Coding Systems
0
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
1194 @subsection Coding System Property Functions
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
1195
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
1196 @defun coding-system-doc-string coding-system
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
1197 This function returns the doc string for @var{coding-system}.
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
1198 @end defun
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
1199
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
1200 @defun coding-system-type coding-system
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
1201 This function returns the type of @var{coding-system}.
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
1202 @end defun
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
1203
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
1204 @defun coding-system-property coding-system prop
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
1205 This function returns the @var{prop} property of @var{coding-system}.
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
1206 @end defun
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
1207
404
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
1208 @node Encoding and Decoding Text, Detection of Textual Encoding, Coding System Property Functions, Coding Systems
0
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
1209 @subsection Encoding and Decoding Text
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
1210
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
1211 @defun decode-coding-region start end coding-system &optional buffer
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
1212 This function decodes the text between @var{start} and @var{end} which
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
1213 is encoded in @var{coding-system}. This is useful if you've read in
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
1214 encoded text from a file without decoding it (e.g. you read in a
54
05472e90ae02 Import from CVS: tag r19-16-pre2
cvs
parents: 0
diff changeset
1215 JIS-formatted file but used the @code{binary} or @code{no-conversion} coding
0
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
1216 system, so that it shows up as @samp{^[$B!<!+^[(B}). The length of the
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
1217 encoded text is returned. @var{buffer} defaults to the current buffer
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
1218 if unspecified.
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
1219 @end defun
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
1220
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
1221 @defun encode-coding-region start end coding-system &optional buffer
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
1222 This function encodes the text between @var{start} and @var{end} using
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
1223 @var{coding-system}. This will, for example, convert Japanese
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
1224 characters into stuff such as @samp{^[$B!<!+^[(B} if you use the JIS
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
1225 encoding. The length of the encoded text is returned. @var{buffer}
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
1226 defaults to the current buffer if unspecified.
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
1227 @end defun
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
1228
404
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
1229 @node Detection of Textual Encoding, Big5 and Shift-JIS Functions, Encoding and Decoding Text, Coding Systems
0
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
1230 @subsection Detection of Textual Encoding
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
1231
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
1232 @defun coding-category-list
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
1233 This function returns a list of all recognized coding categories.
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
1234 @end defun
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
1235
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
1236 @defun set-coding-priority-list list
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
1237 This function changes the priority order of the coding categories.
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
1238 @var{list} should be a list of coding categories, in descending order of
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
1239 priority. Unspecified coding categories will be lower in priority than
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
1240 all specified ones, in the same relative order they were in previously.
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
1241 @end defun
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
1242
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
1243 @defun coding-priority-list
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
1244 This function returns a list of coding categories in descending order of
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
1245 priority.
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
1246 @end defun
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
1247
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
1248 @defun set-coding-category-system coding-category coding-system
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
1249 This function changes the coding system associated with a coding category.
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
1250 @end defun
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
1251
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
1252 @defun coding-category-system coding-category
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
1253 This function returns the coding system associated with a coding category.
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
1254 @end defun
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
1255
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
1256 @defun detect-coding-region start end &optional buffer
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
1257 This function detects coding system of the text in the region between
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
1258 @var{start} and @var{end}. Returned value is a list of possible coding
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
1259 systems ordered by priority. If only ASCII characters are found, it
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
1260 returns @code{autodetect} or one of its subsidiary coding systems
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
1261 according to a detected end-of-line type. Optional arg @var{buffer}
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
1262 defaults to the current buffer.
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
1263 @end defun
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
1264
404
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
1265 @node Big5 and Shift-JIS Functions, Predefined Coding Systems, Detection of Textual Encoding, Coding Systems
0
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
1266 @subsection Big5 and Shift-JIS Functions
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
1267
404
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
1268 These are special functions for working with the non-standard
0
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
1269 Shift-JIS and Big5 encodings.
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
1270
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
1271 @defun decode-shift-jis-char code
404
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
1272 This function decodes a JIS X 0208 character of Shift-JIS coding-system.
0
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
1273 @var{code} is the character code in Shift-JIS as a cons of type bytes.
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
1274 The corresponding character is returned.
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
1275 @end defun
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
1276
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
1277 @defun encode-shift-jis-char ch
404
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
1278 This function encodes a JIS X 0208 character @var{ch} to SHIFT-JIS
0
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
1279 coding-system. The corresponding character code in SHIFT-JIS is
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
1280 returned as a cons of two bytes.
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
1281 @end defun
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
1282
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
1283 @defun decode-big5-char code
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
1284 This function decodes a Big5 character @var{code} of BIG5 coding-system.
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
1285 @var{code} is the character code in BIG5. The corresponding character
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
1286 is returned.
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
1287 @end defun
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
1288
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
1289 @defun encode-big5-char ch
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
1290 This function encodes the Big5 character @var{char} to BIG5
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
1291 coding-system. The corresponding character code in Big5 is returned.
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
1292 @end defun
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
1293
404
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
1294 @node Predefined Coding Systems, , Big5 and Shift-JIS Functions, Coding Systems
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
1295 @subsection Coding Systems Implemented
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
1296
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
1297 MULE initializes most of the commonly used coding systems at XEmacs's
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
1298 startup. A few others are initialized only when the relevant language
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
1299 environment is selected and support libraries are loaded. (NB: The
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
1300 following list is based on XEmacs 21.2.19, the development branch at the
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
1301 time of writing. The list may be somewhat different for other
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
1302 versions. Recent versions of GNU Emacs 20 implement a few more rare
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
1303 coding systems; work is being done to port these to XEmacs.)
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
1304
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
1305 Unfortunately, there is not a consistent naming convention for character
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
1306 sets, and for practical purposes coding systems often take their name
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
1307 from their principal character sets (ASCII, KOI8-R, Shift JIS). Others
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
1308 take their names from the coding system (ISO-2022-JP, EUC-KR), and a few
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
1309 from their non-text usages (internal, binary). To provide for this, and
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
1310 for the fact that many coding systems have several common names, an
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
1311 aliasing system is provided. Finally, some effort has been made to use
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
1312 names that are registered as MIME charsets (this is why the name
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
1313 'shift_jis contains that un-Lisp-y underscore).
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
1314
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
1315 There is a systematic naming convention regarding end-of-line (EOL)
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
1316 conventions for different systems. A coding system whose name ends in
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
1317 "-unix" forces the assumptions that lines are broken by newlines (0x0A).
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
1318 A coding system whose name ends in "-mac" forces the assumptions that
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
1319 lines are broken by ASCII CRs (0x0D). A coding system whose name ends
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
1320 in "-dos" forces the assumptions that lines are broken by CRLF sequences
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
1321 (0x0D 0x0A). These subsidiary coding systems are automatically derived
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
1322 from a base coding system. Use of the base coding system implies
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
1323 autodetection of the text file convention. (The fact that the -unix,
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
1324 -mac, and -dos are derived from a base system results in them showing up
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
1325 as "aliases" in `list-coding-systems'.) These subsidiaries have a
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
1326 consistent modeline indicator as well. "-dos" coding systems have ":T"
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
1327 appended to their modeline indicator, while "-mac" coding systems have
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
1328 ":t" appended (eg, "ISO8:t" for iso-2022-8-mac).
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
1329
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
1330 In the following table, each coding system is given with its mode line
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
1331 indicator in parentheses. Non-textual coding systems are listed first,
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
1332 followed by textual coding systems and their aliases. (The coding system
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
1333 subsidiary modeline indicators ":T" and ":t" will be omitted from the
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
1334 table of coding systems.)
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
1335
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
1336 ### SJT 1999-08-23 Maybe should order these by language? Definitely
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
1337 need language usage for the ISO-8859 family.
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
1338
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
1339 Note that although true coding system aliases have been implemented for
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
1340 XEmacs 21.2, the coding system initialization has not yet been converted
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
1341 as of 21.2.19. So coding systems described as aliases have the same
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
1342 properties as the aliased coding system, but will not be equal as Lisp
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
1343 objects.
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
1344
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
1345 @table @code
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
1346
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
1347 @item automatic-conversion
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
1348 @itemx undecided
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
1349 @itemx undecided-dos
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
1350 @itemx undecided-mac
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
1351 @itemx undecided-unix
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
1352
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
1353 Modeline indicator: @code{Auto}. A type @code{undecided} coding system.
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
1354 Attempts to determine an appropriate coding system from file contents or
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
1355 the environment.
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
1356
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
1357 @item raw-text
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
1358 @itemx no-conversion
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
1359 @itemx raw-text-dos
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
1360 @itemx raw-text-mac
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
1361 @itemx raw-text-unix
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
1362 @itemx no-conversion-dos
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
1363 @itemx no-conversion-mac
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
1364 @itemx no-conversion-unix
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
1365
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
1366 Modeline indicator: @code{Raw}. A type @code{no-conversion} coding system,
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
1367 which converts only line-break-codes. An implementation quirk means
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
1368 that this coding system is also used for ISO8859-1.
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
1369
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
1370 @item binary
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
1371 Modeline indicator: @code{Binary}. A type @code{no-conversion} coding
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
1372 system which does no character coding or EOL conversions. An alias for
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
1373 @code{raw-text-unix}.
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
1374
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
1375 @item alternativnyj
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
1376 @itemx alternativnyj-dos
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
1377 @itemx alternativnyj-mac
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
1378 @itemx alternativnyj-unix
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
1379
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
1380 Modeline indicator: @code{Cy.Alt}. A type @code{ccl} coding system used for
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
1381 Alternativnyj, an encoding of the Cyrillic alphabet.
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
1382
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
1383 @item big5
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
1384 @itemx big5-dos
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
1385 @itemx big5-mac
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
1386 @itemx big5-unix
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
1387
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
1388 Modeline indicator: @code{Zh/Big5}. A type @code{big5} coding system used for
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
1389 BIG5, the most common encoding of traditional Chinese as used in Taiwan.
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
1390
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
1391 @item cn-gb-2312
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
1392 @itemx cn-gb-2312-dos
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
1393 @itemx cn-gb-2312-mac
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
1394 @itemx cn-gb-2312-unix
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
1395
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
1396 Modeline indicator: @code{Zh-GB/EUC}. A type @code{iso2022} coding system used
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
1397 for simplified Chinese (as used in the People's Republic of China), with
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
1398 the @code{ascii} (G0), @code{chinese-gb2312} (G1), and @code{sisheng}
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
1399 (G2) character sets initially designated. Chinese EUC (Extended Unix
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
1400 Code).
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
1401
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
1402 @item ctext-hebrew
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
1403 @itemx ctext-hebrew-dos
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
1404 @itemx ctext-hebrew-mac
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
1405 @itemx ctext-hebrew-unix
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
1406
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
1407 Modeline indicator: @code{CText/Hbrw}. A type @code{iso2022} coding system
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
1408 with the @code{ascii} (G0) and @code{hebrew-iso8859-8} (G1) character
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
1409 sets initially designated for Hebrew.
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
1410
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
1411 @item ctext
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
1412 @itemx ctext-dos
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
1413 @itemx ctext-mac
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
1414 @itemx ctext-unix
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
1415
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
1416 Modeline indicator: @code{CText}. A type @code{iso2022} 8-bit coding system
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
1417 with the @code{ascii} (G0) and @code{latin-iso8859-1} (G1) character
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
1418 sets initially designated. X11 Compound Text Encoding. Often
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
1419 mistakenly recognized instead of EUC encodings; usual cause is
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
1420 inappropriate setting of @code{coding-priority-list}.
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
1421
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
1422 @item escape-quoted
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
1423
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
1424 Modeline indicator: @code{ESC/Quot}. A type @code{iso2022} 8-bit coding
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
1425 system with the @code{ascii} (G0) and @code{latin-iso8859-1} (G1)
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
1426 character sets initially designated and escape quoting. Unix EOL
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
1427 conversion (ie, no conversion). It is used for .ELC files.
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
1428
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
1429 @item euc-jp
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
1430 @itemx euc-jp-dos
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
1431 @itemx euc-jp-mac
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
1432 @itemx euc-jp-unix
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
1433
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
1434 Modeline indicator: @code{Ja/EUC}. A type @code{iso2022} 8-bit coding system
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
1435 with @code{ascii} (G0), @code{japanese-jisx0208} (G1),
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
1436 @code{katakana-jisx0201} (G2), and @code{japanese-jisx0212} (G3)
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
1437 initially designated. Japanese EUC (Extended Unix Code).
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
1438
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
1439 @item euc-kr
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
1440 @itemx euc-kr-dos
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
1441 @itemx euc-kr-mac
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
1442 @itemx euc-kr-unix
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
1443
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
1444 Modeline indicator: @code{ko/EUC}. A type @code{iso2022} 8-bit coding system
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
1445 with @code{ascii} (G0) and @code{korean-ksc5601} (G1) initially
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
1446 designated. Korean EUC (Extended Unix Code).
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
1447
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
1448 @item hz-gb-2312
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
1449 Modeline indicator: @code{Zh-GB/Hz}. A type @code{no-conversion} coding
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
1450 system with Unix EOL convention (ie, no conversion) using
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
1451 post-read-decode and pre-write-encode functions to translate the Hz/ZW
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
1452 coding system used for Chinese.
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
1453
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
1454 @item iso-2022-7bit
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
1455 @itemx iso-2022-7bit-unix
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
1456 @itemx iso-2022-7bit-dos
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
1457 @itemx iso-2022-7bit-mac
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
1458 @itemx iso-2022-7
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
1459
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
1460 Modeline indicator: @code{ISO7}. A type @code{iso2022} 7-bit coding system
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
1461 with @code{ascii} (G0) initially designated. Other character sets must
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
1462 be explicitly designated to be used.
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
1463
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
1464 @item iso-2022-7bit-ss2
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
1465 @itemx iso-2022-7bit-ss2-dos
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
1466 @itemx iso-2022-7bit-ss2-mac
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
1467 @itemx iso-2022-7bit-ss2-unix
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
1468
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
1469 Modeline indicator: @code{ISO7/SS}. A type @code{iso2022} 7-bit coding system
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
1470 with @code{ascii} (G0) initially designated. Other character sets must
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
1471 be explicitly designated to be used. SS2 is used to invoke a
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
1472 96-charset, one character at a time.
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
1473
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
1474 @item iso-2022-8
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
1475 @itemx iso-2022-8-dos
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
1476 @itemx iso-2022-8-mac
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
1477 @itemx iso-2022-8-unix
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
1478
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
1479 Modeline indicator: @code{ISO8}. A type @code{iso2022} 8-bit coding system
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
1480 with @code{ascii} (G0) and @code{latin-iso8859-1} (G1) initially
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
1481 designated. Other character sets must be explicitly designated to be
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
1482 used. No single-shift or locking-shift.
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
1483
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
1484 @item iso-2022-8bit-ss2
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
1485 @itemx iso-2022-8bit-ss2-dos
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
1486 @itemx iso-2022-8bit-ss2-mac
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
1487 @itemx iso-2022-8bit-ss2-unix
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
1488
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
1489 Modeline indicator: @code{ISO8/SS}. A type @code{iso2022} 8-bit coding system
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
1490 with @code{ascii} (G0) and @code{latin-iso8859-1} (G1) initially
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
1491 designated. Other character sets must be explicitly designated to be
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
1492 used. SS2 is used to invoke a 96-charset, one character at a time.
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
1493
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
1494 @item iso-2022-int-1
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
1495 @itemx iso-2022-int-1-dos
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
1496 @itemx iso-2022-int-1-mac
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
1497 @itemx iso-2022-int-1-unix
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
1498
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
1499 Modeline indicator: @code{INT-1}. A type @code{iso2022} 7-bit coding system
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
1500 with @code{ascii} (G0) and @code{korean-ksc5601} (G1) initially
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
1501 designated. ISO-2022-INT-1.
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
1502
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
1503 @item iso-2022-jp-1978-irv
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
1504 @itemx iso-2022-jp-1978-irv-dos
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
1505 @itemx iso-2022-jp-1978-irv-mac
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
1506 @itemx iso-2022-jp-1978-irv-unix
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
1507
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
1508 Modeline indicator: @code{Ja-78/7bit}. A type @code{iso2022} 7-bit coding
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
1509 system. For compatibility with old Japanese terminals; if you need to
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
1510 know, look at the source.
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
1511
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
1512 @item iso-2022-jp
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
1513 @itemx iso-2022-jp-2 (ISO7/SS)
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
1514 @itemx iso-2022-jp-dos
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
1515 @itemx iso-2022-jp-mac
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
1516 @itemx iso-2022-jp-unix
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
1517 @itemx iso-2022-jp-2-dos
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
1518 @itemx iso-2022-jp-2-mac
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
1519 @itemx iso-2022-jp-2-unix
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
1520
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
1521 Modeline indicator: @code{MULE/7bit}. A type @code{iso2022} 7-bit coding
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
1522 system with @code{ascii} (G0) initially designated, and complex
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
1523 specifications to insure backward compatibility with old Japanese
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
1524 systems. Used for communication with mail and news in Japan. The "-2"
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
1525 versions also use SS2 to invoke a 96-charset one character at a time.
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
1526
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
1527 @item iso-2022-kr
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
1528 Modeline indicator: @code{Ko/7bit} A type @code{iso2022} 7-bit coding
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
1529 system with @code{ascii} (G0) and @code{korean-ksc5601} (G1) initially
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
1530 designated. Used for e-mail in Korea.
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
1531
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
1532 @item iso-2022-lock
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
1533 @itemx iso-2022-lock-dos
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
1534 @itemx iso-2022-lock-mac
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
1535 @itemx iso-2022-lock-unix
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
1536
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
1537 Modeline indicator: @code{ISO7/Lock}. A type @code{iso2022} 7-bit coding
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
1538 system with @code{ascii} (G0) initially designated, using Locking-Shift
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
1539 to invoke a 96-charset.
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
1540
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
1541 @item iso-8859-1
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
1542 @itemx iso-8859-1-dos
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
1543 @itemx iso-8859-1-mac
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
1544 @itemx iso-8859-1-unix
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
1545
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
1546 Due to implementation, this is not a type @code{iso2022} coding system,
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
1547 but rather an alias for the @code{raw-text} coding system.
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
1548
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
1549 @item iso-8859-2
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
1550 @itemx iso-8859-2-dos
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
1551 @itemx iso-8859-2-mac
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
1552 @itemx iso-8859-2-unix
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
1553
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
1554 Modeline indicator: @code{MIME/Ltn-2}. A type @code{iso2022} coding
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
1555 system with @code{ascii} (G0) and @code{latin-iso8859-2} (G1) initially
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
1556 invoked.
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
1557
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
1558 @item iso-8859-3
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
1559 @itemx iso-8859-3-dos
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
1560 @itemx iso-8859-3-mac
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
1561 @itemx iso-8859-3-unix
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
1562
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
1563 Modeline indicator: @code{MIME/Ltn-3}. A type @code{iso2022} coding system
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
1564 with @code{ascii} (G0) and @code{latin-iso8859-3} (G1) initially
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
1565 invoked.
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
1566
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
1567 @item iso-8859-4
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
1568 @itemx iso-8859-4-dos
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
1569 @itemx iso-8859-4-mac
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
1570 @itemx iso-8859-4-unix
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
1571
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
1572 Modeline indicator: @code{MIME/Ltn-4}. A type @code{iso2022} coding system
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
1573 with @code{ascii} (G0) and @code{latin-iso8859-4} (G1) initially
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
1574 invoked.
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
1575
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
1576 @item iso-8859-5
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
1577 @itemx iso-8859-5-dos
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
1578 @itemx iso-8859-5-mac
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
1579 @itemx iso-8859-5-unix
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
1580
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
1581 Modeline indicator: @code{ISO8/Cyr}. A type @code{iso2022} coding system with
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
1582 @code{ascii} (G0) and @code{cyrillic-iso8859-5} (G1) initially invoked.
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
1583
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
1584 @item iso-8859-7
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
1585 @itemx iso-8859-7-dos
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
1586 @itemx iso-8859-7-mac
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
1587 @itemx iso-8859-7-unix
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
1588
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
1589 Modeline indicator: @code{Grk}. A type @code{iso2022} coding system with
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
1590 @code{ascii} (G0) and @code{greek-iso8859-7} (G1) initially invoked.
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
1591
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
1592 @item iso-8859-8
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
1593 @itemx iso-8859-8-dos
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
1594 @itemx iso-8859-8-mac
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
1595 @itemx iso-8859-8-unix
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
1596
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
1597 Modeline indicator: @code{MIME/Hbrw}. A type @code{iso2022} coding system with
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
1598 @code{ascii} (G0) and @code{hebrew-iso8859-8} (G1) initially invoked.
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
1599
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
1600 @item iso-8859-9
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
1601 @itemx iso-8859-9-dos
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
1602 @itemx iso-8859-9-mac
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
1603 @itemx iso-8859-9-unix
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
1604
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
1605 Modeline indicator: @code{MIME/Ltn-5}. A type @code{iso2022} coding system
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
1606 with @code{ascii} (G0) and @code{latin-iso8859-9} (G1) initially
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
1607 invoked.
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
1608
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
1609 @item koi8-r
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
1610 @itemx koi8-r-dos
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
1611 @itemx koi8-r-mac
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
1612 @itemx koi8-r-unix
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
1613
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
1614 Modeline indicator: @code{KOI8}. A type @code{ccl} coding-system used for
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
1615 KOI8-R, an encoding of the Cyrillic alphabet.
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
1616
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
1617 @item shift_jis
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
1618 @itemx shift_jis-dos
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
1619 @itemx shift_jis-mac
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
1620 @itemx shift_jis-unix
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
1621
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
1622 Modeline indicator: @code{Ja/SJIS}. A type @code{shift-jis} coding-system
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
1623 implementing the Shift-JIS encoding for Japanese. The underscore is to
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
1624 conform to the MIME charset implementing this encoding.
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
1625
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
1626 @item tis-620
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
1627 @itemx tis-620-dos
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
1628 @itemx tis-620-mac
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
1629 @itemx tis-620-unix
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
1630
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
1631 Modeline indicator: @code{TIS620}. A type @code{ccl} encoding for Thai. The
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
1632 external encoding is defined by TIS620, the internal encoding is
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
1633 peculiar to MULE, and called @code{thai-xtis}.
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
1634
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
1635 @item viqr
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
1636
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
1637 Modeline indicator: @code{VIQR}. A type @code{no-conversion} coding
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
1638 system with Unix EOL convention (ie, no conversion) using
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
1639 post-read-decode and pre-write-encode functions to translate the VIQR
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
1640 coding system for Vietnamese.
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
1641
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
1642 @item viscii
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
1643 @itemx viscii-dos
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
1644 @itemx viscii-mac
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
1645 @itemx viscii-unix
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
1646
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
1647 Modeline indicator: @code{VISCII}. A type @code{ccl} coding-system used
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
1648 for VISCII 1.1 for Vietnamese. Differs slightly from VSCII; VISCII is
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
1649 given priority by XEmacs.
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
1650
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
1651 @item vscii
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
1652 @itemx vscii-dos
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
1653 @itemx vscii-mac
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
1654 @itemx vscii-unix
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
1655
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
1656 Modeline indicator: @code{VSCII}. A type @code{ccl} coding-system used
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
1657 for VSCII 1.1 for Vietnamese. Differs slightly from VISCII, which is
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
1658 given priority by XEmacs. Use
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
1659 @code{(prefer-coding-system 'vietnamese-vscii)} to give priority to VSCII.
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
1660
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
1661 @end table
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
1662
398
74fd4e045ea6 Import from CVS: tag r21-2-29
cvs
parents: 371
diff changeset
1663 @node CCL, Category Tables, Coding Systems, MULE
0
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
1664 @section CCL
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
1665
404
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
1666 CCL (Code Conversion Language) is a simple structured programming
398
74fd4e045ea6 Import from CVS: tag r21-2-29
cvs
parents: 371
diff changeset
1667 language designed for character coding conversions. A CCL program is
74fd4e045ea6 Import from CVS: tag r21-2-29
cvs
parents: 371
diff changeset
1668 compiled to CCL code (represented by a vector of integers) and executed
74fd4e045ea6 Import from CVS: tag r21-2-29
cvs
parents: 371
diff changeset
1669 by the CCL interpreter embedded in Emacs. The CCL interpreter
74fd4e045ea6 Import from CVS: tag r21-2-29
cvs
parents: 371
diff changeset
1670 implements a virtual machine with 8 registers called @code{r0}, ...,
74fd4e045ea6 Import from CVS: tag r21-2-29
cvs
parents: 371
diff changeset
1671 @code{r7}, a number of control structures, and some I/O operators. Take
74fd4e045ea6 Import from CVS: tag r21-2-29
cvs
parents: 371
diff changeset
1672 care when using registers @code{r0} (used in implicit @dfn{set}
74fd4e045ea6 Import from CVS: tag r21-2-29
cvs
parents: 371
diff changeset
1673 statements) and especially @code{r7} (used internally by several
74fd4e045ea6 Import from CVS: tag r21-2-29
cvs
parents: 371
diff changeset
1674 statements and operations, especially for multiple return values and I/O
74fd4e045ea6 Import from CVS: tag r21-2-29
cvs
parents: 371
diff changeset
1675 operations).
74fd4e045ea6 Import from CVS: tag r21-2-29
cvs
parents: 371
diff changeset
1676
404
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
1677 CCL is used for code conversion during process I/O and file I/O for
398
74fd4e045ea6 Import from CVS: tag r21-2-29
cvs
parents: 371
diff changeset
1678 non-ISO2022 coding systems. (It is the only way for a user to specify a
74fd4e045ea6 Import from CVS: tag r21-2-29
cvs
parents: 371
diff changeset
1679 code conversion function.) It is also used for calculating the code
74fd4e045ea6 Import from CVS: tag r21-2-29
cvs
parents: 371
diff changeset
1680 point of an X11 font from a character code. However, since CCL is
74fd4e045ea6 Import from CVS: tag r21-2-29
cvs
parents: 371
diff changeset
1681 designed as a powerful programming language, it can be used for more
74fd4e045ea6 Import from CVS: tag r21-2-29
cvs
parents: 371
diff changeset
1682 generic calculation where efficiency is demanded. A combination of
74fd4e045ea6 Import from CVS: tag r21-2-29
cvs
parents: 371
diff changeset
1683 three or more arithmetic operations can be calculated faster by CCL than
74fd4e045ea6 Import from CVS: tag r21-2-29
cvs
parents: 371
diff changeset
1684 by Emacs Lisp.
74fd4e045ea6 Import from CVS: tag r21-2-29
cvs
parents: 371
diff changeset
1685
404
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
1686 @strong{Warning:} The code in @file{src/mule-ccl.c} and
398
74fd4e045ea6 Import from CVS: tag r21-2-29
cvs
parents: 371
diff changeset
1687 @file{$packages/lisp/mule-base/mule-ccl.el} is the definitive
74fd4e045ea6 Import from CVS: tag r21-2-29
cvs
parents: 371
diff changeset
1688 description of CCL's semantics. The previous version of this section
74fd4e045ea6 Import from CVS: tag r21-2-29
cvs
parents: 371
diff changeset
1689 contained several typos and obsolete names left from earlier versions of
74fd4e045ea6 Import from CVS: tag r21-2-29
cvs
parents: 371
diff changeset
1690 MULE, and many may remain. (I am not an experienced CCL programmer; the
74fd4e045ea6 Import from CVS: tag r21-2-29
cvs
parents: 371
diff changeset
1691 few who know CCL well find writing English painful.)
74fd4e045ea6 Import from CVS: tag r21-2-29
cvs
parents: 371
diff changeset
1692
404
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
1693 A CCL program transforms an input data stream into an output data
398
74fd4e045ea6 Import from CVS: tag r21-2-29
cvs
parents: 371
diff changeset
1694 stream. The input stream, held in a buffer of constant bytes, is left
74fd4e045ea6 Import from CVS: tag r21-2-29
cvs
parents: 371
diff changeset
1695 unchanged. The buffer may be filled by an external input operation,
74fd4e045ea6 Import from CVS: tag r21-2-29
cvs
parents: 371
diff changeset
1696 taken from an Emacs buffer, or taken from a Lisp string. The output
74fd4e045ea6 Import from CVS: tag r21-2-29
cvs
parents: 371
diff changeset
1697 buffer is a dynamic array of bytes, which can be written by an external
74fd4e045ea6 Import from CVS: tag r21-2-29
cvs
parents: 371
diff changeset
1698 output operation, inserted into an Emacs buffer, or returned as a Lisp
74fd4e045ea6 Import from CVS: tag r21-2-29
cvs
parents: 371
diff changeset
1699 string.
74fd4e045ea6 Import from CVS: tag r21-2-29
cvs
parents: 371
diff changeset
1700
404
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
1701 A CCL program is a (Lisp) list containing two or three members. The
398
74fd4e045ea6 Import from CVS: tag r21-2-29
cvs
parents: 371
diff changeset
1702 first member is the @dfn{buffer magnification}, which indicates the
74fd4e045ea6 Import from CVS: tag r21-2-29
cvs
parents: 371
diff changeset
1703 required minimum size of the output buffer as a multiple of the input
74fd4e045ea6 Import from CVS: tag r21-2-29
cvs
parents: 371
diff changeset
1704 buffer. It is followed by the @dfn{main block} which executes while
74fd4e045ea6 Import from CVS: tag r21-2-29
cvs
parents: 371
diff changeset
1705 there is input remaining, and an optional @dfn{EOF block} which is
74fd4e045ea6 Import from CVS: tag r21-2-29
cvs
parents: 371
diff changeset
1706 executed when the input is exhausted. Both the main block and the EOF
74fd4e045ea6 Import from CVS: tag r21-2-29
cvs
parents: 371
diff changeset
1707 block are CCL blocks.
74fd4e045ea6 Import from CVS: tag r21-2-29
cvs
parents: 371
diff changeset
1708
404
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
1709 A @dfn{CCL block} is either a CCL statement or list of CCL statements.
398
74fd4e045ea6 Import from CVS: tag r21-2-29
cvs
parents: 371
diff changeset
1710 A @dfn{CCL statement} is either a @dfn{set statement} (either an integer
74fd4e045ea6 Import from CVS: tag r21-2-29
cvs
parents: 371
diff changeset
1711 or an @dfn{assignment}, which is a list of a register to receive the
74fd4e045ea6 Import from CVS: tag r21-2-29
cvs
parents: 371
diff changeset
1712 assignment, an assignment operator, and an expression) or a @dfn{control
74fd4e045ea6 Import from CVS: tag r21-2-29
cvs
parents: 371
diff changeset
1713 statement} (a list starting with a keyword, whose allowable syntax
74fd4e045ea6 Import from CVS: tag r21-2-29
cvs
parents: 371
diff changeset
1714 depends on the keyword).
74fd4e045ea6 Import from CVS: tag r21-2-29
cvs
parents: 371
diff changeset
1715
74fd4e045ea6 Import from CVS: tag r21-2-29
cvs
parents: 371
diff changeset
1716 @menu
74fd4e045ea6 Import from CVS: tag r21-2-29
cvs
parents: 371
diff changeset
1717 * CCL Syntax:: CCL program syntax in BNF notation.
74fd4e045ea6 Import from CVS: tag r21-2-29
cvs
parents: 371
diff changeset
1718 * CCL Statements:: Semantics of CCL statements.
74fd4e045ea6 Import from CVS: tag r21-2-29
cvs
parents: 371
diff changeset
1719 * CCL Expressions:: Operators and expressions in CCL.
74fd4e045ea6 Import from CVS: tag r21-2-29
cvs
parents: 371
diff changeset
1720 * Calling CCL:: Running CCL programs.
74fd4e045ea6 Import from CVS: tag r21-2-29
cvs
parents: 371
diff changeset
1721 * CCL Examples:: The encoding functions for Big5 and KOI-8.
74fd4e045ea6 Import from CVS: tag r21-2-29
cvs
parents: 371
diff changeset
1722 @end menu
74fd4e045ea6 Import from CVS: tag r21-2-29
cvs
parents: 371
diff changeset
1723
404
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
1724 @node CCL Syntax, CCL Statements, , CCL
398
74fd4e045ea6 Import from CVS: tag r21-2-29
cvs
parents: 371
diff changeset
1725 @comment Node, Next, Previous, Up
74fd4e045ea6 Import from CVS: tag r21-2-29
cvs
parents: 371
diff changeset
1726 @subsection CCL Syntax
74fd4e045ea6 Import from CVS: tag r21-2-29
cvs
parents: 371
diff changeset
1727
404
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
1728 The full syntax of a CCL program in BNF notation:
398
74fd4e045ea6 Import from CVS: tag r21-2-29
cvs
parents: 371
diff changeset
1729
74fd4e045ea6 Import from CVS: tag r21-2-29
cvs
parents: 371
diff changeset
1730 @format
74fd4e045ea6 Import from CVS: tag r21-2-29
cvs
parents: 371
diff changeset
1731 CCL_PROGRAM :=
74fd4e045ea6 Import from CVS: tag r21-2-29
cvs
parents: 371
diff changeset
1732 (BUFFER_MAGNIFICATION
74fd4e045ea6 Import from CVS: tag r21-2-29
cvs
parents: 371
diff changeset
1733 CCL_MAIN_BLOCK
74fd4e045ea6 Import from CVS: tag r21-2-29
cvs
parents: 371
diff changeset
1734 [ CCL_EOF_BLOCK ])
74fd4e045ea6 Import from CVS: tag r21-2-29
cvs
parents: 371
diff changeset
1735
74fd4e045ea6 Import from CVS: tag r21-2-29
cvs
parents: 371
diff changeset
1736 BUFFER_MAGNIFICATION := integer
74fd4e045ea6 Import from CVS: tag r21-2-29
cvs
parents: 371
diff changeset
1737 CCL_MAIN_BLOCK := CCL_BLOCK
74fd4e045ea6 Import from CVS: tag r21-2-29
cvs
parents: 371
diff changeset
1738 CCL_EOF_BLOCK := CCL_BLOCK
74fd4e045ea6 Import from CVS: tag r21-2-29
cvs
parents: 371
diff changeset
1739
74fd4e045ea6 Import from CVS: tag r21-2-29
cvs
parents: 371
diff changeset
1740 CCL_BLOCK :=
74fd4e045ea6 Import from CVS: tag r21-2-29
cvs
parents: 371
diff changeset
1741 STATEMENT | (STATEMENT [STATEMENT ...])
74fd4e045ea6 Import from CVS: tag r21-2-29
cvs
parents: 371
diff changeset
1742 STATEMENT :=
74fd4e045ea6 Import from CVS: tag r21-2-29
cvs
parents: 371
diff changeset
1743 SET | IF | BRANCH | LOOP | REPEAT | BREAK | READ | WRITE
74fd4e045ea6 Import from CVS: tag r21-2-29
cvs
parents: 371
diff changeset
1744 | CALL | END
74fd4e045ea6 Import from CVS: tag r21-2-29
cvs
parents: 371
diff changeset
1745
74fd4e045ea6 Import from CVS: tag r21-2-29
cvs
parents: 371
diff changeset
1746 SET :=
74fd4e045ea6 Import from CVS: tag r21-2-29
cvs
parents: 371
diff changeset
1747 (REG = EXPRESSION)
74fd4e045ea6 Import from CVS: tag r21-2-29
cvs
parents: 371
diff changeset
1748 | (REG ASSIGNMENT_OPERATOR EXPRESSION)
74fd4e045ea6 Import from CVS: tag r21-2-29
cvs
parents: 371
diff changeset
1749 | integer
74fd4e045ea6 Import from CVS: tag r21-2-29
cvs
parents: 371
diff changeset
1750
74fd4e045ea6 Import from CVS: tag r21-2-29
cvs
parents: 371
diff changeset
1751 EXPRESSION := ARG | (EXPRESSION OPERATOR ARG)
74fd4e045ea6 Import from CVS: tag r21-2-29
cvs
parents: 371
diff changeset
1752
74fd4e045ea6 Import from CVS: tag r21-2-29
cvs
parents: 371
diff changeset
1753 IF := (if EXPRESSION CCL_BLOCK [CCL_BLOCK])
74fd4e045ea6 Import from CVS: tag r21-2-29
cvs
parents: 371
diff changeset
1754 BRANCH := (branch EXPRESSION CCL_BLOCK [CCL_BLOCK ...])
74fd4e045ea6 Import from CVS: tag r21-2-29
cvs
parents: 371
diff changeset
1755 LOOP := (loop STATEMENT [STATEMENT ...])
74fd4e045ea6 Import from CVS: tag r21-2-29
cvs
parents: 371
diff changeset
1756 BREAK := (break)
74fd4e045ea6 Import from CVS: tag r21-2-29
cvs
parents: 371
diff changeset
1757 REPEAT :=
74fd4e045ea6 Import from CVS: tag r21-2-29
cvs
parents: 371
diff changeset
1758 (repeat)
74fd4e045ea6 Import from CVS: tag r21-2-29
cvs
parents: 371
diff changeset
1759 | (write-repeat [REG | integer | string])
74fd4e045ea6 Import from CVS: tag r21-2-29
cvs
parents: 371
diff changeset
1760 | (write-read-repeat REG [integer | ARRAY])
74fd4e045ea6 Import from CVS: tag r21-2-29
cvs
parents: 371
diff changeset
1761 READ :=
74fd4e045ea6 Import from CVS: tag r21-2-29
cvs
parents: 371
diff changeset
1762 (read REG ...)
74fd4e045ea6 Import from CVS: tag r21-2-29
cvs
parents: 371
diff changeset
1763 | (read-if (REG OPERATOR ARG) CCL_BLOCK CCL_BLOCK)
74fd4e045ea6 Import from CVS: tag r21-2-29
cvs
parents: 371
diff changeset
1764 | (read-branch REG CCL_BLOCK [CCL_BLOCK ...])
74fd4e045ea6 Import from CVS: tag r21-2-29
cvs
parents: 371
diff changeset
1765 WRITE :=
74fd4e045ea6 Import from CVS: tag r21-2-29
cvs
parents: 371
diff changeset
1766 (write REG ...)
74fd4e045ea6 Import from CVS: tag r21-2-29
cvs
parents: 371
diff changeset
1767 | (write EXPRESSION)
74fd4e045ea6 Import from CVS: tag r21-2-29
cvs
parents: 371
diff changeset
1768 | (write integer) | (write string) | (write REG ARRAY)
74fd4e045ea6 Import from CVS: tag r21-2-29
cvs
parents: 371
diff changeset
1769 | string
74fd4e045ea6 Import from CVS: tag r21-2-29
cvs
parents: 371
diff changeset
1770 CALL := (call ccl-program-name)
74fd4e045ea6 Import from CVS: tag r21-2-29
cvs
parents: 371
diff changeset
1771 END := (end)
74fd4e045ea6 Import from CVS: tag r21-2-29
cvs
parents: 371
diff changeset
1772
74fd4e045ea6 Import from CVS: tag r21-2-29
cvs
parents: 371
diff changeset
1773 REG := r0 | r1 | r2 | r3 | r4 | r5 | r6 | r7
74fd4e045ea6 Import from CVS: tag r21-2-29
cvs
parents: 371
diff changeset
1774 ARG := REG | integer
74fd4e045ea6 Import from CVS: tag r21-2-29
cvs
parents: 371
diff changeset
1775 OPERATOR :=
74fd4e045ea6 Import from CVS: tag r21-2-29
cvs
parents: 371
diff changeset
1776 + | - | * | / | % | & | '|' | ^ | << | >> | <8 | >8 | //
74fd4e045ea6 Import from CVS: tag r21-2-29
cvs
parents: 371
diff changeset
1777 | < | > | == | <= | >= | != | de-sjis | en-sjis
74fd4e045ea6 Import from CVS: tag r21-2-29
cvs
parents: 371
diff changeset
1778 ASSIGNMENT_OPERATOR :=
74fd4e045ea6 Import from CVS: tag r21-2-29
cvs
parents: 371
diff changeset
1779 += | -= | *= | /= | %= | &= | '|=' | ^= | <<= | >>=
74fd4e045ea6 Import from CVS: tag r21-2-29
cvs
parents: 371
diff changeset
1780 ARRAY := '[' integer ... ']'
74fd4e045ea6 Import from CVS: tag r21-2-29
cvs
parents: 371
diff changeset
1781 @end format
74fd4e045ea6 Import from CVS: tag r21-2-29
cvs
parents: 371
diff changeset
1782
74fd4e045ea6 Import from CVS: tag r21-2-29
cvs
parents: 371
diff changeset
1783 @node CCL Statements, CCL Expressions, CCL Syntax, CCL
74fd4e045ea6 Import from CVS: tag r21-2-29
cvs
parents: 371
diff changeset
1784 @comment Node, Next, Previous, Up
74fd4e045ea6 Import from CVS: tag r21-2-29
cvs
parents: 371
diff changeset
1785 @subsection CCL Statements
74fd4e045ea6 Import from CVS: tag r21-2-29
cvs
parents: 371
diff changeset
1786
404
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
1787 The Emacs Code Conversion Language provides the following statement
398
74fd4e045ea6 Import from CVS: tag r21-2-29
cvs
parents: 371
diff changeset
1788 types: @dfn{set}, @dfn{if}, @dfn{branch}, @dfn{loop}, @dfn{repeat},
74fd4e045ea6 Import from CVS: tag r21-2-29
cvs
parents: 371
diff changeset
1789 @dfn{break}, @dfn{read}, @dfn{write}, @dfn{call}, and @dfn{end}.
74fd4e045ea6 Import from CVS: tag r21-2-29
cvs
parents: 371
diff changeset
1790
74fd4e045ea6 Import from CVS: tag r21-2-29
cvs
parents: 371
diff changeset
1791 @heading Set statement:
74fd4e045ea6 Import from CVS: tag r21-2-29
cvs
parents: 371
diff changeset
1792
404
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
1793 The @dfn{set} statement has three variants with the syntaxes
398
74fd4e045ea6 Import from CVS: tag r21-2-29
cvs
parents: 371
diff changeset
1794 @samp{(@var{reg} = @var{expression})},
74fd4e045ea6 Import from CVS: tag r21-2-29
cvs
parents: 371
diff changeset
1795 @samp{(@var{reg} @var{assignment_operator} @var{expression})}, and
74fd4e045ea6 Import from CVS: tag r21-2-29
cvs
parents: 371
diff changeset
1796 @samp{@var{integer}}. The assignment operator variation of the
74fd4e045ea6 Import from CVS: tag r21-2-29
cvs
parents: 371
diff changeset
1797 @dfn{set} statement works the same way as the corresponding C expression
74fd4e045ea6 Import from CVS: tag r21-2-29
cvs
parents: 371
diff changeset
1798 statement does. The assignment operators are @code{+=}, @code{-=},
74fd4e045ea6 Import from CVS: tag r21-2-29
cvs
parents: 371
diff changeset
1799 @code{*=}, @code{/=}, @code{%=}, @code{&=}, @code{|=}, @code{^=},
74fd4e045ea6 Import from CVS: tag r21-2-29
cvs
parents: 371
diff changeset
1800 @code{<<=}, and @code{>>=}, and they have the same meanings as in C. A
74fd4e045ea6 Import from CVS: tag r21-2-29
cvs
parents: 371
diff changeset
1801 "naked integer" @var{integer} is equivalent to a @var{set} statement of
74fd4e045ea6 Import from CVS: tag r21-2-29
cvs
parents: 371
diff changeset
1802 the form @code{(r0 = @var{integer})}.
74fd4e045ea6 Import from CVS: tag r21-2-29
cvs
parents: 371
diff changeset
1803
74fd4e045ea6 Import from CVS: tag r21-2-29
cvs
parents: 371
diff changeset
1804 @heading I/O statements:
74fd4e045ea6 Import from CVS: tag r21-2-29
cvs
parents: 371
diff changeset
1805
404
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
1806 The @dfn{read} statement takes one or more registers as arguments. It
398
74fd4e045ea6 Import from CVS: tag r21-2-29
cvs
parents: 371
diff changeset
1807 reads one byte (a C char) from the input into each register in turn.
74fd4e045ea6 Import from CVS: tag r21-2-29
cvs
parents: 371
diff changeset
1808
404
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
1809 The @dfn{write} takes several forms. In the form @samp{(write @var{reg}
398
74fd4e045ea6 Import from CVS: tag r21-2-29
cvs
parents: 371
diff changeset
1810 ...)} it takes one or more registers as arguments and writes each in
74fd4e045ea6 Import from CVS: tag r21-2-29
cvs
parents: 371
diff changeset
1811 turn to the output. The integer in a register (interpreted as an
74fd4e045ea6 Import from CVS: tag r21-2-29
cvs
parents: 371
diff changeset
1812 Emchar) is encoded to multibyte form (ie, Bufbytes) and written to the
74fd4e045ea6 Import from CVS: tag r21-2-29
cvs
parents: 371
diff changeset
1813 current output buffer. If it is less than 256, it is written as is.
74fd4e045ea6 Import from CVS: tag r21-2-29
cvs
parents: 371
diff changeset
1814 The forms @samp{(write @var{expression})} and @samp{(write
74fd4e045ea6 Import from CVS: tag r21-2-29
cvs
parents: 371
diff changeset
1815 @var{integer})} are treated analogously. The form @samp{(write
74fd4e045ea6 Import from CVS: tag r21-2-29
cvs
parents: 371
diff changeset
1816 @var{string})} writes the constant string to the output. A
74fd4e045ea6 Import from CVS: tag r21-2-29
cvs
parents: 371
diff changeset
1817 "naked string" @samp{@var{string}} is equivalent to the statement @samp{(write
74fd4e045ea6 Import from CVS: tag r21-2-29
cvs
parents: 371
diff changeset
1818 @var{string})}. The form @samp{(write @var{reg} @var{array})} writes
74fd4e045ea6 Import from CVS: tag r21-2-29
cvs
parents: 371
diff changeset
1819 the @var{reg}th element of the @var{array} to the output.
74fd4e045ea6 Import from CVS: tag r21-2-29
cvs
parents: 371
diff changeset
1820
74fd4e045ea6 Import from CVS: tag r21-2-29
cvs
parents: 371
diff changeset
1821 @heading Conditional statements:
74fd4e045ea6 Import from CVS: tag r21-2-29
cvs
parents: 371
diff changeset
1822
404
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
1823 The @dfn{if} statement takes an @var{expression}, a @var{CCL block}, and
398
74fd4e045ea6 Import from CVS: tag r21-2-29
cvs
parents: 371
diff changeset
1824 an optional @var{second CCL block} as arguments. If the
74fd4e045ea6 Import from CVS: tag r21-2-29
cvs
parents: 371
diff changeset
1825 @var{expression} evaluates to non-zero, the first @var{CCL block} is
74fd4e045ea6 Import from CVS: tag r21-2-29
cvs
parents: 371
diff changeset
1826 executed. Otherwise, if there is a @var{second CCL block}, it is
74fd4e045ea6 Import from CVS: tag r21-2-29
cvs
parents: 371
diff changeset
1827 executed.
74fd4e045ea6 Import from CVS: tag r21-2-29
cvs
parents: 371
diff changeset
1828
404
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
1829 The @dfn{read-if} variant of the @dfn{if} statement takes an
398
74fd4e045ea6 Import from CVS: tag r21-2-29
cvs
parents: 371
diff changeset
1830 @var{expression}, a @var{CCL block}, and an optional @var{second CCL
74fd4e045ea6 Import from CVS: tag r21-2-29
cvs
parents: 371
diff changeset
1831 block} as arguments. The @var{expression} must have the form
74fd4e045ea6 Import from CVS: tag r21-2-29
cvs
parents: 371
diff changeset
1832 @code{(@var{reg} @var{operator} @var{operand})} (where @var{operand} is
74fd4e045ea6 Import from CVS: tag r21-2-29
cvs
parents: 371
diff changeset
1833 a register or an integer). The @code{read-if} statement first reads
74fd4e045ea6 Import from CVS: tag r21-2-29
cvs
parents: 371
diff changeset
1834 from the input into the first register operand in the @var{expression},
74fd4e045ea6 Import from CVS: tag r21-2-29
cvs
parents: 371
diff changeset
1835 then conditionally executes a CCL block just as the @code{if} statement
74fd4e045ea6 Import from CVS: tag r21-2-29
cvs
parents: 371
diff changeset
1836 does.
74fd4e045ea6 Import from CVS: tag r21-2-29
cvs
parents: 371
diff changeset
1837
404
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
1838 The @dfn{branch} statement takes an @var{expression} and one or more CCL
398
74fd4e045ea6 Import from CVS: tag r21-2-29
cvs
parents: 371
diff changeset
1839 blocks as arguments. The CCL blocks are treated as a zero-indexed
74fd4e045ea6 Import from CVS: tag r21-2-29
cvs
parents: 371
diff changeset
1840 array, and the @code{branch} statement uses the @var{expression} as the
74fd4e045ea6 Import from CVS: tag r21-2-29
cvs
parents: 371
diff changeset
1841 index of the CCL block to execute. Null CCL blocks may be used as
74fd4e045ea6 Import from CVS: tag r21-2-29
cvs
parents: 371
diff changeset
1842 no-ops, continuing execution with the statement following the
74fd4e045ea6 Import from CVS: tag r21-2-29
cvs
parents: 371
diff changeset
1843 @code{branch} statement in the containing CCL block. Out-of-range
74fd4e045ea6 Import from CVS: tag r21-2-29
cvs
parents: 371
diff changeset
1844 values for the @var{EXPRESSION} are also treated as no-ops.
74fd4e045ea6 Import from CVS: tag r21-2-29
cvs
parents: 371
diff changeset
1845
404
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
1846 The @dfn{read-branch} variant of the @dfn{branch} statement takes an
398
74fd4e045ea6 Import from CVS: tag r21-2-29
cvs
parents: 371
diff changeset
1847 @var{register}, a @var{CCL block}, and an optional @var{second CCL
74fd4e045ea6 Import from CVS: tag r21-2-29
cvs
parents: 371
diff changeset
1848 block} as arguments. The @code{read-branch} statement first reads from
74fd4e045ea6 Import from CVS: tag r21-2-29
cvs
parents: 371
diff changeset
1849 the input into the @var{register}, then conditionally executes a CCL
74fd4e045ea6 Import from CVS: tag r21-2-29
cvs
parents: 371
diff changeset
1850 block just as the @code{branch} statement does.
74fd4e045ea6 Import from CVS: tag r21-2-29
cvs
parents: 371
diff changeset
1851
74fd4e045ea6 Import from CVS: tag r21-2-29
cvs
parents: 371
diff changeset
1852 @heading Loop control statements:
74fd4e045ea6 Import from CVS: tag r21-2-29
cvs
parents: 371
diff changeset
1853
404
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
1854 The @dfn{loop} statement creates a block with an implied jump from the
398
74fd4e045ea6 Import from CVS: tag r21-2-29
cvs
parents: 371
diff changeset
1855 end of the block back to its head. The loop is exited on a @code{break}
74fd4e045ea6 Import from CVS: tag r21-2-29
cvs
parents: 371
diff changeset
1856 statement, and continued without executing the tail by a @code{repeat}
74fd4e045ea6 Import from CVS: tag r21-2-29
cvs
parents: 371
diff changeset
1857 statement.
74fd4e045ea6 Import from CVS: tag r21-2-29
cvs
parents: 371
diff changeset
1858
404
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
1859 The @dfn{break} statement, written @samp{(break)}, terminates the
398
74fd4e045ea6 Import from CVS: tag r21-2-29
cvs
parents: 371
diff changeset
1860 current loop and continues with the next statement in the current
74fd4e045ea6 Import from CVS: tag r21-2-29
cvs
parents: 371
diff changeset
1861 block.
74fd4e045ea6 Import from CVS: tag r21-2-29
cvs
parents: 371
diff changeset
1862
404
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
1863 The @dfn{repeat} statement has three variants, @code{repeat},
398
74fd4e045ea6 Import from CVS: tag r21-2-29
cvs
parents: 371
diff changeset
1864 @code{write-repeat}, and @code{write-read-repeat}. Each continues the
74fd4e045ea6 Import from CVS: tag r21-2-29
cvs
parents: 371
diff changeset
1865 current loop from its head, possibly after performing I/O.
74fd4e045ea6 Import from CVS: tag r21-2-29
cvs
parents: 371
diff changeset
1866 @code{repeat} takes no arguments and does no I/O before jumping.
74fd4e045ea6 Import from CVS: tag r21-2-29
cvs
parents: 371
diff changeset
1867 @code{write-repeat} takes a single argument (a register, an
74fd4e045ea6 Import from CVS: tag r21-2-29
cvs
parents: 371
diff changeset
1868 integer, or a string), writes it to the output, then jumps.
74fd4e045ea6 Import from CVS: tag r21-2-29
cvs
parents: 371
diff changeset
1869 @code{write-read-repeat} takes one or two arguments. The first must
74fd4e045ea6 Import from CVS: tag r21-2-29
cvs
parents: 371
diff changeset
1870 be a register. The second may be an integer or an array; if absent, it
74fd4e045ea6 Import from CVS: tag r21-2-29
cvs
parents: 371
diff changeset
1871 is implicitly set to the first (register) argument.
74fd4e045ea6 Import from CVS: tag r21-2-29
cvs
parents: 371
diff changeset
1872 @code{write-read-repeat} writes its second argument to the output, then
74fd4e045ea6 Import from CVS: tag r21-2-29
cvs
parents: 371
diff changeset
1873 reads from the input into the register, and finally jumps. See the
74fd4e045ea6 Import from CVS: tag r21-2-29
cvs
parents: 371
diff changeset
1874 @code{write} and @code{read} statements for the semantics of the I/O
74fd4e045ea6 Import from CVS: tag r21-2-29
cvs
parents: 371
diff changeset
1875 operations for each type of argument.
74fd4e045ea6 Import from CVS: tag r21-2-29
cvs
parents: 371
diff changeset
1876
74fd4e045ea6 Import from CVS: tag r21-2-29
cvs
parents: 371
diff changeset
1877 @heading Other control statements:
74fd4e045ea6 Import from CVS: tag r21-2-29
cvs
parents: 371
diff changeset
1878
404
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
1879 The @dfn{call} statement, written @samp{(call @var{ccl-program-name})},
398
74fd4e045ea6 Import from CVS: tag r21-2-29
cvs
parents: 371
diff changeset
1880 executes a CCL program as a subroutine. It does not return a value to
74fd4e045ea6 Import from CVS: tag r21-2-29
cvs
parents: 371
diff changeset
1881 the caller, but can modify the register status.
74fd4e045ea6 Import from CVS: tag r21-2-29
cvs
parents: 371
diff changeset
1882
404
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
1883 The @dfn{end} statement, written @samp{(end)}, terminates the CCL
398
74fd4e045ea6 Import from CVS: tag r21-2-29
cvs
parents: 371
diff changeset
1884 program successfully, and returns to caller (which may be a CCL
74fd4e045ea6 Import from CVS: tag r21-2-29
cvs
parents: 371
diff changeset
1885 program). It does not alter the status of the registers.
74fd4e045ea6 Import from CVS: tag r21-2-29
cvs
parents: 371
diff changeset
1886
74fd4e045ea6 Import from CVS: tag r21-2-29
cvs
parents: 371
diff changeset
1887 @node CCL Expressions, Calling CCL, CCL Statements, CCL
74fd4e045ea6 Import from CVS: tag r21-2-29
cvs
parents: 371
diff changeset
1888 @comment Node, Next, Previous, Up
74fd4e045ea6 Import from CVS: tag r21-2-29
cvs
parents: 371
diff changeset
1889 @subsection CCL Expressions
74fd4e045ea6 Import from CVS: tag r21-2-29
cvs
parents: 371
diff changeset
1890
404
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
1891 CCL, unlike Lisp, uses infix expressions. The simplest CCL expressions
398
74fd4e045ea6 Import from CVS: tag r21-2-29
cvs
parents: 371
diff changeset
1892 consist of a single @var{operand}, either a register (one of @code{r0},
74fd4e045ea6 Import from CVS: tag r21-2-29
cvs
parents: 371
diff changeset
1893 ..., @code{r0}) or an integer. Complex expressions are lists of the
74fd4e045ea6 Import from CVS: tag r21-2-29
cvs
parents: 371
diff changeset
1894 form @code{( @var{expression} @var{operator} @var{operand} )}. Unlike
74fd4e045ea6 Import from CVS: tag r21-2-29
cvs
parents: 371
diff changeset
1895 C, assignments are not expressions.
74fd4e045ea6 Import from CVS: tag r21-2-29
cvs
parents: 371
diff changeset
1896
404
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
1897 In the following table, @var{X} is the target resister for a @dfn{set}.
398
74fd4e045ea6 Import from CVS: tag r21-2-29
cvs
parents: 371
diff changeset
1898 In subexpressions, this is implicitly @code{r7}. This means that
74fd4e045ea6 Import from CVS: tag r21-2-29
cvs
parents: 371
diff changeset
1899 @code{>8}, @code{//}, @code{de-sjis}, and @code{en-sjis} cannot be used
74fd4e045ea6 Import from CVS: tag r21-2-29
cvs
parents: 371
diff changeset
1900 freely in subexpressions, since they return parts of their values in
74fd4e045ea6 Import from CVS: tag r21-2-29
cvs
parents: 371
diff changeset
1901 @code{r7}. @var{Y} may be an expression, register, or integer, while
74fd4e045ea6 Import from CVS: tag r21-2-29
cvs
parents: 371
diff changeset
1902 @var{Z} must be a register or an integer.
74fd4e045ea6 Import from CVS: tag r21-2-29
cvs
parents: 371
diff changeset
1903
74fd4e045ea6 Import from CVS: tag r21-2-29
cvs
parents: 371
diff changeset
1904 @multitable @columnfractions .22 .14 .09 .55
74fd4e045ea6 Import from CVS: tag r21-2-29
cvs
parents: 371
diff changeset
1905 @item Name @tab Operator @tab Code @tab C-like Description
74fd4e045ea6 Import from CVS: tag r21-2-29
cvs
parents: 371
diff changeset
1906 @item CCL_PLUS @tab @code{+} @tab 0x00 @tab X = Y + Z
74fd4e045ea6 Import from CVS: tag r21-2-29
cvs
parents: 371
diff changeset
1907 @item CCL_MINUS @tab @code{-} @tab 0x01 @tab X = Y - Z
74fd4e045ea6 Import from CVS: tag r21-2-29
cvs
parents: 371
diff changeset
1908 @item CCL_MUL @tab @code{*} @tab 0x02 @tab X = Y * Z
74fd4e045ea6 Import from CVS: tag r21-2-29
cvs
parents: 371
diff changeset
1909 @item CCL_DIV @tab @code{/} @tab 0x03 @tab X = Y / Z
74fd4e045ea6 Import from CVS: tag r21-2-29
cvs
parents: 371
diff changeset
1910 @item CCL_MOD @tab @code{%} @tab 0x04 @tab X = Y % Z
74fd4e045ea6 Import from CVS: tag r21-2-29
cvs
parents: 371
diff changeset
1911 @item CCL_AND @tab @code{&} @tab 0x05 @tab X = Y & Z
74fd4e045ea6 Import from CVS: tag r21-2-29
cvs
parents: 371
diff changeset
1912 @item CCL_OR @tab @code{|} @tab 0x06 @tab X = Y | Z
74fd4e045ea6 Import from CVS: tag r21-2-29
cvs
parents: 371
diff changeset
1913 @item CCL_XOR @tab @code{^} @tab 0x07 @tab X = Y ^ Z
74fd4e045ea6 Import from CVS: tag r21-2-29
cvs
parents: 371
diff changeset
1914 @item CCL_LSH @tab @code{<<} @tab 0x08 @tab X = Y << Z
74fd4e045ea6 Import from CVS: tag r21-2-29
cvs
parents: 371
diff changeset
1915 @item CCL_RSH @tab @code{>>} @tab 0x09 @tab X = Y >> Z
74fd4e045ea6 Import from CVS: tag r21-2-29
cvs
parents: 371
diff changeset
1916 @item CCL_LSH8 @tab @code{<8} @tab 0x0A @tab X = (Y << 8) | Z
74fd4e045ea6 Import from CVS: tag r21-2-29
cvs
parents: 371
diff changeset
1917 @item CCL_RSH8 @tab @code{>8} @tab 0x0B @tab X = Y >> 8, r[7] = Y & 0xFF
74fd4e045ea6 Import from CVS: tag r21-2-29
cvs
parents: 371
diff changeset
1918 @item CCL_DIVMOD @tab @code{//} @tab 0x0C @tab X = Y / Z, r[7] = Y % Z
74fd4e045ea6 Import from CVS: tag r21-2-29
cvs
parents: 371
diff changeset
1919 @item CCL_LS @tab @code{<} @tab 0x10 @tab X = (X < Y)
74fd4e045ea6 Import from CVS: tag r21-2-29
cvs
parents: 371
diff changeset
1920 @item CCL_GT @tab @code{>} @tab 0x11 @tab X = (X > Y)
74fd4e045ea6 Import from CVS: tag r21-2-29
cvs
parents: 371
diff changeset
1921 @item CCL_EQ @tab @code{==} @tab 0x12 @tab X = (X == Y)
74fd4e045ea6 Import from CVS: tag r21-2-29
cvs
parents: 371
diff changeset
1922 @item CCL_LE @tab @code{<=} @tab 0x13 @tab X = (X <= Y)
74fd4e045ea6 Import from CVS: tag r21-2-29
cvs
parents: 371
diff changeset
1923 @item CCL_GE @tab @code{>=} @tab 0x14 @tab X = (X >= Y)
74fd4e045ea6 Import from CVS: tag r21-2-29
cvs
parents: 371
diff changeset
1924 @item CCL_NE @tab @code{!=} @tab 0x15 @tab X = (X != Y)
74fd4e045ea6 Import from CVS: tag r21-2-29
cvs
parents: 371
diff changeset
1925 @item CCL_ENCODE_SJIS @tab @code{en-sjis} @tab 0x16 @tab X = HIGHER_BYTE (SJIS (Y, Z))
74fd4e045ea6 Import from CVS: tag r21-2-29
cvs
parents: 371
diff changeset
1926 @item @tab @tab @tab r[7] = LOWER_BYTE (SJIS (Y, Z)
74fd4e045ea6 Import from CVS: tag r21-2-29
cvs
parents: 371
diff changeset
1927 @item CCL_DECODE_SJIS @tab @code{de-sjis} @tab 0x17 @tab X = HIGHER_BYTE (DE-SJIS (Y, Z))
74fd4e045ea6 Import from CVS: tag r21-2-29
cvs
parents: 371
diff changeset
1928 @item @tab @tab @tab r[7] = LOWER_BYTE (DE-SJIS (Y, Z))
74fd4e045ea6 Import from CVS: tag r21-2-29
cvs
parents: 371
diff changeset
1929 @end multitable
74fd4e045ea6 Import from CVS: tag r21-2-29
cvs
parents: 371
diff changeset
1930
404
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
1931 The CCL operators are as in C, with the addition of CCL_LSH8, CCL_RSH8,
398
74fd4e045ea6 Import from CVS: tag r21-2-29
cvs
parents: 371
diff changeset
1932 CCL_DIVMOD, CCL_ENCODE_SJIS, and CCL_DECODE_SJIS. The CCL_ENCODE_SJIS
74fd4e045ea6 Import from CVS: tag r21-2-29
cvs
parents: 371
diff changeset
1933 and CCL_DECODE_SJIS treat their first and second bytes as the high and
74fd4e045ea6 Import from CVS: tag r21-2-29
cvs
parents: 371
diff changeset
1934 low bytes of a two-byte character code. (SJIS stands for Shift JIS, an
74fd4e045ea6 Import from CVS: tag r21-2-29
cvs
parents: 371
diff changeset
1935 encoding of Japanese characters used by Microsoft. CCL_ENCODE_SJIS is a
74fd4e045ea6 Import from CVS: tag r21-2-29
cvs
parents: 371
diff changeset
1936 complicated transformation of the Japanese standard JIS encoding to
74fd4e045ea6 Import from CVS: tag r21-2-29
cvs
parents: 371
diff changeset
1937 Shift JIS. CCL_DECODE_SJIS is its inverse.) It is somewhat odd to
74fd4e045ea6 Import from CVS: tag r21-2-29
cvs
parents: 371
diff changeset
1938 represent the SJIS operations in infix form.
74fd4e045ea6 Import from CVS: tag r21-2-29
cvs
parents: 371
diff changeset
1939
404
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
1940 @node Calling CCL, CCL Examples, CCL Expressions, CCL
398
74fd4e045ea6 Import from CVS: tag r21-2-29
cvs
parents: 371
diff changeset
1941 @comment Node, Next, Previous, Up
74fd4e045ea6 Import from CVS: tag r21-2-29
cvs
parents: 371
diff changeset
1942 @subsection Calling CCL
74fd4e045ea6 Import from CVS: tag r21-2-29
cvs
parents: 371
diff changeset
1943
404
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
1944 CCL programs are called automatically during Emacs buffer I/O when the
398
74fd4e045ea6 Import from CVS: tag r21-2-29
cvs
parents: 371
diff changeset
1945 external representation has a coding system type of @code{shift-jis},
74fd4e045ea6 Import from CVS: tag r21-2-29
cvs
parents: 371
diff changeset
1946 @code{big5}, or @code{ccl}. The program is specified by the coding
74fd4e045ea6 Import from CVS: tag r21-2-29
cvs
parents: 371
diff changeset
1947 system (@pxref{Coding Systems}). You can also call CCL programs from
74fd4e045ea6 Import from CVS: tag r21-2-29
cvs
parents: 371
diff changeset
1948 other CCL programs, and from Lisp using these functions:
74fd4e045ea6 Import from CVS: tag r21-2-29
cvs
parents: 371
diff changeset
1949
74fd4e045ea6 Import from CVS: tag r21-2-29
cvs
parents: 371
diff changeset
1950 @defun ccl-execute ccl-program status
74fd4e045ea6 Import from CVS: tag r21-2-29
cvs
parents: 371
diff changeset
1951 Execute @var{ccl-program} with registers initialized by
0
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
1952 @var{status}. @var{ccl-program} is a vector of compiled CCL code
398
74fd4e045ea6 Import from CVS: tag r21-2-29
cvs
parents: 371
diff changeset
1953 created by @code{ccl-compile}. It is an error for the program to try to
74fd4e045ea6 Import from CVS: tag r21-2-29
cvs
parents: 371
diff changeset
1954 execute a CCL I/O command. @var{status} must be a vector of nine
0
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
1955 values, specifying the initial value for the R0, R1 .. R7 registers and
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
1956 for the instruction counter IC. A @code{nil} value for a register
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
1957 initializer causes the register to be set to 0. A @code{nil} value for
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
1958 the IC initializer causes execution to start at the beginning of the
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
1959 program. When the program is done, @var{status} is modified (by
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
1960 side-effect) to contain the ending values for the corresponding
398
74fd4e045ea6 Import from CVS: tag r21-2-29
cvs
parents: 371
diff changeset
1961 registers and IC.
0
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
1962 @end defun
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
1963
398
74fd4e045ea6 Import from CVS: tag r21-2-29
cvs
parents: 371
diff changeset
1964 @defun ccl-execute-on-string ccl-program status str &optional continue
74fd4e045ea6 Import from CVS: tag r21-2-29
cvs
parents: 371
diff changeset
1965 Execute @var{ccl-program} with initial @var{status} on
0
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
1966 @var{string}. @var{ccl-program} is a vector of compiled CCL code
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
1967 created by @code{ccl-compile}. @var{status} must be a vector of nine
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
1968 values, specifying the initial value for the R0, R1 .. R7 registers and
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
1969 for the instruction counter IC. A @code{nil} value for a register
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
1970 initializer causes the register to be set to 0. A @code{nil} value for
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
1971 the IC initializer causes execution to start at the beginning of the
398
74fd4e045ea6 Import from CVS: tag r21-2-29
cvs
parents: 371
diff changeset
1972 program. An optional fourth argument @var{continue}, if non-nil, causes
74fd4e045ea6 Import from CVS: tag r21-2-29
cvs
parents: 371
diff changeset
1973 the IC to
74fd4e045ea6 Import from CVS: tag r21-2-29
cvs
parents: 371
diff changeset
1974 remain on the unsatisfied read operation if the program terminates due
74fd4e045ea6 Import from CVS: tag r21-2-29
cvs
parents: 371
diff changeset
1975 to exhaustion of the input buffer. Otherwise the IC is set to the end
74fd4e045ea6 Import from CVS: tag r21-2-29
cvs
parents: 371
diff changeset
1976 of the program. When the program is done, @var{status} is modified (by
0
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
1977 side-effect) to contain the ending values for the corresponding
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
1978 registers and IC. Returns the resulting string.
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
1979 @end defun
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
1980
404
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
1981 To call a CCL program from another CCL program, it must first be
398
74fd4e045ea6 Import from CVS: tag r21-2-29
cvs
parents: 371
diff changeset
1982 registered:
74fd4e045ea6 Import from CVS: tag r21-2-29
cvs
parents: 371
diff changeset
1983
74fd4e045ea6 Import from CVS: tag r21-2-29
cvs
parents: 371
diff changeset
1984 @defun register-ccl-program name ccl-program
74fd4e045ea6 Import from CVS: tag r21-2-29
cvs
parents: 371
diff changeset
1985 Register @var{name} for CCL program @var{program} in
74fd4e045ea6 Import from CVS: tag r21-2-29
cvs
parents: 371
diff changeset
1986 @code{ccl-program-table}. @var{program} should be the compiled form of
74fd4e045ea6 Import from CVS: tag r21-2-29
cvs
parents: 371
diff changeset
1987 a CCL program, or nil. Return index number of the registered CCL
74fd4e045ea6 Import from CVS: tag r21-2-29
cvs
parents: 371
diff changeset
1988 program.
0
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
1989 @end defun
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
1990
404
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
1991 Information about the processor time used by the CCL interpreter can be
398
74fd4e045ea6 Import from CVS: tag r21-2-29
cvs
parents: 371
diff changeset
1992 obtained using these functions:
74fd4e045ea6 Import from CVS: tag r21-2-29
cvs
parents: 371
diff changeset
1993
0
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
1994 @defun ccl-elapsed-time
398
74fd4e045ea6 Import from CVS: tag r21-2-29
cvs
parents: 371
diff changeset
1995 Returns the elapsed processor time of the CCL interpreter as cons of
74fd4e045ea6 Import from CVS: tag r21-2-29
cvs
parents: 371
diff changeset
1996 user and system time, as
74fd4e045ea6 Import from CVS: tag r21-2-29
cvs
parents: 371
diff changeset
1997 floating point numbers measured in seconds. If only one
0
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
1998 overall value can be determined, the return value will be a cons of that
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
1999 value and 0.
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
2000 @end defun
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
2001
398
74fd4e045ea6 Import from CVS: tag r21-2-29
cvs
parents: 371
diff changeset
2002 @defun ccl-reset-elapsed-time
74fd4e045ea6 Import from CVS: tag r21-2-29
cvs
parents: 371
diff changeset
2003 Resets the CCL interpreter's internal elapsed time registers.
74fd4e045ea6 Import from CVS: tag r21-2-29
cvs
parents: 371
diff changeset
2004 @end defun
74fd4e045ea6 Import from CVS: tag r21-2-29
cvs
parents: 371
diff changeset
2005
404
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
2006 @node CCL Examples, , Calling CCL, CCL
398
74fd4e045ea6 Import from CVS: tag r21-2-29
cvs
parents: 371
diff changeset
2007 @comment Node, Next, Previous, Up
74fd4e045ea6 Import from CVS: tag r21-2-29
cvs
parents: 371
diff changeset
2008 @subsection CCL Examples
74fd4e045ea6 Import from CVS: tag r21-2-29
cvs
parents: 371
diff changeset
2009
404
2f8bb876ab1d Import from CVS: tag r21-2-32
cvs
parents: 398
diff changeset
2010 This section is not yet written.
398
74fd4e045ea6 Import from CVS: tag r21-2-29
cvs
parents: 371
diff changeset
2011
74fd4e045ea6 Import from CVS: tag r21-2-29
cvs
parents: 371
diff changeset
2012 @node Category Tables, , CCL, MULE
0
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
2013 @section Category Tables
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
2014
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
2015 A category table is a type of char table used for keeping track of
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
2016 categories. Categories are used for classifying characters for use in
398
74fd4e045ea6 Import from CVS: tag r21-2-29
cvs
parents: 371
diff changeset
2017 regexps---you can refer to a category rather than having to use a
0
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
2018 complicated [] expression (and category lookups are significantly
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
2019 faster).
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
2020
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
2021 There are 95 different categories available, one for each printable
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
2022 character (including space) in the ASCII charset. Each category is
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
2023 designated by one such character, called a @dfn{category designator}.
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
2024 They are specified in a regexp using the syntax @samp{\cX}, where X is a
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
2025 category designator. (This is not yet implemented.)
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
2026
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
2027 A category table specifies, for each character, the categories that
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
2028 the character is in. Note that a character can be in more than one
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
2029 category. More specifically, a category table maps from a character to
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
2030 either the value @code{nil} (meaning the character is in no categories)
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
2031 or a 95-element bit vector, specifying for each of the 95 categories
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
2032 whether the character is in that category.
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
2033
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
2034 Special Lisp functions are provided that abstract this, so you do not
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
2035 have to directly manipulate bit vectors.
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
2036
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
2037 @defun category-table-p obj
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
2038 This function returns @code{t} if @var{arg} is a category table.
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
2039 @end defun
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
2040
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
2041 @defun category-table &optional buffer
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
2042 This function returns the current category table. This is the one
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
2043 specified by the current buffer, or by @var{buffer} if it is
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
2044 non-@code{nil}.
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
2045 @end defun
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
2046
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
2047 @defun standard-category-table
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
2048 This function returns the standard category table. This is the one used
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
2049 for new buffers.
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
2050 @end defun
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
2051
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
2052 @defun copy-category-table &optional table
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
2053 This function constructs a new category table and return it. It is a
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
2054 copy of the @var{table}, which defaults to the standard category table.
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
2055 @end defun
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
2056
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
2057 @defun set-category-table table &optional buffer
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
2058 This function selects a new category table for @var{buffer}. One
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
2059 argument, a category table. @var{buffer} defaults to the current buffer
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
2060 if omitted.
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
2061 @end defun
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
2062
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
2063 @defun category-designator-p obj
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
2064 This function returns @code{t} if @var{arg} is a category designator (a
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
2065 char in the range @samp{' '} to @samp{'~'}).
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
2066 @end defun
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
2067
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
2068 @defun category-table-value-p obj
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
2069 This function returns @code{t} if @var{arg} is a category table value.
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
2070 Valid values are @code{nil} or a bit vector of size 95.
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
2071 @end defun
376386a54a3c Import from CVS: tag r19-14
cvs
parents:
diff changeset
2072