xemacs-beta: man/lispref/mule.texi annotate

annotate man/lispref/mule.texi @ 70:131b0175ea99 r20-0b30

Import from CVS: tag r20-0b30

author	cvs
date	Mon, 13 Aug 2007 09:02:59 +0200
parents	05472e90ae02
children	8619ce7e4c50

rev	line source
0 376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	1 @c --texinfo--
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	2 @c This is part of the XEmacs Lisp Reference Manual.
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	3 @c Copyright (C) 1996 Ben Wing.
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	4 @c See the file lispref.texi for copying conditions.
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	5 @setfilename ../../info/internationalization.info
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	6 @node MULE, Tips, Internationalization, top
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	7 @chapter MULE
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	8
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	9 @dfn{MULE} is the name originally given to the version of GNU Emacs
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	10 extended for multi-lingual (and in particular Asian-language) support.
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	11 ``MULE'' is short for ``MUlti-Lingual Emacs''. It was originally called
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	12 Nemacs (``Nihon Emacs'' where ``Nihon'' is the Japanese word for
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	13 ``Japan''), when it only provided support for Japanese. XEmacs
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	14 refers to its multi-lingual support as @dfn{MULE support} since it
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	15 is based on @dfn{MULE}.
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	16
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	17 @menu
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	18 * Internationalization Terminology::
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	19 Definition of various internationalization terms.
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	20 * Charsets:: Sets of related characters.
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	21 * MULE Characters:: Working with characters in XEmacs/MULE.
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	22 * Composite Characters:: Making new characters by overstriking other ones.
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	23 * ISO 2022:: An international standard for charsets and encodings.
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	24 * Coding Systems:: Ways of representing a string of chars using integers.
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	25 * CCL:: A special language for writing fast converters.
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	26 * Category Tables:: Subdividing charsets into groups.
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	27 @end menu
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	28
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	29 @node Internationalization Terminology
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	30 @section Internationalization Terminology
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	31
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	32 In internationalization terminology, a string of text is divided up
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	33 into @dfn{characters}, which are the printable units that make up the
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	34 text. A single character is (for example) a capital @samp{A}, the
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	35 number @samp{2}, a Katakana character, a Kanji ideograph (an
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	36 @dfn{ideograph} is a ``picture'' character, such as is used in Japanese
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	37 Kanji, Chinese Hanzi, and Korean Hangul; typically there are thousands
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	38 of such ideographs in each language), etc. The basic property of a
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	39 character is its shape. Note that the same character may be drawn by
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	40 two different people (or in two different fonts) in slightly different
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	41 ways, although the basic shape will be the same.
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	42
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	43 In some cases, the differences will be significant enough that it is
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	44 actually possible to identify two or more distinct shapes that both
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	45 represent the same character. For example, the lowercase letters
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	46 @samp{a} and @samp{g} each have two distinct possible shapes -- the
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	47 @samp{a} can optionally have a curved tail projecting off the top, and
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	48 the @samp{g} can be formed either of two loops, or of one loop and a
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	49 tail hanging off the bottom. Such distinct possible shapes of a
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	50 character are called @dfn{glyphs}. The important characteristic of two
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	51 glyphs making up the same character is that the choice between one or
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	52 the other is purely stylistic and has no linguistic effect on a word
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	53 (this is the reason why a capital @samp{A} and lowercase @samp{a}
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	54 are different characters rather than different glyphs -- e.g.
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	55 @samp{Aspen} is a city while @samp{aspen} is a kind of tree).
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	56
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	57 Note that @dfn{character} and @dfn{glyph} are used differently
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	58 here than elsewhere in XEmacs.
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	59
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	60 A @dfn{character set} is simply a set of related characters. ASCII,
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	61 for example, is a set of 94 characters (or 128, if you count
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	62 non-printing characters). Other character sets are ISO8859-1 (ASCII
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	63 plus various accented characters and other international symbols),
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	64 JISX0201 (ASCII, more or less, plus half-width Katakana), JISX0208
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	65 (Japanese Kanji), JISX0212 (a second set of less-used Japanese Kanji),
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	66 GB2312 (Mainland Chinese Hanzi), etc.
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	67
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	68 Every character set has one or more @dfn{orderings}, which can be
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	69 viewed as a way of assigning a number (or set of numbers) to each
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	70 character in the set. For most character sets, there is a standard
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	71 ordering, and in fact all of the character sets mentioned above define a
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	72 particular ordering. ASCII, for example, places letters in their
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	73 ``natural'' order, puts uppercase letters before lowercase letters,
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	74 numbers before letters, etc. Note that for many of the Asian character
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	75 sets, there is no natural ordering of the characters. The actual
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	76 orderings are based on one or more salient characteristic, of which
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	77 there are many to choose from -- e.g. number of strokes, common
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	78 radicals, phonetic ordering, etc.
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	79
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	80 The set of numbers assigned to any particular character are called
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	81 the character's @dfn{position codes}. The number of position codes
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	82 required to index a particular character in a character set is called
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	83 the @dfn{dimension} of the character set. ASCII, being a relatively
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	84 small character set, is of dimension one, and each character in the
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	85 set is indexed using a single position code, in the range 0 through
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	86 127 (if non-printing characters are included) or 33 through 126
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	87 (if only the printing characters are considered). JISX0208, i.e.
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	88 Japanese Kanji, has thousands of characters, and is of dimension two --
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	89 every character is indexed by two position codes, each in the range
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	90 33 through 126. (Note that the choice of the range here is somewhat
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	91 arbitrary. Although a character set such as JISX0208 defines an
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	92 @emph{ordering} of all its characters, it does not define the actual
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	93 mapping between numbers and characters. You could just as easily
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	94 index the characters in JISX0208 using numbers in the range 0 through
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	95 93, 1 through 94, 2 through 95, etc. The reason for the actual range
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	96 chosen is so that the position codes match up with the actual values
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	97 used in the common encodings.)
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	98
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	99 An @dfn{encoding} is a way of numerically representing characters from
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	100 one or more character sets into a stream of like-sized numerical values
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	101 called @dfn{words}; typically these are 8-bit, 16-bit, or 32-bit
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	102 quantities. If an encoding encompasses only one character set, then the
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	103 position codes for the characters in that character set could be used
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	104 directly. (This is the case with ASCII, and as a result, most people do
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	105 not understand the difference between a character set and an encoding.)
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	106 This is not possible, however, if more than one character set is to be
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	107 used in the encoding. For example, printed Japanese text typically
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	108 requires characters from multiple character sets -- ASCII, JISX0208, and
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	109 JISX0212, to be specific. Each of these is indexed using one or more
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	110 position codes in the range 33 through 126, so the position codes could
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	111 not be used directly or there would be no way to tell which character
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	112 was meant. Different Japanese encodings handle this differently -- JIS
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	113 uses special escape characters to denote different character sets; EUC
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	114 sets the high bit of the position codes for JISX0208 and JISX0212, and
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	115 puts a special extra byte before each JISX0212 character; etc. (JIS,
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	116 EUC, and most of the other encodings you will encounter are 7-bit or
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	117 8-bit encodings. There is one common 16-bit encoding, which is Unicode;
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	118 this strives to represent all the world's characters in a single large
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	119 character set. 32-bit encodings are generally used internally in
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	120 programs to simplify the code that manipulates them; however, they are
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	121 not much used externally because they are not very space-efficient.)
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	122
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	123 Encodings are classified as either @dfn{modal} or @dfn{non-modal}. In
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	124 a @dfn{modal encoding}, there are multiple states that the encoding can be in,
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	125 and the interpretation of the values in the stream depends on the
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	126 current global state of the encoding. Special values in the encoding,
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	127 called @dfn{escape sequences}, are used to change the global state.
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	128 JIS, for example, is a modal encoding. The bytes @samp{ESC $ B}
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	129 indicate that, from then on, bytes are to be interpreted as position
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	130 codes for JISX0208, rather than as ASCII. This effect is cancelled
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	131 using the bytes @samp{ESC ( B}, which mean ``switch from whatever the
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	132 current state is to ASCII''. To switch to JISX0212, the escape sequence
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	133 @samp{ESC $ ( D}. (Note that here, as is common, the escape sequences do
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	134 in fact begin with @samp{ESC}. This is not necessarily the case,
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	135 however.)
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	136
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	137 A @dfn{non-modal encoding} has no global state that extends past the
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	138 character currently being interpreted. EUC, for example, is a
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	139 non-modal encoding. Characters in JISX0208 are encoded by setting
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	140 the high bit of the position codes, and characters in JISX0212 are
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	141 encoded by doing the same but also prefixing the character with the
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	142 byte 0x8F.
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	143
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	144 The advantage of a modal encoding is that it is generally more
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	145 space-efficient, and is easily extendable because there are essentially
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	146 an arbitrary number of escape sequences that can be created. The
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	147 disadvantage, however, is that it is much more difficult to work with
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	148 if it is not being processed in a sequential manner. In the non-modal
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	149 EUC encoding, for example, the byte 0x41 always refers to the letter
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	150 @samp{A}; whereas in JIS, it could either be the letter @samp{A}, or
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	151 one of the two position codes in a JISX0208 character, or one of the
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	152 two position codes in a JISX0212 character. Determining exactly which
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	153 one is meant could be difficult and time-consuming if the previous
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	154 bytes in the string have not already been processed.
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	155
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	156 Non-modal encodings are further divided into @dfn{fixed-width} and
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	157 @dfn{variable-width} formats. A fixed-width encoding always uses
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	158 the same number of words per character, whereas a variable-width
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	159 encoding does not. EUC is a good example of a variable-width
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	160 encoding: one to three bytes are used per character, depending on
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	161 the character set. 16-bit and 32-bit encodings are nearly always
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	162 fixed-width, and this is in fact one of the main reasons for using
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	163 an encoding with a larger word size. The advantages of fixed-width
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	164 encodings should be obvious. The advantages of variable-width
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	165 encodings are that they are generally more space-efficient and allow
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	166 for compatibility with existing 8-bit encodings such as ASCII.
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	167
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	168 Note that the bytes in an 8-bit encoding are often referred to
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	169 as @dfn{octets} rather than simply as bytes. This terminology
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	170 dates back to the days before 8-bit bytes were universal, when
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	171 some computers had 9-bit bytes, others had 10-bit bytes, etc.
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	172
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	173 @node Charsets
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	174 @section Charsets
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	175
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	176 A @dfn{charset} in MULE is an object that encapsulates a
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	177 particular character set as well as an ordering of those characters.
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	178 Charsets are permanent objects and are named using symbols, like
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	179 faces.
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	180
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	181 @defun charsetp object
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	182 This function returns non-@code{nil} if @var{object} is a charset.
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	183 @end defun
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	184
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	185 @menu
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	186 * Charset Properties:: Properties of a charset.
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	187 * Basic Charset Functions:: Functions for working with charsets.
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	188 * Charset Property Functions:: Functions for accessing charset properties.
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	189 * Predefined Charsets:: Predefined charset objects.
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	190 @end menu
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	191
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	192 @node Charset Properties
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	193 @subsection Charset Properties
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	194
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	195 Charsets have the following properties:
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	196
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	197 @table @code
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	198 @item name
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	199 A symbol naming the charset. Every charset must have a different name;
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	200 this allows a charset to be referred to using its name rather than
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	201 the actual charset object.
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	202 @item doc-string
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	203 A documentation string describing the charset.
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	204 @item registry
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	205 A regular expression matching the font registry field for this character
70 131b0175ea99 Import from CVS: tag r20-0b30 cvs parents: 54 diff changeset	206 set. For example, both the @code{ascii} and @code{latin-1} charsets
131b0175ea99 Import from CVS: tag r20-0b30 cvs parents: 54 diff changeset	207 use the registry @code{"ISO8859-1"}. This field is used to choose
131b0175ea99 Import from CVS: tag r20-0b30 cvs parents: 54 diff changeset	208 an appropriate font when the user gives a general font specification
131b0175ea99 Import from CVS: tag r20-0b30 cvs parents: 54 diff changeset	209 such as @samp{--courier-medium-r--140-*}, i.e. a 14-point upright
131b0175ea99 Import from CVS: tag r20-0b30 cvs parents: 54 diff changeset	210 medium-weight Courier font.
0 376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	211 @item dimension
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	212 Number of position codes used to index a character in the character set.
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	213 XEmacs/MULE can only handle character sets of dimension 1 or 2.
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	214 This property defaults to 1.
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	215 @item chars
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	216 Number of characters in each dimension. In XEmacs/MULE, the only
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	217 allowed values are 94 or 96. (There are a couple of pre-defined
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	218 character sets, such as ASCII, that do not follow this, but you cannot
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	219 define new ones like this.) Defaults to 94. Note that if the dimension
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	220 is 2, the character set thus described is 94x94 or 96x96.
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	221 @item columns
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	222 Number of columns used to display a character in this charset.
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	223 Only used in TTY mode. (Under X, the actual width of a character
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	224 can be derived from the font used to display the characters.)
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	225 If unspecified, defaults to the dimension. (This is almost
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	226 always the correct value, because character sets with dimension 2
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	227 are usually ideograph character sets, which need two columns to
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	228 display the intricate ideographs.)
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	229 @item direction
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	230 A symbol, either @code{l2r} (left-to-right) or @code{r2l}
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	231 (right-to-left). Defaults to @code{l2r}. This specifies the
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	232 direction that the text should be displayed in, and will be
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	233 left-to-right for most charsets but right-to-left for Hebrew
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	234 and Arabic. (Right-to-left display is not currently implemented.)
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	235 @item final
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	236 Final byte of the standard ISO 2022 escape sequence designating this
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	237 charset. Must be supplied. Each combination of (@var{dimension},
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	238 @var{chars}) defines a separate namespace for final bytes, and each
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	239 charset within a particular namespace must have a different final byte.
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	240 Note that ISO 2022 restricts the final byte to the range 0x30 - 0x7E if
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	241 dimension == 1, and 0x30 - 0x5F if dimension == 2. Note also that final
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	242 bytes in the range 0x30 - 0x3F are reserved for user-defined (not
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	243 official) character sets. For more information on ISO 2022, see @ref{Coding
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	244 Systems}.
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	245 @item graphic
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	246 0 (use left half of font on output) or 1 (use right half of font on
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	247 output). Defaults to 0. This specifies how to convert the position
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	248 codes that index a character in a character set into an index into the
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	249 font used to display the character set. With @code{graphic} set to 0,
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	250 position codes 33 through 126 map to font indices 33 through 126; with
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	251 it set to 1, position codes 33 through 126 map to font indices 161
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	252 through 254 (i.e. the same number but with the high bit set). For
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	253 example, for a font whose registry is ISO8859-1, the left half of the
70 131b0175ea99 Import from CVS: tag r20-0b30 cvs parents: 54 diff changeset	254 font (octets 0x20 - 0x7F) is the @code{ascii} charset, while the
131b0175ea99 Import from CVS: tag r20-0b30 cvs parents: 54 diff changeset	255 right half (octets 0xA0 - 0xFF) is the @code{latin-1} charset.
0 376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	256 @item ccl-program
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	257 A compiled CCL program used to convert a character in this charset into
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	258 an index into the font. This is in addition to the @code{graphic}
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	259 property. If a CCL program is defined, the position codes of a
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	260 character will first be processed according to @code{graphic} and
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	261 then passed through the CCL program, with the resulting values used
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	262 to index the font.
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	263
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	264 This is used, for example, in the Big5 character set (used in Taiwan).
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	265 This character set is not ISO-2022-compliant, and its size (94x157) does
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	266 not fit within the maximum 96x96 size of ISO-2022-compliant character
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	267 sets. As a result, XEmacs/MULE splits it (in a rather complex fashion,
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	268 so as to group the most commonly used characters together) into two
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	269 charset objects (@code{big5-1} and @code{big5-2}), each of size 94x94,
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	270 and each charset object uses a CCL program to convert the modified
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	271 position codes back into standard Big5 indices to retrieve a character
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	272 from a Big5 font.
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	273 @end table
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	274
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	275 Most of the above properties can only be changed when the charset
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	276 is created. @xref{Charset Property Functions}.
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	277
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	278 @node Basic Charset Functions
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	279 @subsection Basic Charset Functions
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	280
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	281 @defun find-charset charset-or-name
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	282 This function retrieves the charset of the given name. If
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	283 @var{charset-or-name} is a charset object, it is simply returned.
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	284 Otherwise, @var{charset-or-name} should be a symbol. If there is no
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	285 such charset, @code{nil} is returned. Otherwise the associated charset
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	286 object is returned.
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	287 @end defun
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	288
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	289 @defun get-charset name
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	290 This function retrieves the charset of the given name. Same as
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	291 @code{find-charset} except an error is signalled if there is no such
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	292 charset instead of returning @code{nil}.
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	293 @end defun
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	294
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	295 @defun charset-list
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	296 This function returns a list of the names of all defined charsets.
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	297 @end defun
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	298
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	299 @defun make-charset name doc-string props
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	300 This function defines a new character set. This function is for use
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	301 with Mule support. @var{name} is a symbol, the name by which the
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	302 character set is normally referred. @var{doc-string} is a string
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	303 describing the character set. @var{props} is a property list,
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	304 describing the specific nature of the character set. The recognized
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	305 properties are @code{registry}, @code{dimension}, @code{columns},
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	306 @code{chars}, @code{final}, @code{graphic}, @code{direction}, and
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	307 @code{ccl-program}, as previously described.
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	308 @end defun
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	309
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	310 @defun make-reverse-direction-charset charset new-name
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	311 This function makes a charset equivalent to @var{charset} but which goes
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	312 in the opposite direction. @var{new-name} is the name of the new
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	313 charset. The new charset is returned.
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	314 @end defun
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	315
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	316 @defun charset-from-attributes dimension chars final &optional direction
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	317 This function returns a charset with the given @var{dimension},
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	318 @var{chars}, @var{final}, and @var{direction}. If @var{direction} is
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	319 omitted, both directions will be checked (left-to-right will be returned
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	320 if character sets exist for both directions).
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	321 @end defun
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	322
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	323 @defun charset-reverse-direction-charset charset
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	324 This function returns the charset (if any) with the same dimension,
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	325 number of characters, and final byte as @var{charset}, but which is
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	326 displayed in the opposite direction.
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	327 @end defun
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	328
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	329 @node Charset Property Functions
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	330 @subsection Charset Property Functions
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	331
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	332 All of these functions accept either a charset name or charset object.
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	333
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	334 @defun charset-property charset prop
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	335 This function returns property @var{prop} of @var{charset}.
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	336 @xref{Charset Properties}.
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	337 @end defun
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	338
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	339 Convenience functions are also provided for retrieving individual
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	340 properties of a charset.
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	341
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	342 @defun charset-name charset
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	343 This function returns the name of @var{charset}. This will be a symbol.
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	344 @end defun
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	345
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	346 @defun charset-doc-string charset
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	347 This function returns the doc string of @var{charset}.
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	348 @end defun
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	349
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	350 @defun charset-registry charset
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	351 This function returns the registry of @var{charset}.
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	352 @end defun
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	353
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	354 @defun charset-dimension charset
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	355 This function returns the dimension of @var{charset}.
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	356 @end defun
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	357
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	358 @defun charset-chars charset
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	359 This function returns the number of characters per dimension of
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	360 @var{charset}.
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	361 @end defun
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	362
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	363 @defun charset-columns charset
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	364 This function returns the number of display columns per character (in
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	365 TTY mode) of @var{charset}.
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	366 @end defun
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	367
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	368 @defun charset-direction charset
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	369 This function returns the display direction of @var{charset} -- either
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	370 @code{l2r} or @code{r2l}.
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	371 @end defun
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	372
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	373 @defun charset-final charset
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	374 This function returns the final byte of the ISO 2022 escape sequence
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	375 designating @var{charset}.
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	376 @end defun
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	377
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	378 @defun charset-graphic charset
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	379 This function returns either 0 or 1, depending on whether the position
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	380 codes of characters in @var{charset} map to the left or right half
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	381 of their font, respectively.
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	382 @end defun
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	383
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	384 @defun charset-ccl-program charset
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	385 This function returns the CCL program, if any, for converting
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	386 position codes of characters in @var{charset} into font indices.
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	387 @end defun
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	388
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	389 The only property of a charset that can currently be set after
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	390 the charset has been created is the CCL program.
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	391
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	392 @defun set-charset-ccl-program charset ccl-program
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	393 This function sets the @code{ccl-program} property of @var{charset} to
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	394 @var{ccl-program}.
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	395 @end defun
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	396
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	397 @node Predefined Charsets
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	398 @subsection Predefined Charsets
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	399
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	400 The following charsets are predefined in the C code.
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	401
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	402 @example
54 05472e90ae02 Import from CVS: tag r19-16-pre2 cvs parents: 0 diff changeset	403 Name Type Fi Gr Dir Registry
0 376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	404 --------------------------------------------------------------
54 05472e90ae02 Import from CVS: tag r19-16-pre2 cvs parents: 0 diff changeset	405 ascii 94 B 0 l2r ISO8859-1
05472e90ae02 Import from CVS: tag r19-16-pre2 cvs parents: 0 diff changeset	406 control-1 94 0 l2r ---
70 131b0175ea99 Import from CVS: tag r20-0b30 cvs parents: 54 diff changeset	407 latin-1 94 A 1 l2r ISO8859-1
131b0175ea99 Import from CVS: tag r20-0b30 cvs parents: 54 diff changeset	408 latin-2 96 B 1 l2r ISO8859-2
131b0175ea99 Import from CVS: tag r20-0b30 cvs parents: 54 diff changeset	409 latin-3 96 C 1 l2r ISO8859-3
131b0175ea99 Import from CVS: tag r20-0b30 cvs parents: 54 diff changeset	410 latin-4 96 D 1 l2r ISO8859-4
131b0175ea99 Import from CVS: tag r20-0b30 cvs parents: 54 diff changeset	411 cyrillic 96 L 1 l2r ISO8859-5
131b0175ea99 Import from CVS: tag r20-0b30 cvs parents: 54 diff changeset	412 arabic 96 G 1 r2l ISO8859-6
131b0175ea99 Import from CVS: tag r20-0b30 cvs parents: 54 diff changeset	413 greek 96 F 1 l2r ISO8859-7
131b0175ea99 Import from CVS: tag r20-0b30 cvs parents: 54 diff changeset	414 hebrew 96 H 1 r2l ISO8859-8
131b0175ea99 Import from CVS: tag r20-0b30 cvs parents: 54 diff changeset	415 latin-5 96 M 1 l2r ISO8859-9
131b0175ea99 Import from CVS: tag r20-0b30 cvs parents: 54 diff changeset	416 thai 96 T 1 l2r TIS620
131b0175ea99 Import from CVS: tag r20-0b30 cvs parents: 54 diff changeset	417 japanese-jisx0201-kana 94 I 1 l2r JISX0201.1976
131b0175ea99 Import from CVS: tag r20-0b30 cvs parents: 54 diff changeset	418 japanese-jisx0201-roman 94 J 0 l2r JISX0201.1976
54 05472e90ae02 Import from CVS: tag r19-16-pre2 cvs parents: 0 diff changeset	419 japanese-jisx0208-1978 94x94 @@ 0 l2r JISX0208.1978
05472e90ae02 Import from CVS: tag r19-16-pre2 cvs parents: 0 diff changeset	420 japanese-jisx0208 94x94 B 0 l2r JISX0208.19(83\|90)
05472e90ae02 Import from CVS: tag r19-16-pre2 cvs parents: 0 diff changeset	421 japanese-jisx0212 94x94 D 0 l2r JISX0212
70 131b0175ea99 Import from CVS: tag r20-0b30 cvs parents: 54 diff changeset	422 chinese-gb 94x94 A 0 l2r GB2312
54 05472e90ae02 Import from CVS: tag r19-16-pre2 cvs parents: 0 diff changeset	423 chinese-cns11643-1 94x94 G 0 l2r CNS11643.1
05472e90ae02 Import from CVS: tag r19-16-pre2 cvs parents: 0 diff changeset	424 chinese-cns11643-2 94x94 H 0 l2r CNS11643.2
05472e90ae02 Import from CVS: tag r19-16-pre2 cvs parents: 0 diff changeset	425 chinese-big5-1 94x94 0 0 l2r Big5
05472e90ae02 Import from CVS: tag r19-16-pre2 cvs parents: 0 diff changeset	426 chinese-big5-2 94x94 1 0 l2r Big5
05472e90ae02 Import from CVS: tag r19-16-pre2 cvs parents: 0 diff changeset	427 korean-ksc5601 94x94 C 0 l2r KSC5601
05472e90ae02 Import from CVS: tag r19-16-pre2 cvs parents: 0 diff changeset	428 composite 96x96 0 l2r ---
0 376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	429 @end example
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	430
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	431 The following charsets are predefined in the Lisp code.
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	432
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	433 @example
54 05472e90ae02 Import from CVS: tag r19-16-pre2 cvs parents: 0 diff changeset	434 Name Type Fi Gr Dir Registry
0 376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	435 --------------------------------------------------------------
54 05472e90ae02 Import from CVS: tag r19-16-pre2 cvs parents: 0 diff changeset	436 arabic-digit 94 2 0 l2r MuleArabic-0
05472e90ae02 Import from CVS: tag r19-16-pre2 cvs parents: 0 diff changeset	437 arabic-1-column 94 3 0 r2l MuleArabic-1
05472e90ae02 Import from CVS: tag r19-16-pre2 cvs parents: 0 diff changeset	438 arabic-2-column 94 4 0 r2l MuleArabic-2
05472e90ae02 Import from CVS: tag r19-16-pre2 cvs parents: 0 diff changeset	439 sisheng 94 0 0 l2r sisheng_cwnn\\|OMRON_UDC_ZH
05472e90ae02 Import from CVS: tag r19-16-pre2 cvs parents: 0 diff changeset	440 chinese-cns11643-3 94x94 I 0 l2r CNS11643.1
05472e90ae02 Import from CVS: tag r19-16-pre2 cvs parents: 0 diff changeset	441 chinese-cns11643-4 94x94 J 0 l2r CNS11643.1
05472e90ae02 Import from CVS: tag r19-16-pre2 cvs parents: 0 diff changeset	442 chinese-cns11643-5 94x94 K 0 l2r CNS11643.1
05472e90ae02 Import from CVS: tag r19-16-pre2 cvs parents: 0 diff changeset	443 chinese-cns11643-6 94x94 L 0 l2r CNS11643.1
05472e90ae02 Import from CVS: tag r19-16-pre2 cvs parents: 0 diff changeset	444 chinese-cns11643-7 94x94 M 0 l2r CNS11643.1
05472e90ae02 Import from CVS: tag r19-16-pre2 cvs parents: 0 diff changeset	445 ethiopic 94x94 2 0 l2r Ethio
05472e90ae02 Import from CVS: tag r19-16-pre2 cvs parents: 0 diff changeset	446 ascii-r2l 94 B 0 r2l ISO8859-1
05472e90ae02 Import from CVS: tag r19-16-pre2 cvs parents: 0 diff changeset	447 ipa 96 0 1 l2r MuleIPA
05472e90ae02 Import from CVS: tag r19-16-pre2 cvs parents: 0 diff changeset	448 vietnamese-lower 96 1 1 l2r VISCII1.1
05472e90ae02 Import from CVS: tag r19-16-pre2 cvs parents: 0 diff changeset	449 vietnamese-upper 96 2 1 l2r VISCII1.1
0 376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	450 @end example
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	451
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	452 For all of the above charsets, the dimension and number of columns are
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	453 the same.
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	454
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	455 Note that ASCII, Control-1, and Composite are handled specially.
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	456 This is why some of the fields are blank; and some of the filled-in
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	457 fields (e.g. the type) are not really accurate.
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	458
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	459 @node MULE Characters
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	460 @section MULE Characters
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	461
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	462 @defun make-char charset arg1 &optional arg2
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	463 This function makes a multi-byte character from @var{charset} and octets
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	464 @var{arg1} and @var{arg2}.
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	465 @end defun
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	466
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	467 @defun char-charset ch
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	468 This function returns the character set of char @var{ch}.
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	469 @end defun
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	470
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	471 @defun char-octet ch &optional n
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	472 This function returns the octet (i.e. position code) numbered @var{n}
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	473 (should be 0 or 1) of char @var{ch}. @var{n} defaults to 0 if omitted.
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	474 @end defun
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	475
70 131b0175ea99 Import from CVS: tag r20-0b30 cvs parents: 54 diff changeset	476 @defun charsets-in-region start end &optional buffer
0 376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	477 This function returns a list of the charsets in the region between
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	478 @var{start} and @var{end}. @var{buffer} defaults to the current buffer
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	479 if omitted.
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	480 @end defun
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	481
70 131b0175ea99 Import from CVS: tag r20-0b30 cvs parents: 54 diff changeset	482 @defun charsets-in-string string
0 376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	483 This function returns a list of the charsets in @var{string}.
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	484 @end defun
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	485
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	486 @node Composite Characters
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	487 @section Composite Characters
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	488
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	489 Composite characters are not yet completely implemented.
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	490
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	491 @defun make-composite-char string
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	492 This function converts a string into a single composite character. The
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	493 character is the result of overstriking all the characters in the
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	494 string.
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	495 @end defun
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	496
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	497 @defun composite-char-string ch
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	498 This function returns a string of the characters comprising a composite
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	499 character.
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	500 @end defun
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	501
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	502 @defun compose-region start end &optional buffer
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	503 This function composes the characters in the region from @var{start} to
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	504 @var{end} in @var{buffer} into one composite character. The composite
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	505 character replaces the composed characters. @var{buffer} defaults to
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	506 the current buffer if omitted.
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	507 @end defun
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	508
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	509 @defun decompose-region start end &optional buffer
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	510 This function decomposes any composite characters in the region from
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	511 @var{start} to @var{end} in @var{buffer}. This converts each composite
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	512 character into one or more characters, the individual characters out of
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	513 which the composite character was formed. Non-composite characters are
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	514 left as-is. @var{buffer} defaults to the current buffer if omitted.
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	515 @end defun
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	516
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	517 @node ISO 2022
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	518 @section ISO 2022
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	519
54 05472e90ae02 Import from CVS: tag r19-16-pre2 cvs parents: 0 diff changeset	520 This section briefly describes the ISO 2022 encoding standard. For more
05472e90ae02 Import from CVS: tag r19-16-pre2 cvs parents: 0 diff changeset	521 thorough understanding, please refer to the original document of ISO
05472e90ae02 Import from CVS: tag r19-16-pre2 cvs parents: 0 diff changeset	522 2022.
0 376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	523
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	524 Character sets (@dfn{charsets}) are classified into the following four
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	525 categories, according to the number of characters of charset:
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	526 94-charset, 96-charset, 94x94-charset, and 96x96-charset.
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	527
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	528 @need 1000
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	529 @table @asis
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	530 @item 94-charset
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	531 ASCII(B), left(J) and right(I) half of JISX0201, ...
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	532 @item 96-charset
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	533 Latin-1(A), Latin-2(B), Latin-3(C), ...
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	534 @item 94x94-charset
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	535 GB2312(A), JISX0208(B), KSC5601(C), ...
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	536 @item 96x96-charset
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	537 none for the moment
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	538 @end table
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	539
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	540 The character in parentheses after the name of each charset
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	541 is the @dfn{final character} @var{F}, which can be regarded as
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	542 the identifier of the charset. ECMA allocates @var{F} to each
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	543 charset. @var{F} is in the range of 0x30..0x7F, but 0x30..0x3F
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	544 are only for private use.
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	545
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	546 Note: @dfn{ECMA} = European Computer Manufacturers Association
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	547
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	548 There are four @dfn{registers of charsets}, called G0 thru G3.
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	549 You can designate (or assign) any charset to one of these
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	550 registers.
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	551
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	552 The code space contained within one octet (of size 256) is divided into
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	553 4 areas: C0, GL, C1, and GR. GL and GR are the areas into which a
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	554 register of charset can be invoked into.
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	555
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	556 @example
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	557 @group
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	558 C0: 0x00 - 0x1F
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	559 GL: 0x20 - 0x7F
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	560 C1: 0x80 - 0x9F
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	561 GR: 0xA0 - 0xFF
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	562 @end group
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	563 @end example
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	564
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	565 Usually, in the initial state, G0 is invoked into GL, and G1
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	566 is invoked into GR.
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	567
54 05472e90ae02 Import from CVS: tag r19-16-pre2 cvs parents: 0 diff changeset	568 ISO 2022 distinguishes 7-bit environments and 8-bit environments. In
05472e90ae02 Import from CVS: tag r19-16-pre2 cvs parents: 0 diff changeset	569 7-bit environments, only C0 and GL are used.
0 376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	570
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	571 Charset designation is done by escape sequences of the form:
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	572
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	573 @example
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	574 ESC [@var{I}] @var{I} @var{F}
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	575 @end example
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	576
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	577 where @var{I} is an intermediate character in the range 0x20 - 0x2F, and
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	578 @var{F} is the final character identifying this charset.
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	579
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	580 The meaning of intermediate characters are:
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	581
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	582 @example
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	583 @group
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	584 $ [0x24]: indicate charset of dimension 2 (94x94 or 96x96).
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	585 ( [0x28]: designate to G0 a 94-charset whose final byte is @var{F}.
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	586 ) [0x29]: designate to G1 a 94-charset whose final byte is @var{F}.
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	587 * [0x2A]: designate to G2 a 94-charset whose final byte is @var{F}.
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	588 + [0x2B]: designate to G3 a 94-charset whose final byte is @var{F}.
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	589 - [0x2D]: designate to G1 a 96-charset whose final byte is @var{F}.
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	590 . [0x2E]: designate to G2 a 96-charset whose final byte is @var{F}.
54 05472e90ae02 Import from CVS: tag r19-16-pre2 cvs parents: 0 diff changeset	591 / [0x2F]: designate to G3 a 96-charset whose final byte is @var{F}.
0 376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	592 @end group
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	593 @end example
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	594
54 05472e90ae02 Import from CVS: tag r19-16-pre2 cvs parents: 0 diff changeset	595 The following rule is not allowed in ISO 2022 but can be used in Mule.
0 376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	596
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	597 @example
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	598 , [0x2C]: designate to G0 a 96-charset whose final byte is @var{F}.
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	599 @end example
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	600
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	601 Here are examples of designations:
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	602
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	603 @example
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	604 @group
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	605 ESC ( B : designate to G0 ASCII
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	606 ESC - A : designate to G1 Latin-1
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	607 ESC $ ( A or ESC $ A : designate to G0 GB2312
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	608 ESC $ ( B or ESC $ B : designate to G0 JISX0208
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	609 ESC $ ) C : designate to G1 KSC5601
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	610 @end group
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	611 @end example
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	612
54 05472e90ae02 Import from CVS: tag r19-16-pre2 cvs parents: 0 diff changeset	613 To use a charset designated to G2 or G3, and to use a charset designated
05472e90ae02 Import from CVS: tag r19-16-pre2 cvs parents: 0 diff changeset	614 to G1 in a 7-bit environment, you must explicitly invoke G1, G2, or G3
05472e90ae02 Import from CVS: tag r19-16-pre2 cvs parents: 0 diff changeset	615 into GL. There are two types of invocation, Locking Shift (forever) and
05472e90ae02 Import from CVS: tag r19-16-pre2 cvs parents: 0 diff changeset	616 Single Shift (one character only).
0 376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	617
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	618 Locking Shift is done as follows:
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	619
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	620 @example
54 05472e90ae02 Import from CVS: tag r19-16-pre2 cvs parents: 0 diff changeset	621 LS0 or SI (0x0F): invoke G0 into GL
05472e90ae02 Import from CVS: tag r19-16-pre2 cvs parents: 0 diff changeset	622 LS1 or SO (0x0E): invoke G1 into GL
0 376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	623 LS2: invoke G2 into GL
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	624 LS3: invoke G3 into GL
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	625 LS1R: invoke G1 into GR
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	626 LS2R: invoke G2 into GR
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	627 LS3R: invoke G3 into GR
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	628 @end example
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	629
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	630 Single Shift is done as follows:
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	631
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	632 @example
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	633 @group
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	634 SS2 or ESC N: invoke G2 into GL
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	635 SS3 or ESC O: invoke G3 into GL
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	636 @end group
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	637 @end example
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	638
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	639 (#### Ben says: I think the above is slightly incorrect. It appears that
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	640 SS2 invokes G2 into GR and SS3 invokes G3 into GR, whereas ESC N and
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	641 ESC O behave as indicated. The above definitions will not parse
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	642 EUC-encoded text correctly, and it looks like the code in mule-coding.c
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	643 has similar problems.)
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	644
54 05472e90ae02 Import from CVS: tag r19-16-pre2 cvs parents: 0 diff changeset	645 You may realize that there are a lot of ISO-2022-compliant ways of
0 376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	646 encoding multilingual text. Now, in the world, there exist many coding
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	647 systems such as X11's Compound Text, Japanese JUNET code, and so-called
54 05472e90ae02 Import from CVS: tag r19-16-pre2 cvs parents: 0 diff changeset	648 EUC (Extended UNIX Code); all of these are variants of ISO 2022.
0 376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	649
54 05472e90ae02 Import from CVS: tag r19-16-pre2 cvs parents: 0 diff changeset	650 In Mule, we characterize ISO 2022 by the following attributes:
0 376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	651
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	652 @enumerate
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	653 @item
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	654 Initial designation to G0 thru G3.
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	655 @item
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	656 Allow designation of short form for Japanese and Chinese.
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	657 @item
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	658 Should we designate ASCII to G0 before control characters?
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	659 @item
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	660 Should we designate ASCII to G0 at the end of line?
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	661 @item
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	662 7-bit environment or 8-bit environment.
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	663 @item
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	664 Use Locking Shift or not.
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	665 @item
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	666 Use ASCII or JIS0201-1976-Roman.
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	667 @item
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	668 Use JISX0208-1983 or JISX0208-1976.
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	669 @end enumerate
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	670
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	671 (The last two are only for Japanese.)
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	672
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	673 By specifying these attributes, you can create any variant
54 05472e90ae02 Import from CVS: tag r19-16-pre2 cvs parents: 0 diff changeset	674 of ISO 2022.
0 376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	675
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	676 Here are several examples:
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	677
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	678 @example
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	679 @group
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	680 junet -- Coding system used in JUNET.
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	681 1. G0 <- ASCII, G1..3 <- never used
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	682 2. Yes.
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	683 3. Yes.
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	684 4. Yes.
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	685 5. 7-bit environment
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	686 6. No.
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	687 7. Use ASCII
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	688 8. Use JISX0208-1983
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	689 @end group
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	690
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	691 @group
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	692 ctext -- Compound Text
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	693 1. G0 <- ASCII, G1 <- Latin-1, G2,3 <- never used
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	694 2. No.
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	695 3. No.
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	696 4. Yes.
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	697 5. 8-bit environment
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	698 6. No.
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	699 7. Use ASCII
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	700 8. Use JISX0208-1983
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	701 @end group
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	702
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	703 @group
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	704 euc-china -- Chinese EUC. Although many people call this
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	705 as "GB encoding", the name may cause misunderstanding.
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	706 1. G0 <- ASCII, G1 <- GB2312, G2,3 <- never used
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	707 2. No.
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	708 3. Yes.
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	709 4. Yes.
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	710 5. 8-bit environment
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	711 6. No.
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	712 7. Use ASCII
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	713 8. Use JISX0208-1983
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	714 @end group
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	715
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	716 @group
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	717 korean-mail -- Coding system used in Korean network.
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	718 1. G0 <- ASCII, G1 <- KSC5601, G2,3 <- never used
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	719 2. No.
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	720 3. Yes.
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	721 4. Yes.
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	722 5. 7-bit environment
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	723 6. Yes.
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	724 7. No.
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	725 8. No.
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	726 @end group
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	727 @end example
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	728
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	729 Mule creates all these coding systems by default.
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	730
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	731 @node Coding Systems
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	732 @section Coding Systems
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	733
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	734 A coding system is an object that defines how text containing multiple
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	735 character sets is encoded into a stream of (typically 8-bit) bytes. The
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	736 coding system is used to decode the stream into a series of characters
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	737 (which may be from multiple charsets) when the text is read from a file
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	738 or process, and is used to encode the text back into the same format
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	739 when it is written out to a file or process.
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	740
54 05472e90ae02 Import from CVS: tag r19-16-pre2 cvs parents: 0 diff changeset	741 For example, many ISO-2022-compliant coding systems (such as Compound
0 376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	742 Text, which is used for inter-client data under the X Window System) use
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	743 escape sequences to switch between different charsets -- Japanese Kanji,
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	744 for example, is invoked with @samp{ESC $ ( B}; ASCII is invoked with
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	745 @samp{ESC ( B}; and Cyrillic is invoked with @samp{ESC - L}. See
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	746 @code{make-coding-system} for more information.
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	747
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	748 Coding systems are normally identified using a symbol, and the symbol is
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	749 accepted in place of the actual coding system object whenever a coding
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	750 system is called for. (This is similar to how faces and charsets work.)
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	751
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	752 @defun coding-system-p object
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	753 This function returns non-@code{nil} if @var{object} is a coding system.
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	754 @end defun
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	755
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	756 @menu
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	757 * Coding System Types:: Classifying coding systems.
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	758 * EOL Conversion:: Dealing with different ways of denoting
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	759 the end of a line.
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	760 * Coding System Properties:: Properties of a coding system.
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	761 * Basic Coding System Functions:: Working with coding systems.
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	762 * Coding System Property Functions:: Retrieving a coding system's properties.
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	763 * Encoding and Decoding Text:: Encoding and decoding text.
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	764 * Detection of Textual Encoding:: Determining how text is encoded.
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	765 * Big5 and Shift-JIS Functions:: Special functions for these non-standard
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	766 encodings.
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	767 @end menu
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	768
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	769 @node Coding System Types
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	770 @subsection Coding System Types
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	771
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	772 @table @code
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	773 @item nil
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	774 @itemx autodetect
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	775 Automatic conversion. XEmacs attempts to detect the coding system used
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	776 in the file.
54 05472e90ae02 Import from CVS: tag r19-16-pre2 cvs parents: 0 diff changeset	777 @item no-conversion
0 376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	778 No conversion. Use this for binary files and such. On output, graphic
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	779 characters that are not in ASCII or Latin-1 will be replaced by a
70 131b0175ea99 Import from CVS: tag r20-0b30 cvs parents: 54 diff changeset	780 @samp{?}. (For a no-conversion-encoded buffer, these characters will only be
131b0175ea99 Import from CVS: tag r20-0b30 cvs parents: 54 diff changeset	781 present if you explicitly insert them.)
0 376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	782 @item shift-jis
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	783 Shift-JIS (a Japanese encoding commonly used in PC operating systems).
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	784 @item iso2022
54 05472e90ae02 Import from CVS: tag r19-16-pre2 cvs parents: 0 diff changeset	785 Any ISO-2022-compliant encoding. Among other things, this includes JIS
05472e90ae02 Import from CVS: tag r19-16-pre2 cvs parents: 0 diff changeset	786 (the Japanese encoding commonly used for e-mail), national variants of
05472e90ae02 Import from CVS: tag r19-16-pre2 cvs parents: 0 diff changeset	787 EUC (the standard Unix encoding for Japanese and other languages), and
05472e90ae02 Import from CVS: tag r19-16-pre2 cvs parents: 0 diff changeset	788 Compound Text (an encoding used in X11). You can specify more specific
05472e90ae02 Import from CVS: tag r19-16-pre2 cvs parents: 0 diff changeset	789 information about the conversion with the @var{flags} argument.
0 376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	790 @item big5
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	791 Big5 (the encoding commonly used for Taiwanese).
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	792 @item ccl
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	793 The conversion is performed using a user-written pseudo-code program.
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	794 CCL (Code Conversion Language) is the name of this pseudo-code.
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	795 @item internal
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	796 Write out or read in the raw contents of the memory representing the
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	797 buffer's text. This is primarily useful for debugging purposes, and is
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	798 only enabled when XEmacs has been compiled with @code{DEBUG_XEMACS} set
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	799 (the @samp{--debug} configure option). @strong{Warning}: Reading in a
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	800 file using @code{internal} conversion can result in an internal
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	801 inconsistency in the memory representing a buffer's text, which will
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	802 produce unpredictable results and may cause XEmacs to crash. Under
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	803 normal circumstances you should never use @code{internal} conversion.
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	804 @end table
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	805
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	806 @node EOL Conversion
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	807 @subsection EOL Conversion
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	808
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	809 @table @code
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	810 @item nil
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	811 Automatically detect the end-of-line type (LF, CRLF, or CR). Also
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	812 generate subsidiary coding systems named @code{@var{name}-unix},
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	813 @code{@var{name}-dos}, and @code{@var{name}-mac}, that are identical to
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	814 this coding system but have an EOL-TYPE value of @code{lf}, @code{crlf},
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	815 and @code{cr}, respectively.
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	816 @item lf
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	817 The end of a line is marked externally using ASCII LF. Since this is
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	818 also the way that XEmacs represents an end-of-line internally,
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	819 specifying this option results in no end-of-line conversion. This is
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	820 the standard format for Unix text files.
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	821 @item crlf
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	822 The end of a line is marked externally using ASCII CRLF. This is the
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	823 standard format for MS-DOS text files.
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	824 @item cr
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	825 The end of a line is marked externally using ASCII CR. This is the
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	826 standard format for Macintosh text files.
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	827 @item t
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	828 Automatically detect the end-of-line type but do not generate subsidiary
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	829 coding systems. (This value is converted to @code{nil} when stored
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	830 internally, and @code{coding-system-property} will return @code{nil}.)
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	831 @end table
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	832
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	833 @node Coding System Properties
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	834 @subsection Coding System Properties
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	835
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	836 @table @code
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	837 @item mnemonic
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	838 String to be displayed in the modeline when this coding system is
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	839 active.
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	840
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	841 @item eol-type
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	842 End-of-line conversion to be used. It should be one of the types
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	843 listed in @ref{EOL Conversion}.
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	844
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	845 @item post-read-conversion
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	846 Function called after a file has been read in, to perform the decoding.
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	847 Called with two arguments, @var{beg} and @var{end}, denoting a region of
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	848 the current buffer to be decoded.
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	849
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	850 @item pre-write-conversion
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	851 Function called before a file is written out, to perform the encoding.
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	852 Called with two arguments, @var{beg} and @var{end}, denoting a region of
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	853 the current buffer to be encoded.
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	854 @end table
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	855
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	856 The following additional properties are recognized if @var{type} is
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	857 @code{iso2022}:
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	858
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	859 @table @code
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	860 @item charset-g0
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	861 @itemx charset-g1
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	862 @itemx charset-g2
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	863 @itemx charset-g3
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	864 The character set initially designated to the G0 - G3 registers.
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	865 The value should be one of
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	866
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	867 @itemize @bullet
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	868 @item
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	869 A charset object (designate that character set)
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	870 @item
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	871 @code{nil} (do not ever use this register)
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	872 @item
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	873 @code{t} (no character set is initially designated to the register, but
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	874 may be later on; this automatically sets the corresponding
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	875 @code{force-g*-on-output} property)
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	876 @end itemize
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	877
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	878 @item force-g0-on-output
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	879 @itemx force-g1-on-output
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	880 @itemx force-g2-on-output
54 05472e90ae02 Import from CVS: tag r19-16-pre2 cvs parents: 0 diff changeset	881 @itemx force-g3-on-output
0 376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	882 If non-@code{nil}, send an explicit designation sequence on output
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	883 before using the specified register.
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	884
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	885 @item short
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	886 If non-@code{nil}, use the short forms @samp{ESC $ @@}, @samp{ESC $ A},
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	887 and @samp{ESC $ B} on output in place of the full designation sequences
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	888 @samp{ESC $ ( @@}, @samp{ESC $ ( A}, and @samp{ESC $ ( B}.
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	889
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	890 @item no-ascii-eol
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	891 If non-@code{nil}, don't designate ASCII to G0 at each end of line on
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	892 output. Setting this to non-@code{nil} also suppresses other
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	893 state-resetting that normally happens at the end of a line.
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	894
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	895 @item no-ascii-cntl
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	896 If non-@code{nil}, don't designate ASCII to G0 before control chars on
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	897 output.
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	898
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	899 @item seven
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	900 If non-@code{nil}, use 7-bit environment on output. Otherwise, use 8-bit
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	901 environment.
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	902
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	903 @item lock-shift
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	904 If non-@code{nil}, use locking-shift (SO/SI) instead of single-shift or
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	905 designation by escape sequence.
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	906
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	907 @item no-iso6429
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	908 If non-@code{nil}, don't use ISO6429's direction specification.
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	909
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	910 @item escape-quoted
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	911 If non-nil, literal control characters that are the same as the
54 05472e90ae02 Import from CVS: tag r19-16-pre2 cvs parents: 0 diff changeset	912 beginning of a recognized ISO 2022 or ISO 6429 escape sequence (in
0 376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	913 particular, ESC (0x1B), SO (0x0E), SI (0x0F), SS2 (0x8E), SS3 (0x8F),
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	914 and CSI (0x9B)) are ``quoted'' with an escape character so that they can
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	915 be properly distinguished from an escape sequence. (Note that doing
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	916 this results in a non-portable encoding.) This encoding flag is used for
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	917 byte-compiled files. Note that ESC is a good choice for a quoting
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	918 character because there are no escape sequences whose second byte is a
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	919 character from the Control-0 or Control-1 character sets; this is
54 05472e90ae02 Import from CVS: tag r19-16-pre2 cvs parents: 0 diff changeset	920 explicitly disallowed by the ISO 2022 standard.
0 376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	921
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	922 @item input-charset-conversion
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	923 A list of conversion specifications, specifying conversion of characters
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	924 in one charset to another when decoding is performed. Each
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	925 specification is a list of two elements: the source charset, and the
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	926 destination charset.
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	927
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	928 @item output-charset-conversion
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	929 A list of conversion specifications, specifying conversion of characters
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	930 in one charset to another when encoding is performed. The form of each
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	931 specification is the same as for @code{input-charset-conversion}.
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	932 @end table
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	933
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	934 The following additional properties are recognized (and required) if
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	935 @var{type} is @code{ccl}:
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	936
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	937 @table @code
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	938 @item decode
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	939 CCL program used for decoding (converting to internal format).
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	940
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	941 @item encode
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	942 CCL program used for encoding (converting to external format).
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	943 @end table
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	944
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	945 @node Basic Coding System Functions
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	946 @subsection Basic Coding System Functions
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	947
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	948 @defun find-coding-system coding-system-or-name
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	949 This function retrieves the coding system of the given name.
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	950
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	951 If @var{coding-system-or-name} is a coding-system object, it is simply
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	952 returned. Otherwise, @var{coding-system-or-name} should be a symbol.
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	953 If there is no such coding system, @code{nil} is returned. Otherwise
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	954 the associated coding system object is returned.
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	955 @end defun
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	956
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	957 @defun get-coding-system name
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	958 This function retrieves the coding system of the given name. Same as
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	959 @code{find-coding-system} except an error is signalled if there is no
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	960 such coding system instead of returning @code{nil}.
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	961 @end defun
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	962
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	963 @defun coding-system-list
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	964 This function returns a list of the names of all defined coding systems.
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	965 @end defun
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	966
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	967 @defun coding-system-name coding-system
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	968 This function returns the name of the given coding system.
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	969 @end defun
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	970
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	971 @defun make-coding-system name type &optional doc-string props
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	972 This function registers symbol @var{name} as a coding system.
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	973
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	974 @var{type} describes the conversion method used and should be one of
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	975 the types listed in @ref{Coding System Types}.
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	976
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	977 @var{doc-string} is a string describing the coding system.
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	978
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	979 @var{props} is a property list, describing the specific nature of the
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	980 character set. Recognized properties are as in @ref{Coding System
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	981 Properties}.
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	982 @end defun
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	983
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	984 @defun copy-coding-system old-coding-system new-name
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	985 This function copies @var{old-coding-system} to @var{new-name}. If
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	986 @var{new-name} does not name an existing coding system, a new one will
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	987 be created.
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	988 @end defun
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	989
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	990 @defun subsidiary-coding-system coding-system eol-type
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	991 This function returns the subsidiary coding system of
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	992 @var{coding-system} with eol type @var{eol-type}.
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	993 @end defun
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	994
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	995 @node Coding System Property Functions
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	996 @subsection Coding System Property Functions
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	997
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	998 @defun coding-system-doc-string coding-system
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	999 This function returns the doc string for @var{coding-system}.
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	1000 @end defun
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	1001
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	1002 @defun coding-system-type coding-system
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	1003 This function returns the type of @var{coding-system}.
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	1004 @end defun
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	1005
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	1006 @defun coding-system-property coding-system prop
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	1007 This function returns the @var{prop} property of @var{coding-system}.
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	1008 @end defun
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	1009
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	1010 @node Encoding and Decoding Text
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	1011 @subsection Encoding and Decoding Text
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	1012
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	1013 @defun decode-coding-region start end coding-system &optional buffer
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	1014 This function decodes the text between @var{start} and @var{end} which
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	1015 is encoded in @var{coding-system}. This is useful if you've read in
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	1016 encoded text from a file without decoding it (e.g. you read in a
54 05472e90ae02 Import from CVS: tag r19-16-pre2 cvs parents: 0 diff changeset	1017 JIS-formatted file but used the @code{binary} or @code{no-conversion} coding
0 376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	1018 system, so that it shows up as @samp{^[$B!<!+^[(B}). The length of the
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	1019 encoded text is returned. @var{buffer} defaults to the current buffer
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	1020 if unspecified.
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	1021 @end defun
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	1022
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	1023 @defun encode-coding-region start end coding-system &optional buffer
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	1024 This function encodes the text between @var{start} and @var{end} using
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	1025 @var{coding-system}. This will, for example, convert Japanese
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	1026 characters into stuff such as @samp{^[$B!<!+^[(B} if you use the JIS
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	1027 encoding. The length of the encoded text is returned. @var{buffer}
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	1028 defaults to the current buffer if unspecified.
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	1029 @end defun
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	1030
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	1031 @node Detection of Textual Encoding
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	1032 @subsection Detection of Textual Encoding
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	1033
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	1034 @defun coding-category-list
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	1035 This function returns a list of all recognized coding categories.
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	1036 @end defun
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	1037
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	1038 @defun set-coding-priority-list list
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	1039 This function changes the priority order of the coding categories.
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	1040 @var{list} should be a list of coding categories, in descending order of
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	1041 priority. Unspecified coding categories will be lower in priority than
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	1042 all specified ones, in the same relative order they were in previously.
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	1043 @end defun
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	1044
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	1045 @defun coding-priority-list
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	1046 This function returns a list of coding categories in descending order of
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	1047 priority.
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	1048 @end defun
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	1049
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	1050 @defun set-coding-category-system coding-category coding-system
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	1051 This function changes the coding system associated with a coding category.
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	1052 @end defun
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	1053
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	1054 @defun coding-category-system coding-category
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	1055 This function returns the coding system associated with a coding category.
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	1056 @end defun
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	1057
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	1058 @defun detect-coding-region start end &optional buffer
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	1059 This function detects coding system of the text in the region between
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	1060 @var{start} and @var{end}. Returned value is a list of possible coding
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	1061 systems ordered by priority. If only ASCII characters are found, it
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	1062 returns @code{autodetect} or one of its subsidiary coding systems
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	1063 according to a detected end-of-line type. Optional arg @var{buffer}
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	1064 defaults to the current buffer.
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	1065 @end defun
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	1066
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	1067 @node Big5 and Shift-JIS Functions
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	1068 @subsection Big5 and Shift-JIS Functions
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	1069
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	1070 These are special functions for working with the non-standard
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	1071 Shift-JIS and Big5 encodings.
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	1072
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	1073 @defun decode-shift-jis-char code
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	1074 This function decodes a JISX0208 character of Shift-JIS coding-system.
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	1075 @var{code} is the character code in Shift-JIS as a cons of type bytes.
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	1076 The corresponding character is returned.
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	1077 @end defun
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	1078
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	1079 @defun encode-shift-jis-char ch
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	1080 This function encodes a JISX0208 character @var{ch} to SHIFT-JIS
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	1081 coding-system. The corresponding character code in SHIFT-JIS is
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	1082 returned as a cons of two bytes.
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	1083 @end defun
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	1084
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	1085 @defun decode-big5-char code
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	1086 This function decodes a Big5 character @var{code} of BIG5 coding-system.
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	1087 @var{code} is the character code in BIG5. The corresponding character
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	1088 is returned.
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	1089 @end defun
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	1090
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	1091 @defun encode-big5-char ch
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	1092 This function encodes the Big5 character @var{char} to BIG5
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	1093 coding-system. The corresponding character code in Big5 is returned.
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	1094 @end defun
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	1095
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	1096 @node CCL
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	1097 @section CCL
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	1098
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	1099 @defun execute-ccl-program ccl-program status
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	1100 This function executes @var{ccl-program} with registers initialized by
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	1101 @var{status}. @var{ccl-program} is a vector of compiled CCL code
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	1102 created by @code{ccl-compile}. @var{status} must be a vector of nine
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	1103 values, specifying the initial value for the R0, R1 .. R7 registers and
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	1104 for the instruction counter IC. A @code{nil} value for a register
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	1105 initializer causes the register to be set to 0. A @code{nil} value for
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	1106 the IC initializer causes execution to start at the beginning of the
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	1107 program. When the program is done, @var{status} is modified (by
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	1108 side-effect) to contain the ending values for the corresponding
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	1109 registers and IC.
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	1110 @end defun
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	1111
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	1112 @defun execute-ccl-program-string ccl-program status str
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	1113 This function executes @var{ccl-program} with initial @var{status} on
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	1114 @var{string}. @var{ccl-program} is a vector of compiled CCL code
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	1115 created by @code{ccl-compile}. @var{status} must be a vector of nine
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	1116 values, specifying the initial value for the R0, R1 .. R7 registers and
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	1117 for the instruction counter IC. A @code{nil} value for a register
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	1118 initializer causes the register to be set to 0. A @code{nil} value for
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	1119 the IC initializer causes execution to start at the beginning of the
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	1120 program. When the program is done, @var{status} is modified (by
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	1121 side-effect) to contain the ending values for the corresponding
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	1122 registers and IC. Returns the resulting string.
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	1123 @end defun
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	1124
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	1125 @defun ccl-reset-elapsed-time
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	1126 This function resets the internal value which holds the time elapsed by
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	1127 CCL interpreter.
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	1128 @end defun
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	1129
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	1130 @defun ccl-elapsed-time
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	1131 This function returns the time elapsed by CCL interpreter as cons of
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	1132 user and system time. This measures processor time, not real time.
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	1133 Both values are floating point numbers measured in seconds. If only one
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	1134 overall value can be determined, the return value will be a cons of that
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	1135 value and 0.
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	1136 @end defun
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	1137
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	1138 @node Category Tables
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	1139 @section Category Tables
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	1140
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	1141 A category table is a type of char table used for keeping track of
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	1142 categories. Categories are used for classifying characters for use in
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	1143 regexps -- you can refer to a category rather than having to use a
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	1144 complicated [] expression (and category lookups are significantly
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	1145 faster).
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	1146
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	1147 There are 95 different categories available, one for each printable
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	1148 character (including space) in the ASCII charset. Each category is
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	1149 designated by one such character, called a @dfn{category designator}.
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	1150 They are specified in a regexp using the syntax @samp{\cX}, where X is a
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	1151 category designator. (This is not yet implemented.)
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	1152
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	1153 A category table specifies, for each character, the categories that
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	1154 the character is in. Note that a character can be in more than one
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	1155 category. More specifically, a category table maps from a character to
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	1156 either the value @code{nil} (meaning the character is in no categories)
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	1157 or a 95-element bit vector, specifying for each of the 95 categories
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	1158 whether the character is in that category.
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	1159
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	1160 Special Lisp functions are provided that abstract this, so you do not
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	1161 have to directly manipulate bit vectors.
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	1162
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	1163 @defun category-table-p obj
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	1164 This function returns @code{t} if @var{arg} is a category table.
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	1165 @end defun
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	1166
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	1167 @defun category-table &optional buffer
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	1168 This function returns the current category table. This is the one
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	1169 specified by the current buffer, or by @var{buffer} if it is
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	1170 non-@code{nil}.
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	1171 @end defun
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	1172
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	1173 @defun standard-category-table
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	1174 This function returns the standard category table. This is the one used
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	1175 for new buffers.
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	1176 @end defun
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	1177
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	1178 @defun copy-category-table &optional table
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	1179 This function constructs a new category table and return it. It is a
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	1180 copy of the @var{table}, which defaults to the standard category table.
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	1181 @end defun
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	1182
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	1183 @defun set-category-table table &optional buffer
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	1184 This function selects a new category table for @var{buffer}. One
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	1185 argument, a category table. @var{buffer} defaults to the current buffer
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	1186 if omitted.
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	1187 @end defun
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	1188
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	1189 @defun category-designator-p obj
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	1190 This function returns @code{t} if @var{arg} is a category designator (a
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	1191 char in the range @samp{' '} to @samp{'~'}).
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	1192 @end defun
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	1193
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	1194 @defun category-table-value-p obj
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	1195 This function returns @code{t} if @var{arg} is a category table value.
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	1196 Valid values are @code{nil} or a bit vector of size 95.
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	1197 @end defun
376386a54a3c Import from CVS: tag r19-14 cvs parents: diff changeset	1198

Mercurial > hg > xemacs-beta

annotate man/lispref/mule.texi @ 70:131b0175ea99 r20-0b30