comparison src/file-coding.h @ 2297:13a418960a88

[xemacs-hg @ 2004-09-22 02:05:42 by stephent] various doc patches <87isa7awrh.fsf@tleepslib.sk.tsukuba.ac.jp>
author stephent
date Wed, 22 Sep 2004 02:06:52 +0000
parents 34ca43a57692
children ecf1ebac70d8
comparison
equal deleted inserted replaced
2296:a58ea4d0d0cd 2297:13a418960a88
39 #define INCLUDED_file_coding_h_ 39 #define INCLUDED_file_coding_h_
40 40
41 /* Capsule description of the different structures, what their purpose is, 41 /* Capsule description of the different structures, what their purpose is,
42 how they fit together, and where various bits of data are stored. 42 how they fit together, and where various bits of data are stored.
43 43
44 A "coding system" is an algorithm for converting data in one format into 44 A "coding system" is an algorithm for converting stream data in one format
45 data in another format. Currently most of the coding systems we have 45 into stream data in another format. Currently most of the coding systems
46 created concern internationalized text, and convert between the XEmacs 46 we have created concern internationalized text, and convert between the
47 internal format for multilingual text, and various external 47 XEmacs internal format for multilingual text, and various external
48 representations of such text. However, any such conversion is possible, 48 representations of such text. However, any such conversion is possible,
49 for example, compressing or uncompressing text using the gzip algorithm. 49 for example, compressing or uncompressing text using the gzip algorithm.
50 All coding systems provide both encode and decode routines, so that the 50 All coding systems provide both encode and decode routines, so that the
51 conversion can go both ways. 51 conversion can go both ways. Unfortunately encoding and decoding may not
52 be exact inverses, even for a specific instance of a coding system. Care
53 must be taken when this is not the case.
52 54
53 The way we handle this is by dividing the various potential coding 55 The way we handle this is by dividing the various potential coding
54 systems into types, analogous to classes in C++. Each coding system 56 systems into types, analogous to classes in C++. Each coding system
55 type encompasses a series of related coding systems that it can 57 type encompasses a series of related coding systems that it can
56 implement, and it has properties which control how exactly the encoding 58 implement, and it has properties which control how exactly the encoding
119 coding lstream can be changed at any point during the lifetime of the 121 coding lstream can be changed at any point during the lifetime of the
120 lstream, and possibly multiple times. (For example, it can be set using 122 lstream, and possibly multiple times. (For example, it can be set using
121 the Lisp primitives `set-process-input-coding-system' and 123 the Lisp primitives `set-process-input-coding-system' and
122 `set-console-tty-input-coding-system', as well as getting set when a 124 `set-console-tty-input-coding-system', as well as getting set when a
123 conversion operation was started with coding system `undecided' and the 125 conversion operation was started with coding system `undecided' and the
124 correct coding system was then detected.) 126 correct coding system was then detected.) #### This suggests implementing
127 compound text extended segments by saving the state of the ctext stream,
128 and installing an appropriate for the duration of the segment.
125 129
126 IMPORTANT NOTE: There are at least two ancillary data structures 130 IMPORTANT NOTE: There are at least two ancillary data structures
127 associated with a coding system type. (There may also be detection data; 131 associated with a coding system type. (There may also be detection data;
128 see elsewhere.) It's important, when writing a coding system type, to 132 see elsewhere.) It's important, when writing a coding system type, to
129 keep straight which type of data goes where. In particular, `struct 133 keep straight which type of data goes where. In particular, `struct
866 we phrase the conversion methods like write methods -- we can 870 we phrase the conversion methods like write methods -- we can
867 implement reading in terms of a write method but not vice-versa, 871 implement reading in terms of a write method but not vice-versa,
868 because the write method is forced to take only what it's given but 872 because the write method is forced to take only what it's given but
869 the read method can read more data from the other end if necessary. 873 the read method can read more data from the other end if necessary.
870 On the other hand, the write method is free to generate all the data 874 On the other hand, the write method is free to generate all the data
871 it wants (and just write it to the other end), but the the read method 875 it wants (and just write it to the other end), but the read method
872 can return only as much as was asked for, so we need to implement our 876 can return only as much as was asked for, so we need to implement our
873 own buffering. */ 877 own buffering. */
874 878
875 /* If we are reading, then we can return only a fixed amount of data, but 879 /* If we are reading, then we can return only a fixed amount of data, but
876 the converter is free to return as much as it wants, so we direct it 880 the converter is free to return as much as it wants, so we direct it