Mercurial > hg > xemacs-beta
comparison src/file-coding.h @ 2297:13a418960a88
[xemacs-hg @ 2004-09-22 02:05:42 by stephent]
various doc patches <87isa7awrh.fsf@tleepslib.sk.tsukuba.ac.jp>
author | stephent |
---|---|
date | Wed, 22 Sep 2004 02:06:52 +0000 |
parents | 34ca43a57692 |
children | ecf1ebac70d8 |
comparison
equal
deleted
inserted
replaced
2296:a58ea4d0d0cd | 2297:13a418960a88 |
---|---|
39 #define INCLUDED_file_coding_h_ | 39 #define INCLUDED_file_coding_h_ |
40 | 40 |
41 /* Capsule description of the different structures, what their purpose is, | 41 /* Capsule description of the different structures, what their purpose is, |
42 how they fit together, and where various bits of data are stored. | 42 how they fit together, and where various bits of data are stored. |
43 | 43 |
44 A "coding system" is an algorithm for converting data in one format into | 44 A "coding system" is an algorithm for converting stream data in one format |
45 data in another format. Currently most of the coding systems we have | 45 into stream data in another format. Currently most of the coding systems |
46 created concern internationalized text, and convert between the XEmacs | 46 we have created concern internationalized text, and convert between the |
47 internal format for multilingual text, and various external | 47 XEmacs internal format for multilingual text, and various external |
48 representations of such text. However, any such conversion is possible, | 48 representations of such text. However, any such conversion is possible, |
49 for example, compressing or uncompressing text using the gzip algorithm. | 49 for example, compressing or uncompressing text using the gzip algorithm. |
50 All coding systems provide both encode and decode routines, so that the | 50 All coding systems provide both encode and decode routines, so that the |
51 conversion can go both ways. | 51 conversion can go both ways. Unfortunately encoding and decoding may not |
52 be exact inverses, even for a specific instance of a coding system. Care | |
53 must be taken when this is not the case. | |
52 | 54 |
53 The way we handle this is by dividing the various potential coding | 55 The way we handle this is by dividing the various potential coding |
54 systems into types, analogous to classes in C++. Each coding system | 56 systems into types, analogous to classes in C++. Each coding system |
55 type encompasses a series of related coding systems that it can | 57 type encompasses a series of related coding systems that it can |
56 implement, and it has properties which control how exactly the encoding | 58 implement, and it has properties which control how exactly the encoding |
119 coding lstream can be changed at any point during the lifetime of the | 121 coding lstream can be changed at any point during the lifetime of the |
120 lstream, and possibly multiple times. (For example, it can be set using | 122 lstream, and possibly multiple times. (For example, it can be set using |
121 the Lisp primitives `set-process-input-coding-system' and | 123 the Lisp primitives `set-process-input-coding-system' and |
122 `set-console-tty-input-coding-system', as well as getting set when a | 124 `set-console-tty-input-coding-system', as well as getting set when a |
123 conversion operation was started with coding system `undecided' and the | 125 conversion operation was started with coding system `undecided' and the |
124 correct coding system was then detected.) | 126 correct coding system was then detected.) #### This suggests implementing |
127 compound text extended segments by saving the state of the ctext stream, | |
128 and installing an appropriate for the duration of the segment. | |
125 | 129 |
126 IMPORTANT NOTE: There are at least two ancillary data structures | 130 IMPORTANT NOTE: There are at least two ancillary data structures |
127 associated with a coding system type. (There may also be detection data; | 131 associated with a coding system type. (There may also be detection data; |
128 see elsewhere.) It's important, when writing a coding system type, to | 132 see elsewhere.) It's important, when writing a coding system type, to |
129 keep straight which type of data goes where. In particular, `struct | 133 keep straight which type of data goes where. In particular, `struct |
866 we phrase the conversion methods like write methods -- we can | 870 we phrase the conversion methods like write methods -- we can |
867 implement reading in terms of a write method but not vice-versa, | 871 implement reading in terms of a write method but not vice-versa, |
868 because the write method is forced to take only what it's given but | 872 because the write method is forced to take only what it's given but |
869 the read method can read more data from the other end if necessary. | 873 the read method can read more data from the other end if necessary. |
870 On the other hand, the write method is free to generate all the data | 874 On the other hand, the write method is free to generate all the data |
871 it wants (and just write it to the other end), but the the read method | 875 it wants (and just write it to the other end), but the read method |
872 can return only as much as was asked for, so we need to implement our | 876 can return only as much as was asked for, so we need to implement our |
873 own buffering. */ | 877 own buffering. */ |
874 | 878 |
875 /* If we are reading, then we can return only a fixed amount of data, but | 879 /* If we are reading, then we can return only a fixed amount of data, but |
876 the converter is free to return as much as it wants, so we direct it | 880 the converter is free to return as much as it wants, so we direct it |