Mercurial > hg > xemacs-beta
annotate src/file-coding.h @ 5043:d0c14ea98592
various frame-geometry fixes
-------------------- ChangeLog entries follow: --------------------
src/ChangeLog addition:
2010-02-15 Ben Wing <ben@xemacs.org>
* EmacsFrame.c:
* EmacsFrame.c (EmacsFrameResize):
* console-msw-impl.h:
* console-msw-impl.h (struct mswindows_frame):
* console-msw-impl.h (FRAME_MSWINDOWS_TARGET_RECT):
* device-tty.c:
* device-tty.c (tty_asynch_device_change):
* event-msw.c:
* event-msw.c (mswindows_wnd_proc):
* faces.c (Fface_list):
* faces.h:
* frame-gtk.c:
* frame-gtk.c (gtk_set_initial_frame_size):
* frame-gtk.c (gtk_set_frame_size):
* frame-msw.c:
* frame-msw.c (mswindows_init_frame_1):
* frame-msw.c (mswindows_set_frame_size):
* frame-msw.c (mswindows_size_frame_internal):
* frame-msw.c (msprinter_init_frame_3):
* frame.c:
* frame.c (enum):
* frame.c (Fmake_frame):
* frame.c (adjust_frame_size):
* frame.c (store_minibuf_frame_prop):
* frame.c (Fframe_property):
* frame.c (Fframe_properties):
* frame.c (Fframe_displayable_pixel_height):
* frame.c (Fframe_displayable_pixel_width):
* frame.c (internal_set_frame_size):
* frame.c (Fset_frame_height):
* frame.c (Fset_frame_pixel_height):
* frame.c (Fset_frame_displayable_pixel_height):
* frame.c (Fset_frame_width):
* frame.c (Fset_frame_pixel_width):
* frame.c (Fset_frame_displayable_pixel_width):
* frame.c (Fset_frame_size):
* frame.c (Fset_frame_pixel_size):
* frame.c (Fset_frame_displayable_pixel_size):
* frame.c (frame_conversion_internal_1):
* frame.c (get_frame_displayable_pixel_size):
* frame.c (change_frame_size_1):
* frame.c (change_frame_size):
* frame.c (generate_title_string):
* frame.h:
* gtk-xemacs.c:
* gtk-xemacs.c (gtk_xemacs_size_request):
* gtk-xemacs.c (gtk_xemacs_size_allocate):
* gtk-xemacs.c (gtk_xemacs_paint):
* gutter.c:
* gutter.c (update_gutter_geometry):
* redisplay.c (end_hold_frame_size_changes):
* redisplay.c (redisplay_frame):
* toolbar.c:
* toolbar.c (update_frame_toolbars_geometry):
* window.c:
* window.c (frame_pixsize_valid_p):
* window.c (check_frame_size):
Various fixes to frame geometry to make it a bit easier to understand
and fix some bugs.
1. IMPORTANT: Some renamings. Will need to be applied carefully to
the carbon repository, in the following order:
-- pixel_to_char_size -> pixel_to_frame_unit_size
-- char_to_pixel_size -> frame_unit_to_pixel_size
-- pixel_to_real_char_size -> pixel_to_char_size
-- char_to_real_pixel_size -> char_to_pixel_size
-- Reverse second and third arguments of change_frame_size() and
change_frame_size_1() to try to make functions consistent in
putting width before height.
-- Eliminate old round_size_to_char, because it didn't really
do anything differently from round_size_to_real_char()
-- round_size_to_real_char -> round_size_to_char; any places that
called the old round_size_to_char should just call the new one.
2. IMPORTANT FOR CARBON: The set_frame_size() method is now passed
sizes in "frame units", like all other frame-sizing functions,
rather than some hacked-up combination of char-cell units and
total pixel size. This only affects window systems that use
"pixelated geometry", and I'm not sure if Carbon is one of them.
MS Windows is pixelated, X and GTK are not. For pixelated-geometry
systems, the size in set_frame_size() is in displayable pixels
rather than total pixels and needs to be converted appropriately;
take a look at the changes made to mswindows_set_frame_size()
method if necessary.
3. Add a big long comment in frame.c describing how frame geometry
works.
4. Remove MS Windows-specific character height and width fields,
duplicative and unused.
5. frame-displayable-pixel-* and set-frame-displayable-pixel-*
didn't use to work on MS Windows, but they do now.
6. In general, clean up the handling of "pixelated geometry" so
that fewer functions have to worry about this. This is really
an abomination that should be removed entirely but that will
have to happen later. Fix some buggy code in
frame_conversion_internal() that happened to "work" because it
was countered by oppositely buggy code in change_frame_size().
7. Clean up some frame-size code in toolbar.c and use functions
already provided in frame.c instead of rolling its own.
8. Fix check_frame_size() in window.c, which formerly didn't take
pixelated geometry into account.
author | Ben Wing <ben@xemacs.org> |
---|---|
date | Mon, 15 Feb 2010 22:14:11 -0600 |
parents | 257b468bf2ca |
children | e0db3c197671 |
rev | line source |
---|---|
771 | 1 /* Header for encoding conversion functions; coding-system object. |
2 #### rename me to coding-system.h | |
428 | 3 Copyright (C) 1991, 1995 Free Software Foundation, Inc. |
4 Copyright (C) 1995 Sun Microsystems, Inc. | |
793 | 5 Copyright (C) 2000, 2001, 2002 Ben Wing. |
428 | 6 |
7 This file is part of XEmacs. | |
8 | |
9 XEmacs is free software; you can redistribute it and/or modify it | |
10 under the terms of the GNU General Public License as published by the | |
11 Free Software Foundation; either version 2, or (at your option) any | |
12 later version. | |
13 | |
14 XEmacs is distributed in the hope that it will be useful, but WITHOUT | |
15 ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or | |
16 FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License | |
17 for more details. | |
18 | |
19 You should have received a copy of the GNU General Public License | |
20 along with XEmacs; see the file COPYING. If not, write to | |
21 the Free Software Foundation, Inc., 59 Temple Place - Suite 330, | |
22 Boston, MA 02111-1307, USA. */ | |
23 | |
24 /* Synched up with: Mule 2.3. Not in FSF. */ | |
25 | |
771 | 26 /* Authorship: |
27 | |
28 Current primary author: Ben Wing <ben@xemacs.org> | |
29 | |
30 Written by Ben Wing <ben@xemacs.org> for XEmacs, 1995, loosely based | |
31 on code written 91.10.09 by K.Handa <handa@etl.go.jp>. | |
32 Rewritten again 2000-2001 by Ben Wing to support properly | |
33 abstracted coding systems. | |
34 September 2001: Finished last part of abstraction, the detection | |
35 mechanism. | |
36 */ | |
428 | 37 |
440 | 38 #ifndef INCLUDED_file_coding_h_ |
39 #define INCLUDED_file_coding_h_ | |
428 | 40 |
771 | 41 /* Capsule description of the different structures, what their purpose is, |
42 how they fit together, and where various bits of data are stored. | |
43 | |
2297 | 44 A "coding system" is an algorithm for converting stream data in one format |
45 into stream data in another format. Currently most of the coding systems | |
46 we have created concern internationalized text, and convert between the | |
47 XEmacs internal format for multilingual text, and various external | |
771 | 48 representations of such text. However, any such conversion is possible, |
49 for example, compressing or uncompressing text using the gzip algorithm. | |
50 All coding systems provide both encode and decode routines, so that the | |
2297 | 51 conversion can go both ways. Unfortunately encoding and decoding may not |
52 be exact inverses, even for a specific instance of a coding system. Care | |
53 must be taken when this is not the case. | |
771 | 54 |
55 The way we handle this is by dividing the various potential coding | |
56 systems into types, analogous to classes in C++. Each coding system | |
57 type encompasses a series of related coding systems that it can | |
58 implement, and it has properties which control how exactly the encoding | |
59 works. A particular set of values for each of the properties makes up a | |
60 "coding system", and specifies one particular encoding. A `struct | |
61 Lisp_Coding_System' object encapsulates those settings -- its type, the | |
62 values chosen for all properties of that type, a name for the coding | |
63 system, some documentation. | |
64 | |
65 In addition, there are of course methods associated with a coding system | |
66 type, implementing the encoding, decoding, etc. These are stored in a | |
67 `struct coding_system_methods' object, one per coding-system type, which | |
68 contains mostly function pointers. This is retrievable from the | |
69 coding-system object (i.e. the struct Lisp_Coding_System), which has a | |
70 pointer to it. | |
71 | |
72 In order to actually use a coding system to do an encoding or decoding | |
73 operation, you need to use a coding Lstream. | |
74 | |
75 Now let's look more at attached data. All coding systems have certain | |
76 common data fields -- name, type, documentation, etc. -- as well as a | |
77 bunch more that are defined by the coding system type. To handle this | |
78 cleanly, each coding system type defines a structure that holds just the | |
79 fields of data particular to it, and calls it e.g. `struct | |
80 iso2022_coding_system' for coding system type `iso2022'. When the | |
81 memory block holding the coding system object is created, it is sized | |
82 such that it can hold both the struct Lisp_Coding_System and the struct | |
83 iso2022_coding_system (or whatever) directly following it. (This is a | |
84 common trick; another possibility is to have a void * pointer in the | |
85 struct Lisp_Coding_System, which points to another memory block holding | |
86 the struct iso2022_coding_system.) A macro is provided | |
87 (CODING_SYSTEM_TYPE_DATA) to retrieve a pointer of the right type to the | |
88 type-specific data contained within the overall `struct | |
89 Lisp_Coding_System' block. | |
90 | |
91 Lstreams, similarly, are objects of type `struct lstream' holding data | |
92 about the stream operation (how much data has been read or written, any | |
93 buffered data, any error conditions, etc.), and like coding systems have | |
94 different types. They have a structure called `Lstream_implementation', | |
95 one per lstream type, exactly analogous to `struct | |
96 coding_system_methods'. In addition, they have type-specific data | |
97 (specifying, e.g., the file number, FILE *, memory location, other | |
98 lstream, etc. to read the data from or write it to, and for conversion | |
99 processes, the current state of the process -- are we decoding ASCII or | |
100 Kanji characters? are we in the middle of a processing an escape | |
101 sequence? etc.). This type-specific data is stored in a structure | |
102 named `struct coding_stream'. Just like for coding systems, the | |
103 type-independent data in the `struct lstream' and the type-dependent | |
104 data in the `struct coding_stream' are stored together in the same | |
105 memory block. | |
428 | 106 |
771 | 107 Now things get a bit tricky. The `struct coding_stream' is |
108 type-specific from the point of view of an lstream, but not from the | |
109 point of view of a coding system. It contains only general data about | |
110 the conversion process, e.g. the name of the coding system used for | |
111 conversion, the lstream that we take data from or write it to (depending | |
112 on whether this was created as a read stream or a write stream), a | |
113 buffer to hold extra data we retrieved but can't send on yet, some | |
114 flags, etc. It also needs some data specific to the particular coding | |
115 system and thus to the particular operation going on. This data is held | |
116 in a structure named (e.g.) `struct iso2022_coding_stream', and it's | |
117 held in a separate memory block and pointed to by the generic `struct | |
118 coding_stream'. It's not glommed into a single memory block both | |
119 because that would require making changes to the generic lstream code | |
120 and more importantly because the coding system used in a particular | |
121 coding lstream can be changed at any point during the lifetime of the | |
122 lstream, and possibly multiple times. (For example, it can be set using | |
123 the Lisp primitives `set-process-input-coding-system' and | |
124 `set-console-tty-input-coding-system', as well as getting set when a | |
125 conversion operation was started with coding system `undecided' and the | |
2297 | 126 correct coding system was then detected.) #### This suggests implementing |
127 compound text extended segments by saving the state of the ctext stream, | |
128 and installing an appropriate for the duration of the segment. | |
428 | 129 |
771 | 130 IMPORTANT NOTE: There are at least two ancillary data structures |
131 associated with a coding system type. (There may also be detection data; | |
132 see elsewhere.) It's important, when writing a coding system type, to | |
133 keep straight which type of data goes where. In particular, `struct | |
134 foo_coding_system' is attached to the coding system object itself. This | |
135 is a permanent object and there's only one per coding system. It's | |
136 created once, usually at init time, and never destroyed. So, `struct | |
137 foo_coding_system' should in general not contain dynamic data! (Just | |
138 data describing the properties of the coding system.) In particular, | |
139 *NO* data about any conversion in progress. There may be many | |
140 conversions going on simultaneously using a particular coding system, | |
141 and by storing conversion data in the coding system, these conversions | |
142 will overwrite each other's data. | |
143 | |
144 Instead, use the lstream object, whose purpose is to encapsulate a | |
145 particular conversion and all associated data. From the lstream object, | |
146 you can get the struct coding_stream using something like | |
147 | |
148 struct coding_stream *str = LSTREAM_TYPE_DATA (lstr, coding); | |
149 | |
150 But usually this structure is already passed to you as one of the | |
151 parameters of the method being invoked. | |
152 | |
153 From the struct coding_stream, you can retrieve the | |
154 coding-system-type-specific data using something like | |
155 | |
156 struct foo_coding_stream *data = CODING_STREAM_TYPE_DATA (str, foo); | |
157 | |
158 Then, use this structure to hold all data relevant to the particular | |
159 conversion being done. | |
160 | |
161 Initialize this structure whenever init_coding_stream_method is called | |
162 (this may happen more than once), and finalize it (free resources, etc.) | |
163 when finalize_coding_stream_method is called. | |
164 */ | |
165 | |
166 struct coding_stream; | |
167 struct detection_state; | |
168 | |
1204 | 169 extern const struct sized_memory_description coding_system_methods_description; |
771 | 170 |
171 struct coding_system_methods; | |
172 | |
173 enum source_sink_type | |
428 | 174 { |
771 | 175 DECODES_CHARACTER_TO_BYTE, |
176 DECODES_BYTE_TO_BYTE, | |
177 DECODES_BYTE_TO_CHARACTER, | |
178 DECODES_CHARACTER_TO_CHARACTER | |
428 | 179 }; |
180 | |
181 enum eol_type | |
182 { | |
183 EOL_LF, | |
184 EOL_CRLF, | |
771 | 185 EOL_CR, |
1429 | 186 EOL_AUTODETECT |
428 | 187 }; |
188 | |
189 struct Lisp_Coding_System | |
190 { | |
3017 | 191 struct LCRECORD_HEADER header; |
771 | 192 struct coding_system_methods *methods; |
428 | 193 |
1204 | 194 #define CODING_SYSTEM_SLOT_DECLARATION |
195 #define MARKED_SLOT(x) Lisp_Object x; | |
196 #include "coding-system-slots.h" | |
771 | 197 |
1204 | 198 /* Eol type requested by user. See comment about EOL junk in |
199 coding-system-slots.h. */ | |
771 | 200 enum eol_type eol_type; |
428 | 201 |
2132 | 202 /* If true, this is an internal coding system, which will not show up in |
203 coding-system-list unless a special parameter is given to it. */ | |
204 int internal_p; | |
205 | |
771 | 206 /* type-specific extra data attached to a coding_system */ |
207 char data[1]; | |
428 | 208 }; |
209 typedef struct Lisp_Coding_System Lisp_Coding_System; | |
210 | |
440 | 211 DECLARE_LRECORD (coding_system, Lisp_Coding_System); |
212 #define XCODING_SYSTEM(x) XRECORD (x, coding_system, Lisp_Coding_System) | |
617 | 213 #define wrap_coding_system(p) wrap_record (p, coding_system) |
428 | 214 #define CODING_SYSTEMP(x) RECORDP (x, coding_system) |
215 #define CHECK_CODING_SYSTEM(x) CHECK_RECORD (x, coding_system) | |
216 #define CONCHECK_CODING_SYSTEM(x) CONCHECK_RECORD (x, coding_system) | |
217 | |
1204 | 218 enum coding_system_variant |
219 { | |
220 no_conversion_coding_system, | |
221 convert_eol_coding_system, | |
222 undecided_coding_system, | |
223 chain_coding_system, | |
224 text_file_wrapper_coding_system, | |
225 internal_coding_system, | |
226 gzip_coding_system, | |
227 mswindows_multibyte_to_unicode_coding_system, | |
228 mswindows_multibyte_coding_system, | |
229 iso2022_coding_system, | |
230 ccl_coding_system, | |
231 shift_jis_coding_system, | |
232 big5_coding_system, | |
4690
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4569
diff
changeset
|
233 unicode_coding_system, |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4569
diff
changeset
|
234 fixed_width_coding_system |
1204 | 235 }; |
236 | |
771 | 237 struct coding_system_methods |
238 { | |
239 Lisp_Object type; | |
240 Lisp_Object predicate_symbol; | |
241 | |
1204 | 242 /* Type expressed as an enum, needed for KKCC marking of the |
243 type-specific lstream data; copied into the struct coding_stream. */ | |
244 | |
245 enum coding_system_variant enumtype; | |
246 | |
771 | 247 /* Implementation specific methods: */ |
248 | |
249 /* Init method: Initialize coding-system data. Optional. */ | |
250 void (*init_method) (Lisp_Object coding_system); | |
251 | |
252 /* Mark method: Mark any Lisp objects in the type-specific data | |
253 attached to the coding-system object. Optional. */ | |
254 void (*mark_method) (Lisp_Object coding_system); | |
255 | |
256 /* Print method: Print the type-specific properties of this coding | |
257 system, as part of `print'-ing the object. If this method is defined | |
258 and prints anything, it should print a space as the first thing it | |
259 does. Optional. */ | |
260 void (*print_method) (Lisp_Object cs, Lisp_Object printcharfun, | |
261 int escapeflag); | |
262 | |
263 /* Canonicalize method: Convert this coding system to another one; called | |
264 once, at creation time, after all properties have been parsed. The | |
265 returned value should be a coding system created with | |
266 make_internal_coding_system() (passing the existing coding system as the | |
267 first argument), and will become the coding system returned by | |
268 `make-coding-system'. Optional. | |
269 | |
270 NOTE: There are *three* different uses of "canonical" or "canonicalize" | |
271 w.r.t. coding systems, and it's important to keep them straight. | |
272 | |
273 1. The canonicalize method. Used to specify a different coding | |
274 system, used when doing conversions, in place of the actual coding | |
275 system itself. Stored in the CANONICAL field of a coding system. | |
276 | |
277 2. The canonicalize-after-coding method. Used to return the encoding | |
278 that was "actually" used to decode some text, such that this | |
279 particular encoding can be used to encode the text again with the | |
280 expectation that the result will be the same as the original encoding. | |
281 Particularly important with auto-detecting coding systems. | |
282 | |
283 3. From the perspective of aliases, a "canonical" coding system is one | |
284 that's not an alias to some other coding system, and "canonicalization" | |
285 is the process of traversing the alias pointers to find the canonical | |
286 coding system that's equivalent to the alias. | |
287 */ | |
288 Lisp_Object (*canonicalize_method) (Lisp_Object coding_system); | |
289 | |
290 /* Canonicalize after coding method: Convert this coding system to | |
291 another one, after coding (usually decoding) has finished. This is | |
292 meant to be used by auto-detecting coding systems, which should return | |
293 the actually detected coding system. Optional. */ | |
294 Lisp_Object (*canonicalize_after_coding_method) | |
295 (struct coding_stream *str); | |
296 | |
297 /* Convert method: Decode or encode the data in SRC of size N, writing | |
298 the results into the Dynarr DST. If the conversion_end_type method | |
299 indicates that the source is characters (as opposed to bytes), you are | |
300 guaranteed to get only whole characters in the data in SRC/N. STR, a | |
301 struct coding_stream, stores all necessary state and other info about | |
302 the conversion. Coding-specific state (struct TYPE_coding_stream) can | |
303 be retrieved from STR using CODING_STREAM_TYPE_DATA(). Return value | |
304 indicates the number of bytes of the *INPUT* that were converted (not | |
305 the number of bytes written to the Dynarr!). This can be less than | |
306 the total amount of input passed in; if so, the remainder is | |
307 considered "rejected" and will appear again at the beginning of the | |
308 data passed in the next time the convert method is called. When EOF | |
309 is returned on the other end and there's no more data, the convert | |
310 method will be called one last time, STR->eof set and the passed-in | |
311 data will consist only of any rejected data from the previous | |
312 call. (At this point, file handles and similar resources can be | |
313 closed, but do NOT arbitrarily free data structures in the | |
314 type-specific data, because there are operations that can be done on | |
315 closed streams to query the results of the processing -- specifically, | |
316 for coding streams, there's the canonicalize_after_coding() method.) | |
317 Required. */ | |
318 Bytecount (*convert_method) (struct coding_stream *str, | |
319 const unsigned char *src, | |
320 unsigned_char_dynarr *dst, Bytecount n); | |
321 | |
4690
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4569
diff
changeset
|
322 /* Query method: Check whether the buffer text between point and END |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4569
diff
changeset
|
323 can be encoded by this coding system. Returns |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4569
diff
changeset
|
324 either nil (meaning the text can be encoded by the coding system) or a |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4569
diff
changeset
|
325 range table object describing the stretches that the coding system |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4569
diff
changeset
|
326 cannot encode. |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4569
diff
changeset
|
327 |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4569
diff
changeset
|
328 Possible values for flags are below, search for |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4569
diff
changeset
|
329 QUERY_METHOD_IGNORE_INVALID_SEQUENCES. |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4569
diff
changeset
|
330 |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4569
diff
changeset
|
331 Coding systems are expected to be able to behave sensibly with all |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4569
diff
changeset
|
332 possible octets on decoding, which is why this method is only available |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4569
diff
changeset
|
333 for encoding. */ |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4569
diff
changeset
|
334 Lisp_Object (*query_method) (Lisp_Object coding_system, struct buffer *buf, |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4569
diff
changeset
|
335 Charbpos end, int flags); |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4569
diff
changeset
|
336 |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4569
diff
changeset
|
337 /* Same as the previous method, but this works in the context of |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4569
diff
changeset
|
338 lstreams. (Where the data do need to be copied, unfortunately.) The |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4569
diff
changeset
|
339 intention is to implement the query method for the mswindows-multibyte |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4569
diff
changeset
|
340 coding systems in terms of a query_lstream method. */ |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4569
diff
changeset
|
341 Lisp_Object (*query_lstream_method) (struct coding_stream *str, |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4569
diff
changeset
|
342 const Ibyte *start, Bytecount n); |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4569
diff
changeset
|
343 |
771 | 344 /* Coding mark method: Mark any Lisp objects in the type-specific data |
345 attached to `struct coding_stream'. Optional. */ | |
346 void (*mark_coding_stream_method) (struct coding_stream *str); | |
347 | |
348 /* Init coding stream method: Initialize the type-specific data attached | |
349 to the coding stream (i.e. in struct TYPE_coding_stream), when the | |
350 coding stream is opened. The type-specific data will be zeroed out. | |
351 Optional. */ | |
352 void (*init_coding_stream_method) (struct coding_stream *str); | |
353 | |
354 /* Rewind coding stream method: Reset any necessary type-specific data as | |
355 a result of the stream being rewound. Optional. */ | |
356 void (*rewind_coding_stream_method) (struct coding_stream *str); | |
357 | |
358 /* Finalize coding stream method: Clean up the type-specific data | |
359 attached to the coding stream (i.e. in struct TYPE_coding_stream). | |
360 Happens when the Lstream is deleted using Lstream_delete() or is | |
361 garbage-collected. Most streams are deleted after they've been used, | |
362 so it's less likely (but still possible) that allocated data will | |
363 stick around until GC time. (File handles can also be closed when EOF | |
364 is signalled; but some data must stick around after this point, for | |
365 the benefit of canonicalize_after_coding. See the convert method.) | |
366 Called only once (NOT called at disksave time). Optional. */ | |
367 void (*finalize_coding_stream_method) (struct coding_stream *str); | |
368 | |
369 /* Finalize method: Clean up type-specific data (e.g. free allocated | |
370 data) attached to the coding system (i.e. in struct | |
371 TYPE_coding_system), when the coding system is about to be garbage | |
372 collected. (Currently not called.) Called only once (NOT called at | |
373 disksave time). Optional. */ | |
374 void (*finalize_method) (Lisp_Object codesys); | |
375 | |
376 /* Conversion end type method: Does this coding system encode bytes -> | |
377 characters, characters -> characters, bytes -> bytes, or | |
378 characters -> bytes?. Default is characters -> bytes. Optional. */ | |
379 enum source_sink_type (*conversion_end_type_method) (Lisp_Object codesys); | |
380 | |
381 /* Putprop method: Set the value of a type-specific property. If | |
382 the property name is unrecognized, return 0. If the value is disallowed | |
383 or erroneous, signal an error. Currently called only at creation time. | |
384 Optional. */ | |
385 int (*putprop_method) (Lisp_Object codesys, | |
386 Lisp_Object key, | |
387 Lisp_Object value); | |
388 | |
389 /* Getprop method: Return the value of a type-specific property. If | |
390 the property name is unrecognized, return Qunbound. Optional. | |
391 */ | |
392 Lisp_Object (*getprop_method) (Lisp_Object coding_system, | |
393 Lisp_Object prop); | |
394 | |
395 /* These next three are set as part of the call to | |
396 INITIALIZE_CODING_SYSTEM_TYPE_WITH_DATA. */ | |
397 | |
398 /* Description of the extra data (struct foo_coding_system) attached to a | |
1204 | 399 coding system, for pdump purposes. */ |
400 const struct sized_memory_description *extra_description; | |
771 | 401 /* size of struct foo_coding_system -- extra data associated with |
402 the coding system */ | |
403 int extra_data_size; | |
404 /* size of struct foo_coding_stream -- extra data associated with the | |
405 struct coding_stream, needed for each active coding process | |
406 using this coding system. note that we can have more than one | |
407 process active at once (simply by creating more than one coding | |
408 lstream using this coding system), so we can't store this data in | |
409 the coding system object. */ | |
410 int coding_data_size; | |
411 }; | |
412 | |
4690
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4569
diff
changeset
|
413 /* Values for flags, as passed to query_method. */ |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4569
diff
changeset
|
414 |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4569
diff
changeset
|
415 #define QUERY_METHOD_IGNORE_INVALID_SEQUENCES 0x0001 |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4569
diff
changeset
|
416 #define QUERY_METHOD_ERRORP 0x0002 |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4569
diff
changeset
|
417 #define QUERY_METHOD_HIGHLIGHT 0x0004 |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4569
diff
changeset
|
418 |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4569
diff
changeset
|
419 enum query_coding_failure_reasons |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4569
diff
changeset
|
420 { |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4569
diff
changeset
|
421 query_coding_succeeded = 0, |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4569
diff
changeset
|
422 query_coding_unencodable = 1, |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4569
diff
changeset
|
423 query_coding_invalid_sequence = 2 |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4569
diff
changeset
|
424 }; |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4569
diff
changeset
|
425 |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4569
diff
changeset
|
426 extern Lisp_Object Qquery_coding_warning_face; |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4569
diff
changeset
|
427 |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4569
diff
changeset
|
428 Lisp_Object default_query_method (Lisp_Object, struct buffer *, Charbpos, |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4569
diff
changeset
|
429 int); |
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4569
diff
changeset
|
430 |
771 | 431 /***** Calling a coding-system method *****/ |
432 | |
433 #define RAW_CODESYSMETH(cs, m) ((cs)->methods->m##_method) | |
434 #define HAS_CODESYSMETH_P(cs, m) (!!RAW_CODESYSMETH (cs, m)) | |
435 #define CODESYSMETH(cs, m, args) (((cs)->methods->m##_method) args) | |
436 | |
437 /* Call a void-returning coding-system method, if it exists. */ | |
438 #define MAYBE_CODESYSMETH(cs, m, args) do { \ | |
439 Lisp_Coding_System *maybe_codesysmeth_cs = (cs); \ | |
440 if (HAS_CODESYSMETH_P (maybe_codesysmeth_cs, m)) \ | |
441 CODESYSMETH (maybe_codesysmeth_cs, m, args); \ | |
442 } while (0) | |
443 | |
444 /* Call a coding-system method, if it exists, or return GIVEN. | |
445 NOTE: Multiply-evaluates CS. */ | |
446 #define CODESYSMETH_OR_GIVEN(cs, m, args, given) \ | |
447 (HAS_CODESYSMETH_P (cs, m) ? \ | |
448 CODESYSMETH (cs, m, args) : (given)) | |
449 | |
450 #define XCODESYSMETH(cs, m, args) \ | |
451 CODESYSMETH (XCODING_SYSTEM (cs), m, args) | |
452 #define MAYBE_XCODESYSMETH(cs, m, args) \ | |
453 MAYBE_CODESYSMETH (XCODING_SYSTEM (cs), m, args) | |
454 #define XCODESYSMETH_OR_GIVEN(cs, m, args, given) \ | |
455 CODESYSMETH_OR_GIVEN (XCODING_SYSTEM (cs), m, args, given) | |
456 | |
457 /***** Defining new coding-system types *****/ | |
458 | |
1204 | 459 extern const struct sized_memory_description coding_system_empty_extra_description; |
771 | 460 |
800 | 461 #ifdef ERROR_CHECK_TYPES |
771 | 462 #define DECLARE_CODING_SYSTEM_TYPE(type) \ |
463 \ | |
464 extern struct coding_system_methods * type##_coding_system_methods; \ | |
826 | 465 DECLARE_INLINE_HEADER ( \ |
466 struct type##_coding_system * \ | |
771 | 467 error_check_##type##_coding_system_data (Lisp_Coding_System *cs) \ |
826 | 468 ) \ |
771 | 469 { \ |
470 assert (CODING_SYSTEM_TYPE_P (cs, type)); \ | |
471 /* Catch accidental use of INITIALIZE_CODING_SYSTEM_TYPE in place \ | |
472 of INITIALIZE_CODING_SYSTEM_TYPE_WITH_DATA. */ \ | |
473 assert (cs->methods->extra_data_size > 0); \ | |
474 return (struct type##_coding_system *) cs->data; \ | |
475 } \ | |
476 \ | |
826 | 477 DECLARE_INLINE_HEADER ( \ |
478 struct type##_coding_stream * \ | |
771 | 479 error_check_##type##_coding_stream_data (struct coding_stream *s) \ |
826 | 480 ) \ |
771 | 481 { \ |
482 assert (XCODING_SYSTEM_TYPE_P (s->codesys, type)); \ | |
483 return (struct type##_coding_stream *) s->data; \ | |
484 } \ | |
485 \ | |
826 | 486 DECLARE_INLINE_HEADER ( \ |
487 Lisp_Coding_System * \ | |
771 | 488 error_check_##type##_coding_system_type (Lisp_Object obj) \ |
826 | 489 ) \ |
771 | 490 { \ |
491 Lisp_Coding_System *cs = XCODING_SYSTEM (obj); \ | |
492 assert (CODING_SYSTEM_TYPE_P (cs, type)); \ | |
493 return cs; \ | |
494 } \ | |
495 \ | |
496 DECLARE_NOTHING | |
497 #else | |
498 #define DECLARE_CODING_SYSTEM_TYPE(type) \ | |
499 extern struct coding_system_methods * type##_coding_system_methods | |
800 | 500 #endif /* ERROR_CHECK_TYPES */ |
771 | 501 |
502 #define DEFINE_CODING_SYSTEM_TYPE(type) \ | |
503 struct coding_system_methods * type##_coding_system_methods | |
504 | |
1204 | 505 #define DEFINE_CODING_SYSTEM_TYPE_WITH_DATA(type) \ |
506 struct coding_system_methods * type##_coding_system_methods; \ | |
507 static const struct sized_memory_description \ | |
508 type##_coding_system_description_0 = { \ | |
509 sizeof (struct type##_coding_system), \ | |
510 type##_coding_system_description \ | |
511 } | |
512 | |
771 | 513 #define INITIALIZE_CODING_SYSTEM_TYPE(ty, pred_sym) do { \ |
514 ty##_coding_system_methods = \ | |
515 xnew_and_zero (struct coding_system_methods); \ | |
516 ty##_coding_system_methods->type = Q##ty; \ | |
517 ty##_coding_system_methods->extra_description = \ | |
1204 | 518 &coding_system_empty_extra_description; \ |
519 ty##_coding_system_methods->enumtype = ty##_coding_system; \ | |
4690
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4569
diff
changeset
|
520 ty##_coding_system_methods->query_method = default_query_method; \ |
771 | 521 defsymbol_nodump (&ty##_coding_system_methods->predicate_symbol, \ |
522 pred_sym); \ | |
523 add_entry_to_coding_system_type_list (ty##_coding_system_methods); \ | |
2367 | 524 dump_add_root_block_ptr (&ty##_coding_system_methods, \ |
771 | 525 &coding_system_methods_description); \ |
526 } while (0) | |
527 | |
528 #define REINITIALIZE_CODING_SYSTEM_TYPE(type) do { \ | |
529 staticpro_nodump (&type##_coding_system_methods->predicate_symbol); \ | |
530 } while (0) | |
531 | |
532 /* This assumes the existence of two structures: | |
533 | |
534 struct foo_coding_system (attached to the coding system) | |
535 struct foo_coding_stream (per coding process, attached to the | |
536 struct coding_stream) | |
1204 | 537 const struct memory_description foo_coding_system_description[] |
538 (data description of struct foo_coding_system) | |
771 | 539 |
1204 | 540 For an example of how to do the description, see |
771 | 541 chain_coding_system_description. |
542 */ | |
543 #define INITIALIZE_CODING_SYSTEM_TYPE_WITH_DATA(type, pred_sym) \ | |
544 do { \ | |
545 INITIALIZE_CODING_SYSTEM_TYPE (type, pred_sym); \ | |
546 type##_coding_system_methods->extra_data_size = \ | |
547 sizeof (struct type##_coding_system); \ | |
548 type##_coding_system_methods->extra_description = \ | |
1204 | 549 &type##_coding_system_description_0; \ |
771 | 550 type##_coding_system_methods->coding_data_size = \ |
551 sizeof (struct type##_coding_stream); \ | |
552 } while (0) | |
553 | |
554 /* Declare that coding-system-type TYPE has method METH; used in | |
555 initialization routines */ | |
556 #define CODING_SYSTEM_HAS_METHOD(type, meth) \ | |
557 (type##_coding_system_methods->meth##_method = type##_##meth) | |
558 | |
559 /***** Macros for accessing coding-system types *****/ | |
560 | |
561 #define CODING_SYSTEM_TYPE_P(cs, type) \ | |
562 ((cs)->methods == type##_coding_system_methods) | |
563 #define XCODING_SYSTEM_TYPE_P(cs, type) \ | |
564 CODING_SYSTEM_TYPE_P (XCODING_SYSTEM (cs), type) | |
565 | |
800 | 566 #ifdef ERROR_CHECK_TYPES |
771 | 567 # define CODING_SYSTEM_TYPE_DATA(cs, type) \ |
568 error_check_##type##_coding_system_data (cs) | |
569 #else | |
570 # define CODING_SYSTEM_TYPE_DATA(cs, type) \ | |
571 ((struct type##_coding_system *) \ | |
572 (cs)->data) | |
573 #endif | |
574 | |
575 #define XCODING_SYSTEM_TYPE_DATA(cs, type) \ | |
576 CODING_SYSTEM_TYPE_DATA (XCODING_SYSTEM_OF_TYPE (cs, type), type) | |
577 | |
800 | 578 #ifdef ERROR_CHECK_TYPES |
771 | 579 # define XCODING_SYSTEM_OF_TYPE(x, type) \ |
580 error_check_##type##_coding_system_type (x) | |
581 # define XSETCODING_SYSTEM_OF_TYPE(x, p, type) do \ | |
582 { \ | |
793 | 583 x = wrap_coding_system (p); \ |
584 assert (CODING_SYSTEM_TYPEP (XCODING_SYSTEM (x), type)); \ | |
771 | 585 } while (0) |
586 #else | |
587 # define XCODING_SYSTEM_OF_TYPE(x, type) XCODING_SYSTEM (x) | |
793 | 588 # define XSETCODING_SYSTEM_OF_TYPE(x, p, type) do \ |
589 { \ | |
590 x = wrap_coding_system (p); \ | |
591 } while (0) | |
771 | 592 #endif /* ERROR_CHECK_TYPE_CHECK */ |
593 | |
594 #define CODING_SYSTEM_TYPEP(x, type) \ | |
595 (CODING_SYSTEMP (x) && CODING_SYSTEM_TYPE_P (XCODING_SYSTEM (x), type)) | |
596 #define CHECK_CODING_SYSTEM_OF_TYPE(x, type) do { \ | |
597 CHECK_CODING_SYSTEM (x); \ | |
598 if (!CODING_SYSTEM_TYPE_P (XCODING_SYSTEM (x), type)) \ | |
599 dead_wrong_type_argument \ | |
600 (type##_coding_system_methods->predicate_symbol, x); \ | |
601 } while (0) | |
602 #define CONCHECK_CODING_SYSTEM_OF_TYPE(x, type) do { \ | |
603 CONCHECK_CODING_SYSTEM (x); \ | |
604 if (!(CODING_SYSTEM_TYPEP (x, type))) \ | |
605 x = wrong_type_argument \ | |
606 (type##_coding_system_methods->predicate_symbol, x); \ | |
607 } while (0) | |
608 | |
609 #define CODING_SYSTEM_METHODS(codesys) ((codesys)->methods) | |
428 | 610 #define CODING_SYSTEM_NAME(codesys) ((codesys)->name) |
771 | 611 #define CODING_SYSTEM_DESCRIPTION(codesys) ((codesys)->description) |
612 #define CODING_SYSTEM_TYPE(codesys) ((codesys)->methods->type) | |
428 | 613 #define CODING_SYSTEM_MNEMONIC(codesys) ((codesys)->mnemonic) |
771 | 614 #define CODING_SYSTEM_DOCUMENTATION(codesys) ((codesys)->documentation) |
428 | 615 #define CODING_SYSTEM_POST_READ_CONVERSION(codesys) \ |
616 ((codesys)->post_read_conversion) | |
617 #define CODING_SYSTEM_PRE_WRITE_CONVERSION(codesys) \ | |
618 ((codesys)->pre_write_conversion) | |
619 #define CODING_SYSTEM_EOL_TYPE(codesys) ((codesys)->eol_type) | |
771 | 620 #define CODING_SYSTEM_EOL_LF(codesys) ((codesys)->eol[EOL_LF]) |
621 #define CODING_SYSTEM_EOL_CRLF(codesys) ((codesys)->eol[EOL_CRLF]) | |
622 #define CODING_SYSTEM_EOL_CR(codesys) ((codesys)->eol[EOL_CR]) | |
623 #define CODING_SYSTEM_TEXT_FILE_WRAPPER(codesys) ((codesys)->text_file_wrapper) | |
624 #define CODING_SYSTEM_AUTO_EOL_WRAPPER(codesys) ((codesys)->auto_eol_wrapper) | |
625 #define CODING_SYSTEM_SUBSIDIARY_PARENT(codesys) ((codesys)->subsidiary_parent) | |
626 #define CODING_SYSTEM_CANONICAL(codesys) ((codesys)->canonical) | |
4568
1d74a1d115ee
Add #'query-coding-region tests; do the work necessary to get them running.
Aidan Kehoe <kehoea@parhasard.net>
parents:
3017
diff
changeset
|
627 #define CODING_SYSTEM_SAFE_CHARSETS(codesys) ((codesys)->safe_charsets) |
1d74a1d115ee
Add #'query-coding-region tests; do the work necessary to get them running.
Aidan Kehoe <kehoea@parhasard.net>
parents:
3017
diff
changeset
|
628 #define CODING_SYSTEM_SAFE_CHARS(codesys) ((codesys)->safe_chars) |
428 | 629 |
771 | 630 #define CODING_SYSTEM_CHAIN_CHAIN(codesys) \ |
631 (CODING_SYSTEM_TYPE_DATA (codesys, chain)->chain) | |
632 #define CODING_SYSTEM_CHAIN_COUNT(codesys) \ | |
633 (CODING_SYSTEM_TYPE_DATA (codesys, chain)->count) | |
634 #define CODING_SYSTEM_CHAIN_CANONICALIZE_AFTER_CODING(codesys) \ | |
635 (CODING_SYSTEM_TYPE_DATA (codesys, chain)->canonicalize_after_coding) | |
428 | 636 |
771 | 637 #define XCODING_SYSTEM_METHODS(codesys) \ |
638 CODING_SYSTEM_METHODS (XCODING_SYSTEM (codesys)) | |
428 | 639 #define XCODING_SYSTEM_NAME(codesys) \ |
640 CODING_SYSTEM_NAME (XCODING_SYSTEM (codesys)) | |
771 | 641 #define XCODING_SYSTEM_DESCRIPTION(codesys) \ |
642 CODING_SYSTEM_DESCRIPTION (XCODING_SYSTEM (codesys)) | |
428 | 643 #define XCODING_SYSTEM_TYPE(codesys) \ |
644 CODING_SYSTEM_TYPE (XCODING_SYSTEM (codesys)) | |
645 #define XCODING_SYSTEM_MNEMONIC(codesys) \ | |
646 CODING_SYSTEM_MNEMONIC (XCODING_SYSTEM (codesys)) | |
771 | 647 #define XCODING_SYSTEM_DOCUMENTATION(codesys) \ |
648 CODING_SYSTEM_DOCUMENTATION (XCODING_SYSTEM (codesys)) | |
428 | 649 #define XCODING_SYSTEM_POST_READ_CONVERSION(codesys) \ |
650 CODING_SYSTEM_POST_READ_CONVERSION (XCODING_SYSTEM (codesys)) | |
651 #define XCODING_SYSTEM_PRE_WRITE_CONVERSION(codesys) \ | |
652 CODING_SYSTEM_PRE_WRITE_CONVERSION (XCODING_SYSTEM (codesys)) | |
653 #define XCODING_SYSTEM_EOL_TYPE(codesys) \ | |
654 CODING_SYSTEM_EOL_TYPE (XCODING_SYSTEM (codesys)) | |
655 #define XCODING_SYSTEM_EOL_LF(codesys) \ | |
656 CODING_SYSTEM_EOL_LF (XCODING_SYSTEM (codesys)) | |
657 #define XCODING_SYSTEM_EOL_CRLF(codesys) \ | |
658 CODING_SYSTEM_EOL_CRLF (XCODING_SYSTEM (codesys)) | |
659 #define XCODING_SYSTEM_EOL_CR(codesys) \ | |
660 CODING_SYSTEM_EOL_CR (XCODING_SYSTEM (codesys)) | |
771 | 661 #define XCODING_SYSTEM_TEXT_FILE_WRAPPER(codesys) \ |
662 CODING_SYSTEM_TEXT_FILE_WRAPPER (XCODING_SYSTEM (codesys)) | |
663 #define XCODING_SYSTEM_AUTO_EOL_WRAPPER(codesys) \ | |
664 CODING_SYSTEM_AUTO_EOL_WRAPPER (XCODING_SYSTEM (codesys)) | |
665 #define XCODING_SYSTEM_SUBSIDIARY_PARENT(codesys) \ | |
666 CODING_SYSTEM_SUBSIDIARY_PARENT (XCODING_SYSTEM (codesys)) | |
667 #define XCODING_SYSTEM_CANONICAL(codesys) \ | |
668 CODING_SYSTEM_CANONICAL (XCODING_SYSTEM (codesys)) | |
4568
1d74a1d115ee
Add #'query-coding-region tests; do the work necessary to get them running.
Aidan Kehoe <kehoea@parhasard.net>
parents:
3017
diff
changeset
|
669 #define XCODING_SYSTEM_SAFE_CHARSETS(codesys) \ |
1d74a1d115ee
Add #'query-coding-region tests; do the work necessary to get them running.
Aidan Kehoe <kehoea@parhasard.net>
parents:
3017
diff
changeset
|
670 CODING_SYSTEM_SAFE_CHARSETS (XCODING_SYSTEM (codesys)) |
1d74a1d115ee
Add #'query-coding-region tests; do the work necessary to get them running.
Aidan Kehoe <kehoea@parhasard.net>
parents:
3017
diff
changeset
|
671 #define XCODING_SYSTEM_SAFE_CHARS(codesys) \ |
1d74a1d115ee
Add #'query-coding-region tests; do the work necessary to get them running.
Aidan Kehoe <kehoea@parhasard.net>
parents:
3017
diff
changeset
|
672 CODING_SYSTEM_SAFE_CHARS (XCODING_SYSTEM (codesys)) |
428 | 673 |
771 | 674 #define XCODING_SYSTEM_CHAIN_CHAIN(codesys) \ |
675 CODING_SYSTEM_CHAIN_CHAIN (XCODING_SYSTEM (codesys)) | |
676 #define XCODING_SYSTEM_CHAIN_COUNT(codesys) \ | |
677 CODING_SYSTEM_CHAIN_COUNT (XCODING_SYSTEM (codesys)) | |
678 #define XCODING_SYSTEM_CHAIN_CANONICALIZE_AFTER_CODING(codesys) \ | |
679 CODING_SYSTEM_CHAIN_CANONICALIZE_AFTER_CODING (XCODING_SYSTEM (codesys)) | |
428 | 680 |
771 | 681 /**************************************************/ |
682 /* Detection */ | |
683 /**************************************************/ | |
428 | 684 |
771 | 685 #define MAX_DETECTOR_CATEGORIES 256 |
686 #define MAX_DETECTORS 64 | |
428 | 687 |
771 | 688 #define MAX_BYTES_PROCESSED_FOR_DETECTION 65536 |
428 | 689 |
771 | 690 struct detection_state |
428 | 691 { |
771 | 692 int seen_non_ascii; |
693 Bytecount bytes_seen; | |
428 | 694 |
771 | 695 char categories[MAX_DETECTOR_CATEGORIES]; |
696 Bytecount data_offset[MAX_DETECTORS]; | |
697 /* ... more data follows; data_offset[detector_##TYPE] points to | |
698 the data for that type */ | |
428 | 699 }; |
700 | |
771 | 701 #define DETECTION_STATE_DATA(st, type) \ |
702 ((struct type##_detector *) \ | |
703 ((char *) (st) + (st)->data_offset[detector_##type])) | |
428 | 704 |
448 | 705 /* Distinguishable categories of encodings. |
706 | |
707 This list determines the initial priority of the categories. | |
708 | |
709 For better or worse, currently Mule files are encoded in 7-bit ISO 2022. | |
710 For this reason, under Mule ISO_7 gets highest priority. | |
711 | |
712 Putting NO_CONVERSION second prevents "binary corruption" in the | |
713 default case in all but the (presumably) extremely rare case of a | |
714 binary file which contains redundant escape sequences but no 8-bit | |
715 characters. | |
716 | |
717 The remaining priorities are based on perceived "internationalization | |
718 political correctness." An exception is UCS-4 at the bottom, since | |
719 basically everything is compatible with UCS-4, but it is likely to | |
720 be very rare as an external encoding. */ | |
721 | |
771 | 722 /* Macros to define code of control characters for ISO2022's functions. */ |
723 /* Used by the detection routines of other coding system types as well. */ | |
724 /* code */ /* function */ | |
725 #define ISO_CODE_LF 0x0A /* line-feed */ | |
726 #define ISO_CODE_CR 0x0D /* carriage-return */ | |
727 #define ISO_CODE_SO 0x0E /* shift-out */ | |
728 #define ISO_CODE_SI 0x0F /* shift-in */ | |
729 #define ISO_CODE_ESC 0x1B /* escape */ | |
730 #define ISO_CODE_DEL 0x7F /* delete */ | |
731 #define ISO_CODE_SS2 0x8E /* single-shift-2 */ | |
732 #define ISO_CODE_SS3 0x8F /* single-shift-3 */ | |
733 #define ISO_CODE_CSI 0x9B /* control-sequence-introduce */ | |
734 | |
735 enum detection_result | |
736 { | |
737 /* Basically means a magic cookie was seen indicating this type, or | |
738 something similar. */ | |
739 DET_NEAR_CERTAINTY = 4, | |
740 DET_HIGHEST = 4, | |
741 /* Characteristics seen that are unlikely to be other coding system types | |
742 -- e.g. ISO-2022 escape sequences, or perhaps a consistent pattern of | |
743 alternating zero bytes in UTF-16, along with Unicode LF or CRLF | |
744 sequences at regular intervals. (Zero bytes are unlikely or impossible | |
745 in most text encodings.) */ | |
746 DET_QUITE_PROBABLE = 3, | |
747 /* Strong or medium statistical likelihood. At least some | |
748 characteristics seen that match what's normally found in this encoding | |
749 -- e.g. in Shift-JIS, a number of two-byte Japanese character | |
750 sequences in the right range, and nothing out of range; or in Unicode, | |
751 much higher statistical variance in the odd bytes than in the even | |
752 bytes, or vice-versa (perhaps the presence of regular EOL sequences | |
753 would bump this too to DET_QUITE_PROBABLE). This is quite often a | |
754 statistical test. */ | |
755 DET_SOMEWHAT_LIKELY = 2, | |
756 /* Weak statistical likelihood. Pretty much any features at all that | |
757 characterize this encoding, and nothing that rules against it. */ | |
758 DET_SLIGHTLY_LIKELY = 1, | |
759 /* Default state. Perhaps it indicates pure ASCII or something similarly | |
760 vague seen in Shift-JIS, or, exactly as the level says, it might mean | |
761 in a statistical-based detector that the pros and cons are balanced | |
762 out. This is also the lowest level that will be accepted by the | |
763 auto-detector without asking the user: If all available detectors | |
764 report lower levels for all categories with attached coding systems, | |
765 the user will be shown the results and explicitly prompted for action. | |
766 The user will also be prompted if this is the highest available level | |
767 and more than one detector reports the level. (See below about the | |
768 consequent necessity of an "ASCII" detector, which will return level 1 | |
769 or higher for most plain text files.) */ | |
770 DET_AS_LIKELY_AS_UNLIKELY = 0, | |
771 /* Some characteristics seen that are unusual for this encoding -- | |
772 e.g. unusual control characters in a plain-text encoding, lots of | |
773 8-bit characters, or little statistical variance in the odd and even | |
774 bytes in UTF-16. */ | |
775 DET_SOMEWHAT_UNLIKELY = -1, | |
776 /* This indicates that there is very little chance the data is in the | |
777 right format; this is probably the lowest level you can get when | |
778 presenting random binary data to a text file, because there are no | |
779 "specific sequences" you can see that would totally rule out | |
780 recognition. */ | |
781 DET_QUITE_IMPROBABLE = -2, | |
782 /* An erroneous sequence was seen. */ | |
783 DET_NEARLY_IMPOSSIBLE = -3, | |
1429 | 784 DET_LOWEST = -3 |
771 | 785 }; |
786 | |
787 extern int coding_detector_count; | |
788 extern int coding_detector_category_count; | |
789 | |
790 struct detector_category | |
428 | 791 { |
771 | 792 int id; |
793 Lisp_Object sym; | |
794 }; | |
795 | |
796 typedef struct | |
797 { | |
798 Dynarr_declare (struct detector_category); | |
799 } detector_category_dynarr; | |
800 | |
801 struct detector | |
802 { | |
803 int id; | |
804 detector_category_dynarr *cats; | |
805 Bytecount data_size; | |
806 /* Detect method: Required. */ | |
807 void (*detect_method) (struct detection_state *st, | |
808 const unsigned char *src, Bytecount n); | |
809 /* Finalize detection state method: Clean up any allocated data in the | |
810 detection state. Called only once (NOT called at disksave time). | |
811 Optional. */ | |
812 void (*finalize_detection_state_method) (struct detection_state *st); | |
428 | 813 }; |
814 | |
771 | 815 /* Lvalue for a particular detection result -- detection state ST, |
816 category CAT */ | |
817 #define DET_RESULT(st, cat) ((st)->categories[detector_category_##cat]) | |
818 /* In state ST, set all detection results associated with detector DET to | |
819 RESULT. */ | |
820 #define SET_DET_RESULTS(st, det, result) \ | |
821 set_detection_results (st, detector_##det, result) | |
822 | |
823 typedef struct | |
824 { | |
825 Dynarr_declare (struct detector); | |
826 } detector_dynarr; | |
827 | |
828 extern detector_dynarr *all_coding_detectors; | |
829 | |
830 #define DEFINE_DETECTOR_CATEGORY(detector, cat) \ | |
831 int detector_category_##cat | |
832 #define DECLARE_DETECTOR_CATEGORY(detector, cat) \ | |
833 extern int detector_category_##cat | |
834 #define INITIALIZE_DETECTOR_CATEGORY(detector, cat) \ | |
835 do { \ | |
836 struct detector_category dog; \ | |
837 xzero (dog); \ | |
838 detector_category_##cat = coding_detector_category_count++; \ | |
839 dump_add_opaque_int (&detector_category_##cat); \ | |
840 dog.id = detector_category_##cat; \ | |
841 dog.sym = Q##cat; \ | |
842 Dynarr_add (Dynarr_at (all_coding_detectors, detector_##detector).cats, \ | |
843 dog); \ | |
844 } while (0) | |
845 | |
846 #define DEFINE_DETECTOR(Detector) \ | |
847 int detector_##Detector | |
848 #define DECLARE_DETECTOR(Detector) \ | |
849 extern int detector_##Detector | |
850 #define INITIALIZE_DETECTOR(Detector) \ | |
851 do { \ | |
852 struct detector det; \ | |
853 xzero (det); \ | |
854 detector_##Detector = coding_detector_count++; \ | |
855 dump_add_opaque_int (&detector_##Detector); \ | |
856 det.id = detector_##Detector; \ | |
857 det.cats = Dynarr_new2 (detector_category_dynarr, \ | |
858 struct detector_category); \ | |
859 det.data_size = sizeof (struct Detector##_detector); \ | |
860 Dynarr_add (all_coding_detectors, det); \ | |
861 } while (0) | |
862 #define DETECTOR_HAS_METHOD(Detector, Meth) \ | |
863 Dynarr_at (all_coding_detectors, detector_##Detector).Meth##_method = \ | |
802 | 864 Detector##_##Meth |
771 | 865 |
866 | |
867 /**************************************************/ | |
868 /* Decoding/Encoding */ | |
869 /**************************************************/ | |
870 | |
871 /* Is the source (SOURCEP == 1) or sink (SOURCEP == 0) when encoding specified | |
872 in characters? */ | |
873 | |
874 enum source_or_sink | |
875 { | |
876 CODING_SOURCE, | |
877 CODING_SINK | |
878 }; | |
879 | |
880 enum encode_decode | |
881 { | |
882 CODING_ENCODE, | |
883 CODING_DECODE | |
884 }; | |
885 | |
886 /* Data structure attached to an lstream of type `coding', | |
887 containing values specific to the coding process. Additional | |
888 data is stored in the DATA field below; the exact form of that data | |
889 is controlled by the type of the coding system that governs the | |
890 conversion (field CODESYS). CODESYS may be set at any time | |
891 throughout the lifetime of the lstream and possibly more than once. | |
892 See long comment above for more info. */ | |
893 | |
894 struct coding_stream | |
895 { | |
1204 | 896 /* Enumerated constant listing which type of console this is (TTY, X, |
897 MS-Windows, etc.). This duplicates the method structure in | |
898 XCODING_SYSTEM (str->codesys)->methods->type, which formerly was the | |
899 only way to determine the coding system type. We need this constant | |
900 now for KKCC, so that it can be used in an XD_UNION clause to | |
901 determine the Lisp objects in the type-specific data. */ | |
902 enum coding_system_variant type; | |
903 | |
771 | 904 /* Coding system that governs the conversion. */ |
905 Lisp_Object codesys; | |
906 /* Original coding system, pre-canonicalization. */ | |
907 Lisp_Object orig_codesys; | |
908 | |
909 /* Back pointer to current stream. */ | |
910 Lstream *us; | |
911 | |
912 /* Stream that we read the unprocessed data from or write the processed | |
913 data to. */ | |
914 Lstream *other_end; | |
915 | |
916 /* In order to handle both reading to and writing from a coding stream, | |
917 we phrase the conversion methods like write methods -- we can | |
918 implement reading in terms of a write method but not vice-versa, | |
919 because the write method is forced to take only what it's given but | |
920 the read method can read more data from the other end if necessary. | |
921 On the other hand, the write method is free to generate all the data | |
2297 | 922 it wants (and just write it to the other end), but the read method |
771 | 923 can return only as much as was asked for, so we need to implement our |
924 own buffering. */ | |
925 | |
926 /* If we are reading, then we can return only a fixed amount of data, but | |
927 the converter is free to return as much as it wants, so we direct it | |
928 to store the data here and lop off chunks as we need them. If we are | |
929 writing, we use this because the converter takes a Dynarr but we are | |
930 supposed to write into a fixed buffer. (NOTE: This introduces an extra | |
931 memory copy.) */ | |
932 unsigned_char_dynarr *convert_to; | |
933 | |
934 /* The conversion method might reject some of the data -- this typically | |
935 includes partial characters, partial escape sequences, etc. When | |
936 writing, we just pass the rejection up to the Lstream module, and it | |
937 will buffer the data. When reading, however, we need to do the | |
938 buffering ourselves, and we put it here, combined with newly read | |
939 data. */ | |
940 unsigned_char_dynarr *convert_from; | |
941 | |
942 /* If set, this is the last chunk of data being processed. When this is | |
943 finished, output any necessary terminating control characters, escape | |
944 sequences, etc. */ | |
945 unsigned int eof:1; | |
946 | |
947 /* CH holds a partially built-up character. This is really part of the | |
948 state-dependent data and should be moved there. */ | |
949 unsigned int ch; | |
950 | |
951 /* Coding-system-specific data holding extra state about the | |
952 conversion. Logically a struct TYPE_coding_stream; a pointer | |
800 | 953 to such a struct, with (when ERROR_CHECK_TYPES is defined) |
771 | 954 error-checking that this is really a structure of that type |
955 (checking the corresponding coding system type) can be retrieved using | |
956 CODING_STREAM_TYPE_DATA(). Allocated at the same time that | |
957 CODESYS is set (which may occur at any time, even multiple times, | |
958 during the lifetime of the stream). The size comes from | |
959 methods->coding_data_size. */ | |
960 void *data; | |
961 | |
962 enum encode_decode direction; | |
963 | |
800 | 964 /* If set, don't close the stream at the other end when being closed. */ |
965 unsigned int no_close_other:1; | |
802 | 966 /* If set, read only one byte at a time from other end to avoid any |
967 possible blocking. */ | |
968 unsigned int one_byte_at_a_time:1; | |
814 | 969 /* If set, and we're a read stream, we init char mode on ourselves as |
970 necessary to prevent the caller from getting partial characters. (the | |
971 default) */ | |
972 unsigned int set_char_mode_on_us_when_reading:1; | |
800 | 973 |
771 | 974 /* #### Temporary test */ |
975 unsigned int finalized:1; | |
976 }; | |
977 | |
978 #define CODING_STREAM_DATA(stream) LSTREAM_TYPE_DATA (stream, coding) | |
979 | |
800 | 980 #ifdef ERROR_CHECK_TYPES |
771 | 981 # define CODING_STREAM_TYPE_DATA(s, type) \ |
982 error_check_##type##_coding_stream_data (s) | |
983 #else | |
984 # define CODING_STREAM_TYPE_DATA(s, type) \ | |
985 ((struct type##_coding_stream *) (s)->data) | |
986 #endif | |
987 | |
988 /* C should be a binary character in the range 0 - 255; convert | |
989 to internal format and add to Dynarr DST. */ | |
990 | |
428 | 991 #ifdef MULE |
771 | 992 |
993 #define DECODE_ADD_BINARY_CHAR(c, dst) \ | |
994 do { \ | |
826 | 995 if (byte_ascii_p (c)) \ |
771 | 996 Dynarr_add (dst, c); \ |
826 | 997 else if (byte_c1_p (c)) \ |
771 | 998 { \ |
999 Dynarr_add (dst, LEADING_BYTE_CONTROL_1); \ | |
1000 Dynarr_add (dst, c + 0x20); \ | |
1001 } \ | |
1002 else \ | |
1003 { \ | |
1004 Dynarr_add (dst, LEADING_BYTE_LATIN_ISO8859_1); \ | |
1005 Dynarr_add (dst, c); \ | |
1006 } \ | |
1007 } while (0) | |
1008 | |
1009 #else /* not MULE */ | |
1010 | |
1011 #define DECODE_ADD_BINARY_CHAR(c, dst) \ | |
1012 do { \ | |
1013 Dynarr_add (dst, c); \ | |
1014 } while (0) | |
1015 | |
1016 #endif /* MULE */ | |
1017 | |
1018 #define DECODE_OUTPUT_PARTIAL_CHAR(ch, dst) \ | |
1019 do { \ | |
1020 if (ch) \ | |
1021 { \ | |
1022 DECODE_ADD_BINARY_CHAR (ch, dst); \ | |
1023 ch = 0; \ | |
1024 } \ | |
1025 } while (0) | |
428 | 1026 |
1027 #ifdef MULE | |
1028 /* Convert shift-JIS code (sj1, sj2) into internal string | |
1029 representation (c1, c2). (The leading byte is assumed.) */ | |
1030 | |
771 | 1031 #define DECODE_SHIFT_JIS(sj1, sj2, c1, c2) \ |
428 | 1032 do { \ |
1033 int I1 = sj1, I2 = sj2; \ | |
1034 if (I2 >= 0x9f) \ | |
1035 c1 = (I1 << 1) - ((I1 >= 0xe0) ? 0xe0 : 0x60), \ | |
1036 c2 = I2 + 2; \ | |
1037 else \ | |
1038 c1 = (I1 << 1) - ((I1 >= 0xe0) ? 0xe1 : 0x61), \ | |
1039 c2 = I2 + ((I2 >= 0x7f) ? 0x60 : 0x61); \ | |
1040 } while (0) | |
1041 | |
1042 /* Convert the internal string representation of a Shift-JIS character | |
1043 (c1, c2) into Shift-JIS code (sj1, sj2). The leading byte is | |
1044 assumed. */ | |
1045 | |
771 | 1046 #define ENCODE_SHIFT_JIS(c1, c2, sj1, sj2) \ |
428 | 1047 do { \ |
1048 int I1 = c1, I2 = c2; \ | |
1049 if (I1 & 1) \ | |
1050 sj1 = (I1 >> 1) + ((I1 < 0xdf) ? 0x31 : 0x71), \ | |
1051 sj2 = I2 - ((I2 >= 0xe0) ? 0x60 : 0x61); \ | |
1052 else \ | |
1053 sj1 = (I1 >> 1) + ((I1 < 0xdf) ? 0x30 : 0x70), \ | |
1054 sj2 = I2 - 2; \ | |
1055 } while (0) | |
1056 #endif /* MULE */ | |
1057 | |
771 | 1058 DECLARE_CODING_SYSTEM_TYPE (no_conversion); |
1059 DECLARE_CODING_SYSTEM_TYPE (convert_eol); | |
1060 #if 0 | |
1061 DECLARE_CODING_SYSTEM_TYPE (text_file_wrapper); | |
1062 #endif /* 0 */ | |
1063 DECLARE_CODING_SYSTEM_TYPE (undecided); | |
1064 DECLARE_CODING_SYSTEM_TYPE (chain); | |
1065 | |
1066 #ifdef DEBUG_XEMACS | |
1067 DECLARE_CODING_SYSTEM_TYPE (internal); | |
1068 #endif | |
1069 | |
1070 #ifdef MULE | |
1071 DECLARE_CODING_SYSTEM_TYPE (iso2022); | |
1072 DECLARE_CODING_SYSTEM_TYPE (ccl); | |
4690
257b468bf2ca
Move the #'query-coding-region implementation to C.
Aidan Kehoe <kehoea@parhasard.net>
parents:
4569
diff
changeset
|
1073 DECLARE_CODING_SYSTEM_TYPE (fixed_width); |
771 | 1074 DECLARE_CODING_SYSTEM_TYPE (shift_jis); |
1075 DECLARE_CODING_SYSTEM_TYPE (big5); | |
1076 #endif | |
1077 | |
1078 #ifdef HAVE_ZLIB | |
1079 DECLARE_CODING_SYSTEM_TYPE (gzip); | |
1080 #endif | |
428 | 1081 |
771 | 1082 DECLARE_CODING_SYSTEM_TYPE (unicode); |
428 | 1083 |
1315 | 1084 #ifdef WIN32_ANY |
771 | 1085 DECLARE_CODING_SYSTEM_TYPE (mswindows_multibyte_to_unicode); |
1086 DECLARE_CODING_SYSTEM_TYPE (mswindows_multibyte); | |
428 | 1087 #endif |
771 | 1088 |
1089 Lisp_Object coding_stream_detected_coding_system (Lstream *stream); | |
1090 Lisp_Object coding_stream_coding_system (Lstream *stream); | |
1091 void set_coding_stream_coding_system (Lstream *stream, | |
1092 Lisp_Object codesys); | |
1093 Lisp_Object detect_coding_stream (Lisp_Object stream); | |
867 | 1094 Ichar decode_big5_char (int o1, int o2); |
771 | 1095 void add_entry_to_coding_system_type_list (struct coding_system_methods *m); |
1096 Lisp_Object make_internal_coding_system (Lisp_Object existing, | |
4528
726060ee587c
First draft of g++ 4.3 warning removal patch. Builds. *Needs ChangeLogs.*
Stephen J. Turnbull <stephen@xemacs.org>
parents:
4522
diff
changeset
|
1097 const Ascbyte *prefix, |
771 | 1098 Lisp_Object type, |
1099 Lisp_Object description, | |
1100 Lisp_Object props); | |
802 | 1101 |
814 | 1102 #define LSTREAM_FL_NO_CLOSE_OTHER (1 << 16) |
1103 #define LSTREAM_FL_READ_ONE_BYTE_AT_A_TIME (1 << 17) | |
1104 #define LSTREAM_FL_NO_INIT_CHAR_MODE_WHEN_READING (1 << 18) | |
1105 | |
771 | 1106 Lisp_Object make_coding_input_stream (Lstream *stream, Lisp_Object codesys, |
800 | 1107 enum encode_decode direction, |
802 | 1108 int flags); |
771 | 1109 Lisp_Object make_coding_output_stream (Lstream *stream, Lisp_Object codesys, |
800 | 1110 enum encode_decode direction, |
802 | 1111 int flags); |
771 | 1112 void set_detection_results (struct detection_state *st, int detector, |
1113 int given); | |
428 | 1114 |
440 | 1115 #endif /* INCLUDED_file_coding_h_ */ |
1116 |