xemacs-beta: src/file-coding.h annotate

annotate src/file-coding.h @ 771:943eaba38521

[xemacs-hg @ 2002-03-13 08:51:24 by ben] The big ben-mule-21-5 check-in! Various files were added and deleted. See CHANGES-ben-mule. There are still some test suite failures. No crashes, though. Many of the failures have to do with problems in the test suite itself rather than in the actual code. I'll be addressing these in the next day or so -- none of the test suite failures are at all critical. Meanwhile I'll be trying to address the biggest issues -- i.e. build or run failures, which will almost certainly happen on various platforms. All comments should be sent to ben@xemacs.org -- use a Cc: if necessary when sending to mailing lists. There will be pre- and post- tags, something like pre-ben-mule-21-5-merge-in, and post-ben-mule-21-5-merge-in.

author	ben
date	Wed, 13 Mar 2002 08:54:06 +0000
parents	fdefd0186b75
children	e38acbeb1cae

rev	line source
771 943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	1 /* Header for encoding conversion functions; coding-system object.
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	2 #### rename me to coding-system.h
428 3ecd8885ac67 Import from CVS: tag r21-2-22 cvs parents: diff changeset	3 Copyright (C) 1991, 1995 Free Software Foundation, Inc.
3ecd8885ac67 Import from CVS: tag r21-2-22 cvs parents: diff changeset	4 Copyright (C) 1995 Sun Microsystems, Inc.
771 943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	5 Copyright (C) 2000, 2001 Ben Wing.
428 3ecd8885ac67 Import from CVS: tag r21-2-22 cvs parents: diff changeset	6
3ecd8885ac67 Import from CVS: tag r21-2-22 cvs parents: diff changeset	7 This file is part of XEmacs.
3ecd8885ac67 Import from CVS: tag r21-2-22 cvs parents: diff changeset	8
3ecd8885ac67 Import from CVS: tag r21-2-22 cvs parents: diff changeset	9 XEmacs is free software; you can redistribute it and/or modify it
3ecd8885ac67 Import from CVS: tag r21-2-22 cvs parents: diff changeset	10 under the terms of the GNU General Public License as published by the
3ecd8885ac67 Import from CVS: tag r21-2-22 cvs parents: diff changeset	11 Free Software Foundation; either version 2, or (at your option) any
3ecd8885ac67 Import from CVS: tag r21-2-22 cvs parents: diff changeset	12 later version.
3ecd8885ac67 Import from CVS: tag r21-2-22 cvs parents: diff changeset	13
3ecd8885ac67 Import from CVS: tag r21-2-22 cvs parents: diff changeset	14 XEmacs is distributed in the hope that it will be useful, but WITHOUT
3ecd8885ac67 Import from CVS: tag r21-2-22 cvs parents: diff changeset	15 ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
3ecd8885ac67 Import from CVS: tag r21-2-22 cvs parents: diff changeset	16 FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License
3ecd8885ac67 Import from CVS: tag r21-2-22 cvs parents: diff changeset	17 for more details.
3ecd8885ac67 Import from CVS: tag r21-2-22 cvs parents: diff changeset	18
3ecd8885ac67 Import from CVS: tag r21-2-22 cvs parents: diff changeset	19 You should have received a copy of the GNU General Public License
3ecd8885ac67 Import from CVS: tag r21-2-22 cvs parents: diff changeset	20 along with XEmacs; see the file COPYING. If not, write to
3ecd8885ac67 Import from CVS: tag r21-2-22 cvs parents: diff changeset	21 the Free Software Foundation, Inc., 59 Temple Place - Suite 330,
3ecd8885ac67 Import from CVS: tag r21-2-22 cvs parents: diff changeset	22 Boston, MA 02111-1307, USA. */
3ecd8885ac67 Import from CVS: tag r21-2-22 cvs parents: diff changeset	23
3ecd8885ac67 Import from CVS: tag r21-2-22 cvs parents: diff changeset	24 /* Synched up with: Mule 2.3. Not in FSF. */
3ecd8885ac67 Import from CVS: tag r21-2-22 cvs parents: diff changeset	25
771 943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	26 /* Authorship:
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	27
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	28 Current primary author: Ben Wing <ben@xemacs.org>
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	29
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	30 Written by Ben Wing <ben@xemacs.org> for XEmacs, 1995, loosely based
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	31 on code written 91.10.09 by K.Handa <handa@etl.go.jp>.
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	32 Rewritten again 2000-2001 by Ben Wing to support properly
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	33 abstracted coding systems.
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	34 September 2001: Finished last part of abstraction, the detection
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	35 mechanism.
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	36 */
428 3ecd8885ac67 Import from CVS: tag r21-2-22 cvs parents: diff changeset	37
440 8de8e3f6228a Import from CVS: tag r21-2-28 cvs parents: 438 diff changeset	38 #ifndef INCLUDED_file_coding_h_
8de8e3f6228a Import from CVS: tag r21-2-28 cvs parents: 438 diff changeset	39 #define INCLUDED_file_coding_h_
428 3ecd8885ac67 Import from CVS: tag r21-2-22 cvs parents: diff changeset	40
771 943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	41 /* Capsule description of the different structures, what their purpose is,
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	42 how they fit together, and where various bits of data are stored.
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	43
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	44 A "coding system" is an algorithm for converting data in one format into
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	45 data in another format. Currently most of the coding systems we have
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	46 created concern internationalized text, and convert between the XEmacs
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	47 internal format for multilingual text, and various external
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	48 representations of such text. However, any such conversion is possible,
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	49 for example, compressing or uncompressing text using the gzip algorithm.
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	50 All coding systems provide both encode and decode routines, so that the
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	51 conversion can go both ways.
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	52
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	53 The way we handle this is by dividing the various potential coding
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	54 systems into types, analogous to classes in C++. Each coding system
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	55 type encompasses a series of related coding systems that it can
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	56 implement, and it has properties which control how exactly the encoding
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	57 works. A particular set of values for each of the properties makes up a
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	58 "coding system", and specifies one particular encoding. A `struct
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	59 Lisp_Coding_System' object encapsulates those settings -- its type, the
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	60 values chosen for all properties of that type, a name for the coding
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	61 system, some documentation.
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	62
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	63 In addition, there are of course methods associated with a coding system
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	64 type, implementing the encoding, decoding, etc. These are stored in a
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	65 `struct coding_system_methods' object, one per coding-system type, which
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	66 contains mostly function pointers. This is retrievable from the
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	67 coding-system object (i.e. the struct Lisp_Coding_System), which has a
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	68 pointer to it.
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	69
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	70 In order to actually use a coding system to do an encoding or decoding
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	71 operation, you need to use a coding Lstream.
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	72
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	73 Now let's look more at attached data. All coding systems have certain
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	74 common data fields -- name, type, documentation, etc. -- as well as a
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	75 bunch more that are defined by the coding system type. To handle this
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	76 cleanly, each coding system type defines a structure that holds just the
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	77 fields of data particular to it, and calls it e.g. `struct
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	78 iso2022_coding_system' for coding system type `iso2022'. When the
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	79 memory block holding the coding system object is created, it is sized
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	80 such that it can hold both the struct Lisp_Coding_System and the struct
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	81 iso2022_coding_system (or whatever) directly following it. (This is a
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	82 common trick; another possibility is to have a void * pointer in the
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	83 struct Lisp_Coding_System, which points to another memory block holding
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	84 the struct iso2022_coding_system.) A macro is provided
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	85 (CODING_SYSTEM_TYPE_DATA) to retrieve a pointer of the right type to the
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	86 type-specific data contained within the overall `struct
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	87 Lisp_Coding_System' block.
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	88
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	89 Lstreams, similarly, are objects of type `struct lstream' holding data
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	90 about the stream operation (how much data has been read or written, any
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	91 buffered data, any error conditions, etc.), and like coding systems have
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	92 different types. They have a structure called `Lstream_implementation',
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	93 one per lstream type, exactly analogous to `struct
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	94 coding_system_methods'. In addition, they have type-specific data
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	95 (specifying, e.g., the file number, FILE *, memory location, other
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	96 lstream, etc. to read the data from or write it to, and for conversion
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	97 processes, the current state of the process -- are we decoding ASCII or
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	98 Kanji characters? are we in the middle of a processing an escape
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	99 sequence? etc.). This type-specific data is stored in a structure
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	100 named `struct coding_stream'. Just like for coding systems, the
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	101 type-independent data in the `struct lstream' and the type-dependent
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	102 data in the `struct coding_stream' are stored together in the same
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	103 memory block.
428 3ecd8885ac67 Import from CVS: tag r21-2-22 cvs parents: diff changeset	104
771 943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	105 Now things get a bit tricky. The `struct coding_stream' is
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	106 type-specific from the point of view of an lstream, but not from the
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	107 point of view of a coding system. It contains only general data about
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	108 the conversion process, e.g. the name of the coding system used for
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	109 conversion, the lstream that we take data from or write it to (depending
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	110 on whether this was created as a read stream or a write stream), a
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	111 buffer to hold extra data we retrieved but can't send on yet, some
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	112 flags, etc. It also needs some data specific to the particular coding
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	113 system and thus to the particular operation going on. This data is held
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	114 in a structure named (e.g.) `struct iso2022_coding_stream', and it's
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	115 held in a separate memory block and pointed to by the generic `struct
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	116 coding_stream'. It's not glommed into a single memory block both
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	117 because that would require making changes to the generic lstream code
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	118 and more importantly because the coding system used in a particular
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	119 coding lstream can be changed at any point during the lifetime of the
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	120 lstream, and possibly multiple times. (For example, it can be set using
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	121 the Lisp primitives `set-process-input-coding-system' and
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	122 `set-console-tty-input-coding-system', as well as getting set when a
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	123 conversion operation was started with coding system `undecided' and the
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	124 correct coding system was then detected.)
428 3ecd8885ac67 Import from CVS: tag r21-2-22 cvs parents: diff changeset	125
771 943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	126 IMPORTANT NOTE: There are at least two ancillary data structures
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	127 associated with a coding system type. (There may also be detection data;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	128 see elsewhere.) It's important, when writing a coding system type, to
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	129 keep straight which type of data goes where. In particular, `struct
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	130 foo_coding_system' is attached to the coding system object itself. This
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	131 is a permanent object and there's only one per coding system. It's
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	132 created once, usually at init time, and never destroyed. So, `struct
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	133 foo_coding_system' should in general not contain dynamic data! (Just
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	134 data describing the properties of the coding system.) In particular,
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	135 NO data about any conversion in progress. There may be many
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	136 conversions going on simultaneously using a particular coding system,
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	137 and by storing conversion data in the coding system, these conversions
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	138 will overwrite each other's data.
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	139
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	140 Instead, use the lstream object, whose purpose is to encapsulate a
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	141 particular conversion and all associated data. From the lstream object,
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	142 you can get the struct coding_stream using something like
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	143
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	144 struct coding_stream *str = LSTREAM_TYPE_DATA (lstr, coding);
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	145
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	146 But usually this structure is already passed to you as one of the
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	147 parameters of the method being invoked.
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	148
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	149 From the struct coding_stream, you can retrieve the
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	150 coding-system-type-specific data using something like
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	151
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	152 struct foo_coding_stream *data = CODING_STREAM_TYPE_DATA (str, foo);
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	153
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	154 Then, use this structure to hold all data relevant to the particular
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	155 conversion being done.
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	156
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	157 Initialize this structure whenever init_coding_stream_method is called
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	158 (this may happen more than once), and finalize it (free resources, etc.)
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	159 when finalize_coding_stream_method is called.
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	160 */
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	161
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	162 struct coding_stream;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	163 struct detection_state;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	164
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	165 extern const struct struct_description coding_system_methods_description;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	166
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	167 struct coding_system_methods;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	168
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	169 enum source_sink_type
428 3ecd8885ac67 Import from CVS: tag r21-2-22 cvs parents: diff changeset	170 {
771 943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	171 DECODES_CHARACTER_TO_BYTE,
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	172 DECODES_BYTE_TO_BYTE,
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	173 DECODES_BYTE_TO_CHARACTER,
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	174 DECODES_CHARACTER_TO_CHARACTER
428 3ecd8885ac67 Import from CVS: tag r21-2-22 cvs parents: diff changeset	175 };
3ecd8885ac67 Import from CVS: tag r21-2-22 cvs parents: diff changeset	176
3ecd8885ac67 Import from CVS: tag r21-2-22 cvs parents: diff changeset	177 enum eol_type
3ecd8885ac67 Import from CVS: tag r21-2-22 cvs parents: diff changeset	178 {
3ecd8885ac67 Import from CVS: tag r21-2-22 cvs parents: diff changeset	179 EOL_LF,
3ecd8885ac67 Import from CVS: tag r21-2-22 cvs parents: diff changeset	180 EOL_CRLF,
771 943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	181 EOL_CR,
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	182 EOL_AUTODETECT,
428 3ecd8885ac67 Import from CVS: tag r21-2-22 cvs parents: diff changeset	183 };
3ecd8885ac67 Import from CVS: tag r21-2-22 cvs parents: diff changeset	184
3ecd8885ac67 Import from CVS: tag r21-2-22 cvs parents: diff changeset	185 struct Lisp_Coding_System
3ecd8885ac67 Import from CVS: tag r21-2-22 cvs parents: diff changeset	186 {
3ecd8885ac67 Import from CVS: tag r21-2-22 cvs parents: diff changeset	187 struct lcrecord_header header;
771 943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	188 struct coding_system_methods *methods;
428 3ecd8885ac67 Import from CVS: tag r21-2-22 cvs parents: diff changeset	189
771 943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	190 /* Name and description of this coding system. The description
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	191 should be suitable for a menu entry. */
440 8de8e3f6228a Import from CVS: tag r21-2-28 cvs parents: 438 diff changeset	192 Lisp_Object name;
771 943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	193 Lisp_Object description;
428 3ecd8885ac67 Import from CVS: tag r21-2-22 cvs parents: diff changeset	194
3ecd8885ac67 Import from CVS: tag r21-2-22 cvs parents: diff changeset	195 /* Mnemonic string displayed in the modeline when this coding
3ecd8885ac67 Import from CVS: tag r21-2-22 cvs parents: diff changeset	196 system is active for a particular buffer. */
3ecd8885ac67 Import from CVS: tag r21-2-22 cvs parents: diff changeset	197 Lisp_Object mnemonic;
3ecd8885ac67 Import from CVS: tag r21-2-22 cvs parents: diff changeset	198
771 943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	199 /* Long documentation on the coding system. */
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	200 Lisp_Object documentation;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	201 /* Functions to handle additional conversion after reading or before
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	202 writing. #### This mechanism should be replaced by the ability to
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	203 simply create new coding system types. */
440 8de8e3f6228a Import from CVS: tag r21-2-28 cvs parents: 438 diff changeset	204 Lisp_Object post_read_conversion;
8de8e3f6228a Import from CVS: tag r21-2-28 cvs parents: 438 diff changeset	205 Lisp_Object pre_write_conversion;
428 3ecd8885ac67 Import from CVS: tag r21-2-22 cvs parents: diff changeset	206
771 943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	207 /* If this coding system is not of the correct type for text file
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	208 conversion (i.e. decodes byte->char), we wrap it with appropriate
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	209 char<->byte converters. This is created dynamically, when it's
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	210 needed, and cached here. */
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	211 Lisp_Object text_file_wrapper;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	212
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	213 /* If true, this is an internal coding system, which will not show up in
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	214 coding-system-list unless a special parameter is given to it. */
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	215 int internal_p;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	216
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	217 /* ------------------------ junk to handle EOL -------------------------
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	218 I had hoped that we could handle this without lots of special-case
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	219 code, but it appears not to be the case if we want to maintain
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	220 compatibility with the existing way. However, at least with the way
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	221 we do things now, we avoid EOL junk in most of the coding system
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	222 methods themselves, or in the decode/encode functions. The EOL
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	223 special-case code is limited to coding-system creation and to the
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	224 convert-eol and undecided coding system types. */
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	225
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	226 /* If this coding system wants autodetection of the EOL type, then at the
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	227 appropriate time we wrap this coding system with
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	228 convert-eol-autodetect. (We do NOT do this at creation time because
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	229 then we end up with multiple convert-eols wrapped into the final
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	230 result -- esp. with autodetection using `undecided' -- leading to a
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	231 big mess.) We cache the wrapped coding system here. */
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	232 Lisp_Object auto_eol_wrapper;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	233
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	234 /* Eol type requested by user. */
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	235 enum eol_type eol_type;
428 3ecd8885ac67 Import from CVS: tag r21-2-22 cvs parents: diff changeset	236
3ecd8885ac67 Import from CVS: tag r21-2-22 cvs parents: diff changeset	237 /* Subsidiary coding systems that specify a particular type of EOL
3ecd8885ac67 Import from CVS: tag r21-2-22 cvs parents: diff changeset	238 marking, rather than autodetecting it. These will only be non-nil
771 943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	239 if (eol_type == EOL_AUTODETECT). These are chains. */
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	240 Lisp_Object eol[3];
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	241 /* If this coding system is a subsidiary, this element points back to its
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	242 parent. */
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	243 Lisp_Object subsidiary_parent;
428 3ecd8885ac67 Import from CVS: tag r21-2-22 cvs parents: diff changeset	244
771 943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	245 /* At decoding or encoding time, we use the following coding system, if
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	246 it exists, in place of the coding system object. This is how we
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	247 handle coding systems with EOL types of CRLF or CR. Formerly, we did
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	248 the canonicalization at creation time, returning a chain in place of
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	249 the original coding system; but that interferes with
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	250 `coding-system-property' and causes other complications. CANONICAL is
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	251 used when determining the end types of a coding system.
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	252 canonicalize-after-coding also consults CANONICAL (it has to, because
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	253 the data in the lstream is based on CANONICAL, not on the original
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	254 coding system). */
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	255 Lisp_Object canonical;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	256
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	257 /* type-specific extra data attached to a coding_system */
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	258 char data[1];
428 3ecd8885ac67 Import from CVS: tag r21-2-22 cvs parents: diff changeset	259 };
3ecd8885ac67 Import from CVS: tag r21-2-22 cvs parents: diff changeset	260 typedef struct Lisp_Coding_System Lisp_Coding_System;
3ecd8885ac67 Import from CVS: tag r21-2-22 cvs parents: diff changeset	261
440 8de8e3f6228a Import from CVS: tag r21-2-28 cvs parents: 438 diff changeset	262 DECLARE_LRECORD (coding_system, Lisp_Coding_System);
8de8e3f6228a Import from CVS: tag r21-2-28 cvs parents: 438 diff changeset	263 #define XCODING_SYSTEM(x) XRECORD (x, coding_system, Lisp_Coding_System)
428 3ecd8885ac67 Import from CVS: tag r21-2-22 cvs parents: diff changeset	264 #define XSETCODING_SYSTEM(x, p) XSETRECORD (x, p, coding_system)
617 af57a77cbc92 [xemacs-hg @ 2001-06-18 07:09:50 by ben] ben parents: 528 diff changeset	265 #define wrap_coding_system(p) wrap_record (p, coding_system)
428 3ecd8885ac67 Import from CVS: tag r21-2-22 cvs parents: diff changeset	266 #define CODING_SYSTEMP(x) RECORDP (x, coding_system)
3ecd8885ac67 Import from CVS: tag r21-2-22 cvs parents: diff changeset	267 #define CHECK_CODING_SYSTEM(x) CHECK_RECORD (x, coding_system)
3ecd8885ac67 Import from CVS: tag r21-2-22 cvs parents: diff changeset	268 #define CONCHECK_CODING_SYSTEM(x) CONCHECK_RECORD (x, coding_system)
3ecd8885ac67 Import from CVS: tag r21-2-22 cvs parents: diff changeset	269
771 943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	270 struct coding_system_methods
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	271 {
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	272 Lisp_Object type;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	273 Lisp_Object predicate_symbol;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	274
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	275 /* Implementation specific methods: */
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	276
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	277 /* Init method: Initialize coding-system data. Optional. */
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	278 void (*init_method) (Lisp_Object coding_system);
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	279
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	280 /* Mark method: Mark any Lisp objects in the type-specific data
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	281 attached to the coding-system object. Optional. */
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	282 void (*mark_method) (Lisp_Object coding_system);
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	283
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	284 /* Print method: Print the type-specific properties of this coding
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	285 system, as part of `print'-ing the object. If this method is defined
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	286 and prints anything, it should print a space as the first thing it
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	287 does. Optional. */
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	288 void (*print_method) (Lisp_Object cs, Lisp_Object printcharfun,
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	289 int escapeflag);
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	290
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	291 /* Canonicalize method: Convert this coding system to another one; called
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	292 once, at creation time, after all properties have been parsed. The
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	293 returned value should be a coding system created with
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	294 make_internal_coding_system() (passing the existing coding system as the
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	295 first argument), and will become the coding system returned by
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	296 `make-coding-system'. Optional.
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	297
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	298 NOTE: There are three different uses of "canonical" or "canonicalize"
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	299 w.r.t. coding systems, and it's important to keep them straight.
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	300
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	301 1. The canonicalize method. Used to specify a different coding
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	302 system, used when doing conversions, in place of the actual coding
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	303 system itself. Stored in the CANONICAL field of a coding system.
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	304
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	305 2. The canonicalize-after-coding method. Used to return the encoding
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	306 that was "actually" used to decode some text, such that this
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	307 particular encoding can be used to encode the text again with the
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	308 expectation that the result will be the same as the original encoding.
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	309 Particularly important with auto-detecting coding systems.
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	310
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	311 3. From the perspective of aliases, a "canonical" coding system is one
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	312 that's not an alias to some other coding system, and "canonicalization"
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	313 is the process of traversing the alias pointers to find the canonical
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	314 coding system that's equivalent to the alias.
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	315 */
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	316 Lisp_Object (*canonicalize_method) (Lisp_Object coding_system);
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	317
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	318 /* Canonicalize after coding method: Convert this coding system to
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	319 another one, after coding (usually decoding) has finished. This is
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	320 meant to be used by auto-detecting coding systems, which should return
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	321 the actually detected coding system. Optional. */
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	322 Lisp_Object (*canonicalize_after_coding_method)
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	323 (struct coding_stream *str);
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	324
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	325 /* Convert method: Decode or encode the data in SRC of size N, writing
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	326 the results into the Dynarr DST. If the conversion_end_type method
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	327 indicates that the source is characters (as opposed to bytes), you are
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	328 guaranteed to get only whole characters in the data in SRC/N. STR, a
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	329 struct coding_stream, stores all necessary state and other info about
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	330 the conversion. Coding-specific state (struct TYPE_coding_stream) can
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	331 be retrieved from STR using CODING_STREAM_TYPE_DATA(). Return value
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	332 indicates the number of bytes of the INPUT that were converted (not
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	333 the number of bytes written to the Dynarr!). This can be less than
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	334 the total amount of input passed in; if so, the remainder is
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	335 considered "rejected" and will appear again at the beginning of the
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	336 data passed in the next time the convert method is called. When EOF
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	337 is returned on the other end and there's no more data, the convert
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	338 method will be called one last time, STR->eof set and the passed-in
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	339 data will consist only of any rejected data from the previous
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	340 call. (At this point, file handles and similar resources can be
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	341 closed, but do NOT arbitrarily free data structures in the
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	342 type-specific data, because there are operations that can be done on
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	343 closed streams to query the results of the processing -- specifically,
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	344 for coding streams, there's the canonicalize_after_coding() method.)
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	345 Required. */
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	346 Bytecount (convert_method) (struct coding_stream str,
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	347 const unsigned char *src,
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	348 unsigned_char_dynarr *dst, Bytecount n);
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	349
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	350 /* Coding mark method: Mark any Lisp objects in the type-specific data
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	351 attached to `struct coding_stream'. Optional. */
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	352 void (mark_coding_stream_method) (struct coding_stream str);
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	353
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	354 /* Init coding stream method: Initialize the type-specific data attached
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	355 to the coding stream (i.e. in struct TYPE_coding_stream), when the
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	356 coding stream is opened. The type-specific data will be zeroed out.
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	357 Optional. */
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	358 void (init_coding_stream_method) (struct coding_stream str);
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	359
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	360 /* Rewind coding stream method: Reset any necessary type-specific data as
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	361 a result of the stream being rewound. Optional. */
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	362 void (rewind_coding_stream_method) (struct coding_stream str);
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	363
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	364 /* Finalize coding stream method: Clean up the type-specific data
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	365 attached to the coding stream (i.e. in struct TYPE_coding_stream).
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	366 Happens when the Lstream is deleted using Lstream_delete() or is
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	367 garbage-collected. Most streams are deleted after they've been used,
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	368 so it's less likely (but still possible) that allocated data will
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	369 stick around until GC time. (File handles can also be closed when EOF
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	370 is signalled; but some data must stick around after this point, for
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	371 the benefit of canonicalize_after_coding. See the convert method.)
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	372 Called only once (NOT called at disksave time). Optional. */
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	373 void (finalize_coding_stream_method) (struct coding_stream str);
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	374
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	375 /* Finalize method: Clean up type-specific data (e.g. free allocated
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	376 data) attached to the coding system (i.e. in struct
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	377 TYPE_coding_system), when the coding system is about to be garbage
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	378 collected. (Currently not called.) Called only once (NOT called at
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	379 disksave time). Optional. */
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	380 void (*finalize_method) (Lisp_Object codesys);
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	381
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	382 /* Conversion end type method: Does this coding system encode bytes ->
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	383 characters, characters -> characters, bytes -> bytes, or
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	384 characters -> bytes?. Default is characters -> bytes. Optional. */
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	385 enum source_sink_type (*conversion_end_type_method) (Lisp_Object codesys);
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	386
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	387 /* Putprop method: Set the value of a type-specific property. If
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	388 the property name is unrecognized, return 0. If the value is disallowed
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	389 or erroneous, signal an error. Currently called only at creation time.
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	390 Optional. */
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	391 int (*putprop_method) (Lisp_Object codesys,
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	392 Lisp_Object key,
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	393 Lisp_Object value);
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	394
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	395 /* Getprop method: Return the value of a type-specific property. If
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	396 the property name is unrecognized, return Qunbound. Optional.
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	397 */
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	398 Lisp_Object (*getprop_method) (Lisp_Object coding_system,
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	399 Lisp_Object prop);
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	400
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	401 /* These next three are set as part of the call to
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	402 INITIALIZE_CODING_SYSTEM_TYPE_WITH_DATA. */
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	403
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	404 /* Description of the extra data (struct foo_coding_system) attached to a
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	405 coding system, for pdump purposes. NOTE: All offsets must have
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	406 coding_system_data_offset added to them! */
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	407 const struct lrecord_description *extra_description;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	408 /* size of struct foo_coding_system -- extra data associated with
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	409 the coding system */
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	410 int extra_data_size;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	411 /* size of struct foo_coding_stream -- extra data associated with the
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	412 struct coding_stream, needed for each active coding process
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	413 using this coding system. note that we can have more than one
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	414 process active at once (simply by creating more than one coding
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	415 lstream using this coding system), so we can't store this data in
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	416 the coding system object. */
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	417 int coding_data_size;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	418 };
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	419
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	420 /*** Calling a coding-system method ***/
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	421
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	422 #define RAW_CODESYSMETH(cs, m) ((cs)->methods->m##_method)
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	423 #define HAS_CODESYSMETH_P(cs, m) (!!RAW_CODESYSMETH (cs, m))
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	424 #define CODESYSMETH(cs, m, args) (((cs)->methods->m##_method) args)
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	425
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	426 /* Call a void-returning coding-system method, if it exists. */
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	427 #define MAYBE_CODESYSMETH(cs, m, args) do { \
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	428 Lisp_Coding_System *maybe_codesysmeth_cs = (cs); \
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	429 if (HAS_CODESYSMETH_P (maybe_codesysmeth_cs, m)) \
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	430 CODESYSMETH (maybe_codesysmeth_cs, m, args); \
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	431 } while (0)
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	432
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	433 /* Call a coding-system method, if it exists, or return GIVEN.
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	434 NOTE: Multiply-evaluates CS. */
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	435 #define CODESYSMETH_OR_GIVEN(cs, m, args, given) \
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	436 (HAS_CODESYSMETH_P (cs, m) ? \
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	437 CODESYSMETH (cs, m, args) : (given))
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	438
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	439 #define XCODESYSMETH(cs, m, args) \
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	440 CODESYSMETH (XCODING_SYSTEM (cs), m, args)
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	441 #define MAYBE_XCODESYSMETH(cs, m, args) \
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	442 MAYBE_CODESYSMETH (XCODING_SYSTEM (cs), m, args)
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	443 #define XCODESYSMETH_OR_GIVEN(cs, m, args, given) \
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	444 CODESYSMETH_OR_GIVEN (XCODING_SYSTEM (cs), m, args, given)
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	445
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	446
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	447 /*** Defining new coding-system types ***/
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	448
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	449 #define coding_system_data_offset (offsetof (Lisp_Coding_System, data))
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	450 extern const struct lrecord_description coding_system_empty_extra_description[];
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	451
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	452 #ifdef ERROR_CHECK_TYPECHECK
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	453 #define DECLARE_CODING_SYSTEM_TYPE(type) \
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	454 \
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	455 extern struct coding_system_methods * type##_coding_system_methods; \
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	456 INLINE_HEADER struct type##_coding_system * \
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	457 error_check_##type##_coding_system_data (Lisp_Coding_System *cs); \
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	458 INLINE_HEADER struct type##_coding_system * \
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	459 error_check_##type##_coding_system_data (Lisp_Coding_System *cs) \
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	460 { \
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	461 assert (CODING_SYSTEM_TYPE_P (cs, type)); \
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	462 /* Catch accidental use of INITIALIZE_CODING_SYSTEM_TYPE in place \
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	463 of INITIALIZE_CODING_SYSTEM_TYPE_WITH_DATA. */ \
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	464 assert (cs->methods->extra_data_size > 0); \
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	465 return (struct type##_coding_system *) cs->data; \
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	466 } \
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	467 \
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	468 INLINE_HEADER struct type##_coding_stream * \
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	469 error_check_##type##_coding_stream_data (struct coding_stream *s); \
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	470 INLINE_HEADER struct type##_coding_stream * \
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	471 error_check_##type##_coding_stream_data (struct coding_stream *s) \
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	472 { \
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	473 assert (XCODING_SYSTEM_TYPE_P (s->codesys, type)); \
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	474 return (struct type##_coding_stream *) s->data; \
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	475 } \
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	476 \
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	477 INLINE_HEADER Lisp_Coding_System * \
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	478 error_check_##type##_coding_system_type (Lisp_Object obj); \
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	479 INLINE_HEADER Lisp_Coding_System * \
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	480 error_check_##type##_coding_system_type (Lisp_Object obj) \
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	481 { \
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	482 Lisp_Coding_System *cs = XCODING_SYSTEM (obj); \
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	483 assert (CODING_SYSTEM_TYPE_P (cs, type)); \
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	484 return cs; \
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	485 } \
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	486 \
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	487 DECLARE_NOTHING
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	488 #else
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	489 #define DECLARE_CODING_SYSTEM_TYPE(type) \
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	490 extern struct coding_system_methods * type##_coding_system_methods
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	491 #endif /* ERROR_CHECK_TYPECHECK */
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	492
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	493 #define DEFINE_CODING_SYSTEM_TYPE(type) \
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	494 struct coding_system_methods * type##_coding_system_methods
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	495
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	496 #define INITIALIZE_CODING_SYSTEM_TYPE(ty, pred_sym) do { \
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	497 ty##_coding_system_methods = \
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	498 xnew_and_zero (struct coding_system_methods); \
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	499 ty##_coding_system_methods->type = Q##ty; \
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	500 ty##_coding_system_methods->extra_description = \
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	501 coding_system_empty_extra_description; \
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	502 defsymbol_nodump (&ty##_coding_system_methods->predicate_symbol, \
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	503 pred_sym); \
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	504 add_entry_to_coding_system_type_list (ty##_coding_system_methods); \
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	505 dump_add_root_struct_ptr (&ty##_coding_system_methods, \
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	506 &coding_system_methods_description); \
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	507 } while (0)
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	508
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	509 #define REINITIALIZE_CODING_SYSTEM_TYPE(type) do { \
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	510 staticpro_nodump (&type##_coding_system_methods->predicate_symbol); \
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	511 } while (0)
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	512
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	513 /* This assumes the existence of two structures:
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	514
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	515 struct foo_coding_system (attached to the coding system)
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	516 struct foo_coding_stream (per coding process, attached to the
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	517 struct coding_stream)
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	518 const struct foo_coding_system_description[] (pdump description of
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	519 struct foo_coding_system)
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	520
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	521 NOTE: The description must have coding_system_data_offset added to
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	522 all offsets in it! For an example of how to do things, see
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	523 chain_coding_system_description.
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	524 */
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	525 #define INITIALIZE_CODING_SYSTEM_TYPE_WITH_DATA(type, pred_sym) \
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	526 do { \
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	527 INITIALIZE_CODING_SYSTEM_TYPE (type, pred_sym); \
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	528 type##_coding_system_methods->extra_data_size = \
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	529 sizeof (struct type##_coding_system); \
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	530 type##_coding_system_methods->extra_description = \
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	531 type##_coding_system_description; \
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	532 type##_coding_system_methods->coding_data_size = \
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	533 sizeof (struct type##_coding_stream); \
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	534 } while (0)
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	535
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	536 /* Declare that coding-system-type TYPE has method METH; used in
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	537 initialization routines */
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	538 #define CODING_SYSTEM_HAS_METHOD(type, meth) \
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	539 (type##_coding_system_methods->meth##_method = type##_##meth)
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	540
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	541 /*** Macros for accessing coding-system types ***/
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	542
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	543 #define CODING_SYSTEM_TYPE_P(cs, type) \
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	544 ((cs)->methods == type##_coding_system_methods)
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	545 #define XCODING_SYSTEM_TYPE_P(cs, type) \
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	546 CODING_SYSTEM_TYPE_P (XCODING_SYSTEM (cs), type)
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	547
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	548 #ifdef ERROR_CHECK_TYPECHECK
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	549 # define CODING_SYSTEM_TYPE_DATA(cs, type) \
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	550 error_check_##type##_coding_system_data (cs)
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	551 #else
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	552 # define CODING_SYSTEM_TYPE_DATA(cs, type) \
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	553 ((struct type##_coding_system *) \
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	554 (cs)->data)
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	555 #endif
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	556
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	557 #define XCODING_SYSTEM_TYPE_DATA(cs, type) \
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	558 CODING_SYSTEM_TYPE_DATA (XCODING_SYSTEM_OF_TYPE (cs, type), type)
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	559
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	560 #ifdef ERROR_CHECK_TYPECHECK
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	561 # define XCODING_SYSTEM_OF_TYPE(x, type) \
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	562 error_check_##type##_coding_system_type (x)
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	563 # define XSETCODING_SYSTEM_OF_TYPE(x, p, type) do \
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	564 { \
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	565 XSETCODING_SYSTEM (x, p); \
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	566 assert (CODING_SYSTEM_TYPEP (XCODING_SYSTEM(x), type)); \
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	567 } while (0)
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	568 #else
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	569 # define XCODING_SYSTEM_OF_TYPE(x, type) XCODING_SYSTEM (x)
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	570 # define XSETCODING_SYSTEM_OF_TYPE(x, p, type) XSETCODING_SYSTEM (x, p)
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	571 #endif /* ERROR_CHECK_TYPE_CHECK */
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	572
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	573 #define CODING_SYSTEM_TYPEP(x, type) \
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	574 (CODING_SYSTEMP (x) && CODING_SYSTEM_TYPE_P (XCODING_SYSTEM (x), type))
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	575 #define CHECK_CODING_SYSTEM_OF_TYPE(x, type) do { \
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	576 CHECK_CODING_SYSTEM (x); \
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	577 if (!CODING_SYSTEM_TYPE_P (XCODING_SYSTEM (x), type)) \
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	578 dead_wrong_type_argument \
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	579 (type##_coding_system_methods->predicate_symbol, x); \
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	580 } while (0)
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	581 #define CONCHECK_CODING_SYSTEM_OF_TYPE(x, type) do { \
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	582 CONCHECK_CODING_SYSTEM (x); \
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	583 if (!(CODING_SYSTEM_TYPEP (x, type))) \
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	584 x = wrong_type_argument \
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	585 (type##_coding_system_methods->predicate_symbol, x); \
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	586 } while (0)
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	587
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	588 #define CODING_SYSTEM_METHODS(codesys) ((codesys)->methods)
428 3ecd8885ac67 Import from CVS: tag r21-2-22 cvs parents: diff changeset	589 #define CODING_SYSTEM_NAME(codesys) ((codesys)->name)
771 943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	590 #define CODING_SYSTEM_DESCRIPTION(codesys) ((codesys)->description)
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	591 #define CODING_SYSTEM_TYPE(codesys) ((codesys)->methods->type)
428 3ecd8885ac67 Import from CVS: tag r21-2-22 cvs parents: diff changeset	592 #define CODING_SYSTEM_MNEMONIC(codesys) ((codesys)->mnemonic)
771 943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	593 #define CODING_SYSTEM_DOCUMENTATION(codesys) ((codesys)->documentation)
428 3ecd8885ac67 Import from CVS: tag r21-2-22 cvs parents: diff changeset	594 #define CODING_SYSTEM_POST_READ_CONVERSION(codesys) \
3ecd8885ac67 Import from CVS: tag r21-2-22 cvs parents: diff changeset	595 ((codesys)->post_read_conversion)
3ecd8885ac67 Import from CVS: tag r21-2-22 cvs parents: diff changeset	596 #define CODING_SYSTEM_PRE_WRITE_CONVERSION(codesys) \
3ecd8885ac67 Import from CVS: tag r21-2-22 cvs parents: diff changeset	597 ((codesys)->pre_write_conversion)
3ecd8885ac67 Import from CVS: tag r21-2-22 cvs parents: diff changeset	598 #define CODING_SYSTEM_EOL_TYPE(codesys) ((codesys)->eol_type)
771 943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	599 #define CODING_SYSTEM_EOL_LF(codesys) ((codesys)->eol[EOL_LF])
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	600 #define CODING_SYSTEM_EOL_CRLF(codesys) ((codesys)->eol[EOL_CRLF])
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	601 #define CODING_SYSTEM_EOL_CR(codesys) ((codesys)->eol[EOL_CR])
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	602 #define CODING_SYSTEM_TEXT_FILE_WRAPPER(codesys) ((codesys)->text_file_wrapper)
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	603 #define CODING_SYSTEM_AUTO_EOL_WRAPPER(codesys) ((codesys)->auto_eol_wrapper)
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	604 #define CODING_SYSTEM_SUBSIDIARY_PARENT(codesys) ((codesys)->subsidiary_parent)
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	605 #define CODING_SYSTEM_CANONICAL(codesys) ((codesys)->canonical)
428 3ecd8885ac67 Import from CVS: tag r21-2-22 cvs parents: diff changeset	606
771 943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	607 #define CODING_SYSTEM_CHAIN_CHAIN(codesys) \
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	608 (CODING_SYSTEM_TYPE_DATA (codesys, chain)->chain)
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	609 #define CODING_SYSTEM_CHAIN_COUNT(codesys) \
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	610 (CODING_SYSTEM_TYPE_DATA (codesys, chain)->count)
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	611 #define CODING_SYSTEM_CHAIN_CANONICALIZE_AFTER_CODING(codesys) \
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	612 (CODING_SYSTEM_TYPE_DATA (codesys, chain)->canonicalize_after_coding)
428 3ecd8885ac67 Import from CVS: tag r21-2-22 cvs parents: diff changeset	613
771 943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	614 #define XCODING_SYSTEM_METHODS(codesys) \
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	615 CODING_SYSTEM_METHODS (XCODING_SYSTEM (codesys))
428 3ecd8885ac67 Import from CVS: tag r21-2-22 cvs parents: diff changeset	616 #define XCODING_SYSTEM_NAME(codesys) \
3ecd8885ac67 Import from CVS: tag r21-2-22 cvs parents: diff changeset	617 CODING_SYSTEM_NAME (XCODING_SYSTEM (codesys))
771 943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	618 #define XCODING_SYSTEM_DESCRIPTION(codesys) \
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	619 CODING_SYSTEM_DESCRIPTION (XCODING_SYSTEM (codesys))
428 3ecd8885ac67 Import from CVS: tag r21-2-22 cvs parents: diff changeset	620 #define XCODING_SYSTEM_TYPE(codesys) \
3ecd8885ac67 Import from CVS: tag r21-2-22 cvs parents: diff changeset	621 CODING_SYSTEM_TYPE (XCODING_SYSTEM (codesys))
3ecd8885ac67 Import from CVS: tag r21-2-22 cvs parents: diff changeset	622 #define XCODING_SYSTEM_MNEMONIC(codesys) \
3ecd8885ac67 Import from CVS: tag r21-2-22 cvs parents: diff changeset	623 CODING_SYSTEM_MNEMONIC (XCODING_SYSTEM (codesys))
771 943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	624 #define XCODING_SYSTEM_DOCUMENTATION(codesys) \
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	625 CODING_SYSTEM_DOCUMENTATION (XCODING_SYSTEM (codesys))
428 3ecd8885ac67 Import from CVS: tag r21-2-22 cvs parents: diff changeset	626 #define XCODING_SYSTEM_POST_READ_CONVERSION(codesys) \
3ecd8885ac67 Import from CVS: tag r21-2-22 cvs parents: diff changeset	627 CODING_SYSTEM_POST_READ_CONVERSION (XCODING_SYSTEM (codesys))
3ecd8885ac67 Import from CVS: tag r21-2-22 cvs parents: diff changeset	628 #define XCODING_SYSTEM_PRE_WRITE_CONVERSION(codesys) \
3ecd8885ac67 Import from CVS: tag r21-2-22 cvs parents: diff changeset	629 CODING_SYSTEM_PRE_WRITE_CONVERSION (XCODING_SYSTEM (codesys))
3ecd8885ac67 Import from CVS: tag r21-2-22 cvs parents: diff changeset	630 #define XCODING_SYSTEM_EOL_TYPE(codesys) \
3ecd8885ac67 Import from CVS: tag r21-2-22 cvs parents: diff changeset	631 CODING_SYSTEM_EOL_TYPE (XCODING_SYSTEM (codesys))
3ecd8885ac67 Import from CVS: tag r21-2-22 cvs parents: diff changeset	632 #define XCODING_SYSTEM_EOL_LF(codesys) \
3ecd8885ac67 Import from CVS: tag r21-2-22 cvs parents: diff changeset	633 CODING_SYSTEM_EOL_LF (XCODING_SYSTEM (codesys))
3ecd8885ac67 Import from CVS: tag r21-2-22 cvs parents: diff changeset	634 #define XCODING_SYSTEM_EOL_CRLF(codesys) \
3ecd8885ac67 Import from CVS: tag r21-2-22 cvs parents: diff changeset	635 CODING_SYSTEM_EOL_CRLF (XCODING_SYSTEM (codesys))
3ecd8885ac67 Import from CVS: tag r21-2-22 cvs parents: diff changeset	636 #define XCODING_SYSTEM_EOL_CR(codesys) \
3ecd8885ac67 Import from CVS: tag r21-2-22 cvs parents: diff changeset	637 CODING_SYSTEM_EOL_CR (XCODING_SYSTEM (codesys))
771 943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	638 #define XCODING_SYSTEM_TEXT_FILE_WRAPPER(codesys) \
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	639 CODING_SYSTEM_TEXT_FILE_WRAPPER (XCODING_SYSTEM (codesys))
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	640 #define XCODING_SYSTEM_AUTO_EOL_WRAPPER(codesys) \
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	641 CODING_SYSTEM_AUTO_EOL_WRAPPER (XCODING_SYSTEM (codesys))
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	642 #define XCODING_SYSTEM_SUBSIDIARY_PARENT(codesys) \
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	643 CODING_SYSTEM_SUBSIDIARY_PARENT (XCODING_SYSTEM (codesys))
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	644 #define XCODING_SYSTEM_CANONICAL(codesys) \
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	645 CODING_SYSTEM_CANONICAL (XCODING_SYSTEM (codesys))
428 3ecd8885ac67 Import from CVS: tag r21-2-22 cvs parents: diff changeset	646
771 943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	647 #define XCODING_SYSTEM_CHAIN_CHAIN(codesys) \
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	648 CODING_SYSTEM_CHAIN_CHAIN (XCODING_SYSTEM (codesys))
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	649 #define XCODING_SYSTEM_CHAIN_COUNT(codesys) \
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	650 CODING_SYSTEM_CHAIN_COUNT (XCODING_SYSTEM (codesys))
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	651 #define XCODING_SYSTEM_CHAIN_CANONICALIZE_AFTER_CODING(codesys) \
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	652 CODING_SYSTEM_CHAIN_CANONICALIZE_AFTER_CODING (XCODING_SYSTEM (codesys))
428 3ecd8885ac67 Import from CVS: tag r21-2-22 cvs parents: diff changeset	653
771 943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	654 /**************************************************/
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	655 /* Detection */
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	656 /**************************************************/
428 3ecd8885ac67 Import from CVS: tag r21-2-22 cvs parents: diff changeset	657
771 943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	658 #define MAX_DETECTOR_CATEGORIES 256
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	659 #define MAX_DETECTORS 64
428 3ecd8885ac67 Import from CVS: tag r21-2-22 cvs parents: diff changeset	660
771 943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	661 #define MAX_BYTES_PROCESSED_FOR_DETECTION 65536
428 3ecd8885ac67 Import from CVS: tag r21-2-22 cvs parents: diff changeset	662
771 943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	663 struct detection_state
428 3ecd8885ac67 Import from CVS: tag r21-2-22 cvs parents: diff changeset	664 {
771 943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	665 int seen_non_ascii;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	666 Bytecount bytes_seen;
428 3ecd8885ac67 Import from CVS: tag r21-2-22 cvs parents: diff changeset	667
771 943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	668 char categories[MAX_DETECTOR_CATEGORIES];
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	669 Bytecount data_offset[MAX_DETECTORS];
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	670 /* ... more data follows; data_offset[detector_##TYPE] points to
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	671 the data for that type */
428 3ecd8885ac67 Import from CVS: tag r21-2-22 cvs parents: diff changeset	672 };
3ecd8885ac67 Import from CVS: tag r21-2-22 cvs parents: diff changeset	673
771 943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	674 #define DETECTION_STATE_DATA(st, type) \
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	675 ((struct type##_detector *) \
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	676 ((char *) (st) + (st)->data_offset[detector_##type]))
428 3ecd8885ac67 Import from CVS: tag r21-2-22 cvs parents: diff changeset	677
448 3078fd1074e8 Import from CVS: tag r21-2-39 cvs parents: 440 diff changeset	678 /* Distinguishable categories of encodings.
3078fd1074e8 Import from CVS: tag r21-2-39 cvs parents: 440 diff changeset	679
3078fd1074e8 Import from CVS: tag r21-2-39 cvs parents: 440 diff changeset	680 This list determines the initial priority of the categories.
3078fd1074e8 Import from CVS: tag r21-2-39 cvs parents: 440 diff changeset	681
3078fd1074e8 Import from CVS: tag r21-2-39 cvs parents: 440 diff changeset	682 For better or worse, currently Mule files are encoded in 7-bit ISO 2022.
3078fd1074e8 Import from CVS: tag r21-2-39 cvs parents: 440 diff changeset	683 For this reason, under Mule ISO_7 gets highest priority.
3078fd1074e8 Import from CVS: tag r21-2-39 cvs parents: 440 diff changeset	684
3078fd1074e8 Import from CVS: tag r21-2-39 cvs parents: 440 diff changeset	685 Putting NO_CONVERSION second prevents "binary corruption" in the
3078fd1074e8 Import from CVS: tag r21-2-39 cvs parents: 440 diff changeset	686 default case in all but the (presumably) extremely rare case of a
3078fd1074e8 Import from CVS: tag r21-2-39 cvs parents: 440 diff changeset	687 binary file which contains redundant escape sequences but no 8-bit
3078fd1074e8 Import from CVS: tag r21-2-39 cvs parents: 440 diff changeset	688 characters.
3078fd1074e8 Import from CVS: tag r21-2-39 cvs parents: 440 diff changeset	689
3078fd1074e8 Import from CVS: tag r21-2-39 cvs parents: 440 diff changeset	690 The remaining priorities are based on perceived "internationalization
3078fd1074e8 Import from CVS: tag r21-2-39 cvs parents: 440 diff changeset	691 political correctness." An exception is UCS-4 at the bottom, since
3078fd1074e8 Import from CVS: tag r21-2-39 cvs parents: 440 diff changeset	692 basically everything is compatible with UCS-4, but it is likely to
3078fd1074e8 Import from CVS: tag r21-2-39 cvs parents: 440 diff changeset	693 be very rare as an external encoding. */
3078fd1074e8 Import from CVS: tag r21-2-39 cvs parents: 440 diff changeset	694
771 943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	695 /* Macros to define code of control characters for ISO2022's functions. */
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	696 /* Used by the detection routines of other coding system types as well. */
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	697 /* code / / function */
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	698 #define ISO_CODE_LF 0x0A /* line-feed */
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	699 #define ISO_CODE_CR 0x0D /* carriage-return */
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	700 #define ISO_CODE_SO 0x0E /* shift-out */
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	701 #define ISO_CODE_SI 0x0F /* shift-in */
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	702 #define ISO_CODE_ESC 0x1B /* escape */
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	703 #define ISO_CODE_DEL 0x7F /* delete */
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	704 #define ISO_CODE_SS2 0x8E /* single-shift-2 */
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	705 #define ISO_CODE_SS3 0x8F /* single-shift-3 */
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	706 #define ISO_CODE_CSI 0x9B /* control-sequence-introduce */
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	707
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	708 enum detection_result
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	709 {
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	710 /* Basically means a magic cookie was seen indicating this type, or
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	711 something similar. */
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	712 DET_NEAR_CERTAINTY = 4,
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	713 DET_HIGHEST = 4,
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	714 /* Characteristics seen that are unlikely to be other coding system types
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	715 -- e.g. ISO-2022 escape sequences, or perhaps a consistent pattern of
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	716 alternating zero bytes in UTF-16, along with Unicode LF or CRLF
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	717 sequences at regular intervals. (Zero bytes are unlikely or impossible
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	718 in most text encodings.) */
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	719 DET_QUITE_PROBABLE = 3,
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	720 /* Strong or medium statistical likelihood. At least some
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	721 characteristics seen that match what's normally found in this encoding
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	722 -- e.g. in Shift-JIS, a number of two-byte Japanese character
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	723 sequences in the right range, and nothing out of range; or in Unicode,
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	724 much higher statistical variance in the odd bytes than in the even
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	725 bytes, or vice-versa (perhaps the presence of regular EOL sequences
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	726 would bump this too to DET_QUITE_PROBABLE). This is quite often a
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	727 statistical test. */
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	728 DET_SOMEWHAT_LIKELY = 2,
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	729 /* Weak statistical likelihood. Pretty much any features at all that
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	730 characterize this encoding, and nothing that rules against it. */
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	731 DET_SLIGHTLY_LIKELY = 1,
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	732 /* Default state. Perhaps it indicates pure ASCII or something similarly
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	733 vague seen in Shift-JIS, or, exactly as the level says, it might mean
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	734 in a statistical-based detector that the pros and cons are balanced
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	735 out. This is also the lowest level that will be accepted by the
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	736 auto-detector without asking the user: If all available detectors
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	737 report lower levels for all categories with attached coding systems,
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	738 the user will be shown the results and explicitly prompted for action.
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	739 The user will also be prompted if this is the highest available level
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	740 and more than one detector reports the level. (See below about the
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	741 consequent necessity of an "ASCII" detector, which will return level 1
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	742 or higher for most plain text files.) */
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	743 DET_AS_LIKELY_AS_UNLIKELY = 0,
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	744 /* Some characteristics seen that are unusual for this encoding --
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	745 e.g. unusual control characters in a plain-text encoding, lots of
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	746 8-bit characters, or little statistical variance in the odd and even
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	747 bytes in UTF-16. */
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	748 DET_SOMEWHAT_UNLIKELY = -1,
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	749 /* This indicates that there is very little chance the data is in the
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	750 right format; this is probably the lowest level you can get when
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	751 presenting random binary data to a text file, because there are no
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	752 "specific sequences" you can see that would totally rule out
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	753 recognition. */
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	754 DET_QUITE_IMPROBABLE = -2,
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	755 /* An erroneous sequence was seen. */
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	756 DET_NEARLY_IMPOSSIBLE = -3,
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	757 DET_LOWEST = 3,
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	758 };
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	759
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	760 extern int coding_detector_count;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	761 extern int coding_detector_category_count;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	762
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	763 struct detector_category
428 3ecd8885ac67 Import from CVS: tag r21-2-22 cvs parents: diff changeset	764 {
771 943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	765 int id;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	766 Lisp_Object sym;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	767 };
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	768
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	769 typedef struct
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	770 {
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	771 Dynarr_declare (struct detector_category);
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	772 } detector_category_dynarr;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	773
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	774 struct detector
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	775 {
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	776 int id;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	777 detector_category_dynarr *cats;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	778 Bytecount data_size;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	779 /* Detect method: Required. */
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	780 void (detect_method) (struct detection_state st,
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	781 const unsigned char *src, Bytecount n);
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	782 /* Finalize detection state method: Clean up any allocated data in the
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	783 detection state. Called only once (NOT called at disksave time).
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	784 Optional. */
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	785 void (finalize_detection_state_method) (struct detection_state st);
428 3ecd8885ac67 Import from CVS: tag r21-2-22 cvs parents: diff changeset	786 };
3ecd8885ac67 Import from CVS: tag r21-2-22 cvs parents: diff changeset	787
771 943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	788 /* Lvalue for a particular detection result -- detection state ST,
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	789 category CAT */
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	790 #define DET_RESULT(st, cat) ((st)->categories[detector_category_##cat])
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	791 /* In state ST, set all detection results associated with detector DET to
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	792 RESULT. */
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	793 #define SET_DET_RESULTS(st, det, result) \
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	794 set_detection_results (st, detector_##det, result)
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	795
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	796 typedef struct
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	797 {
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	798 Dynarr_declare (struct detector);
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	799 } detector_dynarr;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	800
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	801 extern detector_dynarr *all_coding_detectors;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	802
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	803 #define DEFINE_DETECTOR_CATEGORY(detector, cat) \
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	804 int detector_category_##cat
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	805 #define DECLARE_DETECTOR_CATEGORY(detector, cat) \
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	806 extern int detector_category_##cat
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	807 #define INITIALIZE_DETECTOR_CATEGORY(detector, cat) \
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	808 do { \
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	809 struct detector_category dog; \
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	810 xzero (dog); \
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	811 detector_category_##cat = coding_detector_category_count++; \
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	812 dump_add_opaque_int (&detector_category_##cat); \
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	813 dog.id = detector_category_##cat; \
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	814 dog.sym = Q##cat; \
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	815 Dynarr_add (Dynarr_at (all_coding_detectors, detector_##detector).cats, \
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	816 dog); \
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	817 } while (0)
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	818
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	819 #define DEFINE_DETECTOR(Detector) \
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	820 int detector_##Detector
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	821 #define DECLARE_DETECTOR(Detector) \
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	822 extern int detector_##Detector
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	823 #define INITIALIZE_DETECTOR(Detector) \
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	824 do { \
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	825 struct detector det; \
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	826 xzero (det); \
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	827 detector_##Detector = coding_detector_count++; \
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	828 dump_add_opaque_int (&detector_##Detector); \
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	829 det.id = detector_##Detector; \
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	830 det.cats = Dynarr_new2 (detector_category_dynarr, \
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	831 struct detector_category); \
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	832 det.data_size = sizeof (struct Detector##_detector); \
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	833 Dynarr_add (all_coding_detectors, det); \
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	834 } while (0)
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	835 #define DETECTOR_HAS_METHOD(Detector, Meth) \
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	836 Dynarr_at (all_coding_detectors, detector_##Detector).Meth##_method = \
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	837 Detector##_##Meth
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	838
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	839
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	840 /**************************************************/
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	841 /* Decoding/Encoding */
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	842 /**************************************************/
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	843
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	844 /* Is the source (SOURCEP == 1) or sink (SOURCEP == 0) when encoding specified
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	845 in characters? */
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	846
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	847 enum source_or_sink
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	848 {
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	849 CODING_SOURCE,
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	850 CODING_SINK
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	851 };
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	852
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	853 enum encode_decode
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	854 {
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	855 CODING_ENCODE,
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	856 CODING_DECODE
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	857 };
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	858
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	859 /* Data structure attached to an lstream of type `coding',
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	860 containing values specific to the coding process. Additional
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	861 data is stored in the DATA field below; the exact form of that data
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	862 is controlled by the type of the coding system that governs the
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	863 conversion (field CODESYS). CODESYS may be set at any time
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	864 throughout the lifetime of the lstream and possibly more than once.
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	865 See long comment above for more info. */
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	866
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	867 struct coding_stream
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	868 {
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	869 /* Coding system that governs the conversion. */
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	870 Lisp_Object codesys;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	871 /* Original coding system, pre-canonicalization. */
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	872 Lisp_Object orig_codesys;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	873
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	874 /* Back pointer to current stream. */
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	875 Lstream *us;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	876
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	877 /* Stream that we read the unprocessed data from or write the processed
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	878 data to. */
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	879 Lstream *other_end;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	880
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	881 /* In order to handle both reading to and writing from a coding stream,
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	882 we phrase the conversion methods like write methods -- we can
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	883 implement reading in terms of a write method but not vice-versa,
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	884 because the write method is forced to take only what it's given but
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	885 the read method can read more data from the other end if necessary.
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	886 On the other hand, the write method is free to generate all the data
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	887 it wants (and just write it to the other end), but the the read method
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	888 can return only as much as was asked for, so we need to implement our
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	889 own buffering. */
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	890
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	891 /* If we are reading, then we can return only a fixed amount of data, but
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	892 the converter is free to return as much as it wants, so we direct it
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	893 to store the data here and lop off chunks as we need them. If we are
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	894 writing, we use this because the converter takes a Dynarr but we are
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	895 supposed to write into a fixed buffer. (NOTE: This introduces an extra
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	896 memory copy.) */
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	897 unsigned_char_dynarr *convert_to;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	898
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	899 /* The conversion method might reject some of the data -- this typically
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	900 includes partial characters, partial escape sequences, etc. When
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	901 writing, we just pass the rejection up to the Lstream module, and it
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	902 will buffer the data. When reading, however, we need to do the
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	903 buffering ourselves, and we put it here, combined with newly read
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	904 data. */
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	905 unsigned_char_dynarr *convert_from;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	906
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	907 /* If set, this is the last chunk of data being processed. When this is
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	908 finished, output any necessary terminating control characters, escape
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	909 sequences, etc. */
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	910 unsigned int eof:1;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	911
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	912 /* CH holds a partially built-up character. This is really part of the
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	913 state-dependent data and should be moved there. */
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	914 unsigned int ch;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	915
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	916 /* Coding-system-specific data holding extra state about the
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	917 conversion. Logically a struct TYPE_coding_stream; a pointer
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	918 to such a struct, with (when ERROR_CHECK_TYPECHECK is defined)
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	919 error-checking that this is really a structure of that type
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	920 (checking the corresponding coding system type) can be retrieved using
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	921 CODING_STREAM_TYPE_DATA(). Allocated at the same time that
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	922 CODESYS is set (which may occur at any time, even multiple times,
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	923 during the lifetime of the stream). The size comes from
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	924 methods->coding_data_size. */
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	925 void *data;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	926
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	927 enum encode_decode direction;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	928
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	929 /* #### Temporary test */
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	930 unsigned int finalized:1;
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	931 };
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	932
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	933 #define CODING_STREAM_DATA(stream) LSTREAM_TYPE_DATA (stream, coding)
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	934
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	935 #ifdef ERROR_CHECK_TYPECHECK
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	936 # define CODING_STREAM_TYPE_DATA(s, type) \
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	937 error_check_##type##_coding_stream_data (s)
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	938 #else
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	939 # define CODING_STREAM_TYPE_DATA(s, type) \
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	940 ((struct type##_coding_stream *) (s)->data)
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	941 #endif
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	942
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	943 /* C should be a binary character in the range 0 - 255; convert
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	944 to internal format and add to Dynarr DST. */
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	945
428 3ecd8885ac67 Import from CVS: tag r21-2-22 cvs parents: diff changeset	946 #ifdef MULE
771 943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	947
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	948 #define DECODE_ADD_BINARY_CHAR(c, dst) \
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	949 do { \
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	950 if (BYTE_ASCII_P (c)) \
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	951 Dynarr_add (dst, c); \
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	952 else if (BYTE_C1_P (c)) \
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	953 { \
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	954 Dynarr_add (dst, LEADING_BYTE_CONTROL_1); \
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	955 Dynarr_add (dst, c + 0x20); \
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	956 } \
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	957 else \
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	958 { \
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	959 Dynarr_add (dst, LEADING_BYTE_LATIN_ISO8859_1); \
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	960 Dynarr_add (dst, c); \
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	961 } \
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	962 } while (0)
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	963
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	964 #else /* not MULE */
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	965
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	966 #define DECODE_ADD_BINARY_CHAR(c, dst) \
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	967 do { \
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	968 Dynarr_add (dst, c); \
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	969 } while (0)
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	970
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	971 #endif /* MULE */
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	972
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	973 #define DECODE_OUTPUT_PARTIAL_CHAR(ch, dst) \
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	974 do { \
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	975 if (ch) \
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	976 { \
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	977 DECODE_ADD_BINARY_CHAR (ch, dst); \
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	978 ch = 0; \
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	979 } \
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	980 } while (0)
428 3ecd8885ac67 Import from CVS: tag r21-2-22 cvs parents: diff changeset	981
3ecd8885ac67 Import from CVS: tag r21-2-22 cvs parents: diff changeset	982 #ifdef MULE
3ecd8885ac67 Import from CVS: tag r21-2-22 cvs parents: diff changeset	983 /* Convert shift-JIS code (sj1, sj2) into internal string
3ecd8885ac67 Import from CVS: tag r21-2-22 cvs parents: diff changeset	984 representation (c1, c2). (The leading byte is assumed.) */
3ecd8885ac67 Import from CVS: tag r21-2-22 cvs parents: diff changeset	985
771 943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	986 #define DECODE_SHIFT_JIS(sj1, sj2, c1, c2) \
428 3ecd8885ac67 Import from CVS: tag r21-2-22 cvs parents: diff changeset	987 do { \
3ecd8885ac67 Import from CVS: tag r21-2-22 cvs parents: diff changeset	988 int I1 = sj1, I2 = sj2; \
3ecd8885ac67 Import from CVS: tag r21-2-22 cvs parents: diff changeset	989 if (I2 >= 0x9f) \
3ecd8885ac67 Import from CVS: tag r21-2-22 cvs parents: diff changeset	990 c1 = (I1 << 1) - ((I1 >= 0xe0) ? 0xe0 : 0x60), \
3ecd8885ac67 Import from CVS: tag r21-2-22 cvs parents: diff changeset	991 c2 = I2 + 2; \
3ecd8885ac67 Import from CVS: tag r21-2-22 cvs parents: diff changeset	992 else \
3ecd8885ac67 Import from CVS: tag r21-2-22 cvs parents: diff changeset	993 c1 = (I1 << 1) - ((I1 >= 0xe0) ? 0xe1 : 0x61), \
3ecd8885ac67 Import from CVS: tag r21-2-22 cvs parents: diff changeset	994 c2 = I2 + ((I2 >= 0x7f) ? 0x60 : 0x61); \
3ecd8885ac67 Import from CVS: tag r21-2-22 cvs parents: diff changeset	995 } while (0)
3ecd8885ac67 Import from CVS: tag r21-2-22 cvs parents: diff changeset	996
3ecd8885ac67 Import from CVS: tag r21-2-22 cvs parents: diff changeset	997 /* Convert the internal string representation of a Shift-JIS character
3ecd8885ac67 Import from CVS: tag r21-2-22 cvs parents: diff changeset	998 (c1, c2) into Shift-JIS code (sj1, sj2). The leading byte is
3ecd8885ac67 Import from CVS: tag r21-2-22 cvs parents: diff changeset	999 assumed. */
3ecd8885ac67 Import from CVS: tag r21-2-22 cvs parents: diff changeset	1000
771 943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	1001 #define ENCODE_SHIFT_JIS(c1, c2, sj1, sj2) \
428 3ecd8885ac67 Import from CVS: tag r21-2-22 cvs parents: diff changeset	1002 do { \
3ecd8885ac67 Import from CVS: tag r21-2-22 cvs parents: diff changeset	1003 int I1 = c1, I2 = c2; \
3ecd8885ac67 Import from CVS: tag r21-2-22 cvs parents: diff changeset	1004 if (I1 & 1) \
3ecd8885ac67 Import from CVS: tag r21-2-22 cvs parents: diff changeset	1005 sj1 = (I1 >> 1) + ((I1 < 0xdf) ? 0x31 : 0x71), \
3ecd8885ac67 Import from CVS: tag r21-2-22 cvs parents: diff changeset	1006 sj2 = I2 - ((I2 >= 0xe0) ? 0x60 : 0x61); \
3ecd8885ac67 Import from CVS: tag r21-2-22 cvs parents: diff changeset	1007 else \
3ecd8885ac67 Import from CVS: tag r21-2-22 cvs parents: diff changeset	1008 sj1 = (I1 >> 1) + ((I1 < 0xdf) ? 0x30 : 0x70), \
3ecd8885ac67 Import from CVS: tag r21-2-22 cvs parents: diff changeset	1009 sj2 = I2 - 2; \
3ecd8885ac67 Import from CVS: tag r21-2-22 cvs parents: diff changeset	1010 } while (0)
3ecd8885ac67 Import from CVS: tag r21-2-22 cvs parents: diff changeset	1011 #endif /* MULE */
3ecd8885ac67 Import from CVS: tag r21-2-22 cvs parents: diff changeset	1012
771 943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	1013 DECLARE_CODING_SYSTEM_TYPE (no_conversion);
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	1014 DECLARE_CODING_SYSTEM_TYPE (convert_eol);
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	1015 #if 0
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	1016 DECLARE_CODING_SYSTEM_TYPE (text_file_wrapper);
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	1017 #endif /* 0 */
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	1018 DECLARE_CODING_SYSTEM_TYPE (undecided);
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	1019 DECLARE_CODING_SYSTEM_TYPE (chain);
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	1020
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	1021 #ifdef DEBUG_XEMACS
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	1022 DECLARE_CODING_SYSTEM_TYPE (internal);
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	1023 #endif
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	1024
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	1025 #ifdef MULE
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	1026 DECLARE_CODING_SYSTEM_TYPE (iso2022);
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	1027 DECLARE_CODING_SYSTEM_TYPE (ccl);
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	1028 DECLARE_CODING_SYSTEM_TYPE (shift_jis);
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	1029 DECLARE_CODING_SYSTEM_TYPE (big5);
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	1030 #endif
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	1031
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	1032 #ifdef HAVE_ZLIB
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	1033 DECLARE_CODING_SYSTEM_TYPE (gzip);
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	1034 #endif
428 3ecd8885ac67 Import from CVS: tag r21-2-22 cvs parents: diff changeset	1035
771 943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	1036 DECLARE_CODING_SYSTEM_TYPE (unicode);
428 3ecd8885ac67 Import from CVS: tag r21-2-22 cvs parents: diff changeset	1037
771 943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	1038 #ifdef HAVE_WIN32_CODING_SYSTEMS
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	1039 DECLARE_CODING_SYSTEM_TYPE (mswindows_multibyte_to_unicode);
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	1040 DECLARE_CODING_SYSTEM_TYPE (mswindows_multibyte);
428 3ecd8885ac67 Import from CVS: tag r21-2-22 cvs parents: diff changeset	1041 #endif
771 943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	1042
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	1043 Lisp_Object coding_stream_detected_coding_system (Lstream *stream);
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	1044 Lisp_Object coding_stream_coding_system (Lstream *stream);
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	1045 void set_coding_stream_coding_system (Lstream *stream,
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	1046 Lisp_Object codesys);
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	1047 Lisp_Object detect_coding_stream (Lisp_Object stream);
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	1048 Emchar decode_big5_char (int o1, int o2);
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	1049 void add_entry_to_coding_system_type_list (struct coding_system_methods *m);
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	1050 Lisp_Object make_internal_coding_system (Lisp_Object existing,
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	1051 Char_ASCII *prefix,
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	1052 Lisp_Object type,
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	1053 Lisp_Object description,
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	1054 Lisp_Object props);
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	1055 Lisp_Object make_coding_input_stream (Lstream *stream, Lisp_Object codesys,
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	1056 enum encode_decode direction);
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	1057 Lisp_Object make_coding_output_stream (Lstream *stream, Lisp_Object codesys,
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	1058 enum encode_decode direction);
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	1059 void set_detection_results (struct detection_state *st, int detector,
943eaba38521 [xemacs-hg @ 2002-03-13 08:51:24 by ben] ben parents: 665 diff changeset	1060 int given);
428 3ecd8885ac67 Import from CVS: tag r21-2-22 cvs parents: diff changeset	1061
440 8de8e3f6228a Import from CVS: tag r21-2-28 cvs parents: 438 diff changeset	1062 #endif /* INCLUDED_file_coding_h_ */
8de8e3f6228a Import from CVS: tag r21-2-28 cvs parents: 438 diff changeset	1063

Mercurial > hg > xemacs-beta

annotate src/file-coding.h @ 771:943eaba38521