xemacs-beta: man/internals/internals.texi comparison

comparison man/internals/internals.texi @ 5178:97eb4942aec8

merge

author	Ben Wing <ben@xemacs.org>
date	Mon, 29 Mar 2010 21:28:13 -0500
parents	8b2f75cecb89 f965e31a35f0
children	3889ef128488

comparison

equal deleted inserted replaced

-:b785049378e3
+:97eb4942aec8
 that has been formatted into ASCII lists and tables.
 Note: to define these routines, put point after the end of the definition
 and type C-x C-e.
-(defun list-to-texinfo (b e)
+(defun convert-list-to-texinfo (b e)
 "Convert the selected region from an ASCII list to a Texinfo list."
 (interactive "r")
 (save-restriction
 (narrow-to-region b e)
 (goto-char (point-min))
-(let ((dash-type "^ *-+ +")
+(let ((dash-type "^ *\\(-+\\|o\\) +")
 	  ;; allow single-letter numbering or roman numerals
 	  (letter-type "^ *[[(]?\\([a-zA-Z]\\|[IVXivx]+\\)[]).] +")
 	  (num-type "^ *[[(]?[0-9]+[]).] +")
 	  dash regexp)
 (save-excursion
 	    (insert-char ?\  (- min (current-column)))
 	  (beginning-of-line)
 	  (forward-char min))
 	(kill-rectangle b (point))))))
-(defun table-to-texinfo (b e)
+(defun convert-table-to-texinfo (b e)
 "Convert the selected region from an ASCII table to a Texinfo table.
 Assumes entries are separated by a blank line, and the first sexp in
 each entry is the table heading."
 (interactive "r")
 (save-restriction
 If the region is active, do the region; otherwise, go from point to the end
 of the buffer.  This query-replaces for various kinds of conventions used
 in text: @code{} surrounded by ` and ' or followed by a (); @strong{}
 surrounded by *'s; @file{} something that looks like a file name."
 (interactive)
-(if (and (not no-narrow) (region-active-p))
+(save-excursion
-(save-restriction
+(if (and (not no-narrow) (region-active-p))
-	(narrow-to-region (region-beginning) (region-end))
+	(save-restriction
-	(convert-text-to-texinfo t))
+	  (narrow-to-region (region-beginning) (region-end))
-(let ((p (point))
+	  (goto-char (region-beginning))
-	  (case-replace nil))
+	  (zmacs-deactivate-region)
-(query-replace-regexp "`\\([^']+\\)'\\([^']\\)" "@code{\\1}\\2" nil)
+	  (convert-text-to-texinfo t))
-(goto-char p)
+(let ((p (point))
-(query-replace-regexp "\\(\\Sw\\)\\*\\(\\(?:\\s_\\|\\sw\\)+\\)\\*\\([^A-Za-z.}]\\)" "\\1@strong{\\2}\\3" nil)
+	    (case-replace nil))
-(goto-char p)
+	(message "Point is %d" (point))
-(query-replace-regexp "\\(\\(\\s_\\|\\sw\\)+()\\)\\([^}]\\)" "@code{\\1}\\3" nil)
+	(query-replace-regexp "`\\([^']+\\)'\\([^']\\)" "@code{\\1}\\2" nil)
-(goto-char p)
+	(goto-char p)
-(query-replace-regexp "\\(\\(\\s_\\|\\sw\\)+\\.[A-Za-z]+\\)\\([^A-Za-z.}]\\)" "@file{\\1}\\3" nil)
+	(query-replace-regexp "\\(\\Sw\\)\\*\\(\\(?:\\s_\\|\\sw\\)+\\)\\*\\([^A-Za-z.}]\\)" "\\1@strong{\\2}\\3" nil)
-)))
+	(goto-char p)
+	(query-replace-regexp "\\(\\(\\s_\\|\\sw\\)+()\\)\\([^}]\\)" "@code{\\1}\\3" nil)
+	(goto-char p)
+	(query-replace-regexp "\\(\\(\\s_\\|\\sw\\)+\\.[A-Za-z]+\\)\\([^A-Za-z.}]\\)" "@file{\\1}\\3" nil)
+	))))
 4. Adding new sections:
 -----------------------
 NOTE: These are in the form of macros. #### FIXME Convert them to
 XEmacs is a powerful, customizable text editor and development
 environment.  It began in 1991 as Lucid Emacs, which was in turn
 derived from GNU Emacs, a program written by Richard Stallman of the
 Free Software Foundation.  GNU Emacs dates back to 1985 and was
 modelled after Unipress Emacs, an editor written by James Gosling in
-1981 and based on a series of other "Emacs"-like editors, including
+1981 and based on a series of other ``Emacs''-like editors, including
 EINE (EINE Is Not EMACS), c. 1976, by Dan Weinreb, which run on the
 MIT Lisp Machine and was the first Emacs written in Lisp; ZWEI (ZWEI
 Was EINE Initially), c. 1978, by Dan Weinreb and Mike McMahon; Multics
 Emacs, c. 1978, by Bernie Greenberg, which was written in MacLisp and
 also used Lisp as its extension language; and ZMACS, c. 1980, a direct
 descendant of ZWEI that on ran the Symbolics LM-2, LMI LispM, and
 later, TI Explorer (1983-1989).  These in turn were inspired by the
 first Emacs, a package called EMACS, written in 1976 by Richard
 Stallman, Guy Steele, and Dave Moon.  This was a merger of TECMAC and
-TMACS, a pair of "TECO-macro realtime editors" written by Guy Steele,
+TMACS, a pair of ``TECO-macro realtime editors'' written by Guy Steele,
 Dave Moon, Richard Greenblatt, Charles Frankston, et al., and added a
 dynamic loader and Meta-key cmds.  It ran under ITS (the Incompatible
 Timesharing System) on a DEC PDP 10 and under TWENEX on a Tops-20 and
 was written in TECO and PDP 10 assembly.  ITS was one of the first
 time-sharing operating systems and dates back well before Unix.  ITS,
 M. Stallman (RMS) and James Gosling (the creator of Java); its extension
 language was known as @dfn{Mocklisp}.  This version of Emacs-in-C formed
 the basis for the early versions of GNU Emacs and also for Gosling's
 Unipress Emacs, a commercial product.  Because of bad blood between the
 two over the issue of commercialism, RMS pretty much disowned this
-collaboration, referring to it as "Gosling Emacs".
+collaboration, referring to it as ``Gosling Emacs''.
 At this point we pick up with a time line of events. (A broader timeline
 is available at @uref{http://www.jwz.org/doc/emacs-timeline.html,
 ``Emacs Timeline''}.)
 redisplay code, preliminary I18N support, code merged from GNU Emacs
 19.8 beta)
 @item
 Version 19.9 released January 12, 1994. (Scrollbars, Athena.)
 @item
-Version 19.10 released May 27, 1994. (Uses `configure'; code merged
+Version 19.10 released May 27, 1994. (Uses @code{configure}; code merged
 from GNU Emacs 19.23 beta and further merging with Epoch 4.0) Known as
-"Lucid Emacs" when shipped by Lucid, and as "XEmacs" when shipped by
+``Lucid Emacs'' when shipped by Lucid, and as ``XEmacs'' when shipped by
 Sun; but Lucid went out of business a few days later and it's unclear
 very many copies of 19.10 were released by Lucid. (Last release by
 Jamie Zawinski.)
 @end itemize
 rewritten redisplay, TTY support, multi-device support, device and
 console objects, specifiers, glyphs, toolbars, horizontal scrollbars,
 Lucid scrollbar widget, 3-d modeline, stay-up Lucid menus, resizable
 minibuffer, echo area is a true buffer, MD5 hashing support, expanded
 menubar, redone menu specification format (including menu filters),
-rewritten extents, renamed "screen" to "frame", misc-user events,
+rewritten extents, renamed ``screen'' to ``frame'', misc-user events,
 rewritten face code, rewritten mouse code, warnings system, CL
 backquote syntax, critical C-g, code merging with GNU Emacs 19.28.
 New packages Hyperbole, OOBR, hm--html-menus, viper, lazy-lock,
 ksh-mode, rsz-minibuf.)
 @item
 version 20.4 released February 28, 1998.
 @item
 version 21.0.60 released December 10, 1998. (The version naming scheme was
 changed at this point: [a] the second version number is odd for stable
 versions, even for beta versions; [b] a third version number is added,
-replacing the "beta xxx" ending for beta versions and allowing for
+replacing the ``beta xxx'' ending for beta versions and allowing for
 periodic maintenance releases for stable versions.  Therefore, 21.0 was
-never "officially" released; similarly for 21.2, etc.)
+never ``officially'' released; similarly for 21.2, etc.)
 @item
 version 21.0.61 released January 4, 1999.
 @item
 version 21.0.63 released February 3, 1999.
 @item
 @item
 version 21.0.67 released March 25, 1999.
 @item
 version 21.1.2 released May 14, 1999. (This is the followup to 21.0.67.
 The second version number was bumped to indicate the beginning of the
-"stable" series.)
+``stable'' series.)
 @item
 version 21.1.3 released June 26, 1999.
 @item
 version 21.1.4 released July 8, 1999.
 @item
 @item
 version 21.2.39 released December 31, 2000.
 @item
 version 21.2.40 released January 8, 2001.
 @item
-version 21.2.41 "Polyhymnia" released January 17, 2001.
+version 21.2.41 ``Polyhymnia'' released January 17, 2001.
 @item
-version 21.2.42 "Poseidon" released January 20, 2001.
+version 21.2.42 ``Poseidon'' released January 20, 2001.
 @item
-version 21.2.43 "Terspichore" released January 26, 2001.
+version 21.2.43 ``Terspichore'' released January 26, 2001.
 @item
-version 21.2.44 "Thalia" released February 8, 2001.
+version 21.2.44 ``Thalia'' released February 8, 2001.
 @item
-version 21.2.45 "Thelxepeia" released February 23, 2001.
+version 21.2.45 ``Thelxepeia'' released February 23, 2001.
 @item
-version 21.2.46 "Urania" released March 21, 2001.
+version 21.2.46 ``Urania'' released March 21, 2001.
 @item
-version 21.2.47 "Zephir" released April 14, 2001.
+version 21.2.47 ``Zephir'' released April 14, 2001.
 @item
-XEmacs 21.4.0 "Solid Vapor" released April 16, 2001.
+XEmacs 21.4.0 ``Solid Vapor'' released April 16, 2001.
 @item
-XEmacs 21.4.1 "Copyleft" released April 19, 2001.
+XEmacs 21.4.1 ``Copyleft'' released April 19, 2001.
 @item
-XEmacs 21.4.2 "Developer-Friendly Unix APIs" released May 10, 2001.
+XEmacs 21.4.2 ``Developer-Friendly Unix APIs'' released May 10, 2001.
 @item
-XEmacs 21.4.3 "Academic Rigor" released May 17, 2001.
+XEmacs 21.4.3 ``Academic Rigor'' released May 17, 2001.
 @item
-XEmacs 21.4.4 "Artificial Intelligence" released July 28, 2001.
+XEmacs 21.4.4 ``Artificial Intelligence'' released July 28, 2001.
 @item
-XEmacs 21.4.5 "Civil Service" released October 23, 2001.
+XEmacs 21.4.5 ``Civil Service'' released October 23, 2001.
 @item
-XEmacs 21.4.6 "Common Lisp" released December 17, 2001.
+XEmacs 21.4.6 ``Common Lisp'' released December 17, 2001.
 @item
-XEmacs 21.4.7 "Economic Science" released May 4, 2002.
+XEmacs 21.4.7 ``Economic Science'' released May 4, 2002.
 @item
-XEmacs 21.4.8 "Honest Recruiter" released May 9, 2002.
+XEmacs 21.4.8 ``Honest Recruiter'' released May 9, 2002.
 @item
-XEmacs 21.4.9 "Informed Management" released August 23, 2002.
+XEmacs 21.4.9 ``Informed Management'' released August 23, 2002.
 @item
-XEmacs 21.4.10 "Military Intelligence" released November 2, 2002.
+XEmacs 21.4.10 ``Military Intelligence'' released November 2, 2002.
 @item
-XEmacs 21.4.11 "Native Windows TTY Support" released January 3, 2003.
+XEmacs 21.4.11 ``Native Windows TTY Support'' released January 3, 2003.
 @item
-XEmacs 21.4.12 "Portable Code" released January 15, 2003.
+XEmacs 21.4.12 ``Portable Code'' released January 15, 2003.
 @item
-XEmacs 21.4.13 "Rational FORTRAN" released May 25, 2003.
+XEmacs 21.4.13 ``Rational FORTRAN'' released May 25, 2003.
 @item
-XEmacs 21.4.14 "Reasonable Discussion" released September 3, 2003.
+XEmacs 21.4.14 ``Reasonable Discussion'' released September 3, 2003.
 @item
-XEmacs 21.4.15 "Security Through Obscurity" released February 2, 2004.
+XEmacs 21.4.15 ``Security Through Obscurity'' released February 2, 2004.
 @item
-XEmacs 21.4.16 "Successful IPO" released December 5, 2004.
+XEmacs 21.4.16 ``Successful IPO'' released December 5, 2004.
 @item
-version 21.5.0 "alfalfa" released April 18, 2001.
+version 21.5.0 ``alfalfa'' released April 18, 2001.
 @item
-version 21.5.1 "anise" released May 9, 2001.
+version 21.5.1 ``anise'' released May 9, 2001.
 @item
-version 21.5.2 "artichoke" released July 28, 2001.
+version 21.5.2 ``artichoke'' released July 28, 2001.
 @item
-version 21.5.3 "asparagus" released September 7, 2001.
+version 21.5.3 ``asparagus'' released September 7, 2001.
 @item
-version 21.5.4 "bamboo" released January 8, 2002.
+version 21.5.4 ``bamboo'' released January 8, 2002.
 @item
-version 21.5.5 "beets" released March 5, 2002.
+version 21.5.5 ``beets'' released March 5, 2002.
 @item
-version 21.5.6 "bok choi" released April 5, 2002.
+version 21.5.6 ``bok choi'' released April 5, 2002.
 @item
-version 21.5.7 "broccoflower" released July 2, 2002.
+version 21.5.7 ``broccoflower'' released July 2, 2002.
 @item
-version 21.5.8 "broccoli" released July 27, 2002.
+version 21.5.8 ``broccoli'' released July 27, 2002.
 @item
-version 21.5.9 "brussels sprouts" released August 30, 2002.
+version 21.5.9 ``brussels sprouts'' released August 30, 2002.
 @item
-version 21.5.10 "burdock" released January 4, 2003.
+version 21.5.10 ``burdock'' released January 4, 2003.
 @item
-version 21.5.11 "cabbage" released February 16, 2003.
+version 21.5.11 ``cabbage'' released February 16, 2003.
 @item
-version 21.5.12 "carrot" released April 24, 2003.
+version 21.5.12 ``carrot'' released April 24, 2003.
 @item
-version 21.5.13 "cauliflower" released May 10, 2003.
+version 21.5.13 ``cauliflower'' released May 10, 2003.
 @item
-version 21.5.14 "cassava" released June 1, 2003.
+version 21.5.14 ``cassava'' released June 1, 2003.
 @item
-version 21.5.15 "celery" released September 3, 2003.
+version 21.5.15 ``celery'' released September 3, 2003.
 @item
-version 21.5.16 "celeriac" released September 26, 2003.
+version 21.5.16 ``celeriac'' released September 26, 2003.
 @item
-version 21.5.17 "chayote" released March 22, 2004.
+version 21.5.17 ``chayote'' released March 22, 2004.
 @item
-version 21.5.18 "chestnut" released October 22, 2004.
+version 21.5.18 ``chestnut'' released October 22, 2004.
 @end itemize
 @node The XEmacs Split, XEmacs from the Outside, A History of Emacs, Top
 @chapter The XEmacs Split
 @cindex XEmacs split
 to cooperate a bit with RMS, and the two versions of Emacs will merge. In
 fact there have been six to seven major attempts at merging, each running
 hundreds of messages long and all of them coming from the XEmacs side. All
 have failed because they have eventually come to the same conclusion, which
 is that RMS has no real interest in cooperation at all. If you work with
-him, you have to do it his way -- "my way or the highway".  Specifically:
+him, you have to do it his way -- ``my way or the highway''.  Specifically:
 @enumerate
 @item
 RMS insists on having legal papers signed for every bit of code that goes
 zero or more Kanji characters followed by zero or more
 Hiragana characters.
 @end display
 Then, the problem is that now we can't say that a sequence of
-word-constituents makes up a word.  For instance, both Hiragana "A"
+word-constituents makes up a word.  For instance, both Hiragana ``A''
-and Kanji "KAN" are word-constituents but the sequence of these two
+and Kanji ``KAN'' are word-constituents but the sequence of these two
 letters can't be a single word.
 So, we introduced Sextword for Japanese letters.
 @end quotation
 @item
 Any header-file declarations of the sort
 struct foobar;
-go into the "types" section of lisp.h.
+go into the ``types'' section of @file{lisp.h}.
 @end itemize
 @node Writing New Modules, Working with Lisp Objects, Introduction to Writing C Code, Rules When Writing New C Code
 @section Writing New Modules
 @cindex writing new modules
 style now forbids passing pointers to @samp{Lisp_<Type>} structures into
 or out of a function; instead, a @samp{Lisp_Object} should be passed or
 returned (created using @samp{wrap_<type>}, if necessary).
 @c #### declaration
-@item DECLARE_LRECORD (<type>, Lisp_<Type>)
+@item DECLARE_LISP_OBJECT (<type>, Lisp_<Type>)
-Declares an @samp{lrecord} for @samp{<Type>}, which is the unit of
+Declares a Lisp object for @samp{<Type>}, which is the unit of
 allocation.
 @item #define X<TYPE>(x) XRECORD (x, <type>, Lisp_<Type>)
 Turns a @code{Lisp_Object} into a pointer to @samp{struct Lisp_<Type>}.
 Here is a checklist of things to do when creating a new lisp object type
 named @var{foo}:
 @enumerate
 @item
-create @var{foo}.h
+Create @var{foo}.h
 @item
-create @var{foo}.c
+Create @var{foo}.c
 @item
-add definitions of @code{syms_of_@var{foo}}, etc. to @file{@var{foo}.c}
+Add definitions of @code{syms_of_@var{foo}}, etc. to @file{@var{foo}.c}
 @item
-add declarations of @code{syms_of_@var{foo}}, etc. to @file{symsinit.h}
+Add declarations of @code{syms_of_@var{foo}}, etc. to @file{symsinit.h}
 @item
-add calls to @code{syms_of_@var{foo}}, etc. to @file{emacs.c}
+Add calls to @code{syms_of_@var{foo}}, etc. to @file{emacs.c}
 @item
-add definitions of macros like @code{CHECK_@var{FOO}} and
+Add definitions of macros like @code{CHECK_@var{FOO}} and
 @code{@var{FOO}P} to @file{@var{foo}.h}
 @item
-add the new type index to @code{enum lrecord_type}
+Add the new type index to @code{enum lrecord_type}
 @item
-add a DEFINE_LRECORD_IMPLEMENTATION call to @file{@var{foo}.c}
+Add a @code{DEFINE_*_LISP_OBJECT()} to @file{@var{foo}.c}
 @item
-add an INIT_LRECORD_IMPLEMENTATION call to @code{syms_of_@var{foo}.c}
+Add an @code{INIT_LISP_OBJECT} call to @code{syms_of_@var{foo}.c}
 @end enumerate
 @node Writing Lisp Primitives, Writing Good Comments, Working with Lisp Objects, Rules When Writing New C Code
 @section Writing Lisp Primitives
 correct it or flag it as incorrect, as described in the previous
 paragraph.  Whenever you work on a section of code, @emph{always} make
 sure to update any comments to be correct -- or, at the very least, flag
 them as incorrect.
-To indicate a "todo" or other problem, use four pound signs --
+To indicate a ``todo'' or other problem, use four pound signs --
 i.e. @samp{####}.
 @node Adding Global Lisp Variables, Writing Macros, Writing Good Comments, Rules When Writing New C Code
 @section Adding Global Lisp Variables
 @cindex global Lisp variables, adding
 functions a gcc bug, but the gcc maintainers disagree.
 @cindex inline functions, headers
 @cindex header files, inline functions
 Every header which contains inline functions, either directly by using
-@code{DECLARE_INLINE_HEADER} or indirectly by using @code{DECLARE_LRECORD} must
+@code{DECLARE_INLINE_HEADER} or indirectly by using
-be added to @file{inline.c}'s includes to make the optimization
+@code{DECLARE_LISP_OBJECT} must be added to @file{inline.c}'s includes
-described above work.  (Optimization note: if all INLINE_HEADER
+to make the optimization described above work.  (Optimization note: if
-functions are in fact inlined in all translation units, then the linker
+all INLINE_HEADER functions are in fact inlined in all translation
-can just discard @code{inline.o}, since it contains only unreferenced code).
+units, then the linker can just discard @code{inline.o}, since it
+contains only unreferenced code).
 The three golden rules of macros:
 @enumerate
 @item
 Anything that's an lvalue can be evaluated more than once.
 @item
 Macros where anything else can be evaluated more than once should
-have the word "unsafe" in their name (exceptions may be made for
+have the word ``unsafe'' in their name (exceptions may be made for
 large sets of macros that evaluate arguments of certain types more
 than once, e.g. struct buffer * arguments, when clearly indicated in
 the macro documentation).  These macros are generally meant to be
 called only by other macros that have already stored the calling
 values in temporary variables.
 Capitalize macros doing stuff obviously impossible with (C)
 functions, e.g. directly modifying arguments as if they were passed by
 reference.
 @item
 Capitalize macros that evaluate @strong{any} argument more than once regardless
-of whether that's "allowed" (e.g. buffer arguments).
+of whether that's ``allowed'' (e.g. buffer arguments).
 @item
 Capitalize macros that directly access a field in a Lisp_Object or
 its equivalent underlying structure.  In such cases, access through the
 Lisp_Object precedes the macro with an X, and access through the underlying
 structure doesn't.
 a search-and-replace is done to change type names and such.  Some people
 disagree with such changes, and certainly if done without good reason
 will just lead to headaches.  But it's important to keep the code clean
 and understandable, and consistent naming goes a long way towards this.
-An example of the right way to do this was the so-called "great integral
+An example of the right way to do this was the so-called ``great integral
-type renaming".
+type renaming''.
 @menu
 * Great Integral Type Renaming::
 * Text/Char Type Renaming::
 @end menu
 @item
 All integral types that measure quantities of anything are signed.  Some
 people disagree vociferously with this, but their arguments are mostly
 theoretical, and are vastly outweighed by the practical headaches of
 mixing signed and unsigned values, and more importantly by the far
-increased likelihood of inadvertent bugs: Because of the broken "viral"
+increased likelihood of inadvertent bugs: Because of the broken ``viral''
 nature of unsigned quantities in C (operations involving mixed
 signed/unsigned are done unsigned, when exactly the opposite is nearly
 always wanted), even a single error in declaring a quantity unsigned
 that should be signed, or even the even more subtle error of comparing
 signed and unsigned values and forgetting the necessary cast, can be
-catastrophic, as comparisons will yield wrong results.  -Wsign-compare
+catastrophic, as comparisons will yield wrong results.  @samp{-Wsign-compare}
 is turned on specifically to catch this, but this tends to result in a
 great number of warnings when mixing signed and unsigned, and the casts
 are annoying.  More has been written on this elsewhere.
 @item
 Type names should be relatively short (no more than 10 characters or
 so), with the first letter capitalized and no underscores if they can at
 all be avoided.
 @item
-"count" == a zero-based measurement of some quantity.  Includes sizes,
+``count'' == a zero-based measurement of some quantity.  Includes sizes,
 offsets, and indexes.
 @item
-"bpos" == a one-based measurement of a position in a buffer.  "Charbpos"
+``bpos'' == a one-based measurement of a position in a buffer.  ``Charbpos''
-and "Bytebpos" count text in the buffer, rather than bytes in memory;
+and ``Bytebpos'' count text in the buffer, rather than bytes in memory;
 thus Bytebpos does not directly correspond to the memory representation.
-Use "Membpos" for this.
+Use ``Membpos'' for this.
 @item
-"Char" refers to internal-format characters, not to the C type "char",
+``Char'' refers to internal-format characters, not to the C type ``char'',
 which is really a byte.
 @end itemize
 For the actual name changes, see the script below.
 #endif
 /* The have been some arguments over the what the type should be that
 specifies a count of bytes in a data block to be written out or read in,
 using @code{Lstream_read()}, @code{Lstream_write()}, and related functions.
-Originally it was long, which worked fine; Martin "corrected" these to
+Originally it was long, which worked fine; Martin ``corrected'' these to
 size_t and ssize_t on the grounds that this is theoretically cleaner and
 is in keeping with the C standards.  Unfortunately, this practice is
 horribly error-prone due to design flaws in the way that mixed
 signed/unsigned arithmetic happens.  In fact, by doing this change,
 Martin introduced a subtle but fatal error that caused the operation of
 fixed---use the @code{Known-Bug-Expect-Failure} wrapper macro to mark
 them.
 @deffn Macro Known-Bug-Expect-Failure body
 Arrange for failing tests in @var{body} to generate messages prefixed
-with "KNOWN BUG:" instead of "FAIL:".  @var{body} is a @code{progn}-like
+with ``KNOWN BUG:'' instead of ``FAIL:''.  @var{body} is a @code{progn}-like
 body, and may contain several tests.
 @end deffn
 A lot of the tests we run push limits; suppress Ebola warning messages
 with the @code{Ignore-Ebola} wrapper macro.
 with added or deleted files.} If you are lucky, the operation will
 simply fail.  If you are less lucky, it will proceed, but make the
 adds and deletes on the main line, which you do not want at all.
 Therefore, you must undo all adds and deletes.  To find out what is
 added and deleted, use something like @code{cvs -n update >&!
-cvs.out}, which does a "dry run". (You did make a backup copy first,
+cvs.out}, which does a ``dry run''. (You did make a backup copy first,
 right?  What if you forgot the @samp{-n}, for example, and wasn't
 prepared for the sudden onslaught of merging action?) Take a look at
 the output file @file{cvs.out} and check very carefully for newly
 added files (marked with an @samp{A}) and newly removed files (marked
 with an @samp{R}).  Double check that your newly added files are in
 crw tag -b ben-mule-21-5
 @end example
 Note that this doesn't actually do anything to your local workspace!
 It basically just creates another tag in the repository, identical to
-the branch point tag but internally marked as a "branch tag" rather
+the branch point tag but internally marked as a ``branch tag'' rather
 than a regular tag.
 @item
 Now, move your workspace onto the branch:
 and when you add a new element, the array automatically resizes itself
 if it isn't big enough.  Dynarrs are extensively used in the redisplay
 mechanism.
-A "dynamic array" is a contiguous array of fixed-size elements where there
+A ``dynamic array'' is a contiguous array of fixed-size elements where there
 is no upper limit (except available memory) on the number of elements in the
 array.  Because the elements are maintained contiguously, space is used
 efficiently (no per-element pointers necessary) and random access to a
 particular element is in constant time.  At any one point, the block of memory
 that holds the array has an upper limit; if this limit is exceeded, the
-memory is realloc()ed into a new array that is twice as big.  Assuming that
+memory is @code{realloc()}ed into a new array that is twice as big.  Assuming that
 the time to grow the array is on the order of the new size of the array
 block, this scheme has a provably constant amortized time (i.e. average
 time over all additions).
 When you add elements or retrieve elements, pointers are used.  Note that
 onto a linked list, so they can be efficiently reused.  This data type
 is not much used in XEmacs currently, because it's a fairly new
 addition.
-A "block-type object" is used to efficiently allocate and free blocks
+A ``block-type object'' is used to efficiently allocate and free blocks
 of a particular size.  Freed blocks are remembered in a free list and
 are reused as necessary to allocate new blocks, so as to avoid as
-much as possible making calls to malloc() and free().
+much as possible making calls to @code{malloc()} and @code{free()}.
 This is a container object.  Declare a block-type object of a specific type
 as follows:
 struct mytype_blocktype @{
 characters.  No special allocation or garbage collection is necessary
 for such objects.  Lisp objects of these types do not need to be
 @code{GCPRO}ed.
 @end itemize
 In the remaining two categories, the type is stored in the object
 itself.  The tag for all such objects is the generic @dfn{lrecord}
 (Lisp_Type_Record) tag.  The first bytes of the object's structure are an
 integer (actually a char) characterising the object's type and some
 flags, in particular the mark bit used for garbage collection.  A
 structure describing the type is accessible thru the
 @code{this_one_is_unmarkable} in @code{alloc.c}).
 Now, the actual marking is feasible. We do so by once using the macro
 @code{MARK_RECORD_HEADER} to mark the object itself (actually the
 special flag in the lrecord header), and calling its special marker
-"method" @code{marker} if available. The marker method marks every
+``method'' @code{marker} if available. The marker method marks every
 other object that is in reach from our current object. Note, that these
 marker methods should not call @code{mark_object} recursively, but
 instead should return the next object from where further marking has to
 be performed.
 @code{sweep_conses}, @code{sweep_bit_vectors_1},
 @code{sweep_compiled_functions}, @code{sweep_floats},
 @code{sweep_symbols}, @code{sweep_extents}, @code{sweep_markers} and
 @code{sweep_extents}.  They are the fixed-size types cons, floats,
 compiled-functions, symbol, marker, extent, and event stored in
-so-called "frob blocks", and therefore we can basically do the same on
+so-called ``frob blocks'', and therefore we can basically do the same on
 every type objects, using the same macros, especially defined only to
 handle everything with respect to fixed-size blocks. The only fixed-size
 type that is not handled here are the fixed-size portion of strings,
 because we took special care of them earlier.
 @node Integers and Characters, Allocation from Frob Blocks, Garbage Collection - Step by Step, Allocation of Objects in XEmacs Lisp
 @section Integers and Characters
 @cindex integers and characters
 @cindex characters, integers and
 Integer and character Lisp objects are created from integers using the
-macros @code{XSETINT()} and @code{XSETCHAR()} or the equivalent
 functions @code{make_int()} and @code{make_char()}. (These are actually
 macros on most systems.)  These functions basically just do some moving
 of bits around, since the integral value of the object is stored
 directly in the @code{Lisp_Object}.
-@code{XSETINT()} and the like will truncate values given to them that
-are too big; i.e. you won't get the value you expected but the tag bits
-will at least be correct.
 @node Allocation from Frob Blocks, lrecords, Integers and Characters, Allocation of Objects in XEmacs Lisp
 @section Allocation from Frob Blocks
 @cindex allocation from frob blocks
 @cindex frob blocks, allocation from
-The uninitialized memory required by a @code{Lisp_Object} of a particular type
+The uninitialized memory required by a @code{Lisp_Object} of a
-is allocated using
+particular type is allocated using @code{ALLOCATE_FIXED_TYPE()}.  This
-@code{ALLOCATE_FIXED_TYPE()}.  This only occurs inside of the
+only occurs inside of the lowest-level object-creating functions in
-lowest-level object-creating functions in @file{alloc.c}:
+@file{alloc.c}: @code{Fcons()}, @code{make_float()},
-@code{Fcons()}, @code{make_float()}, @code{Fmake_byte_code()},
+@code{Fmake_byte_code()}, @code{Fmake_symbol()},
-@code{Fmake_symbol()}, @code{allocate_extent()},
+@code{allocate_extent()}, @code{allocate_event()},
-@code{allocate_event()}, @code{Fmake_marker()}, and
+@code{Fmake_marker()}, and @code{make_uninit_string()}.  The idea is
-@code{make_uninit_string()}.  The idea is that, for each type, there are
+that, for each type, there are a number of frob blocks (each 2K in
-a number of frob blocks (each 2K in size); each frob block is divided up
+size); each frob block is divided up into object-sized chunks.  Each
-into object-sized chunks.  Each frob block will have some of these
+frob block will have some of these chunks that are currently assigned
-chunks that are currently assigned to objects, and perhaps some that are
+to objects, and perhaps some that are free. (If a frob block has
-free. (If a frob block has nothing but free chunks, it is freed at the
+nothing but free chunks, it is freed at the end of the garbage
-end of the garbage collection cycle.)  The free chunks are stored in a
+collection cycle.)  The free chunks are stored in a free list, which
-free list, which is chained by storing a pointer in the first four bytes
+is chained by storing a pointer in the first four bytes of the
-of the chunk. (Except for the free chunks at the end of the last frob
+chunk. (Except for the free chunks at the end of the last frob block,
-block, which are handled using an index which points past the end of the
+which are handled using an index which points past the end of the
 last-allocated chunk in the last frob block.)
 @code{ALLOCATE_FIXED_TYPE()} first tries to retrieve a chunk from the
 free list; if that fails, it calls
 @code{ALLOCATE_FIXED_TYPE_FROM_BLOCK()}, which looks at the end of the
 last frob block for space, and creates a new frob block if there is
-none. (There are actually two versions of these macros, one of which is
+none. (There are actually two versions of these macros, one of which
-more defensive but less efficient and is used for error-checking.)
+is more defensive but less efficient and is used for error-checking.)
 @node lrecords, Low-level allocation, Allocation from Frob Blocks, Allocation of Objects in XEmacs Lisp
 @section lrecords
 @cindex lrecords
 [see @file{lrecord.h}]
 @strong{This node needs updating for the ``new garbage collection
 algorithms'' (KKCC) and the ``incremental'' collector.}
 All lrecords have at the beginning of their structure a @code{struct
 lrecord_header}.  This just contains a type number and some flags,
 including the mark bit.  All builtin type numbers are defined as
 constants in @code{enum lrecord_type}, to allow the compiler to generate
 more efficient code for @code{@var{type}P}.  The type number, thru the
 @code{lrecord_implementation_table}, gives access to a @code{struct
 lrecord_implementation}, which is a structure containing method pointers
 and such.  There is one of these for each type, and it is a global,
 constant, statically-declared structure that is declared in the
-@code{DEFINE_LRECORD_IMPLEMENTATION()} macro.
+@code{DEFINE_*_LISP_OBJECT()} macro.
-Simple lrecords (of type (b) above) just have a @code{struct
+Frob-block lrecords just have a @code{struct lrecord_header} at their
-lrecord_header} at their beginning.  lcrecords, however, actually have a
+beginning.  lcrecords, however, actually have a
-@code{struct lcrecord_header}.  This, in turn, has a @code{struct
+@code{struct old_lcrecord_header}.  This, in turn, has a @code{struct
 lrecord_header} at its beginning, so sanity is preserved; but it also
-has a pointer used to chain all lcrecords together, and a special ID
+has a pointer used to chain all lcrecords together.
-field used to distinguish one lcrecord from another. (This field is used
-only for debugging and could be removed, but the space gain is not
-significant.)
 @strong{lcrecords are now obsolete when using the write-barrier-based
 collector.}
-Simple lrecords are created using @code{ALLOCATE_FIXED_TYPE()}, just
+Frob-block objects are created using @code{ALLOC_FROB_BLOCK_LISP_OBJECT()}.
-like for other frob blocks.  The only change is that the implementation
+All this does is call @code{ALLOCATE_FIXED_TYPE()} to allocate an
-pointer must be initialized correctly. (The implementation structure for
+object, and @code{set_lheader_implementation()} to initialize the header.
-an lrecord, or rather the pointer to it, is named @code{lrecord_float},
-@code{lrecord_extent}, @code{lrecord_buffer}, etc.)
+Normal objects (i.e. lcrecords) are created using
+@code{ALLOC_NORMAL_LISP_OBJECT()}, which takes a type name (resolved
-lcrecords are created using @code{alloc_lcrecord()}.  This takes a
+internally to a structure named @code{lrecord_foo} for type
-size to allocate and an implementation pointer. (The size needs to be
+@code{foo}).  If they are of variable size, however, they are created
-passed because some lcrecords, such as window configurations, are of
+with @code{ALLOC_SIZED_LISP_OBJECT()}, which takes a size to allocate
-variable size.) This basically just @code{malloc()}s the storage,
+in addition to a type.  This basically just @code{malloc()}s the
-initializes the @code{struct lcrecord_header}, and chains the lcrecord
+storage, initializes the @code{struct lcrecord_header}, and chains the
-onto the head of the list of all lcrecords, which is stored in the
+lcrecord onto the head of the list of all lcrecords, which is stored
-variable @code{all_lcrecords}.  The calls to @code{alloc_lcrecord()}
+in the variable @code{all_lcrecords}.  The calls to the above
-generally occur in the lowest-level allocation function for each lrecord
+allocation macros generally occur in the lowest-level allocation
-type.
+function for each lrecord type.
-Whenever you create an lrecord, you need to call either
+Whenever you create a normal object, you need to call one of the
-@code{DEFINE_LRECORD_IMPLEMENTATION()} or
+@code{DEFINE_*_LISP_OBJECT()} macros.  This needs to be
-@code{DEFINE_LRECORD_SEQUENCE_IMPLEMENTATION()}.  This needs to be
 specified in a @file{.c} file, at the top level.  What this actually
 does is define and initialize the implementation structure for the
 lrecord. (And possibly declares a function @code{error_check_foo()} that
 implements the @code{XFOO()} macro when error-checking is enabled.)  The
 arguments to the macros are the actual type name (this is used to
 are used to encapsulate type-specific information about the object, such
 as how to print it or mark it for garbage collection, so that it's easy
 to add new object types without having to add a specific case for each
 new type in a bunch of different places.
-The difference between @code{DEFINE_LRECORD_IMPLEMENTATION()} and
+The various macros for defining Lisp objects are as follows:
-@code{DEFINE_LRECORD_SEQUENCE_IMPLEMENTATION()} is that the former is
-used for fixed-size object types and the latter is for variable-size
+@itemize @bullet
-object types.  Most object types are fixed-size; some complex
+@item
-types, however (e.g. window configurations), are variable-size.
+@code{DEFINE_*_LISP_OBJECT} is for objects with constant size. (Either
-Variable-size object types have an extra method, which is called
+@code{DEFINE_DUMPABLE_LISP_OBJECT} for objects that can be saved in a
-to determine the actual size of a particular object of that type.
+dumped executable, or @code{DEFINE_NODUMP_LISP_OBJECT} for objects
-(Currently this is only used for keeping allocation statistics.)
+that cannot be saved -- e.g. that contain pointers to non-persistent
+external objects such as window-system windows.)
-For the purpose of keeping allocation statistics, the allocation
+@item
+@code{DEFINE_*_SIZABLE_LISP_OBJECT} is for objects whose size varies.
+This includes some simple types such as vectors, bit vectors and
+opaque objects, as well complex types, especially types such as
+specifiers, lstreams or coding systems that have subtypes and include
+subtype-specific data attached to the end of the structure.
+Variable-size objects have an extra method that returns the size of
+the object.  This is not used at allocation (rather, the size is
+specified in the call to the allocation macro), but is used for
+operations such as copying a Lisp object, as well as for keeping
+allocation statistics.
+@item
+@code{DEFINE_*_FROB_BLOCK_LISP_OBJECT} is for objects that are
+allocated in large blocks (``frob blocks''), which are parceled up
+individually.  Such objects need special handling in @file{alloc.c}.
+This does not apply to NEW_GC, because it does this automatically.
+@item
+@code{DEFINE_*_INTERNAL_LISP_OBJECT} is for ``internal'' objects that
+should never be visible on the Lisp level.  This is a shorthand for
+the most common type of internal objects, which have no equal or hash
+method (since they generally won't appear in hash tables), no
+finalizer and @code{internal_object_printer()} as their print method
+(which prints that the object is internal and shouldn't be visible
+externally).  For internal objects needing a finalizer, equal or hash
+method, or wanting to customize the print method, use the normal
+@code{DEFINE_*_LISP_OBJECT} mechanism for defining these objects.
+@item
+@code{DEFINE_*_GENERAL_LISP_OBJECT} is for objects that need to
+provide one of the less common methods that are omitted on most
+objects.  These methods include the methods supporting the unified
+property interface using @code{get}, @code{put}, @code{remprop} and
+@code{object-plist}, and (for dumpable objects only) the
+@code{disksaver} method.
+@item
+@code{DEFINE_MODULE_*} is for objects defined in an external module.
+@end itemize
+@code{MAKE_LISP_OBJECT} and @code{MAKE_MODULE_LISP_OBJECT} are what
+underlies all of these; they define a structure containing pointers to
+object methods and other info such as the size of the structure
+containing the object.
+For the purpose of keeping allocation statistics, the allocation
 engine keeps a list of all the different types that exist.  Note that,
-since @code{DEFINE_LRECORD_IMPLEMENTATION()} is a macro that is
+since @code{DEFINE_*_LISP_OBJECT()} is a macro that is
-specified at top-level, there is no way for it to initialize the global
+specified at top-level, there is no way for it to initialize the
-data structures containing type information, like
+global data structures containing type information, like
 @code{lrecord_implementations_table}.  For this reason a call to
-@code{INIT_LRECORD_IMPLEMENTATION} must be added to the same source file
+@code{INIT_LISP_OBJECT()} must be added to the same source
-containing @code{DEFINE_LRECORD_IMPLEMENTATION}, but instead of to the
+file containing @code{DEFINE_*_LISP_OBJECT()}, but instead of
-top level, to one of the init functions, typically
+to the top level, to one of the init functions, typically
-@code{syms_of_@var{foo}.c}.  @code{INIT_LRECORD_IMPLEMENTATION} must be
+@code{syms_of_@var{foo}.c}.  @code{INIT_LISP_OBJECT()} must
-called before an object of this type is used.
+be called before an object of this type is used.
 The type number is also used to index into an array holding the number
 of objects of each type and the total memory allocated for objects of
 that type.  The statistics in this array are computed during the sweep
 stage.  These statistics are returned by the call to
 @code{garbage-collect}.
-Note that for every type defined with a @code{DEFINE_LRECORD_*()}
+Note that for every type defined with a @code{DEFINE_*_LISP_OBJECT()}
-macro, there needs to be a @code{DECLARE_LRECORD_IMPLEMENTATION()}
+macro, there needs to be a @code{DECLARE_LISP_OBJECT()} somewhere in a
-somewhere in a @file{.h} file, and this @file{.h} file needs to be
+@file{.h} file, and this @file{.h} file needs to be included by
-included by @file{inline.c}.
+@file{inline.c}.
 Furthermore, there should generally be a set of @code{XFOOBAR()},
-@code{FOOBARP()}, etc. macros in a @file{.h} (or occasionally @file{.c})
+@code{FOOBARP()}, etc. macros in a @file{.h} (or occasionally
-file.  To create one of these, copy an existing model and modify as
+@file{.c}) file.  To create one of these, copy an existing model and
-necessary.
+modify as necessary.
 @strong{Please note:} If you define an lrecord in an external
-dynamically-loaded module, you must use @code{DECLARE_EXTERNAL_LRECORD},
+dynamically-loaded module, you must use
-@code{DEFINE_EXTERNAL_LRECORD_IMPLEMENTATION}, and
+@code{DECLARE_MODULE_LISP_OBJECT()},
-@code{DEFINE_EXTERNAL_LRECORD_SEQUENCE_IMPLEMENTATION} instead of the
+@code{DEFINE_MODULE_*_LISP_OBJECT()}, and
-non-EXTERNAL forms. These macros will dynamically add new type numbers
+@code{INIT_MODULE_LISP_OBJECT()} instead of the non-MODULE
-to the global enum that records them, whereas the non-EXTERNAL forms
+forms. These macros will dynamically add new type numbers to the
-assume that the programmer has already inserted the correct type numbers
+global enum that records them, whereas the non-MODULE forms assume
-into the enum's code at compile-time.
+that the programmer has already inserted the correct type numbers into
+the enum's code at compile-time.
 The various methods in the lrecord implementation structure are:
 @enumerate
 @item
 operating-system and window-system resources associated with the object
 (e.g. pixmaps, fonts), etc.
 The finalize method can be NULL if nothing needs to be done.
-WARNING #1: The finalize method is also called at the end of the dump
-phase; this time with the for_disksave parameter set to non-zero.  The
-object is @emph{not} about to disappear, so you have to make sure to
-@emph{not} free any extra @code{malloc()}ed memory if you're going to
-need it later.  (Also, signal an error if there are any operating-system
-and window-system resources here, because they can't be dumped.)
 Finalize methods should, as a rule, set to zero any pointers after
-they've been freed, and check to make sure pointers are not zero before
+they've been freed, and check to make sure pointers are not zero
-freeing.  Although I'm pretty sure that finalize methods are not called
+before freeing.  Although I'm pretty sure that finalize methods are
-twice on the same object (except for the @code{for_disksave} proviso),
+not called twice on the same object, we've gotten nastily burned in
-we've gotten nastily burned in some cases by not doing this.
+some cases by not doing this.
-WARNING #2: The finalize method is @emph{only} called for
+WARNING #1: The finalize method is @emph{only} called for
-lcrecords, @emph{not} for simply lrecords.  If you need a
+normal objects, @emph{not} for frob-block objects.  If you need a
-finalize method for simple lrecords, you have to stick
+finalize method for frob-block objects, you have to stick
 it in the @code{ADDITIONAL_FREE_foo()} macro in @file{alloc.c}.
-WARNING #3: Things are in an @emph{extremely} bizarre state
+WARNING #2: Things are in an @emph{extremely} bizarre state
 when @code{ADDITIONAL_FREE_foo()} is called, so you have to
 be incredibly careful when writing one of these functions.
 See the comment in @code{gc_sweep()}.  If you ever have to add
 one of these, consider using an lcrecord or dealing with
 the problem in a different fashion.
 To hash two or more values together into a single value, use
 @code{HASH2()}, @code{HASH3()}, @code{HASH4()}, etc.
 @item
 @dfn{getprop}, @dfn{putprop}, @dfn{remprop}, and @dfn{plist} methods.
-These are used for object types that have properties.  I don't feel like
+These are used for object types that have properties, and are called
-documenting them here.  If you create one of these objects, you have to
+when @code{get}, @code{put}, @code{remprop}, and @code{object-plist},
-use different macros to define them,
+respectively are called on the object.  If you create one of these
-i.e. @code{DEFINE_LRECORD_IMPLEMENTATION_WITH_PROPS()} or
+objects, you have to use a different macro to define them,
-@code{DEFINE_LRECORD_SEQUENCE_IMPLEMENTATION_WITH_PROPS()}.
+i.e. @code{DEFINE_*_GENERAL_LISP_OBJECT()}.
 @item
 A @dfn{size_in_bytes} method, when the object is of variable-size.
-(i.e. declared with a @code{_SEQUENCE_IMPLEMENTATION} macro.)  This should
+(i.e. declared with a @code{DEFINE_*_SIZABLE_*_LISP_OBJECT} macro.)
-simply return the object's size in bytes, exactly as you might expect.
+This should simply return the object's size in bytes, exactly as you
-For an example, see the methods for window configurations and opaques.
+might expect.  For an example, see the methods for lstreams and opaques.
+@item
+A @dfn{disksave} method.  This is called at the end of the dump phase.
+It is used for objects that contain pointers or handles to objects
+created in external libraries, such as window-system windows or file
+handles.  Such external objects cannot be dumped, so it is necessary
+to release them at dump time and arrange somehow or other for them to
+be resurrected if necessary later on.
+It seems that even non-dumpable objects may be around at dump time,
+and a disksaver may be provided. (In fact, the only object currently
+with a disksaver, lstream, is non-dumpable.)
+Objects rarely need to provide this method; most of the time it will
+be NULL.  If you want to provide this method, you have to use the
+@code{DEFINE_*_GENERAL_LISP_OBJECT()} macro to define your object.
 @end enumerate
 @node Low-level allocation, Cons, lrecords, Allocation of Objects in XEmacs Lisp
 @section Low-level allocation
 @cindex low-level allocation
 complicated depending on how much information we cache.  In addition to
 the known region, we always cache the correct conversions for point,
 BEGV, and ZV, and in addition to this we cache 16 positions where the
 conversion is known.  We only look in the cache or update it when we
 need to move the known region more than a certain amount (currently 50
-chars), and then we throw away a "random" value and replace it with the
+chars), and then we throw away a ``random'' value and replace it with the
 newly calculated value.
 Finally, we maintain an extra flag that tracks whether the buffer is
 entirely ASCII, to speed up the conversions even more.  This flag is
 actually of dubious value because in an entirely-ASCII buffer the known
 track of a shifter value (0, 1, or 2) indicating how much to shift.
 Multiplying by 3 can be implemented by doubling and then adding the
 original value.  Dividing by 3, alas, cannot be implemented in any
 simple shift/subtract method, as far as I know; so we just do a table
 lookup.  For simplicity, we use a table of size 128K, which indexes the
-"divide-by-3" values for the first 64K non-negative numbers. (Note that
+``divide-by-3'' values for the first 64K non-negative numbers. (Note that
 we can increase the size up to 384K, i.e. indexing the first 192K
 non-negative numbers, while still using shorts in the array.) This also
 means that the size of the known region can be at most 64K for
 width-three characters.
 @end quotation
 @item
 the position of the gap
 @item
 the last value we computed
 @item
-a set of positions that are "far away" from previously computed positions
+a set of positions that are ``far away'' from previously computed positions
 (5000 chars currently; #### perhaps should be smaller)
 @end itemize
 For each position, we @code{CONSIDER()} it.  This means:
 the simple loop in FSF with the use of @code{bytecount_to_charcount()},
 @code{charcount_to_bytecount()}, @code{bytecount_to_charcount_down()}, or
 @code{charcount_to_bytecount_down()}. (The latter two I added for this purpose.)
 These scan 4 or 8 bytes at a time through purely single-byte characters.
-If the amount we had to scan was more than our "far away" distance (5000
+If the amount we had to scan was more than our ``far away'' distance (5000
 characters, see above), then cache the new position.
 #### Things to do:
 @itemize @bullet
 @item
 Look at the most recent GNU Emacs to see whether anything has changed.
 @item
 Think about whether it makes sense to try to implement some sort of
-known region or list of "known regions", like we had before.  This would
+known region or list of ``known regions'', like we had before.  This would
 be a region of entirely single-byte characters that we can check very
 quickly. (Previously I used a range of same-width characters of any
 size; but this adds extra complexity and slows down the scanning, and is
 probably not worth it.) As part of the scanning process in
 @code{bytecount_to_charcount()} et al, we skip over chunks of entirely
 In terms of reading the actual code, there are five optimizations
 (obfuscations, if you like) that have been done.
 @enumerate
 @item
-An explicit "failure stack" has been substituted for recursion.
+An explicit ``failure stack'' has been substituted for recursion.
 @item
 The @code{match_1_operator}, @code{next_p}, and @code{next_b} functions
 are actually inlined into the @code{match} function for efficiency.
 Then the pointer movement is interspersed with the matching operations.
 If the operator uses buffer context, the buffer pointer movement is
 sometimes implicit in the operations retrieving the context.
 @item
 Some cases are combined into short preparation for individual cases, and
-a "fall-through" into combined code for several cases.
+a ``fall-through'' into combined code for several cases.
 @item
 The @code{pattern} type is not an explicit @samp{struct}.  Instead, the
 data (including, @emph{e.g.}, @samp{range_table}) is inlined into the
 compiled bytecode.  This leads to bizarre code in the interpreter like
 @example
 ..., 'range', count, first_8_flags, second_8_flags, ..., next_op, ...
 @end example
 @end enumerate
-But if you keep your eye on the "switch in a loop" structure, you
+But if you keep your eye on the ``switch in a loop'' structure, you
 should be able to understand the parts you need.
 @node Multilingual Support, Consoles; Devices; Frames; Windows, Text, Top
 @chapter Multilingual Support
 @cindex Mule character sets and encodings
 a simple charset like ASCII, there is only one encoding normally used --
 each character is represented by a single byte, with the same value as
 its code point.  For more complicated charsets, however, things are not
 so obvious.  Unicode version 2, for example, is a large charset with
 thousands of characters, each indexed by a 16-bit number, often
-represented in hex, e.g. 0x05D0 for the Hebrew letter "aleph".  One
+represented in hex, e.g. 0x05D0 for the Hebrew letter ``aleph''.  One
 obvious encoding uses two bytes per character (actually two encodings,
 depending on which of the two possible byte orderings is chosen).  This
 encoding is convenient for internal processing of Unicode text; however,
 it's incompatible with ASCII, so a different encoding, e.g. UTF-8, is
 usually used for external text, for example files or e-mail.  UTF-8
 In an ASCII or single-European-character-set world, life is very simple.
 There are 256 characters, and each character is represented using the
 numbers 0 through 255, which fit into a single byte.  With a few
 exceptions (such as case-changing operations or syntax classes like
-'whitespace'), "text" is simply an array of indices into a font.  You
+@code{whitespace}), ``text'' is simply an array of indices into a font.  You
 can get different languages simply by choosing fonts with different
 8-bit character sets (ISO-8859-1, -2, special-symbol fonts, etc.), and
-everything will "just work" as long as anyone else receiving your text
+everything will ``just work'' as long as anyone else receiving your text
 uses a compatible font.
 In the multi-lingual world, however, it is much more complicated.  There
 are a great number of different characters which are organized in a
 complex fashion into various character sets.  The representation to use
 text as possible.  No operations should ever be performed on text encoded
 in an external representation other than simple copying, because no
 assumptions can reliably be made about the format of this text.  You
 cannot assume, for example, that the end of text is terminated by a null
 byte. (For example, if the text is Unicode, it will have many null bytes
-in it.)  You cannot find the next "slash" character by searching through
+in it.)  You cannot find the next ``slash'' character by searching through
-the bytes until you find a byte that looks like a "slash" character,
+the bytes until you find a byte that looks like a ``slash'' character,
 because it might actually be the second byte of a Kanji character.
 Furthermore, all text in the internal representation must be converted,
 even if it is known to be completely ASCII, because the external
 representation may not be ASCII compatible (for example, if it is
 Unicode).
 the structures of a particular external encoding and the methods required
 to convert to and from this encoding.  A facility exists to create coding
 system aliases, which in essence gives a single coding system two
 different names.  It is effectively used in XEmacs to provide a layer of
 abstraction on top of the actual coding systems.  For example, the coding
-system alias "file-name" points to whichever coding system is currently
+system alias ``file-name'' points to whichever coding system is currently
 used for encoding and decoding file names as passed to or retrieved from
 system calls.  In general, the actual encoding will differ from system to
 system, and also on the particular locale that the user is in.  The use
 of the file-name alias effectively hides that implementation detail on
 top of that abstract interface layer which provides a unified set of
 C = plain char, when the base type is unsigned
 U = unsigned
 S = signed
 @end example
-(Formerly I had a comment saying that type (e) "should be replaced with
+(Formerly I had a comment saying that type (e) ``should be replaced with
-void *".  However, there are in fact many places where an unsigned char
+void *''.  However, there are in fact many places where an unsigned char
 * might be used -- e.g. for ease in pointer computation, since void *
 doesn't allow this, and for compatibility with external APIs.)
 Note that these typedefs are purely for documentation purposes; from
 the C code's perspective, they are exactly equivalent to @code{char *},
 @node Different Ways of Seeing Internal Text, Buffer Positions, Byte Types, Byte/Character Types; Buffer Positions; Other Typedefs
 @subsection Different Ways of Seeing Internal Text
 @cindex different ways of seeing internal text
 There are various ways of representing internal text.  The two primary
-ways are as an "array" of individual characters; the other is as a
+ways are as an ``array'' of individual characters; the other is as a
-"stream" of bytes.  In the ASCII world, where there are only 255
+``stream'' of bytes.  In the ASCII world, where there are only 255
 characters at most, things are easy because each character fits into a
 byte.  In general, however, this is not true -- see the above discussion
 of characters vs. encodings.
 In some cases, it's also important to distinguish between a stream
 representation as a series of bytes and as a series of textual units.
 This is particularly important wrt Unicode.  The UTF-16 representation
-(sometimes referred to, rather sloppily, as simply the "Unicode" format)
+(sometimes referred to, rather sloppily, as simply the ``Unicode'' format)
 represents text as a series of 16-bit units.  Mostly, each unit
 corresponds to a single character, but not necessarily, as characters
-outside of the range 0-65535 (the BMP or "Basic Multilingual Plane" of
+outside of the range 0-65535 (the BMP or ``Basic Multilingual Plane'' of
 Unicode) require two 16-bit units, through the mechanism of
-"surrogates".  When a series of 16-bit units is serialized into a byte
+``surrogates''.  When a series of 16-bit units is serialized into a byte
 stream, there are at least two possible representations, little-endian
 and big-endian, and which one is used may depend on the native format of
 16-bit integers in the CPU of the machine that XEmacs is running
 on. (Similarly, UTF-32 is logically a representation with 32-bit textual
 units.)
 @item
 UTF-16 has 2-byte (16-bit) units.
 @item
 UTF-32 has 4-byte (32-bit) units.
 @item
-XEmacs-internal encoding (the old "Mule" encoding) has 1-byte (8-bit)
+XEmacs-internal encoding (the old ``Mule'' encoding) has 1-byte (8-bit)
 units.
 @item
-UTF-7 technically has 7-bit units that are within the "mail-safe" range
+UTF-7 technically has 7-bit units that are within the ``mail-safe'' range
 (ASCII 32 - 126 plus a few control characters), but normally is encoded
 in an 8-bit stream. (UTF-7 is also a modal encoding, since it has a
 normal mode where printable ASCII characters represent themselves and a
 shifted mode, introduced with a plus sign, where a base-64 encoding is
 used.)
 @table @code
 @item Ibyte
 The data in a buffer or string is logically made up of Ibyte objects,
 where a Ibyte takes up the same amount of space as a char. (It is
 declared differently, though, to catch invalid usages.) Strings stored
-using Ibytes are said to be in "internal format".  The important
+using Ibytes are said to be in ``internal format''.  The important
 characteristics of internal format are
 @itemize @minus
 @item
 ASCII characters are represented as a single Ibyte, in the range 0 -
 This means that Ichar values are upwardly compatible with the standard
 8-bit representation of ASCII/ISO-8859-1.
 @item Extbyte
-Strings that go in or out of Emacs are in "external format", typedef'ed
+Strings that go in or out of Emacs are in ``external format'', typedef'ed
 as an array of char or a char *.  There is more than one external format
 (JIS, EUC, etc.) but they all have similar properties.  They are modal
 encodings, which is to say that the meaning of particular bytes is not
-fixed but depends on what "mode" the string is currently in (e.g. bytes
+fixed but depends on what ``mode'' the string is currently in (e.g. bytes
 in the range 0 - 0x7f might be interpreted as ASCII, or as Hiragana, or
 as 2-byte Kanji, depending on the current mode).  The mode starts out in
 ASCII/ISO-8859-1 and is switched using escape sequences -- for example,
 in the JIS encoding, 'ESC $ B' switches to a mode where pairs of bytes
 in the range 0 - 0x7f are interpreted as Kanji characters.
 There are three possible ways to specify positions in a buffer.  All
 of these are one-based: the beginning of the buffer is position or
 index 1, and 0 is not a valid position.
-As a "buffer position" (typedef Charbpos):
+As a ``buffer position'' (typedef Charbpos):
 This is an index specifying an offset in characters from the
 beginning of the buffer.  Note that buffer positions are
 logically @strong{between} characters, not on a character.  The
 difference between two buffer positions specifies the number of
 characters between those positions.  Buffer positions are the
 only kind of position externally visible to the user.
-As a "byte index" (typedef Bytebpos):
+As a ``byte index'' (typedef Bytebpos):
 This is an index over the bytes used to represent the characters
 in the buffer.  If there is no Mule support, this is identical
 to a buffer position, because each character is represented
 using one byte.  However, with Mule support, many characters
 require two or more bytes for their representation, and so a
 byte index may be greater than the corresponding buffer
 position.
-As a "memory index" (typedef Membpos):
+As a ``memory index'' (typedef Membpos):
 This is the byte index adjusted for the gap.  For positions
 before the gap, this is identical to the byte index.  For
 positions after the gap, this is the byte index plus the gap
 size.  There are two possible memory indices for the gap
 position; the memory index at the beginning of the gap should
 always be used, except in code that deals with manipulating the
 gap, where both indices may be seen.  The address of the
-character "at" (i.e. following) a particular position can be
+character ``at'' (i.e. following) a particular position can be
 obtained from the formula
 buffer_start_address + memory_index(position) - 1
 except in the case of characters at the gap position.
 use the buffer-level functions in buffer.h, which automatically know the
 correct format and handle the gap.
 Some terminology:
-"itext" appearing in the macros means "internal-format text" -- type
+itext" appearing in the macros means "internal-format text" -- type
 @code{Ibyte *}.  Operations on such pointers themselves, rather than on the
 text being pointed to, have "itext" instead of "itext" in the macro
 name.  "ichar" in the macro names means an Ichar -- the representation
 of a character as a single integer rather than a series of bytes, as part
 of "itext".  Many of the macros below are for converting between the
 @item
 (c) using the GCC extension (@{ ... @}).
 @end itemize
 Turned out that all of the above had bugs, all caused by GCC (hence the
-comments about "those GCC wankers" and "ream gcc up the ass").  As for
+comments about ``those GCC wankers'' and ``ream gcc up the ass'').  As for
 (a), some versions of GCC (especially on Intel platforms), which had
 buggy implementations of @code{alloca()} that couldn't handle being called
 inside of a function call -- they just decremented the stack right in the
 middle of pushing args.  Oops, crash with stack trashing, very bad.  (b)
 was an attempt to fix (a), and that led to further GCC crashes, esp. when
 consistency.  For example, the new Mule workspace contains Ibyte
 versions of the stdlib string functions.
 @item Extbyte, UExtbyte
 Pointer to text in some external format, which can be defined as all
 formats other than the internal one.  The data representing a string
-in "external" format (binary or any external encoding) is logically a
+in ``external'' format (binary or any external encoding) is logically a
 set of Extbytes.  Extbyte is guaranteed to be just a char, so for
 example strlen (Extbyte *) is OK.  Extbyte is only a documentation
 device for referring to external text.
 @item Ascbyte, UAscbyte
 pure ASCII text, consisting of bytesf in a string in entirely US-ASCII
 @node Mule-izing Code,  , An Example of Mule-Aware Code, Coding for Mule
 @subsection Mule-izing Code
 A lot of code is written without Mule in mind, and needs to be made
-Mule-correct or "Mule-ized".  There is really no substitute for
+Mule-correct or ``Mule-ized''.  There is really no substitute for
 line-by-line analysis when doing this, but the following checklist can
 help:
 @itemize @bullet
 @item
 @item
 Look in the CRT sources!  They come with VC++.  See win32.c.
 @end enumerate
 @node Locales, More about code pages, Microsoft Documentation, Microsoft Windows-Related Multilingual Issues
-@subsection Locales, code pages, and other concepts of "language"
+@subsection Locales, code pages, and other concepts of ``language''
-@cindex locales, code pages, and other concepts of "language"
+@cindex locales, code pages, and other concepts of ``language''
 First, make sure you clearly understand the difference between the C
 runtime library (CRT) and the Win32 API!  See win32.c.
 There are various different ways of representing the vague concept
-of "language", and it can be very confusing.  So:
+of ``language'', and it can be very confusing.  So:
 @itemize @bullet
 @item
-The CRT library has the concept of "locale", which is a
+The CRT library has the concept of ``locale'', which is a
 combination of language and country, and which controls the way
 currency and dates are displayed, the encoding of data, etc.
 @item
-XEmacs has the concept of "language environment", more or less
+XEmacs has the concept of ``language environment'', more or less
 like a locale; although currently in most cases it just refers to
 the language, and no sub-language distinctions are
 made. (Exceptions are with Chinese, which has different language
 environments for Taiwan and mainland China, due to the different
 encodings and writing systems.)
 @item
 Windows has a number of different language concepts:
 @enumerate
 @item
-There are "languages" and "sublanguages", which correspond to
+There are ``languages'' and ``sublanguages'', which correspond to
 the languages and countries of the C library -- e.g. LANG_ENGLISH
 and SUBLANG_ENGLISH_US.  These are identified by 8-bit integers,
-called the "primary language identifier" and "sublanguage
+called the ``primary language identifier'' and ``sublanguage
-identifier", respectively.  These are combined into a 16-bit
+identifier'', respectively.  These are combined into a 16-bit
-integer or "language identifier" by MAKELANGID().
+integer or ``language identifier'' by @code{MAKELANGID()}.
 @item
-The language identifier in turn is combined with a "sort
+The language identifier in turn is combined with a ``sort
-identifier" (and optionally a "sort version") to yield a 32-bit
+identifier'' (and optionally a ``sort version'') to yield a 32-bit
-integer called a "locale identifier" (type LCID), which identifies
+integer called a ``locale identifier'' (type LCID), which identifies
 locales -- the primary means of distinguishing language/regional
 settings and similar to C library locales.
 @item
-A "code page" combines the XEmacs concepts of "charset" and "coding
+A ``code page'' combines the XEmacs concepts of ``charset'' and ``coding
-system".  It logically encompasses
+system''.  It logically encompasses
 @itemize @minus
 @item
 a set of supported characters
 @item
 supported
 @item
 a way of encoding a series of characters into a string of bytes
 @end itemize
-Note that the first two properties correspond to an XEmacs "charset"
+Note that the first two properties correspond to an XEmacs ``charset''
-and the latter an XEmacs "coding system".
+and the latter an XEmacs ``coding system''.
 Traditional encodings are either simple one-byte encodings, or
 combination one-byte/two-byte encodings (aka MBCS encodings, where MBCS
-stands for "Multibyte Character Set") with the following properties:
+stands for ``Multibyte Character Set'') with the following properties:
 @itemize @minus
 @item
 all characters are encoded as a one-byte or two-byte sequence
 @item
 the encoding is stateless (non-modal)
 @item
 the lower 128 bytes are compatible with ASCII
 @item
-in the higher bytes, the value of the first byte ("lead byte")
+in the higher bytes, the value of the first byte (``lead byte'')
 determines whether a second byte follows
 @item
 the values used for second bytes may overlap those used for first
 bytes, and (in some encodings) include values in the low half; thus,
 moving backwards is hard, and pure-ASCII algorithms (e.g. finding the
 Every Windows locale has four associated code pages: ANSI (an
 international standard or some Microsoft-created approximation; the
 native code page under Windows), OEM (a DOS encoding, still used in the
 FAT file system), Mac (an encoding used on the Macintosh) and EBCDIC (a
 non-ASCII-compatible encoding used on IBM mainframes, originally based
-on the BCD or "binary-coded decimal" encoding of numbers).  All code
+on the BCD or ``binary-coded decimal'' encoding of numbers).  All code
 pages associated with a locale follow (as far as I know) the properties
 listed above for traditional code pages.  More than one locale can share
 a code page -- e.g. all the Western European languages, including
 English, do.
 @item
-Windows also has an "input locale identifier" (aka "keyboard
+Windows also has an ``input locale identifier'' (aka ``keyboard
-layout id") or HKL, which is a 32-bit integer composed of the
+layout id'') or HKL, which is a 32-bit integer composed of the
-16-bit language identifier and a 16-bit "device identifier", which
+16-bit language identifier and a 16-bit ``device identifier'', which
 originally specified a particular keyboard layout (e.g. the locale
-"US English" can have the QWERTY layout, the Dvorak layout, etc.),
+``US English'' can have the QWERTY layout, the Dvorak layout, etc.),
 but has been expanded to include speech-to-text converters and
 other non-keyboard ways of inputting text.  Note that both the HKL
 and LCID share the language identifier in the lower 16 bits, and in
-both cases a 0 in the upper 16 bits means "default" (sort order or
+both cases a 0 in the upper 16 bits means ``default'' (sort order or
 device), providing a way to convert between HKL's, LCID's, and
 language identifiers (i.e. language/sublanguage pairs).  The
 default keyboard layout for a language is (as far as I can
 determine) established using the Regional Settings control panel
 applet, where you can add input locales as combinations of language
 @node More about code pages, More about locales, Locales, Microsoft Windows-Related Multilingual Issues
 @subsection More about code pages
 @cindex more about code pages
-Here is what MSDN says about code pages (article "Code Pages"):
+Here is what MSDN says about code pages (article ``Code Pages''):
 @quotation
 A code page is a character set, which can include numbers,
 punctuation marks, and other glyphs. Different languages and locales
 may use different code pages. For example, ANSI code page 1252 is
 -- The "C" locale is defined by ANSI to correspond to the locale in
 which C programs have traditionally executed. The code page for the
 "C" locale (code page) corresponds to the ASCII character
 set. For example, in the "C" locale, islower returns true for the
-values 0x61 ?0x7A only. In another locale, islower may return true
+values 0x61 to 0x7A only. In another locale, islower may return true
 for these as well as other values, as defined by that locale.
-Under "Locale-Dependent Routines" we notice the following setlocale
+Under ``Locale-Dependent Routines'' we notice the following setlocale
 dependencies:
 atof, atoi, atol (LC_NUMERIC)
 is Routines (LC_CTYPE)
 isleadbyte (LC_CTYPE)
 wcstombs (LC_CTYPE)
 wctomb (LC_CTYPE)
 _wtoi/_wtol (LC_NUMERIC)
 @end quotation
-NOTE: The above documentation doesn't clearly explain the "locale code
+NOTE: The above documentation doesn't clearly explain the ``locale code
-page" and "multibyte code page".  These are two different values,
+page'' and ``multibyte code page''.  These are two different values,
 maintained respectively in the CRT global variables __lc_codepage and
 __mbcodepage.  Calling e.g. setlocale (LC_ALL, "JAPANESE") sets @strong{ONLY}
 __lc_codepage to 932 (the code page for Japanese), and leaves
 __mbcodepage unchanged (usually 1252, i.e. Windows-ANSI).  You'd have to
 call _setmbcp() to change __mbcodepage.  Figuring out from the
 documentation which routines use which code page is not so obvious.  But:
 @itemize @bullet
 @item
-from "Interpretation of Multibyte-Character Sequences" it appears that
+from ``Interpretation of Multibyte-Character Sequences'' it appears that
-all "multibyte-character routines" use the multibyte code page except for
+all ``multibyte-character routines'' use the multibyte code page except for
-mblen(), _mbstrlen(), mbstowcs(), mbtowc(), wcstombs(), and wctomb().
+@code{mblen()}, @code{_mbstrlen()}, @code{mbstowcs()}, @code{mbtowc()}, @code{wcstombs()}, and @code{wctomb()}.
 @item
-from "_setmbcp": "The multibyte code page also affects
+from ``_setmbcp'': ``The multibyte code page also affects
 multibyte-character processing by the following run-time library
 routines: _exec functions _mktemp _stat _fullpath _spawn functions
 _tempnam _makepath _splitpath tmpnam.  In addition, all run-time library
 routines that receive multibyte-character argv or envp program arguments
 as parameters (such as the _exec and _spawn families) process these
 strings according to the multibyte code page. Hence these routines are
 also affected by a call to _setmbcp that changes the multibyte code
-page."
+page.''
 @end itemize
 Summary: from looking at the CRT source (which comes with VC++) and
 carefully looking through the docs, it appears that:
 @itemize @bullet
 @item
-the "locale code page" is used by all of the routines listed above
+the ``locale code page'' is used by all of the routines listed above
-under "Locale-Dependent Routines" (EXCEPT _mbccpy() and _mbclen()),
+under ``Locale-Dependent Routines'' (EXCEPT @code{_mbccpy()} and @code{_mbclen()}),
 as well as any other place that converts between multibyte and Unicode
 strings, e.g. the startup code.
 @item
-the "multibyte code page" is used in all of the *mb*() routines
+the ``multibyte code page'' is used in all of the @code{mb*()} routines
-except mblen(), _mbstrlen(), mbstowcs(), mbtowc(), wcstombs(),
+except @code{mblen()}, @code{_mbstrlen()}, @code{mbstowcs()}, @code{mbtowc()}, @code{wcstombs()},
-and wctomb(); also _exec*(), _spawn*(), _mktemp(), _stat(), _fullpath(),
+and @code{wctomb()}; also @code{_exec*()}, @code{_spawn*()}, @code{_mktemp()}, @code{_stat()}, @code{_fullpath()},
-_tempnam(), _makepath(), _splitpath(), tmpnam(), and similar functions
+@code{_tempnam()}, @code{_makepath()}, @code{_splitpath()}, @code{tmpnam()}, and similar functions
 without the leading underscore.
 @end itemize
 @node More about locales, Unicode support under Windows, More about code pages, Microsoft Windows-Related Multilingual Issues
 @subsection More about locales
 In addition to the locale defined by the CRT, Windows (i.e. the Win32 API)
 defines various locales:
 @itemize @bullet
 @item
-The system-default locale is the locale defined under "Language
+The system-default locale is the locale defined under ``Language
-settings for the system" in the "Regional Options" control panel.  This
+settings for the system'' in the ``Regional Options'' control panel.  This
 is NOT user-specific, and changing it requires a reboot (at least under
 Windows 2000).  The ANSI code page of the system-default locale is
-returned by GetACP(), and you can specify this code page in calls
+returned by @code{GetACP()}, and you can specify this code page in calls
 e.g. to MultiByteToWideChar with the constant CP_ACP.
 @item
-The user-default locale is the locale defined under "Settings for the
+The user-default locale is the locale defined under ``Settings for the
-current user" in the "Regional Options" control panel.
+current user'' in the ``Regional Options'' control panel.
 @item
 There is a thread-local locale set by SetThreadLocale. #### What is this
 used for?
 @end itemize
 The Win32 API has a bunch of multibyte functions -- all of those that
-end with ...A(), and on which we spend so much effort in
+end with ...@code{A()}, and on which we spend so much effort in
 intl-encap-win32.c.  These appear to ALWAYS use the ANSI code page of
-the system-default locale (GetACP(), CP_ACP).  Note that this applies
+the system-default locale (@code{GetACP()}, CP_ACP).  Note that this applies
 also, for example, to the encoding of filenames in all file-handling
-routines, including the CRT ones such as open(), because they pass their
+routines, including the CRT ones such as @code{open()}, because they pass their
 args unchanged to the Win32 API.
 @node Unicode support under Windows, The golden rules of writing Unicode-safe code, More about locales, Microsoft Windows-Related Multilingual Issues
 @subsection Unicode support under Windows
 @cindex unicode support under windows
 table to convert the characters of that code page to and from Unicode, and
 the Win32 API itself probably (perhaps always) uses Unicode internally.
 Under Windows there are two different versions of all library routines that
 accept or return text, those that handle Unicode text and those handling
-"multibyte" text, i.e. variable-width ASCII-compatible text in some
+``multibyte'' text, i.e. variable-width ASCII-compatible text in some
 national format such as EUC or Shift-JIS.  Because Windows 95 basically
 doesn't support Unicode but Windows NT does, and Microsoft doesn't provide
 any way of writing a single binary that will work on both systems and still
 use Unicode when it's available (although see below, Microsoft Layer for
 Unicode), we need to provide a way of run-time conditionalizing so you
-could have one binary for both systems.  "Unicode-splitting" refers to
+could have one binary for both systems.  ``Unicode-splitting'' refers to
 writing code that will handle this properly.  This means using
 Qmswindows_tstr as the external conversion format, calling the appropriate
 qxe...() Unicode-split version of library functions, and doing other things
-in certain cases, e.g. when a qxe() function is not present.
+in certain cases, e.g. when a @code{qxe()} function is not present.
 Unicode support also requires that the various Windows APIs be
-"Unicode-encapsulated", so that they automatically call the ANSI or
+``Unicode-encapsulated'', so that they automatically call the ANSI or
 Unicode version of the API call appropriately and handle the size
 differences in structures.  What this means is:
 @itemize @bullet
 @item
 first, note that Windows already provides a sort of encapsulation
 of all APIs that deal with text.  All such APIs are underlyingly
-provided in two versions, with an A or W suffix (ANSI or "wide"
+provided in two versions, with an A or W suffix (ANSI or ``wide''
 i.e. Unicode), and the compile-time constant UNICODE controls which is
 selected by the unsuffixed API.  Same thing happens with structures, and
 also with types, where the generic types have names beginning with T --
 TCHAR, LPTSTR, etc..  Unfortunately, this is compile-time only, not
 run-time, so not sufficient. (Creating the necessary run-time encoding
 such an API available internally.)
 @item
 what we do is provide an encapsulation of each standard Windows API call
 that is split into A and W versions.  current theory is to avoid all
-preprocessor games; so we name the function with a prefix -- "qxe"
+preprocessor games; so we name the function with a prefix -- ``qxe''
 currently -- and require callers to use the prefixed name.  Callers need
 to explicitly use the W version of all structures, and convert text
 themselves using Qmswindows_tstr.  the qxe encapsulated version will
 automatically call the appropriate A or W version depending on whether
 we're running on 9x or NT (you can force use of the A calls on NT,
 purpose, to make the code easier to follow for someone who's not familiar
 with it.  until our library is really complete and bug-free, we should
 think twice before doing this.
 According to Microsoft documentation, only the following functions are
-provided under Windows 9x to support Unicode (see MSDN page "Windows
+provided under Windows 9x to support Unicode (see MSDN page ``Windows
-95/98/Me General Limitations"):
+95/98/Me General Limitations''):
 EnumResourceLanguagesW
 EnumResourceNamesW
 EnumResourceTypesW
 ExtTextOutW
 MessageBoxExW
 MultiByteToWideChar
 TextOutW
 WideCharToMultiByte
-also maybe GetTextExtentExPoint? (KB Q125671 "Unicode Functions Supported
+also maybe GetTextExtentExPoint? (KB Q125671 ``Unicode Functions Supported
-by Windows 95")
+by Windows 95'')
 Q210341 says this in addition:
 @quotation
 SUMMARY:
 range beyond the 256 limitation of a one-byte representation.
 The Unicode standard offers application developers an opportunity to
 work with text without the limitations of character set based
 systems. For more information on the Unicode standard see the
-"References" section of this article. Windows NT is a fully Unicode
+References" section of this article. Windows NT is a fully Unicode
 capable operating system so it may be desirable to write software that
 supports Unicode on Windows 95.
 Even though Windows 95 and Windows 98 are not Unicode based, they do
 provide some limited Unicode functionality. Drawing of Unicode text is
 @itemize @bullet
 @item
 wmain() is completely supported, and appropriate Unicode-formatted argv
 and envp will always be passed.
 @item
-Likewise, wWinMain() is completely supported. (NOTE: The docs are not at
+Likewise, @code{wWinMain()} is completely supported. (NOTE: The docs are not at
 all clear on how these various entry points interact, and implies that
-a windows-subsystem program "must" use WinMain(), while a console-
+a windows-subsystem program ``must'' use @code{WinMain()}, while a console-
-subsystem program "must" use main(), and a program compiled with UNICODE
+subsystem program ``must'' use @code{main()}, and a program compiled with UNICODE
-(which we don't, see above) "must" use the w*() versions, while a program
+(which we don't, see above) ``must'' use the @code{w*()} versions, while a program
-not compiled this way "must" use the plain versions.  In fact it appears
+not compiled this way ``must'' use the plain versions.  In fact it appears
 that the CRT provides four different compiler entry points, namely
 w?(main|WinMain)CRTStartup, and we simply choose the one we like using
 the appropriate link flag.
 @item
 _wenviron, _wputenv
 | +--------------------------------------------------------------------+ |
 | |                               menubar                              | |
 | ###################################################################### |
 | #                               toolbar                              # |
 | #--------------------------------------------------------------------# |
-| #  |                            gutter                            |  # |
+| #  |                        internal border                       |  # |
-| #  |--------------------------------------------------------------|  # |
+| #  | +----------------------------------------------------------+ |  # |
-| #  | |                  internal border width                   | |  # |
+| #  | |                          gutter                          | |  # |
-| #  | | ******************************************************** | |  # |
+| #  | |-********************************************************-| |  # |
-|w#  | | *                         |s|v*                      |s* | |  #w|
+|w#  | | *@|        scrollbar        |v*                      |s* | |  #w|
-|i#  | | *                         |c|e*                      |c* | |  #i|
+|i#  | | *-+-------------------------|e*                      |c* | |  #i|
-|n#  | | *                         |r|r*                      |r* | |  #n|
+|n#  | | *s|                         |r*                      |r* | |  #n|
-|d#  | | *                         |o|t*                      |o* | |  #d|
+|d#  | | *c|                         |t*                      |o* | |  #d|
-|o#  | | *        text area        |l|.*      text area       |l* | |  #o|
+|o#  | | *r|                         |.*      text area       |l* | |  #o|
-|w#  | |i*                         |l| *                      |l*i| |  #w|
+|w#  |i| *o|                         | *                      |l* |i|  #w|
-|-#  | |n*                         |b|d*                      |b*n| |  #-|
+|-#  |n| *l|        text area        |d*                      |b* |n|  #-|
-|m#  | |t*                         |a|i*                      |a*t| |  #m|
+|m#  |t| *l|                         |i*                      |a* |t|  #m|
-|a#  | |.*                         |r|v*                      |r*.| |  #a|
+|a#  |e| *b|                         |v*                      |r* |e|  #a|
-|n# t| | *-------------------------+-|i*----------------------+-* | |t #n|
+|n# t|r| *a|                         |i*----------------------+-* |r|t #n|
-|a# o|g|b*        scrollbar        | |d*      scrollbar       | *b|g|o #a|
+|a# o|n|g*r|                         |d*      scrollbar       |@*g|n|o #a|
-|g# o|u|o*-------------------------+-|e*----------------------+-*o|u|o #g|
+|g# o|a|u*-+-------------------------|e*----------------------+-*u|a|o #g|
-|e# l|t|r*        modeline           |r*      modeline          *r|t|l #e|
+|e# l|l|t*        modeline           |r*      modeline          *t|l|l #e|
-|r# b|t|d********************************************************d|t|b #r|
+|r# b| |t********************************************************t| |b #r|
-| # a|e|e*   =..texttexttex....=   |s|v*                      |s*e|e|a # |
+| # a|b|e*   =..texttexttex....=   |s|v*                      |s*e|b|a # |
-|d# r|r|r*o m=..texttexttextt..=o m|c|e*                      |c*r|r|r #d|
+|d# r|o|r*o m=..texttexttextt..=o m|c|e*                      |c*r|o|r #d|
-|e#  | | *u a=.exttexttextte...=u a|r|r*                      |r* | |  #e|
+|e#  |r| *u a=.exttexttextte...=u a|r|r*                      |r* |r|  #e|
-|c#  | |w*t r=....texttexttex..=t r|o|t*                      |o*w| |  #c|
+|c#  |d| *t r=....texttexttex..=t r|o|t*                      |o* |d|  #c|
-|o#  | |i*s g=        etc.     =s g|l|.*      text area       |l*i| |  #o|
+|o#  |e| *s g=        etc.     =s g|l|.*      text area       |l* |e|  #o|
-|r#  | |d*i i=                 =i i|l| *                      |l*d| |  #r|
+|r#  |r| *i i=                 =i i|l| *                      |l* |r|  #r|
-|a#  | |t*d n=                 =d n|b|d*                      |b*t| |  #a|
+|a#  | | *d n=                 =d n|b|d*                      |b* | |  #a|
-|t#  | |h*e  = inner text area =e  |a|i*                      |a*h| |  #t|
+|t#  | | *e  = inner text area =e  |a|i*                      |a* | |  #t|
 |i#  | | *   =                 =   |r|v*                      |r* | |  #i|
 |o#  | | *---===================---+-|i*----------------------+-* | |  #o|
-|n#  | | *        scrollbar        | |d*      scrollbar       | * | |  #n|
+|n#  | | *        scrollbar        |@|d*      scrollbar       |@* | |  #n|
 | #  | | *-------------------------+-|e*----------------------+-* | |  # |
 | #  | | *        modeline           |r*      modeline          * | |  # |
-| #  | | ******************************************************** | |  # |
+| #  | |-********************************************************-| |  # |
-| #  | | *                        minibuffer                    * | |  # |
+| #  | |                           gutter                         | |  # |
-| #  | | ******************************************************** | |  # |
+| #  | |-********************************************************-| |  # |
-| #  | |                   internal border width                  | |  # |
+| #  | |@*                       minibuffer                     *@| |  # |
-| #  |--------------------------------------------------------------|  # |
+| #  | +-********************************************************-+ |  # |
-| #  |                             gutter                           |  # |
+| #  |                         internal border                      |  # |
 | #--------------------------------------------------------------------# |
 | #                                toolbar                             # |
 | ###################################################################### |
 |                          window manager decoration                     |
 +------------------------------------------------------------------------+
 # = boundary of client area; * = window boundaries, boundary of paned area
-= = boundary of inner text area; . = inside margin area
+= = boundary of inner text area; . = inside margin area; @ = dead boxes
 @end example
-Note in particular what happens at the corners, where a "corner box"
+Note in particular what happens at the corners, where a ``corner box''
 occurs.  Top and bottom toolbars take precedence over left and right
 toolbars, extending out horizontally into the corner boxes.  Gutters
 work the same way.  The corner box where the scrollbars meet, however,
-is assigned to neither scrollbar, and is known as the "dead box"; it is
+is assigned to neither scrollbar, and is known as the ``dead box''; it is
-an area that must be cleared specially.
+an area that must be cleared specially.  There are similar dead boxes at
+the bottom-right and bottom-left corners where the minibuffer and
+left/right gutters meet, but there is currently a bug in that these dead
+boxes are not explicitly cleared and may contain junk.
 @node The Frame, The Non-Client Area, Intro to Window and Frame Geometry, Window and Frame Geometry
 @section The Frame
-The "top-level window area" is the entire area of a top-level window (or
+The ``top-level window area'' is the entire area of a top-level window (or
-"frame").  The "client area" (a term from MS Windows) is the area of a
+``frame'').  The ``client area'' (a term from MS Windows) is the area of a
 top-level window that XEmacs draws into and manages with redisplay.
 This includes the toolbar, scrollbars, gutters, dividers, text area,
 modeline and minibuffer.  It does not include the menubar, title or
-outer borders.  The "non-client area" is the area of a top-level window
+outer borders.  The ``non-client area'' is the area of a top-level window
 outside of the client area and includes the menubar, title and outer
 borders.  Internally, all frame coordinates are relative to the client
 area.
 @item
 The outer layer is the window-manager decorations: The title and
 borders.  These are controlled by the window manager, a separate process
 that controls the desktop, the location of icons, etc.  When a process
 tries to create a window, the window manager intercepts this action and
-"reparents" the window, placing another window around it which contains
+``reparents'' the window, placing another window around it which contains
 the window decorations, including the title bar, outer borders used for
 resizing, etc.  The window manager also implements any actions involving
 the decorations, such as the ability to resize a window by dragging its
 borders, move a window by dragging its title bar, etc.  If there is no
 window manager or you kill it, windows will have no decorations (and
 will lose them if they previously had any) and you will not be able to
 move or resize them.
 @item
-Inside of the window-manager decorations is the "shell", which is
+Inside of the window-manager decorations is the ``shell'', which is
 managed by the toolkit and widget libraries your program is linked with.
 The code in @file{*-x.c} uses the Xt toolkit and various possible widget
-libraries built on top of Xt, such as Motif, Athena, the "Lucid"
+libraries built on top of Xt, such as Motif, Athena, the ``Lucid''
 widgets, etc.  Another possibility is GTK (@file{*-gtk.c}), which implements
-both the toolkit and widgets.  Under Xt, the "shell" window is an
+both the toolkit and widgets.  Under Xt, the ``shell'' window is an
 EmacsShell widget, containing an EmacsManager widget of the same size,
 which in turn contains a menubar widget and an EmacsFrame widget, inside
 of which is the client area. (The division into EmacsShell and
 EmacsManager is due to the complex and screwy geometry-management system
 in Xt [and X more generally].  The EmacsShell handles negotation with
 Under Windows, the non-client area is managed by the window system.
 There is no division such as under X.  Part of the window-system API
 (@file{USER.DLL}) of Win32 includes functions to control the menubars, title,
 etc. and implements the move and resize behavior.  There @strong{is} an
-equivalent of the window manager, called the "shell", but it manages
+equivalent of the window manager, called the ``shell'', but it manages
 only the desktop, not the windows themselves.  The normal shell under
 Windows is @file{EXPLORER.EXE}; if you kill this, you will lose the bar
-containing the "Start" menu and tray and such, but the windows
+containing the ``Start'' menu and tray and such, but the windows
 themselves will not be affected or lose their decorations.
 @node The Client Area, The Paned Area, The Non-Client Area, Window and Frame Geometry
 @section The Client Area
 Inside of the client area is the toolbars, the gutters (where the buffer
 tabs are displayed), the minibuffer, the internal border width, and one
-or more non-overlapping "windows" (this is old Emacs terminology, from
+or more non-overlapping ``windows'' (this is old Emacs terminology, from
 before the time when frames existed at all; the standard terminology for
-this would be "pane").  Each window can contain a modeline, horizontal
+this would be ``pane'').  Each window can contain a modeline, horizontal
 and/or vertical scrollbars, and (for non-rightmost windows) a vertical
 divider, surrounding a text area.
 The dimensions of the toolbars and gutters are determined by the formula
-(THICKNESS + 2 * BORDER-THICKNESS), where "thickness" is a cover term
+(THICKNESS + 2 * BORDER-THICKNESS), where ``thickness'' is a cover term
 for height or width, as appropriate.  The height and width come from
 @code{default-toolbar-height} and @code{default-toolbar-width} and the specific
 versions of these (@code{top-toolbar-height}, @code{left-toolbar-width}, etc.).
 The border thickness comes from @code{default-toolbar-border-height} and
 @code{default-toolbar-border-width}, and the specific versions of these.  The
 @node The Paned Area, Text Areas, The Client Area, Window and Frame Geometry
 @section The Paned Area
-The area occupied by the "windows" is called the paned area.  Note that
+The area occupied by the ``windows'' is called the paned area.
-this includes the minibuffer, which is just another window but is
+Unfortunately, because of the presence of the gutter @strong{between} the
-special-cased in XEmacs.  Each window can include a horizontal and/or
+minibuffer and other windows, the bottom of the paned area is not
-vertical scrollbar, a modeline and a vertical divider to its right, as
+well-defined -- does it include the minibuffer (in which case it also
-well as the text area.  Only non-rightmost windows can include a
+includes the bottom gutter, but none others) or does it not include
-vertical divider. (The minibuffer normally does not include either
+the minibuffer? (In which case not all windows are included.) It would
-modeline or scrollbars.)
+be cleaner to put the bottom gutter @strong{below} the minibuffer instead of
+above it.
+Each window can include a horizontal and/or vertical scrollbar, a
+modeline and a vertical divider to its right, as well as the text area.
+Only non-rightmost windows can include a vertical divider. (The
+minibuffer normally does not include either modeline or scrollbars.)
 Note that, because the toolbars and gutters are controlled by
 specifiers, and specifiers can have window-specific and buffer-specific
 values, the size of the paned area can change depending on which window
 is selected: In other words, if the selected window or buffer changes,
 @code{horizontal-scrollbar-visible-p}, @code{vertical-scrollbar-visible-p},
 @code{vertical-divider-always-visible-p}, etc.
 In addition, it is possible to set margins in the text area using the
 specifiers @code{left-margin-width} and @code{right-margin-width}.  When this is
-done, only the "inner text area" (the area inside of the margins) will
+done, only the ``inner text area'' (the area inside of the margins) will
 be used for normal display of text; the margins will be used for glyphs
 with a layout policy of @code{outside-margin} (as set on an extent containing
 the glyph by @code{set-extent-begin-glyph-layout} or
 @code{set-extent-end-glyph-layout}).  However, the calculation of the text
 area size (e.g. in the function @code{window-text-area-width}) includes the
 margins.  Which margin is used depends on whether a glyph has been set
 as the begin-glyph or end-glyph of an extent (@code{set-extent-begin-glyph}
 etc.), using the left and right margins, respectively.
 Technically, the margins outside of the inner text area are known as the
-"outside margins".  The "inside margins" are in the inner text area and
+``outside margins''.  The ``inside margins'' are in the inner text area and
 constitute the whitespace between the outside margins and the first or
 last non-whitespace character in a line; their width can vary from line
 to line.  Glyphs will be placed in the inside margin if their layout
 policy is @code{inside-margin} or @code{whitespace}, with @code{whitespace} glyphs on
 the inside and @code{inside-margin} glyphs on the outside.  Inside-margin
 @node The Displayable Area, Which Functions Use Which?, Text Areas, Window and Frame Geometry
 @section The Displayable Area
-The "displayable area" is not so much an actual area as a convenient
+The ``displayable area'' is not so much an actual area as a convenient
 fiction.  It is the area used to convert between pixel and character
 dimensions for frames.  The character dimensions for a frame (e.g. as
 returned by @code{frame-width} and @code{frame-height} and set by
 @code{set-frame-width} and @code{set-frame-height}) are determined from the
 displayable area by dividing by the pixel size of the default font as
-instantiated in the frame. (For proportional fonts, the "average" width
+instantiated in the frame. (For proportional fonts, the ``average'' width
 is used.  Under Windows, this is a built-in property of the fonts.
 Under X, this is based on the width of the lowercase 'n', or if this is
 zero then the width of the default character. [We prefer 'n' to the
 specified default character because many X fonts have a default
 character with a zero or otherwise non-representative width.])
-The displayable area is essentially the "theoretical" paned area of the
+The displayable area is essentially the ``theoretical'' gutter area of the
-frame excluding the rightmost and bottom-most scrollbars.  In this
+frame, excluding the rightmost and bottom-most scrollbars.  That is, it
-context, "theoretical" means that all calculations on based on
+starts from the client (or ``total'') area and then excludes the
-frame-level values for toolbar, gutter and scrollbar thicknesses.
+``theoretical'' toolbars and bottom-most/rightmost scrollbars, and the
-Because these thicknesses are controlled by specifiers, and specifiers
+internal border width.  In this context, ``theoretical'' means that all
-can have window-specific and buffer-specific values, these calculations
+calculations on based on frame-level values for toolbar and scrollbar
-may or may not reflect the actual size of the paned area or of the
+thicknesses.  Because these thicknesses are controlled by specifiers,
-scrollbars when any particular window is selected.  Note also that the
+and specifiers can have window-specific and buffer-specific values,
-"displayable area" may not even be contiguous!  In particular, if the
+these calculations may or may not reflect the actual size of the paned
-frame-level value of the horizontal scrollbar height is non-zero, then
+area or of the scrollbars when any particular window is selected.  Note
-the displayable area includes the paned area above and below the bottom
+also that the ``displayable area'' may not even be contiguous!  In
-horizontal scrollbar but not the scrollbar itself.
+particular, the gutters are included, but the bottom-most and rightmost
+scrollbars are excluded even though they are inside of the gutters.
+Furthermore, if the frame-level value of the horizontal scrollbar height
+is non-zero, then the displayable area includes the paned area above and
+below the bottom horizontal scrollbar (i.e. the modeline and minibuffer)
+but not the scrollbar itself.
 As a further twist, the character-dimension calculations are adjusted so
 that the truncation and continuation glyphs (see @code{truncation-glyph} and
 @code{continuation-glyph}) count as a single character even if they are wider
 than the default font width. (Technically, the character width is
 width before dividing by the default-font width, and then adding 1 to
 the result.) (The ultimate motivation for this kludge as well as the
 subtraction of the scrollbars, but not the minibuffer or bottom-most
 modeline, is to maintain compatibility with TTY's.)
-Despite all these concerns and kludges, however, the "displayable area"
+Despite all these concerns and kludges, however, the ``displayable area''
 concept works well in practice and mostly ensures that by default the
 frame will actually fit 79 characters + continuation/truncation glyph.
 @node Which Functions Use Which?,  , The Displayable Area, Window and Frame Geometry
 @section Event Queues
 @cindex event queues
 @cindex queues, event
 There are two event queues here -- the command event queue (#### which
-should be called "deferred event queue" and is in my glyph ws) and the
+should be called ``deferred event queue'' and is in my glyph ws) and the
 dispatch event queue. (MS Windows actually has an extra dispatch queue
 for non-user events and uses the generic one only for user events.  This
 is because user and non-user events in Windows come through the same
 place -- the window procedure -- but under X, it's possible to
 selectively process events such that we take all the user events before
 @item handle_magic_event_cb
 XEmacs calls this with an event structure which contains window-system
 dependent information that XEmacs doesn't need to know about, but which
 must happen in order.  If the @code{next_event_cb} never returns an
-event of type "magic", this will never be used.
+event of type ``magic'', this will never be used.
 @item format_magic_event_cb
 Called with a magic event; print a representation of the innards of the
 event to @var{PSTREAM}.
 @item select_process_cb
 @item unselect_process_cb
 These callbacks tell the underlying implementation to add or remove a
 file descriptor from the list of fds which are polled for
 inferior-process input.  When input becomes available on the given
-process connection, an event of type "process" should be generated.
+process connection, an event of type ``process'' should be generated.
 @item select_console_cb
 @item unselect_console_cb
 These callbacks tell the underlying implementation to add or remove a
 console from the list of consoles which are polled for user-input.
 @cindex focus handling
 Ben's capsule lecture on focus:
 In GNU Emacs @code{select-frame} never changes the window-manager frame
-focus.  All it does is change the "selected frame".  This is similar to
+focus.  All it does is change the ``selected frame''.  This is similar to
 what happens when we call @code{select-device} or @code{select-console}.
 Whenever an event comes in (including a keyboard event), its frame is
 selected; therefore, evaluating @code{select-frame} in @samp{*scratch*}
 won't cause any effects because the next received event (in the same
 frame) will cause a switch back to the frame displaying
 minibuffer, you essentially want to temporarily switch the WM focus to
 the frame with the minibuffer, and switch it back when you exit the
 minibuffer.
 GNU Emacs solves this with the crockish @code{redirect-frame-focus},
-which says "for keyboard events received from FRAME, act like they're
+which says ``for keyboard events received from FRAME, act like they're
-coming from FOCUS-FRAME".  I think what this means is that, when a
+coming from FOCUS-FRAME''.  I think what this means is that, when a
 keyboard event comes in and the event manager is about to select the
 event's frame, if that frame has its focus redirected, the redirected-to
 frame is selected instead.  That way, if you're in a minibufferless
 frame and enter the minibuffer, then all Lisp functions that run see the
 selected frame as the minibuffer's frame rather than the minibufferless
 There's also some weird logic that switches the redirected frame focus
 from one frame to another if Lisp code explicitly calls
 @code{select-frame} (but not if @code{handle-switch-frame} is called),
 and saves and restores the frame focus in window configurations,
 etc. etc.  All of this logic is heavily @code{#if 0}'d, with lots of
-comments saying "No, this approach doesn't seem to work, so I'm trying
+comments saying ``No, this approach doesn't seem to work, so I'm trying
-this ...  is it reasonable?  Well, I'm not sure ..." that are a red flag
+this ...  is it reasonable?  Well, I'm not sure ...'' that are a red flag
 indicating crockishness.
 Because of our way of doing things, we can avoid all this crock.
 Keyboard events never cause a select-frame (who cares what frame they're
 associated with?  They come from a console, only).  We change the actual
 return value should be an alist consisting of a list of all of the
 defined subtypes for that coding system type along with a level of
 likelihood and a list of additional properties indicating certain
 features detected in the data. The extra properties returned are
 defined entirely by the particular coding system type and are used
-only in the algorithm described below under "user control." However,
+only in the algorithm described below under ``user control.'' However,
 the levels of likelihood have a standard meaning as follows:
-Level 4 means "near certainty" and typically indicates that a
+Level 4 means ``near certainty'' and typically indicates that a
 signature has been detected, usually at the beginning of the data,
 indicating that the data is encoded in this particular coding system
 type. An example of this would be the byte order mark at the beginning
 of UCS2 encoded data or the GZIP mark at the beginning of GZIP data.
-Level 3 means "highly likely" and indicates that tell-tale signs have
+Level 3 means ``highly likely'' and indicates that tell-tale signs have
 been discovered in the data that are characteristic of this particular
 coding system type. Examples of this might be ISO 2022 escape
 sequences or the current Unicode end of line markers at regular
 intervals.
-Level 2 means "strongly statistically likely" indicating that
+Level 2 means ``strongly statistically likely'' indicating that
 statistical analysis concludes that there's a high chance that this
 data is encoded according to this particular type. For example, this
 might mean that for UCS2 data, there is a high proportion of null bytes
 or other repeated bytes in the odd-numbered bytes of the data and a
 high variance in the even-numbered bytes of the data. For Shift-JIS,
 this might indicate that there were no illegal Shift-JIS sequences
 and a fairly high occurrence of common Shift-JIS characters.
-Level 1 means "weak statistical likelihood" meaning that there is some
+Level 1 means ``weak statistical likelihood'' meaning that there is some
 indication that the data is encoded in this coding system type. In
 fact, there is a reasonable chance that it may be some other type as
 well. This means, for example, that no illegal sequences were
 encountered and at least some data was encountered that is purposely
 not in other coding system types. For Shift-JIS data, this might mean
 that some bytes in the range 128 to 159 were encountered in the data.
-Level 0 means "neutral" which is to say that there's either not enough
+Level 0 means ``neutral'' which is to say that there's either not enough
 data to make any decision or that the data could well be interpreted
 as this type (meaning no illegal sequences), but there is little or no
 indication of anything particular to this particular type.
-Level -1 means "weakly unlikely" meaning that some data was
+Level -1 means ``weakly unlikely'' meaning that some data was
 encountered that could conceivably be part of the coding system type
 but is probably not. For example, successively long line-lengths or
 very rarely-encountered sequences.
-Level -2 means "strongly unlikely" meaning that typically a number
+Level -2 means ``strongly unlikely'' meaning that typically a number
 of illegal sequences were encountered.
 The algorithm to determine when to stop and indicate that the data has
 been detected as a particular coding system uses a priority list,
 which is typically specified as part of the language environment
 Japanese-language environment particular subtypes of ISO 2022 will be
 associated with the Japanese coding system version of those
 subtypes). It is perfectly legal and quite common in fact, to list the
 same subtype more than once in the priority list with successively
 lower requirements. Other facts that can be listed in the priority
-list for a subtype are "reject", meaning that the data should never be
+list for a subtype are ``reject'', meaning that the data should never be
-detected as this subtype, or "ask", meaning that if the data is
+detected as this subtype, or ``ask'', meaning that if the data is
 detected to be this subtype, the user will be asked whether they
 actually mean this. This latter property could be used, for example,
 towards the bottom of the priority list.
 In addition there is a global variable which specifies the minimum
 system, the subtype, the coding system and the associated level of
 likelihood will be prominently displayed either in the echo area or in
 a status box somewhere.
 If no positive match is found according to the priority list, or if
-the matches that are found have the "ask" property on them, then the
+the matches that are found have the ``ask'' property on them, then the
 user will be presented with a list of choices of possible encodings
 and asked to choose one. This list is typically sorted first by level
 of likelihood, and then within this, by the order in which the
 subtypes appear in the priority list. This list is displayed in a
 special kind of dialog box or other buffer allowing the user, in
 will be in the form of errors or warnings of various levels, some of
 which may be severe enough to stop the decoding entirely, and some of
 which may either indicate definitely malformed data but from which
 it's possible to recover, or simply data that appears rather
 questionable. If any of these status values are reported during
-decoding, the user will be informed of this and asked "are you sure?"
+decoding, the user will be informed of this and asked ``are you sure?''
-As part of the "are you sure" dialog box or question, the user can
+As part of the ``are you sure'' dialog box or question, the user can
 display the results of the decoding to make sure it's correct. If the
-user says "no, they're not sure," then the same list of choices as
+user says ``no, they're not sure,'' then the same list of choices as
 previously mentioned will be presented.
 @subheading RFC: Autodetection
 Also appeared under heading "Implementation of Coding System Priority
 @enumerate
 @item
 Hopefully a system general enough to handle (2)--(4) will
 handle these, too, but we should watch out for gotchas like
-Unicode "plane 14" tags which (I think _both_ Ben and Olivier
+Unicode ``plane 14'' tags which (I think _both_ Ben and Olivier
 will agree) have no place in the internal representation, and
 thus must be treated as out-of-band control sequences.  I
 don't know if all such gotchas will be as easy to dispose of.
 @item
 sly, it can't be perfect if any autodecoding is done;
 like Hrvoje should have an easily available option to
 to this default (or an optimized approximation which
 t actually read the whole file into a buffer) or simply
-y everything as binary (with the "font" for binary files
+y everything as binary (with the ``font'' for binary files
 a user option).
 @item
 This implies that we should be detecting conditions in the
 tail of the file which violate the implicit assumptions of the
 Date: 11/1/1999 7:24 AM
 Stephen, thank you very much for writing this up.  I think it is a good start,
 and definitely moving in the direction I would like to see things going: more
-proposals, less arguing. (aka "more light, less heat") However, I have some
+proposals, less arguing. (aka ``more light, less heat'') However, I have some
 suggestions for cleaning this up:
 You should try to make it more layered.  For example, you might have one
 section devoted to the workings of autodetection, which starts out like this
 (the section numbers below are totally arbitrary):

Mercurial > hg > xemacs-beta

comparison man/internals/internals.texi @ 5178:97eb4942aec8