changeset 868:48eed784e93a

[xemacs-hg @ 2002-06-05 12:00:40 by ben] To: xemacs-patches@xemacs.org internals/internals.texi:
author ben
date Wed, 05 Jun 2002 12:01:11 +0000
parents 804517e16990
children a07667553efc
files man/ChangeLog man/internals/internals.texi src/ChangeLog src/README.global-renaming src/README.integral-types
diffstat 5 files changed, 565 insertions(+), 352 deletions(-) [+]
line wrap: on
line diff
--- a/man/ChangeLog	Wed Jun 05 09:58:45 2002 +0000
+++ b/man/ChangeLog	Wed Jun 05 12:01:11 2002 +0000
@@ -1,3 +1,13 @@
+2002-06-05  Ben Wing  <ben@xemacs.org>
+
+	* internals/internals.texi (Top):
+	* internals/internals.texi (The XEmacs Object System (Abstractly Speaking)):
+	* internals/internals.texi (How Lisp Objects Are Represented in C):
+	* internals/internals.texi (Major Textual Changes):
+	* internals/internals.texi (Great Integral Type Renaming):
+	* internals/internals.texi (Text/Char Type Renaming):
+	* internals/internals.texi (files): New.
+
 2002-05-04  Stephen J. Turnbull  <stephen@xemacs.org>
 
 	* custom.texi (The Init File): Rewrite completely.
--- a/man/internals/internals.texi	Wed Jun 05 09:58:45 2002 +0000
+++ b/man/internals/internals.texi	Wed Jun 05 12:01:11 2002 +0000
@@ -116,6 +116,7 @@
 * XEmacs From the Inside::
 * The XEmacs Object System (Abstractly Speaking)::
 * How Lisp Objects Are Represented in C::
+* Major Textual Changes::
 * Rules When Writing New C Code::
 * CVS Techniques::
 * A Summary of the Various XEmacs Modules::
@@ -1759,7 +1760,7 @@
 nor do most complex objects, which contain too much state to be easily
 initialized through a read syntax.
 
-@node How Lisp Objects Are Represented in C, Rules When Writing New C Code, The XEmacs Object System (Abstractly Speaking), Top
+@node How Lisp Objects Are Represented in C, Major Textual Changes, The XEmacs Object System (Abstractly Speaking), Top
 @chapter How Lisp Objects Are Represented in C
 @cindex Lisp objects are represented in C, how
 @cindex objects are represented in C, how Lisp
@@ -1846,7 +1847,335 @@
 nothing unless the corresponding configure error checking flag was
 specified.
 
-@node Rules When Writing New C Code, CVS Techniques, How Lisp Objects Are Represented in C, Top
+@node Major Textual Changes, Rules When Writing New C Code, How Lisp Objects Are Represented in C, Top
+@chapter Major Textual Changes
+@cindex textual changes, major
+@cindex major textual changes
+
+Sometimes major textual changes are made to the source.  This means that
+a search-and-replace is done to change type names and such.  Some people
+disagree with such changes, and certainly if done without good reason
+will just lead to headaches.  But it's important to keep the code clean
+and understable, and consistent naming goes a long way towards this.
+
+An example of the right way to do this was the so-called "great integral
+type renaming".
+
+@menu
+* Great Integral Type Renaming::
+* Text/Char Type Renaming::
+@end menu
+
+@node Great Integral Type Renaming
+@section Great Integral Type Renaming
+@cindex Great Integral Type Renaming
+@cindex integral type renaming, great
+@cindex type renaming, integral
+@cindex renaming, integral types
+
+The purpose of this is to rationalize the names used for various
+integral types, so that they match their intended uses and follow
+consist conventions, and eliminate types that were not semantically
+different from each other.
+
+The conventions are:
+
+@itemize @bullet
+@item
+All integral types that measure quantities of anything are signed.  Some
+people disagree vociferously with this, but their arguments are mostly
+theoretical, and are vastly outweighed by the practical headaches of
+mixing signed and unsigned values, and more importantly by the far
+increased likelihood of inadvertent bugs: Because of the broken "viral"
+nature of unsigned quantities in C (operations involving mixed
+signed/unsigned are done unsigned, when exactly the opposite is nearly
+always wanted), even a single error in declaring a quantity unsigned
+that should be signed, or even the even more subtle error of comparing
+signed and unsigned values and forgetting the necessary cast, can be
+catastrophic, as comparisons will yield wrong results.  -Wsign-compare
+is turned on specifically to catch this, but this tends to result in a
+great number of warnings when mixing signed and unsigned, and the casts
+are annoying.  More has been written on this elsewhere.
+
+@item
+All such quantity types just mentioned boil down to EMACS_INT, which is
+32 bits on 32-bit machines and 64 bits on 64-bit machines.  This is
+guaranteed to be the same size as Lisp objects of type `int', and (as
+far as I can tell) of size_t (unsigned!) and ssize_t.  The only type
+below that is not an EMACS_INT is Hashcode, which is an unsigned value
+of the same size as EMACS_INT.
+
+@item
+Type names should be relatively short (no more than 10 characters or
+so), with the first letter capitalized and no underscores if they can at
+all be avoided.
+
+@item
+"count" == a zero-based measurement of some quantity.  Includes sizes,
+offsets, and indexes.
+
+@item
+"bpos" == a one-based measurement of a position in a buffer.  "Charbpos"
+and "Bytebpos" count text in the buffer, rather than bytes in memory;
+thus Bytebpos does not directly correspond to the memory representation.
+Use "Membpos" for this.
+
+@item
+"Char" refers to internal-format characters, not to the C type "char",
+which is really a byte.
+@end itemize
+
+For the actual name changes, see the script below.
+
+I ran the following script to do the conversion. (NOTE: This script is
+idempotent.  You can safely run it multiple times and it will not screw
+up previous results -- in fact, it will do nothing if nothing has
+changed.  Thus, it can be run repeatedly as necessary to handle patches
+coming in from old workspaces, or old branches.)  There are two tags,
+just before and just after the change: @samp{pre-integral-type-rename}
+and @samp{post-integral-type-rename}.  When merging code from the main
+trunk into a branch, the best thing to do is first merge up to
+@samp{pre-integral-type-rename}, then apply the script and associated
+changes, then merge from @samp{post-integral-type-change} to the
+present. (Alternatively, just do the merging in one operation; but you
+may then have a lot of conflicts needing to be resolved by hand.)
+
+Script @samp{fixtypes.sh} follows:
+
+@example
+----------------------------------- cut ------------------------------------
+files="*.[ch] s/*.h m/*.h config.h.in ../configure.in Makefile.in.in ../lib-src/*.[ch] ../lwlib/*.[ch]"
+gr Memory_Count Bytecount $files
+gr Lstream_Data_Count Bytecount $files
+gr Element_Count Elemcount $files
+gr Hash_Code Hashcode $files
+gr extcount bytecount $files
+gr bufpos charbpos $files
+gr bytind bytebpos $files
+gr memind membpos $files
+gr bufbyte intbyte $files
+gr Extcount Bytecount $files
+gr Bufpos Charbpos $files
+gr Bytind Bytebpos $files
+gr Memind Membpos $files
+gr Bufbyte Intbyte $files
+gr EXTCOUNT BYTECOUNT $files
+gr BUFPOS CHARBPOS $files
+gr BYTIND BYTEBPOS $files
+gr MEMIND MEMBPOS $files
+gr BUFBYTE INTBYTE $files
+gr MEMORY_COUNT BYTECOUNT $files
+gr LSTREAM_DATA_COUNT BYTECOUNT $files
+gr ELEMENT_COUNT ELEMCOUNT $files
+gr HASH_CODE HASHCODE $files
+----------------------------------- cut ------------------------------------
+@end example
+
+The @samp{gr} script, and the scripts it uses, are documented in
+@file{README.global-renaming}, because if placed in this file they would
+need to have their @@ characters doubled, meaning you couldn't easily
+cut and paste from the source.
+
+In addition to those programs, I needed to fix up a few other
+things, particularly relating to the duplicate definitions of
+types, now that some types merged with others.  Specifically:
+
+@enumerate
+@item
+in lisp.h, removed duplicate declarations of Bytecount.  The changed
+code should now look like this: (In each code snippet below, the first
+and last lines are the same as the original, as are all lines outside of
+those lines.  That allows you to locate the section to be replaced, and
+replace the stuff in that section, verifying that there isn't anything
+new added that would need to be kept.)
+
+@example
+--------------------------------- snip -------------------------------------
+/* Counts of bytes or chars */
+typedef EMACS_INT Bytecount;
+typedef EMACS_INT Charcount;
+
+/* Counts of elements */
+typedef EMACS_INT Elemcount;
+
+/* Hash codes */
+typedef unsigned long Hashcode;
+
+/* ------------------------ dynamic arrays ------------------- */
+--------------------------------- snip -------------------------------------
+@end example
+
+@item 
+in lstream.h, removed duplicate declaration of Bytecount.  Rewrote the
+comment about this type.  The changed code should now look like this:
+
+@example
+--------------------------------- snip -------------------------------------
+#endif
+
+/* The have been some arguments over the what the type should be that
+   specifies a count of bytes in a data block to be written out or read in,
+   using Lstream_read(), Lstream_write(), and related functions.
+   Originally it was long, which worked fine; Martin "corrected" these to
+   size_t and ssize_t on the grounds that this is theoretically cleaner and
+   is in keeping with the C standards.  Unfortunately, this practice is
+   horribly error-prone due to design flaws in the way that mixed
+   signed/unsigned arithmetic happens.  In fact, by doing this change,
+   Martin introduced a subtle but fatal error that caused the operation of
+   sending large mail messages to the SMTP server under Windows to fail.
+   By putting all values back to be signed, avoiding any signed/unsigned
+   mixing, the bug immediately went away.  The type then in use was
+   Lstream_Data_Count, so that it be reverted cleanly if a vote came to
+   that.  Now it is Bytecount.
+
+   Some earlier comments about why the type must be signed: This MUST BE
+   SIGNED, since it also is used in functions that return the number of
+   bytes actually read to or written from in an operation, and these
+   functions can return -1 to signal error.
+
+   Note that the standard Unix read() and write() functions define the
+   count going in as a size_t, which is UNSIGNED, and the count going
+   out as an ssize_t, which is SIGNED.  This is a horrible design
+   flaw.  Not only is it highly likely to lead to logic errors when a
+   -1 gets interpreted as a large positive number, but operations are
+   bound to fail in all sorts of horrible ways when a number in the
+   upper-half of the size_t range is passed in -- this number is
+   unrepresentable as an ssize_t, so code that checks to see how many
+   bytes are actually written (which is mandatory if you are dealing
+   with certain types of devices) will get completely screwed up.
+
+   --ben
+*/
+
+typedef enum lstream_buffering
+--------------------------------- snip -------------------------------------
+@end example
+
+@item
+in dumper.c, there are four places, all inside of switch() statements,
+where XD_BYTECOUNT appears twice as a case tag.  In each case, the two
+case blocks contain identical code, and you should *REMOVE THE SECOND*
+and leave the first.
+@end enumerate
+
+@node Text/Char Type Renaming
+@section Text/Char Type Renaming
+@cindex Text/Char Type Renaming
+@cindex type renaming, text/char
+@cindex renaming, text/char types
+
+The purpose of this was
+
+@enumerate
+@item
+To distinguish between ``charptr'' when it refers to operations on
+the pointer itself and when it refers to operations on text
+@item
+To use consistent naming for everything referring to internal format, i.e.
+@end enumerate
+
+@example
+	Itext == text in internal format
+	Ibyte == a byte in such text
+	Ichar == a char as represented in internal character format
+@end example
+
+Thus e.g.
+
+@example
+	set_charptr_emchar -> set_itext_ichar
+@end example
+ 
+This was done using a script like this: 
+
+@example
+files="*.[ch] s/*.h m/*.h config.h.in ../configure.in Makefile.in.in ../lib-src/*.[ch] ../lwlib/*.[ch]"
+gr Intbyte Ibyte $files
+gr INTBYTE IBYTE $files
+gr intbyte ibyte $files
+gr EMCHAR ICHAR $files
+gr emchar ichar $files
+gr Emchar Ichar $files
+gr INC_CHARPTR INC_IBYTEPTR $files
+gr DEC_CHARPTR DEC_IBYTEPTR $files
+gr VALIDATE_CHARPTR VALIDATE_IBYTEPTR $files
+gr valid_charptr valid_ibyteptr $files
+gr CHARPTR ITEXT $files
+gr charptr itext $files
+gr Charptr Itext $files
+@end example
+
+See above for the source to @samp{gr}.
+
+As in the integral-types change, there are pre and post tags before and
+after the change:
+
+@example
+	pre-internal-format-textual-renaming
+	post-internal-format-textual-renaming
+@end example
+
+When merging a large branch, follow the same sort of procedure
+documented above, using these tags -- essentially sync up to the pre
+tag, then apply the script yourself, then sync from the post tag to the
+present.  You can probably do the same if you don't have a separate
+workspace, but do have lots of outstanding changes and you'd rather not
+just merge all the textual changes directly.  Use something like this:
+
+(WARNING: I'm not a CVS guru; before trying this, or any large operation
+that might potentially mess things up, *DEFINITELY* make a backup of
+your existing workspace.)
+
+@example
+cup -r pre-internal-format-textual-renaming
+<apply script>
+cup -A -j post-internal-format-textual-renaming -j HEAD
+@end example
+
+This might also work:
+
+@example
+cup -j pre-internal-format-textual-renaming
+<apply script>
+cup -j post-internal-format-textual-renaming -j HEAD
+@end example
+
+ben
+
+The following is a script to go in the opposite direction:
+
+@example
+files="*.[ch] s/*.h m/*.h config.h.in ../configure.in Makefile.in.in ../lib-src/*.[ch] ../lwlib/*.[ch]"
+
+# Evidently Perl considers _ to be a word char ala \b, even though XEmacs
+# doesn't.  We need to be careful here with ibyte/ichar because of words
+# like Richard, eicharlen(), multibyte, HIBYTE, etc.
+
+gr Ibyte Intbyte $files
+gr '\bIBYTE' INTBYTE $files
+gr '\bibyte' intbyte $files
+gr '\bICHAR' EMCHAR $files
+gr '\bichar' emchar $files
+gr '\bIchar' Emchar $files
+gr '\bIBYTEPTR' CHARPTR $files
+gr '\bibyteptr' charptr $files
+gr '\bITEXT' CHARPTR $files
+gr '\bitext' charptr $files
+gr '\bItext' CHARPTR $files
+
+gr '_IBYTE' _INTBYTE $files
+gr '_ibyte' _intbyte $files
+gr '_ICHAR' _EMCHAR $files
+gr '_ichar' _emchar $files
+gr '_Ichar' _Emchar $files
+gr '_IBYTEPTR' _CHARPTR $files
+gr '_ibyteptr' _charptr $files
+gr '_ITEXT' _CHARPTR $files
+gr '_itext' _charptr $files
+gr '_Itext' _CHARPTR $files
+@end example
+
+@node Rules When Writing New C Code, CVS Techniques, Major Textual Changes, Top
 @chapter Rules When Writing New C Code
 @cindex writing new C code, rules when
 @cindex C code, rules when writing new
--- a/src/ChangeLog	Wed Jun 05 09:58:45 2002 +0000
+++ b/src/ChangeLog	Wed Jun 05 12:01:11 2002 +0000
@@ -1,3 +1,15 @@
+2002-06-05  Ben Wing  <ben@xemacs.org>
+
+	* README.integral-types: Removed.
+	* README.global-renaming: Added.
+
+	Stuff specific to the integral types rename was moved to the
+	Internals Manual.  The general scripts, suitable for any type
+	of global search-and-replace, were moved to README.global-renaming.
+	(In the internals manual, they need to be munged by replacing @
+	with @@, and this precludes just cutting and pasting from the source
+	file, which is what people are naturally going to do.)
+
 2002-06-05  Ben Wing  <ben@xemacs.org>
 
 	* abbrev.c (abbrev_match_mapper):
@@ -6054,7 +6066,12 @@
 	* dumper.c: remove duplicate case tag XD_BYTECOUNT, and the
 	accompanying duplicate code, from 4 switchs tatements.
 
-	See README.integral-types in this directory for more details.
+	[[See README.integral-types in this directory for more
+	details.]] --invalid.
+
+	See the Internals Manual, under Major Type Changes, and also
+	README.global-renaming.
+	
 
 2001-09-17  Ben Wing  <ben@xemacs.org>
 
--- /dev/null	Thu Jan 01 00:00:00 1970 +0000
+++ b/src/README.global-renaming	Wed Jun 05 12:01:11 2002 +0000
@@ -0,0 +1,206 @@
+README.global-renaming
+
+This file documents the generic scripts that have been used to implement
+the recent type renamings, e.g. the "great integral type renaming" and the
+"text/char type renaming".  More information about these changes can be
+found in the Internals manual.
+
+A sample script to do such renaming is this (used in the great integral
+type renaming):
+
+----------------------------------- cut ------------------------------------
+files="*.[ch] s/*.h m/*.h config.h.in ../configure.in Makefile.in.in ../lib-src/*.[ch] ../lwlib/*.[ch]"
+gr Memory_Count Bytecount $files
+gr Lstream_Data_Count Bytecount $files
+gr Element_Count Elemcount $files
+gr Hash_Code Hashcode $files
+gr extcount bytecount $files
+gr bufpos charbpos $files
+gr bytind bytebpos $files
+gr memind membpos $files
+gr bufbyte intbyte $files
+gr Extcount Bytecount $files
+gr Bufpos Charbpos $files
+gr Bytind Bytebpos $files
+gr Memind Membpos $files
+gr Bufbyte Intbyte $files
+gr EXTCOUNT BYTECOUNT $files
+gr BUFPOS CHARBPOS $files
+gr BYTIND BYTEBPOS $files
+gr MEMIND MEMBPOS $files
+gr BUFBYTE INTBYTE $files
+gr MEMORY_COUNT BYTECOUNT $files
+gr LSTREAM_DATA_COUNT BYTECOUNT $files
+gr ELEMENT_COUNT ELEMCOUNT $files
+gr HASH_CODE HASHCODE $files
+----------------------------------- cut ------------------------------------
+
+
+`fixtypes.sh' is a Bourne-shell script; it uses 'gr':
+
+
+----------------------------------- cut ------------------------------------
+#!/bin/sh
+
+# Usage is like this:
+
+# gr FROM TO FILES ...
+
+# globally replace FROM with TO in FILES.  FROM and TO are regular expressions.
+# backup files are stored in the `backup' directory.
+from="$1"
+to="$2"
+shift 2
+echo ${1+"$@"} | xargs global-replace "s/$from/$to/g"
+----------------------------------- cut ------------------------------------
+
+
+`gr' in turn uses a Perl script to do its real work, `global-replace',
+which follows:
+
+
+----------------------------------- cut ------------------------------------
+: #-*- Perl -*-
+
+### global-replace --- modify the contents of a file by a Perl expression
+
+## Copyright (C) 1999 Martin Buchholz.
+## Copyright (C) 2001, 2002 Ben Wing.
+
+## Authors: Martin Buchholz <martin@xemacs.org>, Ben Wing <ben@xemacs.org>
+## Maintainer: Ben Wing <ben@xemacs.org>
+## Current Version: 1.2, March 12, 2002
+
+# This program is free software; you can redistribute it and/or modify
+# it under the terms of the GNU General Public License as published by
+# the Free Software Foundation; either version 2, or (at your option)
+# any later version.
+#
+# This program is distributed in the hope that it will be useful, but
+# WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+# General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with XEmacs; see the file COPYING.  If not, write to the Free
+# Software Foundation, Inc., 59 Temple Place - Suite 330, Boston, MA
+# 02111-1307, USA.
+
+eval 'exec perl -w -S $0 ${1+"$@"}'
+    if 0;
+
+use strict;
+use FileHandle;
+use Carp;
+use Getopt::Long;
+use File::Basename;
+
+(my $myName = $0) =~ s@.*/@@; my $usage="
+Usage: $myName [--help] [--backup-dir=DIR] [--line-mode] [--hunk-mode]
+       PERLEXPR FILE ...
+
+Globally modify a file, either line by line or in one big hunk.
+
+Typical usage is like this:
+
+[with GNU print, GNU xargs: guaranteed to handle spaces, quotes, etc.
+ in file names]
+
+find . -name '*.[ch]' -print0 | xargs -0 $0 's/\bCONST\b/const/g'\n
+
+[with non-GNU print, xargs]
+
+find . -name '*.[ch]' -print | xargs $0 's/\bCONST\b/const/g'\n
+
+
+The file is read in, either line by line (with --line-mode specified)
+or in one big hunk (with --hunk-mode specified; it's the default), and
+the Perl expression is then evalled with \$_ set to the line or hunk of
+text, including the terminating newline if there is one.  It should
+destructively modify the value there, storing the changed result in \$_.
+
+Files in which any modifications are made are backed up to the directory
+specified using --backup-dir, or to `backup.orig' by default.  To disable
+this, use --backup-dir= with no argument.
+
+Hunk mode is the default because it is MUCH MUCH faster than line-by-line.
+Use line-by-line only when it matters, e.g. you want to do a replacement
+only once per line (the default without the `g' argument).  Conversely,
+when using hunk mode, *ALWAYS* use `g'; otherwise, you will only make one
+replacement in the entire file!
+";
+
+my %options = ();
+$Getopt::Long::ignorecase = 0;
+&GetOptions (
+	     \%options,
+	     'help', 'backup-dir=s', 'line-mode', 'hunk-mode',
+);
+
+
+die $usage if $options{"help"} or @ARGV <= 1;
+my $code = shift;
+
+die $usage if grep (-d || ! -w, @ARGV);
+
+sub SafeOpen {
+  open ((my $fh = new FileHandle), $_[0]);
+  confess "Can't open $_[0]: $!" if ! defined $fh;
+  return $fh;
+}
+
+sub SafeClose {
+  close $_[0] or confess "Can't close $_[0]: $!";
+}
+
+sub FileContents {
+  my $fh = SafeOpen ("< $_[0]");
+  my $olddollarslash = $/;
+  local $/ = undef;
+  my $contents = <$fh>;
+  $/ = $olddollarslash;
+  return $contents;
+}
+
+sub WriteStringToFile {
+  my $fh = SafeOpen ("> $_[0]");
+  binmode $fh;
+  print $fh $_[1] or confess "$_[0]: $!\n";
+  SafeClose $fh;
+}
+
+foreach my $file (@ARGV) {
+  my $changed_p = 0;
+  my $new_contents = "";
+  if ($options{"line-mode"}) {
+    my $fh = SafeOpen $file;
+    while (<$fh>) {
+      my $save_line = $_;
+      eval $code;
+      $changed_p = 1 if $save_line ne $_;
+      $new_contents .= $_;
+    }
+  } else {
+    my $orig_contents = $_ = FileContents $file;
+    eval $code;
+    if ($_ ne $orig_contents) {
+      $changed_p = 1;
+      $new_contents = $_;
+    }
+  }
+
+  if ($changed_p) {
+    my $backdir = $options{"backup-dir"};
+    $backdir = "backup.orig" if !defined ($backdir);
+    if ($backdir) {
+      my ($name, $path, $suffix) = fileparse ($file, "");
+      my $backfulldir = $path . $backdir;
+      my $backfile = "$backfulldir/$name";
+      mkdir $backfulldir, 0755 unless -d $backfulldir;
+      print "modifying $file (original saved in $backfile)\n";
+      rename $file, $backfile;
+    }
+    WriteStringToFile ($file, $new_contents);
+  }
+}
+----------------------------------- cut ------------------------------------
--- a/src/README.integral-types	Wed Jun 05 09:58:45 2002 +0000
+++ /dev/null	Thu Jan 01 00:00:00 1970 +0000
@@ -1,349 +0,0 @@
-README.integral-types
-
-The great integral types renaming.
-
-#### The content of this file was originally posted as a ChangeLog and
-should be moved to the Internals manual.
-
-The purpose of this is to rationalize the names used for various
-integral types, so that they match their intended uses and follow
-consist conventions, and eliminate types that were not semantically
-different from each other.
-
-The conventions are:
-
--- All integral types that measure quantities of anything are
-   signed.  Some people disagree vociferously with this, but their
-   arguments are mostly theoretical, and are vastly outweighed by
-   the practical headaches of mixing signed and unsigned values,
-   and more importantly by the far increased likelihood of
-   inadvertent bugs: Because of the broken "viral" nature of
-   unsigned quantities in C (operations involving mixed
-   signed/unsigned are done unsigned, when exactly the opposite is
-   nearly always wanted), even a single error in declaring a
-   quantity unsigned that should be signed, or even the even more
-   subtle error of comparing signed and unsigned values and
-   forgetting the necessary cast, can be catastrophic, as
-   comparisons will yield wrong results.  -Wsign-compare is turned
-   on specifically to catch this, but this tends to result in a
-   great number of warnings when mixing signed and unsigned, and
-   the casts are annoying.  More has been written on this
-   elsewhere.
-
--- All such quantity types just mentioned boil down to EMACS_INT,
-   which is 32 bits on 32-bit machines and 64 bits on 64-bit
-   machines.  This is guaranteed to be the same size as Lisp
-   objects of type `int', and (as far as I can tell) of size_t
-   (unsigned!) and ssize_t.  The only type below that is not an
-   EMACS_INT is Hashcode, which is an unsigned value of the same
-   size as EMACS_INT.
-
--- Type names should be relatively short (no more than 10
-   characters or so), with the first letter capitalized and no
-   underscores if they can at all be avoided.
-
--- "count" == a zero-based measurement of some quantity.  Includes
-   sizes, offsets, and indexes.
-
--- "bpos" == a one-based measurement of a position in a buffer.
-   "Charbpos" and "Bytebpos" count text in the buffer, rather than
-   bytes in memory; thus Bytebpos does not directly correspond to
-   the memory representation.  Use "Membpos" for this.
-
--- "Char" refers to internal-format characters, not to the C type
-   "char", which is really a byte.
-
--- For the actual name changes, see the script below.
-
-I ran the following script to do the conversion. (NOTE: This script
-is idempotent.  You can safely run it multiple times and it will
-not screw up previous results -- in fact, it will do nothing if
-nothing has changed.  Thus, it can be run repeatedly as necessary
-to handle patches coming in from old workspaces, or old branches.)
-There are two tags, just before and just after the change:
-`pre-integral-type-rename' and `post-integral-type-rename'.  When
-merging code from the main trunk into a branch, the best thing to
-do is first merge up to `pre-integral-type-rename', then apply the
-script and associated changes, then merge from
-`post-integral-type-change' to the present. (Alternatively, just do
-the merging in one operation; but you may then have a lot of
-conflicts needing to be resolved by hand.)
-
-Script `fixtypes.sh' follows:
-
-
------------------------------------ cut ------------------------------------
-files="*.[ch] s/*.h m/*.h config.h.in ../configure.in Makefile.in.in ../lib-src/*.[ch] ../lwlib/*.[ch]"
-gr Memory_Count Bytecount $files
-gr Lstream_Data_Count Bytecount $files
-gr Element_Count Elemcount $files
-gr Hash_Code Hashcode $files
-gr extcount bytecount $files
-gr bufpos charbpos $files
-gr bytind bytebpos $files
-gr memind membpos $files
-gr bufbyte intbyte $files
-gr Extcount Bytecount $files
-gr Bufpos Charbpos $files
-gr Bytind Bytebpos $files
-gr Memind Membpos $files
-gr Bufbyte Intbyte $files
-gr EXTCOUNT BYTECOUNT $files
-gr BUFPOS CHARBPOS $files
-gr BYTIND BYTEBPOS $files
-gr MEMIND MEMBPOS $files
-gr BUFBYTE INTBYTE $files
-gr MEMORY_COUNT BYTECOUNT $files
-gr LSTREAM_DATA_COUNT BYTECOUNT $files
-gr ELEMENT_COUNT ELEMCOUNT $files
-gr HASH_CODE HASHCODE $files
------------------------------------ cut ------------------------------------
-
-
-	`fixtypes.sh' is a Bourne-shell script; it uses 'gr':
-
-
------------------------------------ cut ------------------------------------
-#!/bin/sh
-
-# Usage is like this:
-
-# gr FROM TO FILES ...
-
-# globally replace FROM with TO in FILES.  FROM and TO are regular expressions.
-# backup files are stored in the `backup' directory.
-from="$1"
-to="$2"
-shift 2
-echo ${1+"$@"} | xargs global-replace "s/$from/$to/g"
------------------------------------ cut ------------------------------------
-
-
-	`gr' in turn uses a Perl script to do its real work,
-	`global-replace', which follows:
-
-
------------------------------------ cut ------------------------------------
-: #-*- Perl -*-
-
-### global-modify --- modify the contents of a file by a Perl expression
-
-## Copyright (C) 1999 Martin Buchholz.
-## Copyright (C) 2001 Ben Wing.
-
-## Authors: Martin Buchholz <martin@xemacs.org>, Ben Wing <ben@xemacs.org>
-## Maintainer: Ben Wing <ben@xemacs.org>
-## Current Version: 1.0, May 5, 2001
-
-# This program is free software; you can redistribute it and/or modify
-# it under the terms of the GNU General Public License as published by
-# the Free Software Foundation; either version 2, or (at your option)
-# any later version.
-#
-# This program is distributed in the hope that it will be useful, but
-# WITHOUT ANY WARRANTY; without even the implied warranty of
-# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
-# General Public License for more details.
-#
-# You should have received a copy of the GNU General Public License
-# along with XEmacs; see the file COPYING.  If not, write to the Free
-# Software Foundation, Inc., 59 Temple Place - Suite 330, Boston, MA
-# 02111-1307, USA.
-
-eval 'exec perl -w -S $0 ${1+"$@"}'
-    if 0;
-
-use strict;
-use FileHandle;
-use Carp;
-use Getopt::Long;
-use File::Basename;
-
-(my $myName = $0) =~ s@.*/@@; my $usage="
-Usage: $myName [--help] [--backup-dir=DIR] [--line-mode] [--hunk-mode]
-       PERLEXPR FILE ...
-
-Globally modify a file, either line by line or in one big hunk.
-
-Typical usage is like this:
-
-[with GNU print, GNU xargs: guaranteed to handle spaces, quotes, etc.
- in file names]
-
-find . -name '*.[ch]' -print0 | xargs -0 $0 's/\bCONST\b/const/g'\n
-
-[with non-GNU print, xargs]
-
-find . -name '*.[ch]' -print | xargs $0 's/\bCONST\b/const/g'\n
-
-
-The file is read in, either line by line (with --line-mode specified)
-or in one big hunk (with --hunk-mode specified; it's the default), and
-the Perl expression is then evalled with \$_ set to the line or hunk of
-text, including the terminating newline if there is one.  It should
-destructively modify the value there, storing the changed result in \$_.
-
-Files in which any modifications are made are backed up to the directory
-specified using --backup-dir, or to `backup' by default.  To disable this,
-use --backup-dir= with no argument.
-
-Hunk mode is the default because it is MUCH MUCH faster than line-by-line.
-Use line-by-line only when it matters, e.g. you want to do a replacement
-only once per line (the default without the `g' argument).  Conversely,
-when using hunk mode, *ALWAYS* use `g'; otherwise, you will only make one
-replacement in the entire file!
-";
-
-my %options = ();
-$Getopt::Long::ignorecase = 0;
-&GetOptions (
-	     \%options,
-	     'help', 'backup-dir=s', 'line-mode', 'hunk-mode',
-);
-
-
-die $usage if $options{"help"} or @ARGV <= 1;
-my $code = shift;
-
-die $usage if grep (-d || ! -w, @ARGV);
-
-sub SafeOpen {
-  open ((my $fh = new FileHandle), $_[0]);
-  confess "Can't open $_[0]: $!" if ! defined $fh;
-  return $fh;
-}
-
-sub SafeClose {
-  close $_[0] or confess "Can't close $_[0]: $!";
-}
-
-sub FileContents {
-  my $fh = SafeOpen ("< $_[0]");
-  my $olddollarslash = $/;
-  local $/ = undef;
-  my $contents = <$fh>;
-  $/ = $olddollarslash;
-  return $contents;
-}
-
-sub WriteStringToFile {
-  my $fh = SafeOpen ("> $_[0]");
-  binmode $fh;
-  print $fh $_[1] or confess "$_[0]: $!\n";
-  SafeClose $fh;
-}
-
-foreach my $file (@ARGV) {
-  my $changed_p = 0;
-  my $new_contents = "";
-  if ($options{"line-mode"}) {
-    my $fh = SafeOpen $file;
-    while (<$fh>) {
-      my $save_line = $_;
-      eval $code;
-      $changed_p = 1 if $save_line ne $_;
-      $new_contents .= $_;
-    }
-  } else {
-    my $orig_contents = $_ = FileContents $file;
-    eval $code;
-    if ($_ ne $orig_contents) {
-      $changed_p = 1;
-      $new_contents = $_;
-    }
-  }
-
-  if ($changed_p) {
-    my $backdir = $options{"backup-dir"};
-    $backdir = "backup" if !defined ($backdir);
-    if ($backdir) {
-      my ($name, $path, $suffix) = fileparse ($file, "");
-      my $backfulldir = $path . $backdir;
-      my $backfile = "$backfulldir/$name";
-      mkdir $backfulldir, 0755 unless -d $backfulldir;
-      print "modifying $file (original saved in $backfile)\n";
-      rename $file, $backfile;
-    }
-    WriteStringToFile ($file, $new_contents);
-  }
-}
------------------------------------ cut ------------------------------------
-
-
-In addition to those programs, I needed to fix up a few other
-things, particularly relating to the duplicate definitions of
-types, now that some types merged with others.  Specifically:
-
-1. in lisp.h, removed duplicate declarations of Bytecount.  The
-   changed code should now look like this: (In each code snippet
-   below, the first and last lines are the same as the original, as
-   are all lines outside of those lines.  That allows you to locate
-   the section to be replaced, and replace the stuff in that
-   section, verifying that there isn't anything new added that
-   would need to be kept.)
-
---------------------------------- snip -------------------------------------
-/* Counts of bytes or chars */
-typedef EMACS_INT Bytecount;
-typedef EMACS_INT Charcount;
-
-/* Counts of elements */
-typedef EMACS_INT Elemcount;
-
-/* Hash codes */
-typedef unsigned long Hashcode;
-
-/* ------------------------ dynamic arrays ------------------- */
---------------------------------- snip -------------------------------------
-
-2. in lstream.h, removed duplicate declaration of Bytecount.
-   Rewrote the comment about this type.  The changed code should
-   now look like this:
-
-
---------------------------------- snip -------------------------------------
-#endif
-
-/* The have been some arguments over the what the type should be that
-   specifies a count of bytes in a data block to be written out or read in,
-   using Lstream_read(), Lstream_write(), and related functions.
-   Originally it was long, which worked fine; Martin "corrected" these to
-   size_t and ssize_t on the grounds that this is theoretically cleaner and
-   is in keeping with the C standards.  Unfortunately, this practice is
-   horribly error-prone due to design flaws in the way that mixed
-   signed/unsigned arithmetic happens.  In fact, by doing this change,
-   Martin introduced a subtle but fatal error that caused the operation of
-   sending large mail messages to the SMTP server under Windows to fail.
-   By putting all values back to be signed, avoiding any signed/unsigned
-   mixing, the bug immediately went away.  The type then in use was
-   Lstream_Data_Count, so that it be reverted cleanly if a vote came to
-   that.  Now it is Bytecount.
-
-   Some earlier comments about why the type must be signed: This MUST BE
-   SIGNED, since it also is used in functions that return the number of
-   bytes actually read to or written from in an operation, and these
-   functions can return -1 to signal error.
-
-   Note that the standard Unix read() and write() functions define the
-   count going in as a size_t, which is UNSIGNED, and the count going
-   out as an ssize_t, which is SIGNED.  This is a horrible design
-   flaw.  Not only is it highly likely to lead to logic errors when a
-   -1 gets interpreted as a large positive number, but operations are
-   bound to fail in all sorts of horrible ways when a number in the
-   upper-half of the size_t range is passed in -- this number is
-   unrepresentable as an ssize_t, so code that checks to see how many
-   bytes are actually written (which is mandatory if you are dealing
-   with certain types of devices) will get completely screwed up.
-
-   --ben
-*/
-
-typedef enum lstream_buffering
---------------------------------- snip -------------------------------------
-
-
-3. in dumper.c, there are four places, all inside of switch()
-   statements, where XD_BYTECOUNT appears twice as a case tag.  In
-   each case, the two case blocks contain identical code, and you
-   should *REMOVE THE SECOND* and leave the first.
-