comparison man/internals/internals.texi @ 398:74fd4e045ea6 r21-2-29

Import from CVS: tag r21-2-29
author cvs
date Mon, 13 Aug 2007 11:13:30 +0200
parents aabb7f5b1c81
children a86b2b5e0111
comparison
equal deleted inserted replaced
397:f4aeb21a5bad 398:74fd4e045ea6
3 @setfilename ../../info/internals.info 3 @setfilename ../../info/internals.info
4 @settitle XEmacs Internals Manual 4 @settitle XEmacs Internals Manual
5 @c %**end of header 5 @c %**end of header
6 6
7 @ifinfo 7 @ifinfo
8 @dircategory XEmacs Editor
9 @direntry
10 * Internals: (internals). XEmacs Internals Manual.
11 @end direntry
8 12
9 Copyright @copyright{} 1992 - 1996 Ben Wing. 13 Copyright @copyright{} 1992 - 1996 Ben Wing.
10 Copyright @copyright{} 1996, 1997 Sun Microsystems. 14 Copyright @copyright{} 1996, 1997 Sun Microsystems.
11 Copyright @copyright{} 1994 - 1998 Free Software Foundation. 15 Copyright @copyright{} 1994 - 1998 Free Software Foundation.
12 Copyright @copyright{} 1994, 1995 Board of Trustees, University of Illinois. 16 Copyright @copyright{} 1994, 1995 Board of Trustees, University of Illinois.
57 @setchapternewpage odd 61 @setchapternewpage odd
58 @finalout 62 @finalout
59 63
60 @titlepage 64 @titlepage
61 @title XEmacs Internals Manual 65 @title XEmacs Internals Manual
62 @subtitle Version 1.2, October 1998 66 @subtitle Version 1.3, August 1999
63 67
64 @author Ben Wing 68 @author Ben Wing
65 @author Martin Buchholz 69 @author Martin Buchholz
66 @author Hrvoje Niksic 70 @author Hrvoje Niksic
71 @author Matthias Neubauer
72 @author Olivier Galibert
67 @page 73 @page
68 @vskip 0pt plus 1fill 74 @vskip 0pt plus 1fill
69 75
70 @noindent 76 @noindent
71 Copyright @copyright{} 1992 - 1996 Ben Wing. @* 77 Copyright @copyright{} 1992 - 1996 Ben Wing. @*
72 Copyright @copyright{} 1996, 1997 Sun Microsystems, Inc. @* 78 Copyright @copyright{} 1996, 1997 Sun Microsystems, Inc. @*
73 Copyright @copyright{} 1994 - 1998 Free Software Foundation. @* 79 Copyright @copyright{} 1994 - 1998 Free Software Foundation. @*
74 Copyright @copyright{} 1994, 1995 Board of Trustees, University of Illinois. 80 Copyright @copyright{} 1994, 1995 Board of Trustees, University of Illinois.
75 81
76 @sp 2 82 @sp 2
77 Version 1.2 @* 83 Version 1.3 @*
78 October 1998.@* 84 August 1999.@*
79 85
80 Permission is granted to make and distribute verbatim copies of this 86 Permission is granted to make and distribute verbatim copies of this
81 manual provided the copyright notice and this permission notice are 87 manual provided the copyright notice and this permission notice are
82 preserved on all copies. 88 preserved on all copies.
83 89
111 * The XEmacs Object System (Abstractly Speaking):: 117 * The XEmacs Object System (Abstractly Speaking)::
112 * How Lisp Objects Are Represented in C:: 118 * How Lisp Objects Are Represented in C::
113 * Rules When Writing New C Code:: 119 * Rules When Writing New C Code::
114 * A Summary of the Various XEmacs Modules:: 120 * A Summary of the Various XEmacs Modules::
115 * Allocation of Objects in XEmacs Lisp:: 121 * Allocation of Objects in XEmacs Lisp::
122 * Dumping::
116 * Events and the Event Loop:: 123 * Events and the Event Loop::
117 * Evaluation; Stack Frames; Bindings:: 124 * Evaluation; Stack Frames; Bindings::
118 * Symbols and Variables:: 125 * Symbols and Variables::
119 * Buffers and Textual Representation:: 126 * Buffers and Textual Representation::
120 * MULE Character Sets and Encodings:: 127 * MULE Character Sets and Encodings::
121 * The Lisp Reader and Compiler:: 128 * The Lisp Reader and Compiler::
122 * Lstreams:: 129 * Lstreams::
123 * Consoles; Devices; Frames; Windows:: 130 * Consoles; Devices; Frames; Windows::
124 * The Redisplay Mechanism:: 131 * The Redisplay Mechanism::
125 * Extents:: 132 * Extents::
126 * Faces and Glyphs:: 133 * Faces::
134 * Glyphs::
127 * Specifiers:: 135 * Specifiers::
128 * Menus:: 136 * Menus::
129 * Subprocesses:: 137 * Subprocesses::
130 * Interface to X Windows:: 138 * Interface to X Windows::
131 * Index:: Index including concepts, functions, variables, 139 * Index::
132 and other terms. 140
133 141 @detailmenu --- The Detailed Node Listing ---
134 --- The Detailed Node Listing ---
135
136 Here are other nodes that are inferiors of those already listed,
137 mentioned here so you can get to them in one step:
138 142
139 A History of Emacs 143 A History of Emacs
140 144
141 * Through Version 18:: Unification prevails. 145 * Through Version 18:: Unification prevails.
142 * Lucid Emacs:: One version 19 Emacs. 146 * Lucid Emacs:: One version 19 Emacs.
143 * GNU Emacs 19:: The other version 19 Emacs. 147 * GNU Emacs 19:: The other version 19 Emacs.
148 * GNU Emacs 20:: The other version 20 Emacs.
144 * XEmacs:: The continuation of Lucid Emacs. 149 * XEmacs:: The continuation of Lucid Emacs.
145 150
146 Rules When Writing New C Code 151 Rules When Writing New C Code
147 152
148 * General Coding Rules:: 153 * General Coding Rules::
149 * Writing Lisp Primitives:: 154 * Writing Lisp Primitives::
150 * Adding Global Lisp Variables:: 155 * Adding Global Lisp Variables::
156 * Coding for Mule::
151 * Techniques for XEmacs Developers:: 157 * Techniques for XEmacs Developers::
158
159 Coding for Mule
160
161 * Character-Related Data Types::
162 * Working With Character and Byte Positions::
163 * Conversion to and from External Data::
164 * General Guidelines for Writing Mule-Aware Code::
165 * An Example of Mule-Aware Code::
152 166
153 A Summary of the Various XEmacs Modules 167 A Summary of the Various XEmacs Modules
154 168
155 * Low-Level Modules:: 169 * Low-Level Modules::
156 * Basic Lisp Modules:: 170 * Basic Lisp Modules::
168 Allocation of Objects in XEmacs Lisp 182 Allocation of Objects in XEmacs Lisp
169 183
170 * Introduction to Allocation:: 184 * Introduction to Allocation::
171 * Garbage Collection:: 185 * Garbage Collection::
172 * GCPROing:: 186 * GCPROing::
187 * Garbage Collection - Step by Step::
173 * Integers and Characters:: 188 * Integers and Characters::
174 * Allocation from Frob Blocks:: 189 * Allocation from Frob Blocks::
175 * lrecords:: 190 * lrecords::
176 * Low-level allocation:: 191 * Low-level allocation::
177 * Pure Space:: 192 * Pure Space::
181 * Symbol:: 196 * Symbol::
182 * Marker:: 197 * Marker::
183 * String:: 198 * String::
184 * Compiled Function:: 199 * Compiled Function::
185 200
201 Garbage Collection - Step by Step
202
203 * Invocation::
204 * garbage_collect_1::
205 * mark_object::
206 * gc_sweep::
207 * sweep_lcrecords_1::
208 * compact_string_chars::
209 * sweep_strings::
210 * sweep_bit_vectors_1::
211
212 Dumping
213
214 * Overview::
215 * Data descriptions::
216 * Dumping phase::
217 * Reloading phase::
218
219 Dumping phase
220
221 * Object inventory::
222 * Address allocation::
223 * The header::
224 * Data dumping::
225 * Pointers dumping::
226
186 Events and the Event Loop 227 Events and the Event Loop
187 228
188 * Introduction to Events:: 229 * Introduction to Events::
189 * Main Loop:: 230 * Main Loop::
190 * Specifics of the Event Gathering Mechanism:: 231 * Specifics of the Event Gathering Mechanism::
219 MULE Character Sets and Encodings 260 MULE Character Sets and Encodings
220 261
221 * Character Sets:: 262 * Character Sets::
222 * Encodings:: 263 * Encodings::
223 * Internal Mule Encodings:: 264 * Internal Mule Encodings::
265 * CCL::
224 266
225 Encodings 267 Encodings
226 268
227 * Japanese EUC (Extended Unix Code):: 269 * Japanese EUC (Extended Unix Code)::
228 * JIS7:: 270 * JIS7::
230 Internal Mule Encodings 272 Internal Mule Encodings
231 273
232 * Internal String Encoding:: 274 * Internal String Encoding::
233 * Internal Character Encoding:: 275 * Internal Character Encoding::
234 276
235 The Lisp Reader and Compiler
236
237 Lstreams 277 Lstreams
278
279 * Creating an Lstream:: Creating an lstream object.
280 * Lstream Types:: Different sorts of things that are streamed.
281 * Lstream Functions:: Functions for working with lstreams.
282 * Lstream Methods:: Creating new lstream types.
238 283
239 Consoles; Devices; Frames; Windows 284 Consoles; Devices; Frames; Windows
240 285
241 * Introduction to Consoles; Devices; Frames; Windows:: 286 * Introduction to Consoles; Devices; Frames; Windows::
242 * Point:: 287 * Point::
243 * Window Hierarchy:: 288 * Window Hierarchy::
289 * The Window Object::
244 290
245 The Redisplay Mechanism 291 The Redisplay Mechanism
246 292
247 * Critical Redisplay Sections:: 293 * Critical Redisplay Sections::
248 * Line Start Cache:: 294 * Line Start Cache::
295 * Redisplay Piece by Piece::
249 296
250 Extents 297 Extents
251 298
252 * Introduction to Extents:: Extents are ranges over text, with properties. 299 * Introduction to Extents:: Extents are ranges over text, with properties.
253 * Extent Ordering:: How extents are ordered internally. 300 * Extent Ordering:: How extents are ordered internally.
254 * Format of the Extent Info:: The extent information in a buffer or string. 301 * Format of the Extent Info:: The extent information in a buffer or string.
255 * Zero-Length Extents:: A weird special case. 302 * Zero-Length Extents:: A weird special case.
256 * Mathematics of Extent Ordering:: A rigorous foundation. 303 * Mathematics of Extent Ordering:: A rigorous foundation.
257 * Extent Fragments:: Cached information useful for redisplay. 304 * Extent Fragments:: Cached information useful for redisplay.
258 305
259 Faces and Glyphs 306 @end detailmenu
260
261 Specifiers
262
263 Menus
264
265 Subprocesses
266
267 Interface to X Windows
268
269 @end menu 307 @end menu
270 308
271 @node A History of Emacs, XEmacs From the Outside, Top, Top 309 @node A History of Emacs, XEmacs From the Outside, Top, Top
272 @chapter A History of Emacs 310 @chapter A History of Emacs
273 @cindex history of Emacs 311 @cindex history of Emacs
304 * GNU Emacs 19:: The other version 19 Emacs. 342 * GNU Emacs 19:: The other version 19 Emacs.
305 * GNU Emacs 20:: The other version 20 Emacs. 343 * GNU Emacs 20:: The other version 20 Emacs.
306 * XEmacs:: The continuation of Lucid Emacs. 344 * XEmacs:: The continuation of Lucid Emacs.
307 @end menu 345 @end menu
308 346
309 @node Through Version 18 347 @node Through Version 18, Lucid Emacs, A History of Emacs, A History of Emacs
310 @section Through Version 18 348 @section Through Version 18
311 @cindex Gosling, James 349 @cindex Gosling, James
312 @cindex Great Usenet Renaming 350 @cindex Great Usenet Renaming
313 351
314 Although the history of the early versions of GNU Emacs is unclear, 352 Although the history of the early versions of GNU Emacs is unclear,
417 version 18.58 released ?????. 455 version 18.58 released ?????.
418 @item 456 @item
419 version 18.59 released October 31, 1992. 457 version 18.59 released October 31, 1992.
420 @end itemize 458 @end itemize
421 459
422 @node Lucid Emacs 460 @node Lucid Emacs, GNU Emacs 19, Through Version 18, A History of Emacs
423 @section Lucid Emacs 461 @section Lucid Emacs
424 @cindex Lucid Emacs 462 @cindex Lucid Emacs
425 @cindex Lucid Inc. 463 @cindex Lucid Inc.
426 @cindex Energize 464 @cindex Energize
427 @cindex Epoch 465 @cindex Epoch
505 version 20.3 (the first stable version of XEmacs 20.x) released November 30, 543 version 20.3 (the first stable version of XEmacs 20.x) released November 30,
506 1997. 544 1997.
507 version 20.4 released February 28, 1998. 545 version 20.4 released February 28, 1998.
508 @end itemize 546 @end itemize
509 547
510 @node GNU Emacs 19 548 @node GNU Emacs 19, GNU Emacs 20, Lucid Emacs, A History of Emacs
511 @section GNU Emacs 19 549 @section GNU Emacs 19
512 @cindex GNU Emacs 19 550 @cindex GNU Emacs 19
513 @cindex FSF Emacs 551 @cindex FSF Emacs
514 552
515 About a year after the initial release of Lucid Emacs, the FSF 553 About a year after the initial release of Lucid Emacs, the FSF
582 worse. Lucid soon began incorporating features from GNU Emacs 19 into 620 worse. Lucid soon began incorporating features from GNU Emacs 19 into
583 Lucid Emacs; the work was mostly done by Richard Mlynarik, who had been 621 Lucid Emacs; the work was mostly done by Richard Mlynarik, who had been
584 working on and using GNU Emacs for a long time (back as far as version 622 working on and using GNU Emacs for a long time (back as far as version
585 16 or 17). 623 16 or 17).
586 624
587 @node GNU Emacs 20 625 @node GNU Emacs 20, XEmacs, GNU Emacs 19, A History of Emacs
588 @section GNU Emacs 20 626 @section GNU Emacs 20
589 @cindex GNU Emacs 20 627 @cindex GNU Emacs 20
590 @cindex FSF Emacs 628 @cindex FSF Emacs
591 629
592 On February 2, 1997 work began on GNU Emacs to integrate Mule. The first 630 On February 2, 1997 work began on GNU Emacs to integrate Mule. The first
601 version 20.2 released September 20, 1997. 639 version 20.2 released September 20, 1997.
602 @item 640 @item
603 version 20.3 released August 19, 1998. 641 version 20.3 released August 19, 1998.
604 @end itemize 642 @end itemize
605 643
606 @node XEmacs 644 @node XEmacs, , GNU Emacs 20, A History of Emacs
607 @section XEmacs 645 @section XEmacs
608 @cindex XEmacs 646 @cindex XEmacs
609 647
610 @cindex Sun Microsystems 648 @cindex Sun Microsystems
611 @cindex University of Illinois 649 @cindex University of Illinois
689 windows, frames, events) that are useful for implementing an editor. 727 windows, frames, events) that are useful for implementing an editor.
690 Some of these objects (in particular windows and frames) have 728 Some of these objects (in particular windows and frames) have
691 displayable representations, and XEmacs provides a function 729 displayable representations, and XEmacs provides a function
692 @code{redisplay()} that ensures that the display of all such objects 730 @code{redisplay()} that ensures that the display of all such objects
693 matches their internal state. Most of the time, a standard Lisp 731 matches their internal state. Most of the time, a standard Lisp
694 environment is in a @dfn{read-eval-print} loop -- i.e. ``read some Lisp 732 environment is in a @dfn{read-eval-print} loop---i.e. ``read some Lisp
695 code, execute it, and print the results''. XEmacs has a similar loop: 733 code, execute it, and print the results''. XEmacs has a similar loop:
696 734
697 @itemize @bullet 735 @itemize @bullet
698 @item 736 @item
699 read an event 737 read an event
864 handler for some or all classes of errors. (If no handler is registered, 902 handler for some or all classes of errors. (If no handler is registered,
865 a default handler, generally installed by the top-level event loop, is 903 a default handler, generally installed by the top-level event loop, is
866 executed; this prints out the error and continues.) Routines can also 904 executed; this prints out the error and continues.) Routines can also
867 specify cleanup code (called an @dfn{unwind-protect}) that will be 905 specify cleanup code (called an @dfn{unwind-protect}) that will be
868 called when control exits from a block of code, no matter how that exit 906 called when control exits from a block of code, no matter how that exit
869 occurs -- i.e. even if a function deeply nested below it causes a 907 occurs---i.e. even if a function deeply nested below it causes a
870 non-local exit back to the top level. 908 non-local exit back to the top level.
871 909
872 Note that this facility has appeared in some recent vintages of C, in 910 Note that this facility has appeared in some recent vintages of C, in
873 particular Visual C++ and other PC compilers written for the Microsoft 911 particular Visual C++ and other PC compilers written for the Microsoft
874 Win32 API. 912 Win32 API.
878 that if you declare a local variable in a particular function, and then 916 that if you declare a local variable in a particular function, and then
879 call another function, that subfunction can ``see'' the local variable 917 call another function, that subfunction can ``see'' the local variable
880 you declared. This is actually considered a bug in Emacs Lisp and in 918 you declared. This is actually considered a bug in Emacs Lisp and in
881 all other early dialects of Lisp, and was corrected in Common Lisp. (In 919 all other early dialects of Lisp, and was corrected in Common Lisp. (In
882 Common Lisp, you can still declare dynamically scoped variables if you 920 Common Lisp, you can still declare dynamically scoped variables if you
883 want to -- they are sometimes useful -- but variables by default are 921 want to---they are sometimes useful---but variables by default are
884 @dfn{lexically scoped} as in C.) 922 @dfn{lexically scoped} as in C.)
885 @end enumerate 923 @end enumerate
886 924
887 For those familiar with Lisp, Emacs Lisp is modelled after MacLisp, an 925 For those familiar with Lisp, Emacs Lisp is modelled after MacLisp, an
888 early dialect of Lisp developed at MIT (no relation to the Macintosh 926 early dialect of Lisp developed at MIT (no relation to the Macintosh
1236 most other data structures in Lisp. 1274 most other data structures in Lisp.
1237 @item char 1275 @item char
1238 An object representing a single character of text; chars behave like 1276 An object representing a single character of text; chars behave like
1239 integers in many ways but are logically considered text rather than 1277 integers in many ways but are logically considered text rather than
1240 numbers and have a different read syntax. (the read syntax for a char 1278 numbers and have a different read syntax. (the read syntax for a char
1241 contains the char itself or some textual encoding of it -- for example, 1279 contains the char itself or some textual encoding of it---for example,
1242 a Japanese Kanji character might be encoded as @samp{^[$(B#&^[(B} using the 1280 a Japanese Kanji character might be encoded as @samp{^[$(B#&^[(B} using the
1243 ISO-2022 encoding standard -- rather than the numerical representation 1281 ISO-2022 encoding standard---rather than the numerical representation
1244 of the char; this way, if the mapping between chars and integers 1282 of the char; this way, if the mapping between chars and integers
1245 changes, which is quite possible for Kanji characters and other extended 1283 changes, which is quite possible for Kanji characters and other extended
1246 characters, the same character will still be created. Note that some 1284 characters, the same character will still be created. Note that some
1247 primitives confuse chars and integers. The worst culprit is @code{eq}, 1285 primitives confuse chars and integers. The worst culprit is @code{eq},
1248 which makes a special exception and considers a char to be @code{eq} to 1286 which makes a special exception and considers a char to be @code{eq} to
1456 1494
1457 @example 1495 @example
1458 1.983e-4 1496 1.983e-4
1459 @end example 1497 @end example
1460 1498
1461 converts to a float whose value is 1983.23e-4, or .0001983. 1499 converts to a float whose value is 1.983e-4, or .0001983.
1462 1500
1463 @example 1501 @example
1464 ?b 1502 ?b
1465 @end example 1503 @end example
1466 1504
1590 The tag describes the type of the Lisp object. For integers and chars, 1628 The tag describes the type of the Lisp object. For integers and chars,
1591 the lower 28 bits contain the value of the integer or char; for all 1629 the lower 28 bits contain the value of the integer or char; for all
1592 others, the lower 28 bits contain a pointer. The mark bit is used 1630 others, the lower 28 bits contain a pointer. The mark bit is used
1593 during garbage-collection, and is always 0 when garbage collection is 1631 during garbage-collection, and is always 0 when garbage collection is
1594 not happening. (The way that garbage collection works, basically, is that it 1632 not happening. (The way that garbage collection works, basically, is that it
1595 loops over all places where Lisp objects could exist -- this includes 1633 loops over all places where Lisp objects could exist---this includes
1596 all global variables in C that contain Lisp objects [including 1634 all global variables in C that contain Lisp objects [including
1597 @code{Vobarray}, the C equivalent of @code{obarray}; through this, all 1635 @code{Vobarray}, the C equivalent of @code{obarray}; through this, all
1598 Lisp variables will get marked], plus various other places -- and 1636 Lisp variables will get marked], plus various other places---and
1599 recursively scans through the Lisp objects, marking each object it finds 1637 recursively scans through the Lisp objects, marking each object it finds
1600 by setting the mark bit. Then it goes through the lists of all objects 1638 by setting the mark bit. Then it goes through the lists of all objects
1601 allocated, freeing the ones that are not marked and turning off the mark 1639 allocated, freeing the ones that are not marked and turning off the mark
1602 bit of the ones that are marked.) 1640 bit of the ones that are marked.)
1603 1641
1707 machines/compilers do this, and on the ones that don't, a more 1745 machines/compilers do this, and on the ones that don't, a more
1708 complicated definition is selected by defining 1746 complicated definition is selected by defining
1709 @code{EXPLICIT_SIGN_EXTEND}. 1747 @code{EXPLICIT_SIGN_EXTEND}.
1710 1748
1711 Note that when @code{ERROR_CHECK_TYPECHECK} is defined, the extractor 1749 Note that when @code{ERROR_CHECK_TYPECHECK} is defined, the extractor
1712 macros become more complicated -- they check the tag bits and/or the 1750 macros become more complicated---they check the tag bits and/or the
1713 type field in the first four bytes of a record type to ensure that the 1751 type field in the first four bytes of a record type to ensure that the
1714 object is really of the correct type. This is great for catching places 1752 object is really of the correct type. This is great for catching places
1715 where an incorrect type is being dereferenced -- this typically results 1753 where an incorrect type is being dereferenced---this typically results
1716 in a pointer being dereferenced as the wrong type of structure, with 1754 in a pointer being dereferenced as the wrong type of structure, with
1717 unpredictable (and sometimes not easily traceable) results. 1755 unpredictable (and sometimes not easily traceable) results.
1718 1756
1719 There are similar @code{XSET@var{TYPE}()} macros that construct a Lisp 1757 There are similar @code{XSET@var{TYPE}()} macros that construct a Lisp
1720 object. These macros are of the form @code{XSET@var{TYPE} 1758 object. These macros are of the form @code{XSET@var{TYPE}
1754 * Adding Global Lisp Variables:: 1792 * Adding Global Lisp Variables::
1755 * Coding for Mule:: 1793 * Coding for Mule::
1756 * Techniques for XEmacs Developers:: 1794 * Techniques for XEmacs Developers::
1757 @end menu 1795 @end menu
1758 1796
1759 @node General Coding Rules 1797 @node General Coding Rules, Writing Lisp Primitives, Rules When Writing New C Code, Rules When Writing New C Code
1760 @section General Coding Rules 1798 @section General Coding Rules
1761 1799
1762 The C code is actually written in a dialect of C called @dfn{Clean C}, 1800 The C code is actually written in a dialect of C called @dfn{Clean C},
1763 meaning that it can be compiled, mostly warning-free, with either a C or 1801 meaning that it can be compiled, mostly warning-free, with either a C or
1764 C++ compiler. Coding in Clean C has several advantages over plain C. 1802 C++ compiler. Coding in Clean C has several advantages over plain C.
1787 the same directory as the C sources) and @file{lisp.h}. @file{config.h} 1825 the same directory as the C sources) and @file{lisp.h}. @file{config.h}
1788 must always be included before any other header files (including 1826 must always be included before any other header files (including
1789 system header files) to ensure that certain tricks played by various 1827 system header files) to ensure that certain tricks played by various
1790 @file{s/} and @file{m/} files work out correctly. 1828 @file{s/} and @file{m/} files work out correctly.
1791 1829
1830 When including header files, always use angle brackets, not double
1831 quotes, except when the file to be included is in the same directory as
1832 the including file. If either file is a generated file, then that is
1833 not likely to be the case. In order to understand why we have this
1834 rule, imagine what happens when you do a build in the source directory
1835 using @samp{./configure} and another build in another directory using
1836 @samp{../work/configure}. There will be two different @file{config.h}
1837 files. Which one will be used if you @samp{#include "config.h"}?
1838
1792 @strong{All global and static variables that are to be modifiable must 1839 @strong{All global and static variables that are to be modifiable must
1793 be declared uninitialized.} This means that you may not use the 1840 be declared uninitialized.} This means that you may not use the
1794 ``declare with initializer'' form for these variables, such as @code{int 1841 ``declare with initializer'' form for these variables, such as @code{int
1795 some_variable = 0;}. The reason for this has to do with some kludges 1842 some_variable = 0;}. The reason for this has to do with some kludges
1796 done during the dumping process: If possible, the initialized data 1843 done during the dumping process: If possible, the initialized data
1797 segment is re-mapped so that it becomes part of the (unmodifiable) code 1844 segment is re-mapped so that it becomes part of the (unmodifiable) code
1798 segment in the dumped executable. This allows this memory to be shared 1845 segment in the dumped executable. This allows this memory to be shared
1799 among multiple running XEmacs processes. XEmacs is careful to place as 1846 among multiple running XEmacs processes. XEmacs is careful to place as
1800 much constant data as possible into initialized variables (in 1847 much constant data as possible into initialized variables (in
1801 particular, into what's called the @dfn{pure space} -- see below) during 1848 particular, into what's called the @dfn{pure space}---see below) during
1802 the @file{temacs} phase. 1849 the @file{temacs} phase.
1803 1850
1804 @cindex copy-on-write 1851 @cindex copy-on-write
1805 @strong{Please note:} This kludge only works on a few systems nowadays, 1852 @strong{Please note:} This kludge only works on a few systems nowadays,
1806 and is rapidly becoming irrelevant because most modern operating systems 1853 and is rapidly becoming irrelevant because most modern operating systems
1831 1878
1832 The C source code makes heavy use of C preprocessor macros. One popular 1879 The C source code makes heavy use of C preprocessor macros. One popular
1833 macro style is: 1880 macro style is:
1834 1881
1835 @example 1882 @example
1836 #define FOO(var, value) do @{ \ 1883 #define FOO(var, value) do @{ \
1837 Lisp_Object FOO_value = (value); \ 1884 Lisp_Object FOO_value = (value); \
1838 ... /* compute using FOO_value */ \ 1885 ... /* compute using FOO_value */ \
1839 (var) = bar; \ 1886 (var) = bar; \
1840 @} while (0) 1887 @} while (0)
1841 @end example 1888 @end example
1842 1889
1843 The @code{do @{...@} while (0)} is a standard trick to allow FOO to have 1890 The @code{do @{...@} while (0)} is a standard trick to allow FOO to have
1844 statement semantics, so that it can safely be used within an @code{if} 1891 statement semantics, so that it can safely be used within an @code{if}
1860 case of @code{GET_EXTERNAL_LIST_LENGTH}, validating the properness of 1907 case of @code{GET_EXTERNAL_LIST_LENGTH}, validating the properness of
1861 the list. The macros @code{EXTERNAL_LIST_LOOP_DELETE_IF} and 1908 the list. The macros @code{EXTERNAL_LIST_LOOP_DELETE_IF} and
1862 @code{LIST_LOOP_DELETE_IF} delete elements from a lisp list satisfying some 1909 @code{LIST_LOOP_DELETE_IF} delete elements from a lisp list satisfying some
1863 predicate. 1910 predicate.
1864 1911
1865 @node Writing Lisp Primitives 1912 @node Writing Lisp Primitives, Adding Global Lisp Variables, General Coding Rules, Rules When Writing New C Code
1866 @section Writing Lisp Primitives 1913 @section Writing Lisp Primitives
1867 1914
1868 Lisp primitives are Lisp functions implemented in C. The details of 1915 Lisp primitives are Lisp functions implemented in C. The details of
1869 interfacing the C function so that Lisp can call it are handled by a few 1916 interfacing the C function so that Lisp can call it are handled by a few
1870 C macros. The only way to really understand how to write new C code is 1917 C macros. The only way to really understand how to write new C code is
2104 2151
2105 @file{eval.c} is a very good file to look through for examples; 2152 @file{eval.c} is a very good file to look through for examples;
2106 @file{lisp.h} contains the definitions for important macros and 2153 @file{lisp.h} contains the definitions for important macros and
2107 functions. 2154 functions.
2108 2155
2109 @node Adding Global Lisp Variables 2156 @node Adding Global Lisp Variables, Coding for Mule, Writing Lisp Primitives, Rules When Writing New C Code
2110 @section Adding Global Lisp Variables 2157 @section Adding Global Lisp Variables
2111 2158
2112 Global variables whose names begin with @samp{Q} are constants whose 2159 Global variables whose names begin with @samp{Q} are constants whose
2113 value is a symbol of a particular name. The name of the variable should 2160 value is a symbol of a particular name. The name of the variable should
2114 be derived from the name of the symbol using the same rules as for Lisp 2161 be derived from the name of the symbol using the same rules as for Lisp
2166 garbage-collection mechanism won't know that the object in this variable 2213 garbage-collection mechanism won't know that the object in this variable
2167 is in use, and will happily collect it and reuse its storage for another 2214 is in use, and will happily collect it and reuse its storage for another
2168 Lisp object, and you will be the one who's unhappy when you can't figure 2215 Lisp object, and you will be the one who's unhappy when you can't figure
2169 out how your variable got overwritten. 2216 out how your variable got overwritten.
2170 2217
2171 @node Coding for Mule 2218 @node Coding for Mule, Techniques for XEmacs Developers, Adding Global Lisp Variables, Rules When Writing New C Code
2172 @section Coding for Mule 2219 @section Coding for Mule
2173 @cindex Coding for Mule 2220 @cindex Coding for Mule
2174 2221
2175 Although Mule support is not compiled by default in XEmacs, many people 2222 Although Mule support is not compiled by default in XEmacs, many people
2176 are using it, and we consider it crucial that new code works correctly 2223 are using it, and we consider it crucial that new code works correctly
2189 * Conversion to and from External Data:: 2236 * Conversion to and from External Data::
2190 * General Guidelines for Writing Mule-Aware Code:: 2237 * General Guidelines for Writing Mule-Aware Code::
2191 * An Example of Mule-Aware Code:: 2238 * An Example of Mule-Aware Code::
2192 @end menu 2239 @end menu
2193 2240
2194 @node Character-Related Data Types 2241 @node Character-Related Data Types, Working With Character and Byte Positions, Coding for Mule, Coding for Mule
2195 @subsection Character-Related Data Types 2242 @subsection Character-Related Data Types
2196 2243
2197 First, let's review the basic character-related datatypes used by 2244 First, let's review the basic character-related datatypes used by
2198 XEmacs. Note that the separate @code{typedef}s are not mandatory in the 2245 XEmacs. Note that the separate @code{typedef}s are not mandatory in the
2199 current implementation (all of them boil down to @code{unsigned char} or 2246 current implementation (all of them boil down to @code{unsigned char} or
2263 which are equivalent to @code{unsigned char}. Obviously, an 2310 which are equivalent to @code{unsigned char}. Obviously, an
2264 @code{Extcount} is the distance between two @code{Extbyte}s. Extbytes 2311 @code{Extcount} is the distance between two @code{Extbyte}s. Extbytes
2265 and Extcounts are not all that frequent in XEmacs code. 2312 and Extcounts are not all that frequent in XEmacs code.
2266 @end table 2313 @end table
2267 2314
2268 @node Working With Character and Byte Positions 2315 @node Working With Character and Byte Positions, Conversion to and from External Data, Character-Related Data Types, Coding for Mule
2269 @subsection Working With Character and Byte Positions 2316 @subsection Working With Character and Byte Positions
2270 2317
2271 Now that we have defined the basic character-related types, we can look 2318 Now that we have defined the basic character-related types, we can look
2272 at the macros and functions designed for work with them and for 2319 at the macros and functions designed for work with them and for
2273 conversion between them. Most of these macros are defined in 2320 conversion between them. Most of these macros are defined in
2387 @example 2434 @example
2388 Bufbyte *charptr_n_addr (Bufbyte *p, Charcount cc); 2435 Bufbyte *charptr_n_addr (Bufbyte *p, Charcount cc);
2389 @end example 2436 @end example
2390 @end table 2437 @end table
2391 2438
2392 @node Conversion to and from External Data 2439 @node Conversion to and from External Data, General Guidelines for Writing Mule-Aware Code, Working With Character and Byte Positions, Coding for Mule
2393 @subsection Conversion to and from External Data 2440 @subsection Conversion to and from External Data
2394 2441
2395 When an external function, such as a C library function, returns a 2442 When an external function, such as a C library function, returns a
2396 @code{char} pointer, you should almost never treat it as @code{Bufbyte}. 2443 @code{char} pointer, you should almost never treat it as @code{Bufbyte}.
2397 This is because these returned strings may contain 8bit characters which 2444 This is because these returned strings may contain 8bit characters which
2501 These macros convert external text of a specific format to its internal 2548 These macros convert external text of a specific format to its internal
2502 representation, with the external format being incoded into the name of 2549 representation, with the external format being incoded into the name of
2503 the macro. 2550 the macro.
2504 @end table 2551 @end table
2505 2552
2506 @node General Guidelines for Writing Mule-Aware Code 2553 @node General Guidelines for Writing Mule-Aware Code, An Example of Mule-Aware Code, Conversion to and from External Data, Coding for Mule
2507 @subsection General Guidelines for Writing Mule-Aware Code 2554 @subsection General Guidelines for Writing Mule-Aware Code
2508 2555
2509 This section contains some general guidance on how to write Mule-aware 2556 This section contains some general guidance on how to write Mule-aware
2510 code, as well as some pitfalls you should avoid. 2557 code, as well as some pitfalls you should avoid.
2511 2558
2540 they receive. This increases efficiency because that way external data 2587 they receive. This increases efficiency because that way external data
2541 needs to be decoded only once, when it is read. After that, it is 2588 needs to be decoded only once, when it is read. After that, it is
2542 passed around in internal format. 2589 passed around in internal format.
2543 @end table 2590 @end table
2544 2591
2545 @node An Example of Mule-Aware Code 2592 @node An Example of Mule-Aware Code, , General Guidelines for Writing Mule-Aware Code, Coding for Mule
2546 @subsection An Example of Mule-Aware Code 2593 @subsection An Example of Mule-Aware Code
2547 2594
2548 As an example of Mule-aware code, we shall will analyze the 2595 As an example of Mule-aware code, we shall will analyze the
2549 @code{string} function, which conses up a Lisp string from the character 2596 @code{string} function, which conses up a Lisp string from the character
2550 arguments it receives. Here is the definition, pasted from 2597 arguments it receives. Here is the definition, pasted from
2591 over the XEmacs code. For starters, I recommend 2638 over the XEmacs code. For starters, I recommend
2592 @code{Fnormalize_menu_item_name} in @file{menubar.c}. After you have 2639 @code{Fnormalize_menu_item_name} in @file{menubar.c}. After you have
2593 understood this section of the manual and studied the examples, you can 2640 understood this section of the manual and studied the examples, you can
2594 proceed writing new Mule-aware code. 2641 proceed writing new Mule-aware code.
2595 2642
2596 @node Techniques for XEmacs Developers 2643 @node Techniques for XEmacs Developers, , Coding for Mule, Rules When Writing New C Code
2597 @section Techniques for XEmacs Developers 2644 @section Techniques for XEmacs Developers
2598 2645
2599 To make a quantified XEmacs, do: @code{make quantmacs}. 2646 To make a quantified XEmacs, do: @code{make quantmacs}.
2600 2647
2601 You simply can't dump Quantified and Purified images. Run the image 2648 You simply can't dump Quantified and Purified images. Run the image
2639 2686
2640 Unfortunately, Emacs Lisp is slow, and is going to stay slow. Function 2687 Unfortunately, Emacs Lisp is slow, and is going to stay slow. Function
2641 calls in elisp are especially expensive. Iterating over a long list is 2688 calls in elisp are especially expensive. Iterating over a long list is
2642 going to be 30 times faster implemented in C than in Elisp. 2689 going to be 30 times faster implemented in C than in Elisp.
2643 2690
2644 To get started debugging XEmacs, take a look at the @file{gdbinit} and 2691 To get started debugging XEmacs, take a look at the @file{.gdbinit} and
2645 @file{dbxrc} files in the @file{src} directory. 2692 @file{.dbxrc} files in the @file{src} directory.
2646 @xref{Q2.1.15 - How to Debug an XEmacs problem with a debugger,,, 2693 @xref{Q2.1.15 - How to Debug an XEmacs problem with a debugger,,,
2647 xemacs-faq, XEmacs FAQ}. 2694 xemacs-faq, XEmacs FAQ}.
2648 2695
2649 After making source code changes, run @code{make check} to ensure that 2696 After making source code changes, run @code{make check} to ensure that
2650 you haven't introduced any regressions. If you're feeling ambitious, 2697 you haven't introduced any regressions. If you're feeling ambitious,
2713 * Modules for Interfacing with the Operating System:: 2760 * Modules for Interfacing with the Operating System::
2714 * Modules for Interfacing with X Windows:: 2761 * Modules for Interfacing with X Windows::
2715 * Modules for Internationalization:: 2762 * Modules for Internationalization::
2716 @end menu 2763 @end menu
2717 2764
2718 @node Low-Level Modules 2765 @node Low-Level Modules, Basic Lisp Modules, A Summary of the Various XEmacs Modules, A Summary of the Various XEmacs Modules
2719 @section Low-Level Modules 2766 @section Low-Level Modules
2720 2767
2721 @example 2768 @example
2722 config.h 2769 config.h
2723 @end example 2770 @end example
2937 2984
2938 This is not currently used. 2985 This is not currently used.
2939 2986
2940 2987
2941 2988
2942 @node Basic Lisp Modules 2989 @node Basic Lisp Modules, Modules for Standard Editing Operations, Low-Level Modules, A Summary of the Various XEmacs Modules
2943 @section Basic Lisp Modules 2990 @section Basic Lisp Modules
2944 2991
2945 @example 2992 @example
2946 emacsfns.h 2993 emacsfns.h
2947 lisp-disunion.h 2994 lisp-disunion.h
2972 declarations (i.e. a simple declaration like @code{struct foo;} where 3019 declarations (i.e. a simple declaration like @code{struct foo;} where
2973 the structure itself is defined elsewhere) should be placed into the 3020 the structure itself is defined elsewhere) should be placed into the
2974 typedefs section as necessary. 3021 typedefs section as necessary.
2975 3022
2976 @file{lrecord.h} contains the basic structures and macros that implement 3023 @file{lrecord.h} contains the basic structures and macros that implement
2977 all record-type Lisp objects -- i.e. all objects whose type is a field 3024 all record-type Lisp objects---i.e. all objects whose type is a field
2978 in their C structure, which includes all objects except the few most 3025 in their C structure, which includes all objects except the few most
2979 basic ones. 3026 basic ones.
2980 3027
2981 @file{lisp.h} contains prototypes for most of the exported functions in 3028 @file{lisp.h} contains prototypes for most of the exported functions in
2982 the various modules. Lisp primitives defined using @code{DEFUN} that 3029 the various modules. Lisp primitives defined using @code{DEFUN} that
3008 not dependent on any particular object type, and interfaces to 3055 not dependent on any particular object type, and interfaces to
3009 particular types of objects using a standardized interface of 3056 particular types of objects using a standardized interface of
3010 type-specific methods. This scheme is a fundamental principle of 3057 type-specific methods. This scheme is a fundamental principle of
3011 object-oriented programming and is heavily used throughout XEmacs. The 3058 object-oriented programming and is heavily used throughout XEmacs. The
3012 great advantage of this is that it allows for a clean separation of 3059 great advantage of this is that it allows for a clean separation of
3013 functionality into different modules -- new classes of Lisp objects, new 3060 functionality into different modules---new classes of Lisp objects, new
3014 event interfaces, new device types, new stream interfaces, etc. can be 3061 event interfaces, new device types, new stream interfaces, etc. can be
3015 added transparently without affecting code anywhere else in XEmacs. 3062 added transparently without affecting code anywhere else in XEmacs.
3016 Because the different subsystems are divided into general and specific 3063 Because the different subsystems are divided into general and specific
3017 code, adding a new subtype within a subsystem will in general not 3064 code, adding a new subtype within a subsystem will in general not
3018 require changes to the generic subsystem code or affect any of the other 3065 require changes to the generic subsystem code or affect any of the other
3099 @end example 3146 @end example
3100 3147
3101 @file{symbols.c} implements the handling of symbols, obarrays, and 3148 @file{symbols.c} implements the handling of symbols, obarrays, and
3102 retrieving the values of symbols. Much of the code is devoted to 3149 retrieving the values of symbols. Much of the code is devoted to
3103 handling the special @dfn{symbol-value-magic} objects that define 3150 handling the special @dfn{symbol-value-magic} objects that define
3104 special types of variables -- this includes buffer-local variables, 3151 special types of variables---this includes buffer-local variables,
3105 variable aliases, variables that forward into C variables, etc. This 3152 variable aliases, variables that forward into C variables, etc. This
3106 module is initialized extremely early (right after @file{alloc.c}), 3153 module is initialized extremely early (right after @file{alloc.c}),
3107 because it is here that the basic symbols @code{t} and @code{nil} are 3154 because it is here that the basic symbols @code{t} and @code{nil} are
3108 created, and those symbols are used everywhere throughout XEmacs. 3155 created, and those symbols are used everywhere throughout XEmacs.
3109 3156
3143 structures. Note that the byte-code @emph{compiler} is written in Lisp. 3190 structures. Note that the byte-code @emph{compiler} is written in Lisp.
3144 3191
3145 3192
3146 3193
3147 3194
3148 @node Modules for Standard Editing Operations 3195 @node Modules for Standard Editing Operations, Editor-Level Control Flow Modules, Basic Lisp Modules, A Summary of the Various XEmacs Modules
3149 @section Modules for Standard Editing Operations 3196 @section Modules for Standard Editing Operations
3150 3197
3151 @example 3198 @example
3152 buffer.c 3199 buffer.c
3153 buffer.h 3200 buffer.h
3313 This module implements the undo mechanism for tracking buffer changes. 3360 This module implements the undo mechanism for tracking buffer changes.
3314 Most of this could be implemented in Lisp. 3361 Most of this could be implemented in Lisp.
3315 3362
3316 3363
3317 3364
3318 @node Editor-Level Control Flow Modules 3365 @node Editor-Level Control Flow Modules, Modules for the Basic Displayable Lisp Objects, Modules for Standard Editing Operations, A Summary of the Various XEmacs Modules
3319 @section Editor-Level Control Flow Modules 3366 @section Editor-Level Control Flow Modules
3320 3367
3321 @example 3368 @example
3322 event-Xt.c 3369 event-Xt.c
3323 event-stream.c 3370 event-stream.c
3378 @example 3425 @example
3379 keyboard.c 3426 keyboard.c
3380 @end example 3427 @end example
3381 3428
3382 @file{keyboard.c} contains functions that implement the actual editor 3429 @file{keyboard.c} contains functions that implement the actual editor
3383 command loop -- i.e. the event loop that cyclically retrieves and 3430 command loop---i.e. the event loop that cyclically retrieves and
3384 dispatches events. This code is also rather tricky, just like 3431 dispatches events. This code is also rather tricky, just like
3385 @file{event-stream.c}. 3432 @file{event-stream.c}.
3386 3433
3387 3434
3388 3435
3411 bootstrapping implementations early in temacs, before the echo-area Lisp 3458 bootstrapping implementations early in temacs, before the echo-area Lisp
3412 code is loaded). 3459 code is loaded).
3413 3460
3414 3461
3415 3462
3416 @node Modules for the Basic Displayable Lisp Objects 3463 @node Modules for the Basic Displayable Lisp Objects, Modules for other Display-Related Lisp Objects, Editor-Level Control Flow Modules, A Summary of the Various XEmacs Modules
3417 @section Modules for the Basic Displayable Lisp Objects 3464 @section Modules for the Basic Displayable Lisp Objects
3418 3465
3419 @example 3466 @example
3420 device-ns.h 3467 device-ns.h
3421 device-stream.c 3468 device-stream.c
3485 is part of the redisplay mechanism or the code for particular object 3532 is part of the redisplay mechanism or the code for particular object
3486 types such as scrollbars. 3533 types such as scrollbars.
3487 3534
3488 3535
3489 3536
3490 @node Modules for other Display-Related Lisp Objects 3537 @node Modules for other Display-Related Lisp Objects, Modules for the Redisplay Mechanism, Modules for the Basic Displayable Lisp Objects, A Summary of the Various XEmacs Modules
3491 @section Modules for other Display-Related Lisp Objects 3538 @section Modules for other Display-Related Lisp Objects
3492 3539
3493 @example 3540 @example
3494 faces.c 3541 faces.c
3495 faces.h 3542 faces.h
3546 3593
3547 @example 3594 @example
3548 font-lock.c 3595 font-lock.c
3549 @end example 3596 @end example
3550 3597
3551 This file provides C support for syntax highlighting -- i.e. 3598 This file provides C support for syntax highlighting---i.e.
3552 highlighting different syntactic constructs of a source file in 3599 highlighting different syntactic constructs of a source file in
3553 different colors, for easy reading. The C support is provided so that 3600 different colors, for easy reading. The C support is provided so that
3554 this is fast. 3601 this is fast.
3555 3602
3556 3603
3564 3611
3565 These modules decode GIF-format image files, for use with glyphs. 3612 These modules decode GIF-format image files, for use with glyphs.
3566 3613
3567 3614
3568 3615
3569 @node Modules for the Redisplay Mechanism 3616 @node Modules for the Redisplay Mechanism, Modules for Interfacing with the File System, Modules for other Display-Related Lisp Objects, A Summary of the Various XEmacs Modules
3570 @section Modules for the Redisplay Mechanism 3617 @section Modules for the Redisplay Mechanism
3571 3618
3572 @example 3619 @example
3573 redisplay-output.c 3620 redisplay-output.c
3574 redisplay-tty.c 3621 redisplay-tty.c
3636 These files provide some miscellaneous TTY-output functions and should 3683 These files provide some miscellaneous TTY-output functions and should
3637 probably be merged into @file{redisplay-tty.c}. 3684 probably be merged into @file{redisplay-tty.c}.
3638 3685
3639 3686
3640 3687
3641 @node Modules for Interfacing with the File System 3688 @node Modules for Interfacing with the File System, Modules for Other Aspects of the Lisp Interpreter and Object System, Modules for the Redisplay Mechanism, A Summary of the Various XEmacs Modules
3642 @section Modules for Interfacing with the File System 3689 @section Modules for Interfacing with the File System
3643 3690
3644 @example 3691 @example
3645 lstream.c 3692 lstream.c
3646 lstream.h 3693 lstream.h
3737 for expanding symbolic links, on systems that don't implement it or have 3784 for expanding symbolic links, on systems that don't implement it or have
3738 a broken implementation. 3785 a broken implementation.
3739 3786
3740 3787
3741 3788
3742 @node Modules for Other Aspects of the Lisp Interpreter and Object System 3789 @node Modules for Other Aspects of the Lisp Interpreter and Object System, Modules for Interfacing with the Operating System, Modules for Interfacing with the File System, A Summary of the Various XEmacs Modules
3743 @section Modules for Other Aspects of the Lisp Interpreter and Object System 3790 @section Modules for Other Aspects of the Lisp Interpreter and Object System
3744 3791
3745 @example 3792 @example
3746 elhash.c 3793 elhash.c
3747 elhash.h 3794 elhash.h
3851 @cindex mark method 3898 @cindex mark method
3852 Opaque objects can also have an arbitrary @dfn{mark method} associated 3899 Opaque objects can also have an arbitrary @dfn{mark method} associated
3853 with them, in case the block of memory contains other Lisp objects that 3900 with them, in case the block of memory contains other Lisp objects that
3854 need to be marked for garbage-collection purposes. (If you need other 3901 need to be marked for garbage-collection purposes. (If you need other
3855 object methods, such as a finalize method, you should just go ahead and 3902 object methods, such as a finalize method, you should just go ahead and
3856 create a new Lisp object type -- it's not hard.) 3903 create a new Lisp object type---it's not hard.)
3857 3904
3858 3905
3859 3906
3860 @example 3907 @example
3861 abbrev.c 3908 abbrev.c
3899 various security applications on the Internet. 3946 various security applications on the Internet.
3900 3947
3901 3948
3902 3949
3903 3950
3904 @node Modules for Interfacing with the Operating System 3951 @node Modules for Interfacing with the Operating System, Modules for Interfacing with X Windows, Modules for Other Aspects of the Lisp Interpreter and Object System, A Summary of the Various XEmacs Modules
3905 @section Modules for Interfacing with the Operating System 3952 @section Modules for Interfacing with the Operating System
3906 3953
3907 @example 3954 @example
3908 callproc.c 3955 callproc.c
3909 process.c 3956 process.c
4138 These modules are used for MS-DOS support, which does not work in 4185 These modules are used for MS-DOS support, which does not work in
4139 XEmacs. 4186 XEmacs.
4140 4187
4141 4188
4142 4189
4143 @node Modules for Interfacing with X Windows 4190 @node Modules for Interfacing with X Windows, Modules for Internationalization, Modules for Interfacing with the Operating System, A Summary of the Various XEmacs Modules
4144 @section Modules for Interfacing with X Windows 4191 @section Modules for Interfacing with X Windows
4145 4192
4146 @example 4193 @example
4147 Emacs.ad.h 4194 Emacs.ad.h
4148 @end example 4195 @end example
4280 4327
4281 Don't touch this code; something is liable to break if you do. 4328 Don't touch this code; something is liable to break if you do.
4282 4329
4283 4330
4284 4331
4285 @node Modules for Internationalization 4332 @node Modules for Internationalization, , Modules for Interfacing with X Windows, A Summary of the Various XEmacs Modules
4286 @section Modules for Internationalization 4333 @section Modules for Internationalization
4287 4334
4288 @example 4335 @example
4289 mule-canna.c 4336 mule-canna.c
4290 mule-ccl.c 4337 mule-ccl.c
4357 Asian-language support, and is not currently used. 4404 Asian-language support, and is not currently used.
4358 4405
4359 4406
4360 4407
4361 4408
4362 @node Allocation of Objects in XEmacs Lisp, Events and the Event Loop, A Summary of the Various XEmacs Modules, Top 4409 @node Allocation of Objects in XEmacs Lisp, Dumping, A Summary of the Various XEmacs Modules, Top
4363 @chapter Allocation of Objects in XEmacs Lisp 4410 @chapter Allocation of Objects in XEmacs Lisp
4364 4411
4365 @menu 4412 @menu
4366 * Introduction to Allocation:: 4413 * Introduction to Allocation::
4367 * Garbage Collection:: 4414 * Garbage Collection::
4368 * GCPROing:: 4415 * GCPROing::
4416 * Garbage Collection - Step by Step::
4369 * Integers and Characters:: 4417 * Integers and Characters::
4370 * Allocation from Frob Blocks:: 4418 * Allocation from Frob Blocks::
4371 * lrecords:: 4419 * lrecords::
4372 * Low-level allocation:: 4420 * Low-level allocation::
4373 * Pure Space:: 4421 * Pure Space::
4378 * Marker:: 4426 * Marker::
4379 * String:: 4427 * String::
4380 * Compiled Function:: 4428 * Compiled Function::
4381 @end menu 4429 @end menu
4382 4430
4383 @node Introduction to Allocation 4431 @node Introduction to Allocation, Garbage Collection, Allocation of Objects in XEmacs Lisp, Allocation of Objects in XEmacs Lisp
4384 @section Introduction to Allocation 4432 @section Introduction to Allocation
4385 4433
4386 Emacs Lisp, like all Lisps, has garbage collection. This means that 4434 Emacs Lisp, like all Lisps, has garbage collection. This means that
4387 the programmer never has to explicitly free (destroy) an object; it 4435 the programmer never has to explicitly free (destroy) an object; it
4388 happens automatically when the object becomes inaccessible. Most 4436 happens automatically when the object becomes inaccessible. Most
4427 format; this includes conses, strings, vectors, and sometimes symbols. 4475 format; this includes conses, strings, vectors, and sometimes symbols.
4428 With the exception of vectors, objects in this category are allocated in 4476 With the exception of vectors, objects in this category are allocated in
4429 @dfn{frob blocks}, i.e. large blocks of memory that are subdivided into 4477 @dfn{frob blocks}, i.e. large blocks of memory that are subdivided into
4430 individual objects. This saves a lot on malloc overhead, since there 4478 individual objects. This saves a lot on malloc overhead, since there
4431 are typically quite a lot of these objects around, and the objects are 4479 are typically quite a lot of these objects around, and the objects are
4432 small. (A cons, for example, occupies 8 bytes on 32-bit machines -- 4 4480 small. (A cons, for example, occupies 8 bytes on 32-bit machines---4
4433 bytes for each of the two objects it contains.) Vectors are individually 4481 bytes for each of the two objects it contains.) Vectors are individually
4434 @code{malloc()}ed since they are of variable size. (It would be 4482 @code{malloc()}ed since they are of variable size. (It would be
4435 possible, and desirable, to allocate vectors of certain small sizes out 4483 possible, and desirable, to allocate vectors of certain small sizes out
4436 of frob blocks, but it isn't currently done.) Strings are handled 4484 of frob blocks, but it isn't currently done.) Strings are handled
4437 specially: Each string is allocated in two parts, a fixed size structure 4485 specially: Each string is allocated in two parts, a fixed size structure
4481 abstraction, the FSF scheme is not nearly as clean or as easy to 4529 abstraction, the FSF scheme is not nearly as clean or as easy to
4482 extend. (FSF calls items of type (c) @code{Lisp_Misc} and items of type 4530 extend. (FSF calls items of type (c) @code{Lisp_Misc} and items of type
4483 (d) @code{Lisp_Vectorlike}, with separate tags for each, although 4531 (d) @code{Lisp_Vectorlike}, with separate tags for each, although
4484 @code{Lisp_Vectorlike} is also used for vectors.) 4532 @code{Lisp_Vectorlike} is also used for vectors.)
4485 4533
4486 @node Garbage Collection 4534 @node Garbage Collection, GCPROing, Introduction to Allocation, Allocation of Objects in XEmacs Lisp
4487 @section Garbage Collection 4535 @section Garbage Collection
4488 @cindex garbage collection 4536 @cindex garbage collection
4489 4537
4490 @cindex mark and sweep 4538 @cindex mark and sweep
4491 Garbage collection is simple in theory but tricky to implement. 4539 Garbage collection is simple in theory but tricky to implement.
4555 Finally, note that garbage collection can be invoked explicitly 4603 Finally, note that garbage collection can be invoked explicitly
4556 by calling @code{garbage-collect} but is also called automatically 4604 by calling @code{garbage-collect} but is also called automatically
4557 by @code{eval}, once a certain amount of memory has been allocated 4605 by @code{eval}, once a certain amount of memory has been allocated
4558 since the last garbage collection (according to @code{gc-cons-threshold}). 4606 since the last garbage collection (according to @code{gc-cons-threshold}).
4559 4607
4560 @node GCPROing 4608 @node GCPROing, Garbage Collection - Step by Step, Garbage Collection, Allocation of Objects in XEmacs Lisp
4561 @section @code{GCPRO}ing 4609 @section @code{GCPRO}ing
4562 4610
4563 @code{GCPRO}ing is one of the ugliest and trickiest parts of Emacs 4611 @code{GCPRO}ing is one of the ugliest and trickiest parts of Emacs
4564 internals. The basic idea is that whenever garbage collection 4612 internals. The basic idea is that whenever garbage collection
4565 occurs, all in-use objects must be reachable somehow or 4613 occurs, all in-use objects must be reachable somehow or
4617 variable @samp{gcprolist} pointing to the head of the list and the nth 4665 variable @samp{gcprolist} pointing to the head of the list and the nth
4618 local @code{gcpro} variable pointing to the first @code{gcpro} variable 4666 local @code{gcpro} variable pointing to the first @code{gcpro} variable
4619 in the next enclosing stack frame. Each @code{GCPRO}ed thing is an 4667 in the next enclosing stack frame. Each @code{GCPRO}ed thing is an
4620 lvalue, and the @code{struct gcpro} local variable contains a pointer to 4668 lvalue, and the @code{struct gcpro} local variable contains a pointer to
4621 this lvalue. This is why things will mess up badly if you don't pair up 4669 this lvalue. This is why things will mess up badly if you don't pair up
4622 the @code{GCPRO}s and @code{UNGCPRO}s -- you will end up with 4670 the @code{GCPRO}s and @code{UNGCPRO}s---you will end up with
4623 @code{gcprolist}s containing pointers to @code{struct gcpro}s or local 4671 @code{gcprolist}s containing pointers to @code{struct gcpro}s or local
4624 @code{Lisp_Object} variables in no-longer-active stack frames. 4672 @code{Lisp_Object} variables in no-longer-active stack frames.
4625 4673
4626 @item 4674 @item
4627 It is actually possible for a single @code{struct gcpro} to 4675 It is actually possible for a single @code{struct gcpro} to
4708 anything that looks like a reference to an object as a reference. This 4756 anything that looks like a reference to an object as a reference. This
4709 will result in a few objects not getting collected when they should, but 4757 will result in a few objects not getting collected when they should, but
4710 it obviates the need for @code{GCPRO}ing, and allows garbage collection 4758 it obviates the need for @code{GCPRO}ing, and allows garbage collection
4711 to happen at any point at all, such as during object allocation. 4759 to happen at any point at all, such as during object allocation.
4712 4760
4713 @node Integers and Characters 4761 @node Garbage Collection - Step by Step, Integers and Characters, GCPROing, Allocation of Objects in XEmacs Lisp
4762 @section Garbage Collection - Step by Step
4763 @cindex garbage collection step by step
4764
4765 @menu
4766 * Invocation::
4767 * garbage_collect_1::
4768 * mark_object::
4769 * gc_sweep::
4770 * sweep_lcrecords_1::
4771 * compact_string_chars::
4772 * sweep_strings::
4773 * sweep_bit_vectors_1::
4774 @end menu
4775
4776 @node Invocation, garbage_collect_1, Garbage Collection - Step by Step, Garbage Collection - Step by Step
4777 @subsection Invocation
4778 @cindex garbage collection, invocation
4779
4780 The first thing that anyone should know about garbage collection is:
4781 when and how the garbage collector is invoked. One might think that this
4782 could happen every time new memory is allocated, e.g. new objects are
4783 created, but this is @emph{not} the case. Instead, we have the following
4784 situation:
4785
4786 The entry point of any process of garbage collection is an invocation
4787 of the function @code{garbage_collect_1} in file @code{alloc.c}. The
4788 invocation can occur @emph{explicitly} by calling the function
4789 @code{Fgarbage_collect} (in addition this function provides information
4790 about the freed memory), or can occur @emph{implicitly} in four different
4791 situations:
4792 @enumerate
4793 @item
4794 In function @code{main_1} in file @code{emacs.c}. This function is called
4795 at each startup of xemacs. The garbage collection is invoked after all
4796 initial creations are completed, but only if a special internal error
4797 checking-constant @code{ERROR_CHECK_GC} is defined.
4798 @item
4799 In function @code{disksave_object_finalization} in file
4800 @code{alloc.c}. The only purpose of this function is to clear the
4801 objects from memory which need not be stored with xemacs when we dump out
4802 an executable. This is only done by @code{Fdump_emacs} or by
4803 @code{Fdump_emacs_data} respectively (both in @code{emacs.c}). The
4804 actual clearing is accomplished by making these objects unreachable and
4805 starting a garbage collection. The function is only used while building
4806 xemacs.
4807 @item
4808 In function @code{Feval / eval} in file @code{eval.c}. Each time the
4809 well known and often used function eval is called to evaluate a form,
4810 one of the first things that could happen, is a potential call of
4811 @code{garbage_collect_1}. There exist three global variables,
4812 @code{consing_since_gc} (counts the created cons-cells since the last
4813 garbage collection), @code{gc_cons_threshold} (a specified threshold
4814 after which a garbage collection occurs) and @code{always_gc}. If
4815 @code{always_gc} is set or if the threshold is exceeded, the garbage
4816 collection will start.
4817 @item
4818 In function @code{Ffuncall / funcall} in file @code{eval.c}. This
4819 function evaluates calls of elisp functions and works according to
4820 @code{Feval}.
4821 @end enumerate
4822
4823 The upshot is that garbage collection can basically occur everywhere
4824 @code{Feval}, respectively @code{Ffuncall}, is used - either directly or
4825 through another function. Since calls to these two functions are
4826 hidden in various other functions, many calls to
4827 @code{garabge_collect_1} are not obviously foreseeable, and therefore
4828 unexpected. Instances where they are used that are worth remembering are
4829 various elisp commands, as for example @code{or},
4830 @code{and}, @code{if}, @code{cond}, @code{while}, @code{setq}, etc.,
4831 miscellaneous @code{gui_item_...} functions, everything related to
4832 @code{eval} (@code{Feval_buffer}, @code{call0}, ...) and inside
4833 @code{Fsignal}. The latter is used to handle signals, as for example the
4834 ones raised by every @code{QUITE}-macro triggered after pressing Ctrl-g.
4835
4836 @node garbage_collect_1, mark_object, Invocation, Garbage Collection - Step by Step
4837 @subsection @code{garbage_collect_1}
4838 @cindex @code{garbage_collect_1}
4839
4840 We can now describe exactly what happens after the invocation takes
4841 place.
4842 @enumerate
4843 @item
4844 There are several cases in which the garbage collector is left immediately:
4845 when we are already garbage collecting (@code{gc_in_progress}), when
4846 the garbage collection is somehow forbidden
4847 (@code{gc_currently_forbidden}), when we are currently displaying something
4848 (@code{in_display}) or when we are preparing for the armageddon of the
4849 whole system (@code{preparing_for_armageddon}).
4850 @item
4851 Next the correct frame in which to put
4852 all the output occurring during garbage collecting is determined. In
4853 order to be able to restore the old display's state after displaying the
4854 message, some data about the current cursor position has to be
4855 saved. The variables @code{pre_gc_curser} and @code{cursor_changed} take
4856 care of that.
4857 @item
4858 The state of @code{gc_currently_forbidden} must be restored after
4859 the garbage collection, no matter what happens during the process. We
4860 accomplish this by @code{record_unwind_protect}ing the suitable function
4861 @code{restore_gc_inhibit} together with the current value of
4862 @code{gc_currently_forbidden}.
4863 @item
4864 If we are concurrently running an interactive xemacs session, the next step
4865 is simply to show the garbage collector's cursor/message.
4866 @item
4867 The following steps are the intrinsic steps of the garbage collector,
4868 therefore @code{gc_in_progress} is set.
4869 @item
4870 For debugging purposes, it is possible to copy the current C stack
4871 frame. However, this seems to be a currently unused feature.
4872 @item
4873 Before actually starting to go over all live objects, references to
4874 objects that are no longer used are pruned. We only have to do this for events
4875 (@code{clear_event_resource}) and for specifiers
4876 (@code{cleanup_specifiers}).
4877 @item
4878 Now the mark phase begins and marks all accessible elements. In order to
4879 start from
4880 all slots that serve as roots of accessibility, the function
4881 @code{mark_object} is called for each root individually to go out from
4882 there to mark all reachable objects. All roots that are traversed are
4883 shown in their processed order:
4884 @itemize @bullet
4885 @item
4886 all constant symbols and static variables that are registered via
4887 @code{staticpro}@ in the array @code{staticvec}.
4888 @xref{Adding Global Lisp Variables}.
4889 @item
4890 all Lisp objects that are created in C functions and that must be
4891 protected from freeing them. They are registered in the global
4892 list @code{gcprolist}.
4893 @xref{GCPROing}.
4894 @item
4895 all local variables (i.e. their name fields @code{symbol} and old
4896 values @code{old_values}) that are bound during the evaluation by the Lisp
4897 engine. They are stored in @code{specbinding} structs pushed on a stack
4898 called @code{specpdl}.
4899 @xref{Dynamic Binding; The specbinding Stack; Unwind-Protects}.
4900 @item
4901 all catch blocks that the Lisp engine encounters during the evaluation
4902 cause the creation of structs @code{catchtag} inserted in the list
4903 @code{catchlist}. Their tag (@code{tag}) and value (@code{val} fields
4904 are freshly created objects and therefore have to be marked.
4905 @xref{Catch and Throw}.
4906 @item
4907 every function application pushes new structs @code{backtrace}
4908 on the call stack of the Lisp engine (@code{backtrace_list}). The unique
4909 parts that have to be marked are the fields for each function
4910 (@code{function}) and all their arguments (@code{args}).
4911 @xref{Evaluation}.
4912 @item
4913 all objects that are used by the redisplay engine that must not be freed
4914 are marked by a special function called @code{mark_redisplay} (in
4915 @code{redisplay.c}).
4916 @item
4917 all objects created for profiling purposes are allocated by C functions
4918 instead of using the lisp allocation mechanisms. In order to receive the
4919 right ones during the sweep phase, they also have to be marked
4920 manually. That is done by the function @code{mark_profiling_info}
4921 @end itemize
4922 @item
4923 Hash tables in XEmacs belong to a kind of special objects that
4924 make use of a concept often called 'weak pointers'.
4925 To make a long story short, these kind of pointers are not followed
4926 during the estimation of the live objects during garbage collection.
4927 Any object referenced only by weak pointers is collected
4928 anyway, and the reference to it is cleared. In hash tables there are
4929 different usage patterns of them, manifesting in different types of hash
4930 tables, namely 'non-weak', 'weak', 'key-weak' and 'value-weak'
4931 (internally also 'key-car-weak' and 'value-car-weak') hash tables, each
4932 clearing entries depending on different conditions. More information can
4933 be found in the documentation to the function @code{make-hash-table}.
4934
4935 Because there are complicated dependency rules about when and what to
4936 mark while processing weak hash tables, the standard @code{marker}
4937 method is only active if it is marking non-weak hash tables. As soon as
4938 a weak component is in the table, the hash table entries are ignored
4939 while marking. Instead their marking is done each separately by the
4940 function @code{finish_marking_weak_hash_tables}. This function iterates
4941 over each hash table entry @code{hentries} for each weak hash table in
4942 @code{Vall_weak_hash_tables}. Depending on the type of a table, the
4943 appropriate action is performed.
4944 If a table is acting as @code{HASH_TABLE_KEY_WEAK}, and a key already marked,
4945 everything reachable from the @code{value} component is marked. If it is
4946 acting as a @code{HASH_TABLE_VALUE_WEAK} and the value component is
4947 already marked, the marking starts beginning only from the
4948 @code{key} component.
4949 If it is a @code{HASH_TABLE_KEY_CAR_WEAK} and the car
4950 of the key entry is already marked, we mark both the @code{key} and
4951 @code{value} components.
4952 Finally, if the table is of the type @code{HASH_TABLE_VALUE_CAR_WEAK}
4953 and the car of the value components is already marked, again both the
4954 @code{key} and the @code{value} components get marked.
4955
4956 Again, there are lists with comparable properties called weak
4957 lists. There exist different peculiarities of their types called
4958 @code{simple}, @code{assoc}, @code{key-assoc} and
4959 @code{value-assoc}. You can find further details about them in the
4960 description to the function @code{make-weak-list}. The scheme of their
4961 marking is similar: all weak lists are listed in @code{Qall_weak_lists},
4962 therefore we iterate over them. The marking is advanced until we hit an
4963 already marked pair. Then we know that during a former run all
4964 the rest has been marked completely. Again, depending on the special
4965 type of the weak list, our jobs differ. If it is a @code{WEAK_LIST_SIMPLE}
4966 and the elem is marked, we mark the @code{cons} part. If it is a
4967 @code{WEAK_LIST_ASSOC} and not a pair or a pair with both marked car and
4968 cdr, we mark the @code{cons} and the @code{elem}. If it is a
4969 @code{WEAK_LIST_KEY_ASSOC} and not a pair or a pair with a marked car of
4970 the elem, we mark the @code{cons} and the @code{elem}. Finally, if it is
4971 a @code{WEAK_LIST_VALUE_ASSOC} and not a pair or a pair with a marked
4972 cdr of the elem, we mark both the @code{cons} and the @code{elem}.
4973
4974 Since, by marking objects in reach from weak hash tables and weak lists,
4975 other objects could get marked, this perhaps implies further marking of
4976 other weak objects, both finishing functions are redone as long as
4977 yet unmarked objects get freshly marked.
4978
4979 @item
4980 After completing the special marking for the weak hash tables and for the weak
4981 lists, all entries that point to objects that are going to be swept in
4982 the further process are useless, and therefore have to be removed from
4983 the table or the list.
4984
4985 The function @code{prune_weak_hash_tables} does the job for weak hash
4986 tables. Totally unmarked hash tables are removed from the list
4987 @code{Vall_weak_hash_tables}. The other ones are treated more carefully
4988 by scanning over all entries and removing one as soon as one of
4989 the components @code{key} and @code{value} is unmarked.
4990
4991 The same idea applies to the weak lists. It is accomplished by
4992 @code{prune_weak_lists}: An unmarked list is pruned from
4993 @code{Vall_weak_lists} immediately. A marked list is treated more
4994 carefully by going over it and removing just the unmarked pairs.
4995
4996 @item
4997 The function @code{prune_specifiers} checks all listed specifiers held
4998 in @code{Vall_speficiers} and removes the ones from the lists that are
4999 unmarked.
5000
5001 @item
5002 All syntax tables are stored in a list called
5003 @code{Vall_syntax_tables}. The function @code{prune_syntax_tables} walks
5004 through it and unlinks the tables that are unmarked.
5005
5006 @item
5007 Next, we will attack the complete sweeping - the function
5008 @code{gc_sweep} which holds the predominance.
5009 @item
5010 First, all the variables with respect to garbage collection are
5011 reset. @code{consing_since_gc} - the counter of the created cells since
5012 the last garbage collection - is set back to 0, and
5013 @code{gc_in_progress} is not @code{true} anymore.
5014 @item
5015 In case the session is interactive, the displayed cursor and message are
5016 removed again.
5017 @item
5018 The state of @code{gc_inhibit} is restored to the former value by
5019 unwinding the stack.
5020 @item
5021 A small memory reserve is always held back that can be reached by
5022 @code{breathing_space}. If nothing more is left, we create a new reserve
5023 and exit.
5024 @end enumerate
5025
5026 @node mark_object, gc_sweep, garbage_collect_1, Garbage Collection - Step by Step
5027 @subsection @code{mark_object}
5028 @cindex @code{mark_object}
5029
5030 The first thing that is checked while marking an object is whether the
5031 object is a real Lisp object @code{Lisp_Type_Record} or just an integer
5032 or a character. Integers and characters are the only two types that are
5033 stored directly - without another level of indirection, and therefore they
5034 don't have to be marked and collected.
5035 @xref{How Lisp Objects Are Represented in C}.
5036
5037 The second case is the one we have to handle. It is the one when we are
5038 dealing with a pointer to a Lisp object. But, there exist also three
5039 possibilities, that prevent us from doing anything while marking: The
5040 object is read only which prevents it from being garbage collected,
5041 i.e. marked (@code{C_READONLY_RECORD_HEADER}). The object in question is
5042 already marked, and need not be marked for the second time (checked by
5043 @code{MARKED_RECORD_HEADER_P}). If it is a special, unmarkable object
5044 (@code{UNMARKABLE_RECORD_HEADER_P}, apparently, these are objects that
5045 sit in some const space, and can therefore not be marked, see
5046 @code{this_one_is_unmarkable} in @code{alloc.c}).
5047
5048 Now, the actual marking is feasible. We do so by once using the macro
5049 @code{MARK_RECORD_HEADER} to mark the object itself (actually the
5050 special flag in the lrecord header), and calling its special marker
5051 "method" @code{marker} if available. The marker method marks every
5052 other object that is in reach from our current object. Note, that these
5053 marker methods should not call @code{mark_object} recursively, but
5054 instead should return the next object from where further marking has to
5055 be performed.
5056
5057 In case another object was returned, as mentioned before, we reiterate
5058 the whole @code{mark_object} process beginning with this next object.
5059
5060 @node gc_sweep, sweep_lcrecords_1, mark_object, Garbage Collection - Step by Step
5061 @subsection @code{gc_sweep}
5062 @cindex @code{gc_sweep}
5063
5064 The job of this function is to free all unmarked records from memory. As
5065 we know, there are different types of objects implemented and managed, and
5066 consequently different ways to free them from memory.
5067 @xref{Introduction to Allocation}.
5068
5069 We start with all objects stored through @code{lcrecords}. All
5070 bulkier objects are allocated and handled using that scheme of
5071 @code{lcrecords}. Each object is @code{malloc}ed separately
5072 instead of placing it in one of the contiguous frob blocks. All types
5073 that are currently stored
5074 using @code{lcrecords}'s @code{alloc_lcrecord} and
5075 @code{make_lcrecord_list} are the types: vectors, buffers,
5076 char-table, char-table-entry, console, weak-list, database, device,
5077 ldap, hash-table, command-builder, extent-auxiliary, extent-info, face,
5078 coding-system, frame, image-instance, glyph, popup-data, gui-item,
5079 keymap, charset, color_instance, font_instance, opaque, opaque-list,
5080 process, range-table, specifier, symbol-value-buffer-local,
5081 symbol-value-lisp-magic, symbol-value-varalias, toolbar-button,
5082 tooltalk-message, tooltalk-pattern, window, and window-configuration. We
5083 take care of them in the fist place
5084 in order to be able to handle and to finalize items stored in them more
5085 easily. The function @code{sweep_lcrecords_1} as described below is
5086 doing the whole job for us.
5087 For a description about the internals: @xref{lrecords}.
5088
5089 Our next candidates are the other objects that behave quite differently
5090 than everything else: the strings. They consists of two parts, a
5091 fixed-size portion (@code{struct Lisp_string}) holding the string's
5092 length, its property list and a pointer to the second part, and the
5093 actual string data, which is stored in string-chars blocks comparable to
5094 frob blocks. In this block, the data is not only freed, but also a
5095 compression of holes is made, i.e. all strings are relocated together.
5096 @xref{String}. This compacting phase is performed by the function
5097 @code{compact_string_chars}, the actual sweeping by the function
5098 @code{sweep_strings} is described below.
5099
5100 After that, the other types are swept step by step using functions
5101 @code{sweep_conses}, @code{sweep_bit_vectors_1},
5102 @code{sweep_compiled_functions}, @code{sweep_floats},
5103 @code{sweep_symbols}, @code{sweep_extents}, @code{sweep_markers} and
5104 @code{sweep_extents}. They are the fixed-size types cons, floats,
5105 compiled-functions, symbol, marker, extent, and event stored in
5106 so-called "frob blocks", and therefore we can basically do the same on
5107 every type objects, using the same macros, especially defined only to
5108 handle everything with respect to fixed-size blocks. The only fixed-size
5109 type that is not handled here are the fixed-size portion of strings,
5110 because we took special care of them earlier.
5111
5112 The only big exceptions are bit vectors stored differently and
5113 therefore treated differently by the function @code{sweep_bit_vectors_1}
5114 described later.
5115
5116 At first, we need some brief information about how
5117 these fixed-size types are managed in general, in order to understand
5118 how the sweeping is done. They have all a fixed size, and are therefore
5119 stored in big blocks of memory - allocated at once - that can hold a
5120 certain amount of objects of one type. The macro
5121 @code{DECLARE_FIXED_TYPE_ALLOC} creates the suitable structures for
5122 every type. More precisely, we have the block struct
5123 (holding a pointer to the previous block @code{prev} and the
5124 objects in @code{block[]}), a pointer to current block
5125 (@code{current_..._block)}) and its last index
5126 (@code{current_..._block_index}), and a pointer to the free list that
5127 will be created. Also a macro @code{FIXED_TYPE_FROM_BLOCK} plus some
5128 related macros exists that are used to obtain a new object, either from
5129 the free list @code{ALLOCATE_FIXED_TYPE_1} if there is an unused object
5130 of that type stored or by allocating a completely new block using
5131 @code{ALLOCATE_FIXED_TYPE_FROM_BLOCK}.
5132
5133 The rest works as follows: all of them define a
5134 macro @code{UNMARK_...} that is used to unmark the object. They define a
5135 macro @code{ADDITIONAL_FREE_...} that defines additional work that has
5136 to be done when converting an object from in use to not in use (so far,
5137 only markers use it in order to unchain them). Then, they all call
5138 the macro @code{SWEEP_FIXED_TYPE_BLOCK} instantiated with their type name
5139 and their struct name.
5140
5141 This call in particular does the following: we go over all blocks
5142 starting with the current moving towards the oldest.
5143 For each block, we look at every object in it. If the object already
5144 freed (checked with @code{FREE_STRUCT_P} using the first pointer of the
5145 object), or if it is
5146 set to read only (@code{C_READONLY_RECORD_HEADER_P}, nothing must be
5147 done. If it is unmarked (checked with @code{MARKED_RECORD_HEADER_P}), it
5148 is put in the free list and set free (using the macro
5149 @code{FREE_FIXED_TYPE}, otherwise it stays in the block, but is unmarked
5150 (by @code{UNMARK_...}). While going through one block, we note if the
5151 whole block is empty. If so, the whole block is freed (using
5152 @code{xfree}) and the free list state is set to the state it had before
5153 handling this block.
5154
5155 @node sweep_lcrecords_1, compact_string_chars, gc_sweep, Garbage Collection - Step by Step
5156 @subsection @code{sweep_lcrecords_1}
5157 @cindex @code{sweep_lcrecords_1}
5158
5159 After nullifying the complete lcrecord statistics, we go over all
5160 lcrecords two separate times. They are all chained together in a list with
5161 a head called @code{all_lcrecords}.
5162
5163 The first loop calls for each object its @code{finalizer} method, but only
5164 in the case that it is not read only
5165 (@code{C_READONLY_RECORD_HEADER_P)}, it is not already marked
5166 (@code{MARKED_RECORD_HEADER_P}), it is not already in a free list (list of
5167 freed objects, field @code{free}) and finally it owns a finalizer
5168 method.
5169
5170 The second loop actually frees the appropriate objects again by iterating
5171 through the whole list. In case an object is read only or marked, it
5172 has to persist, otherwise it is manually freed by calling
5173 @code{xfree}. During this loop, the lcrecord statistics are kept up to
5174 date by calling @code{tick_lcrecord_stats} with the right arguments,
5175
5176 @node compact_string_chars, sweep_strings, sweep_lcrecords_1, Garbage Collection - Step by Step
5177 @subsection @code{compact_string_chars}
5178 @cindex @code{compact_string_chars}
5179
5180 The purpose of this function is to compact all the data parts of the
5181 strings that are held in so-called @code{string_chars_block}, i.e. the
5182 strings that do not exceed a certain maximal length.
5183
5184 The procedure with which this is done is as follows. We are keeping two
5185 positions in the @code{string_chars_block}s using two pointer/integer
5186 pairs, namely @code{from_sb}/@code{from_pos} and
5187 @code{to_sb}/@code{to_pos}. They stand for the actual positions, from
5188 where to where, to copy the actually handled string.
5189
5190 While going over all chained @code{string_char_block}s and their held
5191 strings, staring at @code{first_string_chars_block}, both pointers
5192 are advanced and eventually a string is copied from @code{from_sb} to
5193 @code{to_sb}, depending on the status of the pointed at strings.
5194
5195 More precisely, we can distinguish between the following actions.
5196 @itemize @bullet
5197 @item
5198 The string at @code{from_sb}'s position could be marked as free, which
5199 is indicated by an invalid pointer to the pointer that should point back
5200 to the fixed size string object, and which is checked by
5201 @code{FREE_STRUCT_P}. In this case, the @code{from_sb}/@code{from_pos}
5202 is advanced to the next string, and nothing has to be copied.
5203 @item
5204 Also, if a string object itself is unmarked, nothing has to be
5205 copied. We likewise advance the @code{from_sb}/@code{from_pos}
5206 pair as described above.
5207 @item
5208 In all other cases, we have a marked string at hand. The string data
5209 must be moved from the from-position to the to-position. In case
5210 there is not enough space in the actual @code{to_sb}-block, we advance
5211 this pointer to the beginning of the next block before copying. In case the
5212 from and to positions are different, we perform the
5213 actual copying using the library function @code{memmove}.
5214 @end itemize
5215
5216 After compacting, the pointer to the current
5217 @code{string_chars_block}, sitting in @code{current_string_chars_block},
5218 is reset on the last block to which we moved a string,
5219 i.e. @code{to_block}, and all remaining blocks (we know that they just
5220 carry garbage) are explicitly @code{xfree}d.
5221
5222 @node sweep_strings, sweep_bit_vectors_1, compact_string_chars, Garbage Collection - Step by Step
5223 @subsection @code{sweep_strings}
5224 @cindex @code{sweep_strings}
5225
5226 The sweeping for the fixed sized string objects is essentially exactly
5227 the same as it is for all other fixed size types. As before, the freeing
5228 into the suitable free list is done by using the macro
5229 @code{SWEEP_FIXED_SIZE_BLOCK} after defining the right macros
5230 @code{UNMARK_string} and @code{ADDITIONAL_FREE_string}. These two
5231 definitions are a little bit special compared to the ones used
5232 for the other fixed size types.
5233
5234 @code{UNMARK_string} is defined the same way except some additional code
5235 used for updating the bookkeeping information.
5236
5237 For strings, @code{ADDITIONAL_FREE_string} has to do something in
5238 addition: in case, the string was not allocated in a
5239 @code{string_chars_block} because it exceeded the maximal length, and
5240 therefore it was @code{malloc}ed separately, we know also @code{xfree}
5241 it explicitly.
5242
5243 @node sweep_bit_vectors_1, , sweep_strings, Garbage Collection - Step by Step
5244 @subsection @code{sweep_bit_vectors_1}
5245 @cindex @code{sweep_bit_vectors_1}
5246
5247 Bit vectors are also one of the rare types that are @code{malloc}ed
5248 individually. Consequently, while sweeping, all further needless
5249 bit vectors must be freed by hand. This is done, as one might imagine,
5250 the expected way: since they are all registered in a list called
5251 @code{all_bit_vectors}, all elements of that list are traversed,
5252 all unmarked bit vectors are unlinked by calling @code{xfree} and all of
5253 them become unmarked.
5254 In addition, the bookkeeping information used for garbage
5255 collector's output purposes is updated.
5256
5257 @node Integers and Characters, Allocation from Frob Blocks, Garbage Collection - Step by Step, Allocation of Objects in XEmacs Lisp
4714 @section Integers and Characters 5258 @section Integers and Characters
4715 5259
4716 Integer and character Lisp objects are created from integers using the 5260 Integer and character Lisp objects are created from integers using the
4717 macros @code{XSETINT()} and @code{XSETCHAR()} or the equivalent 5261 macros @code{XSETINT()} and @code{XSETCHAR()} or the equivalent
4718 functions @code{make_int()} and @code{make_char()}. (These are actually 5262 functions @code{make_int()} and @code{make_char()}. (These are actually
4722 5266
4723 @code{XSETINT()} and the like will truncate values given to them that 5267 @code{XSETINT()} and the like will truncate values given to them that
4724 are too big; i.e. you won't get the value you expected but the tag bits 5268 are too big; i.e. you won't get the value you expected but the tag bits
4725 will at least be correct. 5269 will at least be correct.
4726 5270
4727 @node Allocation from Frob Blocks 5271 @node Allocation from Frob Blocks, lrecords, Integers and Characters, Allocation of Objects in XEmacs Lisp
4728 @section Allocation from Frob Blocks 5272 @section Allocation from Frob Blocks
4729 5273
4730 The uninitialized memory required by a @code{Lisp_Object} of a particular type 5274 The uninitialized memory required by a @code{Lisp_Object} of a particular type
4731 is allocated using 5275 is allocated using
4732 @code{ALLOCATE_FIXED_TYPE()}. This only occurs inside of the 5276 @code{ALLOCATE_FIXED_TYPE()}. This only occurs inside of the
4749 @code{ALLOCATE_FIXED_TYPE_FROM_BLOCK()}, which looks at the end of the 5293 @code{ALLOCATE_FIXED_TYPE_FROM_BLOCK()}, which looks at the end of the
4750 last frob block for space, and creates a new frob block if there is 5294 last frob block for space, and creates a new frob block if there is
4751 none. (There are actually two versions of these macros, one of which is 5295 none. (There are actually two versions of these macros, one of which is
4752 more defensive but less efficient and is used for error-checking.) 5296 more defensive but less efficient and is used for error-checking.)
4753 5297
4754 @node lrecords 5298 @node lrecords, Low-level allocation, Allocation from Frob Blocks, Allocation of Objects in XEmacs Lisp
4755 @section lrecords 5299 @section lrecords
4756 5300
4757 [see @file{lrecord.h}] 5301 [see @file{lrecord.h}]
4758 5302
4759 All lrecords have at the beginning of their structure a @code{struct 5303 All lrecords have at the beginning of their structure a @code{struct
4988 (i.e. declared with a @code{_SEQUENCE_IMPLEMENTATION} macro.) This should 5532 (i.e. declared with a @code{_SEQUENCE_IMPLEMENTATION} macro.) This should
4989 simply return the object's size in bytes, exactly as you might expect. 5533 simply return the object's size in bytes, exactly as you might expect.
4990 For an example, see the methods for window configurations and opaques. 5534 For an example, see the methods for window configurations and opaques.
4991 @end enumerate 5535 @end enumerate
4992 5536
4993 @node Low-level allocation 5537 @node Low-level allocation, Pure Space, lrecords, Allocation of Objects in XEmacs Lisp
4994 @section Low-level allocation 5538 @section Low-level allocation
4995 5539
4996 Memory that you want to allocate directly should be allocated using 5540 Memory that you want to allocate directly should be allocated using
4997 @code{xmalloc()} rather than @code{malloc()}. This implements 5541 @code{xmalloc()} rather than @code{malloc()}. This implements
4998 error-checking on the return value, and once upon a time did some more 5542 error-checking on the return value, and once upon a time did some more
5060 routines. These routines also call @code{INCREMENT_CONS_COUNTER()} at the 5604 routines. These routines also call @code{INCREMENT_CONS_COUNTER()} at the
5061 appropriate times; this keeps statistics on how much memory is 5605 appropriate times; this keeps statistics on how much memory is
5062 allocated, so that garbage-collection can be invoked when the 5606 allocated, so that garbage-collection can be invoked when the
5063 threshold is reached. 5607 threshold is reached.
5064 5608
5065 @node Pure Space 5609 @node Pure Space, Cons, Low-level allocation, Allocation of Objects in XEmacs Lisp
5066 @section Pure Space 5610 @section Pure Space
5067 5611
5068 Not yet documented. 5612 Not yet documented.
5069 5613
5070 @node Cons 5614 @node Cons, Vector, Pure Space, Allocation of Objects in XEmacs Lisp
5071 @section Cons 5615 @section Cons
5072 5616
5073 Conses are allocated in standard frob blocks. The only thing to 5617 Conses are allocated in standard frob blocks. The only thing to
5074 note is that conses can be explicitly freed using @code{free_cons()} 5618 note is that conses can be explicitly freed using @code{free_cons()}
5075 and associated functions @code{free_list()} and @code{free_alist()}. This 5619 and associated functions @code{free_list()} and @code{free_alist()}. This
5079 generating extra objects and thereby triggering GC sooner. 5623 generating extra objects and thereby triggering GC sooner.
5080 However, you have to be @emph{extremely} careful when doing this. 5624 However, you have to be @emph{extremely} careful when doing this.
5081 If you mess this up, you will get BADLY BURNED, and it has happened 5625 If you mess this up, you will get BADLY BURNED, and it has happened
5082 before. 5626 before.
5083 5627
5084 @node Vector 5628 @node Vector, Bit Vector, Cons, Allocation of Objects in XEmacs Lisp
5085 @section Vector 5629 @section Vector
5086 5630
5087 As mentioned above, each vector is @code{malloc()}ed individually, and 5631 As mentioned above, each vector is @code{malloc()}ed individually, and
5088 all are threaded through the variable @code{all_vectors}. Vectors are 5632 all are threaded through the variable @code{all_vectors}. Vectors are
5089 marked strangely during garbage collection, by kludging the size field. 5633 marked strangely during garbage collection, by kludging the size field.
5090 Note that the @code{struct Lisp_Vector} is declared with its 5634 Note that the @code{struct Lisp_Vector} is declared with its
5091 @code{contents} field being a @emph{stretchy} array of one element. It 5635 @code{contents} field being a @emph{stretchy} array of one element. It
5092 is actually @code{malloc()}ed with the right size, however, and access 5636 is actually @code{malloc()}ed with the right size, however, and access
5093 to any element through the @code{contents} array works fine. 5637 to any element through the @code{contents} array works fine.
5094 5638
5095 @node Bit Vector 5639 @node Bit Vector, Symbol, Vector, Allocation of Objects in XEmacs Lisp
5096 @section Bit Vector 5640 @section Bit Vector
5097 5641
5098 Bit vectors work exactly like vectors, except for more complicated 5642 Bit vectors work exactly like vectors, except for more complicated
5099 code to access an individual bit, and except for the fact that bit 5643 code to access an individual bit, and except for the fact that bit
5100 vectors are lrecords while vectors are not. (The only difference here is 5644 vectors are lrecords while vectors are not. (The only difference here is
5101 that there's an lrecord implementation pointer at the beginning and the 5645 that there's an lrecord implementation pointer at the beginning and the
5102 tag field in bit vector Lisp words is ``lrecord'' rather than 5646 tag field in bit vector Lisp words is ``lrecord'' rather than
5103 ``vector''.) 5647 ``vector''.)
5104 5648
5105 @node Symbol 5649 @node Symbol, Marker, Bit Vector, Allocation of Objects in XEmacs Lisp
5106 @section Symbol 5650 @section Symbol
5107 5651
5108 Symbols are also allocated in frob blocks. Note that the code 5652 Symbols are also allocated in frob blocks. Note that the code
5109 exists for symbols to be either lrecords (category (c) above) 5653 exists for symbols to be either lrecords (category (c) above)
5110 or simple types (category (b) above), and are lrecords by 5654 or simple types (category (b) above), and are lrecords by
5114 chained through their @code{next} field. 5658 chained through their @code{next} field.
5115 5659
5116 Remember that @code{intern} looks up a symbol in an obarray, creating 5660 Remember that @code{intern} looks up a symbol in an obarray, creating
5117 one if necessary. 5661 one if necessary.
5118 5662
5119 @node Marker 5663 @node Marker, String, Symbol, Allocation of Objects in XEmacs Lisp
5120 @section Marker 5664 @section Marker
5121 5665
5122 Markers are allocated in frob blocks, as usual. They are kept 5666 Markers are allocated in frob blocks, as usual. They are kept
5123 in a buffer unordered, but in a doubly-linked list so that they 5667 in a buffer unordered, but in a doubly-linked list so that they
5124 can easily be removed. (Formerly this was a singly-linked list, 5668 can easily be removed. (Formerly this was a singly-linked list,
5125 but in some cases garbage collection took an extraordinarily 5669 but in some cases garbage collection took an extraordinarily
5126 long time due to the O(N^2) time required to remove lots of 5670 long time due to the O(N^2) time required to remove lots of
5127 markers from a buffer.) Markers are removed from a buffer in 5671 markers from a buffer.) Markers are removed from a buffer in
5128 the finalize stage, in @code{ADDITIONAL_FREE_marker()}. 5672 the finalize stage, in @code{ADDITIONAL_FREE_marker()}.
5129 5673
5130 @node String 5674 @node String, Compiled Function, Marker, Allocation of Objects in XEmacs Lisp
5131 @section String 5675 @section String
5132 5676
5133 As mentioned above, strings are a special case. A string is logically 5677 As mentioned above, strings are a special case. A string is logically
5134 two parts, a fixed-size object (containing the length, property list, 5678 two parts, a fixed-size object (containing the length, property list,
5135 and a pointer to the actual data), and the actual data in the string. 5679 and a pointer to the actual data), and the actual data in the string.
5159 Note that there is one situation not handled: a string that is too big 5703 Note that there is one situation not handled: a string that is too big
5160 to fit into a string-chars block. Such strings, called @dfn{big 5704 to fit into a string-chars block. Such strings, called @dfn{big
5161 strings}, are all @code{malloc()}ed as their own block. (#### Although it 5705 strings}, are all @code{malloc()}ed as their own block. (#### Although it
5162 would make more sense for the threshold for big strings to be somewhat 5706 would make more sense for the threshold for big strings to be somewhat
5163 lower, e.g. 1/2 or 1/4 the size of a string-chars block. It seems that 5707 lower, e.g. 1/2 or 1/4 the size of a string-chars block. It seems that
5164 this was indeed the case formerly -- indeed, the threshold was set at 5708 this was indeed the case formerly---indeed, the threshold was set at
5165 1/8 -- but Mly forgot about this when rewriting things for 19.8.) 5709 1/8---but Mly forgot about this when rewriting things for 19.8.)
5166 5710
5167 Note also that the string data in string-chars blocks is padded as 5711 Note also that the string data in string-chars blocks is padded as
5168 necessary so that proper alignment constraints on the @code{struct 5712 necessary so that proper alignment constraints on the @code{struct
5169 Lisp_String} back pointers are maintained. 5713 Lisp_String} back pointers are maintained.
5170 5714
5186 string data (which would normally be obtained from the now-non-existent 5730 string data (which would normally be obtained from the now-non-existent
5187 @code{struct Lisp_String}) at the beginning of the dead string data gap. 5731 @code{struct Lisp_String}) at the beginning of the dead string data gap.
5188 The string compactor recognizes this special 0xFFFFFFFF marker and 5732 The string compactor recognizes this special 0xFFFFFFFF marker and
5189 handles it correctly. 5733 handles it correctly.
5190 5734
5191 @node Compiled Function 5735 @node Compiled Function, , String, Allocation of Objects in XEmacs Lisp
5192 @section Compiled Function 5736 @section Compiled Function
5193 5737
5194 Not yet documented. 5738 Not yet documented.
5195 5739
5196 @node Events and the Event Loop, Evaluation; Stack Frames; Bindings, Allocation of Objects in XEmacs Lisp, Top 5740
5741 @node Dumping, Events and the Event Loop, Allocation of Objects in XEmacs Lisp, Top
5742 @chapter Dumping
5743
5744 @section What is dumping and its justification
5745
5746 The C code of XEmacs is just a Lisp engine with a lot of built-in
5747 primitives useful for writing an editor. The editor itself is written
5748 mostly in Lisp, and represents around 100K lines of code. Loading and
5749 executing the initialization of all this code takes a bit a time (five
5750 to ten times the usual startup time of current xemacs) and requires
5751 having all the lisp source files around. Having to reload them each
5752 time the editor is started would not be acceptable.
5753
5754 The traditional solution to this problem is called dumping: the build
5755 process first creates the lisp engine under the name @file{temacs}, then
5756 runs it until it has finished loading and initializing all the lisp
5757 code, and eventually creates a new executable called @file{xemacs}
5758 including both the object code in @file{temacs} and all the contents of
5759 the memory after the initialization.
5760
5761 This solution, while working, has a huge problem: the creation of the
5762 new executable from the actual contents of memory is an extremely
5763 system-specific process, quite error-prone, and which interferes with a
5764 lot of system libraries (like malloc). It is even getting worse
5765 nowadays with libraries using constructors which are automatically
5766 called when the program is started (even before main()) which tend to
5767 crash when they are called multiple times, once before dumping and once
5768 after (IRIX 6.x libz.so pulls in some C++ image libraries thru
5769 dependencies which have this problem). Writing the dumper is also one
5770 of the most difficult parts of porting XEmacs to a new operating system.
5771 Basically, `dumping' is an operation that is just not officially
5772 supported on many operating systems.
5773
5774 The aim of the portable dumper is to solve the same problem as the
5775 system-specific dumper, that is to be able to reload quickly, using only
5776 a small number of files, the fully initialized lisp part of the editor,
5777 without any system-specific hacks.
5778
5779 @menu
5780 * Overview::
5781 * Data descriptions::
5782 * Dumping phase::
5783 * Reloading phase::
5784 * Remaining issues::
5785 @end menu
5786
5787 @node Overview, Data descriptions, Dumping, Dumping
5788 @section Overview
5789
5790 The portable dumping system has to:
5791
5792 @enumerate
5793 @item
5794 At dump time, write all initialized, non-quickly-rebuildable data to a
5795 file [Note: currently named @file{xemacs.dmp}, but the name will
5796 change], along with all informations needed for the reloading.
5797
5798 @item
5799 When starting xemacs, reload the dump file, relocate it to its new
5800 starting address if needed, and reinitialize all pointers to this
5801 data. Also, rebuild all the quickly rebuildable data.
5802 @end enumerate
5803
5804 @node Data descriptions, Dumping phase, Overview, Dumping
5805 @section Data descriptions
5806
5807 The more complex task of the dumper is to be able to write lisp objects
5808 (lrecords) and C structs to disk and reload them at a different address,
5809 updating all the pointers they include in the process. This is done by
5810 using external data descriptions that give information about the layout
5811 of the structures in memory.
5812
5813 The specification of these descriptions is in lrecord.h. A description
5814 of an lrecord is an array of struct lrecord_description. Each of these
5815 structs include a type, an offset in the structure and some optional
5816 parameters depending on the type. For instance, here is the string
5817 description:
5818
5819 @example
5820 static const struct lrecord_description string_description[] = @{
5821 @{ XD_BYTECOUNT, offsetof (Lisp_String, size) @},
5822 @{ XD_OPAQUE_DATA_PTR, offsetof (Lisp_String, data), XD_INDIRECT(0, 1) @},
5823 @{ XD_LISP_OBJECT, offsetof (Lisp_String, plist) @},
5824 @{ XD_END @}
5825 @};
5826 @end example
5827
5828 The first line indicates a member of type Bytecount, which is used by
5829 the next, indirect directive. The second means "there is a pointer to
5830 some opaque data in the field @code{data}". The length of said data is
5831 given by the expression @code{XD_INDIRECT(0, 1)}, which means "the value
5832 in the 0th line of the description (welcome to C) plus one". The third
5833 line means "there is a Lisp_Object member @code{plist} in the Lisp_String
5834 structure". @code{XD_END} then ends the description.
5835
5836 This gives us all the information we need to move around what is pointed
5837 to by a structure (C or lrecord) and, by transitivity, everything that
5838 it points to. The only missing information for dumping is the size of
5839 the structure. For lrecords, this is part of the
5840 lrecord_implementation, so we don't need to duplicate it. For C
5841 structures we use a struct struct_description, which includes a size
5842 field and a pointer to an associated array of lrecord_description.
5843
5844 @node Dumping phase, Reloading phase, Data descriptions, Dumping
5845 @section Dumping phase
5846
5847 Dumping is done by calling the function pdump() (in alloc.c) which is
5848 invoked from Fdump_emacs (in emacs.c). This function performs a number
5849 of tasks.
5850
5851 @menu
5852 * Object inventory::
5853 * Address allocation::
5854 * The header::
5855 * Data dumping::
5856 * Pointers dumping::
5857 @end menu
5858
5859 @node Object inventory, Address allocation, Dumping phase, Dumping phase
5860 @subsection Object inventory
5861
5862 The first task is to build the list of the objects to dump. This
5863 includes:
5864
5865 @itemize @bullet
5866 @item lisp objects
5867 @item C structures
5868 @end itemize
5869
5870 We end up with one @code{pdump_entry_list_elmt} per object group (arrays
5871 of C structs are kept together) which includes a pointer to the first
5872 object of the group, the per-object size and the count of objects in the
5873 group, along with some other information which is initialized later.
5874
5875 These entries are linked together in @code{pdump_entry_list} structures
5876 and can be enumerated thru either:
5877
5878 @enumerate
5879 @item
5880 the @code{pdump_object_table}, an array of @code{pdump_entry_list}, one
5881 per lrecord type, indexed by type number.
5882
5883 @item
5884 the @code{pdump_opaque_data_list}, used for the opaque data which does
5885 not include pointers, and hence does not need descriptions.
5886
5887 @item
5888 the @code{pdump_struct_table}, which is a vector of
5889 @code{struct_description}/@code{pdump_entry_list} pairs, used for
5890 non-opaque C structures.
5891 @end enumerate
5892
5893 This uses a marking strategy similar to the garbage collector. Some
5894 differences though:
5895
5896 @enumerate
5897 @item
5898 We do not use the mark bit (which does not exist for C structures
5899 anyway), we use a big hash table instead.
5900
5901 @item
5902 We do not use the mark function of lrecords but instead rely on the
5903 external descriptions. This happens essentially because we need to
5904 follow pointers to C structures and opaque data in addition to
5905 Lisp_Object members.
5906 @end enumerate
5907
5908 This is done by @code{pdump_register_object}, which handles Lisp_Object
5909 variables, and pdump_register_struct which handles C structures, which
5910 both delegate the description management to pdump_register_sub.
5911
5912 The hash table doubles as a map object to pdump_entry_list_elmt (i.e.
5913 allows us to look up a pdump_entry_list_elmt with the object it points
5914 to). Entries are added with @code{pdump_add_entry()} and looked up with
5915 @code{pdump_get_entry()}. There is no need for entry removal. The hash
5916 value is computed quite basically from the object pointer by
5917 @code{pdump_make_hash()}.
5918
5919 The roots for the marking are:
5920
5921 @enumerate
5922 @item
5923 the @code{staticpro}'ed variables (there is a special @code{staticpro_nodump()}
5924 call for protected variables we do not want to dump).
5925
5926 @item
5927 the @code{pdump_wire}'d variables (@code{staticpro} is equivalent to
5928 @code{staticpro_nodump()} + @code{pdump_wire()}).
5929
5930 @item
5931 the @code{dumpstruct}'ed variables, which points to C structures.
5932 @end enumerate
5933
5934 This does not include the GCPRO'ed variables, the specbinds, the
5935 catchtags, the backlist, the redisplay or the profiling info, since we
5936 do not want to rebuild the actual chain of lisp calls which end up to
5937 the dump-emacs call, only the global variables.
5938
5939 Weak lists and weak hash tables are dumped as if they were their
5940 non-weak equivalent (without changing their type, of course). This has
5941 not yet been a problem.
5942
5943 @node Address allocation, The header, Object inventory, Dumping phase
5944 @subsection Address allocation
5945
5946
5947 The next step is to allocate the offsets of each of the objects in the
5948 final dump file. This is done by @code{pdump_allocate_offset()} which
5949 is called indirectly by @code{pdump_scan_by_alignment()}.
5950
5951 The strategy to deal with alignment problems uses these facts:
5952
5953 @enumerate
5954 @item
5955 real world alignment requirements are powers of two.
5956
5957 @item
5958 the C compiler is required to adjust the size of a struct so that you
5959 can have an array of them next to each other. This means you can have a
5960 upper bound of the alignment requirements of a given structure by
5961 looking at which power of two its size is a multiple.
5962
5963 @item
5964 the non-variant part of variable size lrecords has an alignment
5965 requirement of 4.
5966 @end enumerate
5967
5968 Hence, for each lrecord type, C struct type or opaque data block the
5969 alignment requirement is computed as a power of two, with a minimum of
5970 2^2 for lrecords. @code{pdump_scan_by_alignment()} then scans all the
5971 @code{pdump_entry_list_elmt}'s, the ones with the highest requirements
5972 first. This ensures the best packing.
5973
5974 The maximum alignment requirement we take into account is 2^8.
5975
5976 @code{pdump_allocate_offset()} only has to do a linear allocation,
5977 starting at offset 256 (this leaves room for the header and keep the
5978 alignments happy).
5979
5980 @node The header, Data dumping, Address allocation, Dumping phase
5981 @subsection The header
5982
5983 The next step creates the file and writes a header with a signature and
5984 some random informations in it (number of staticpro, number of assigned
5985 lrecord types, etc...). The reloc_address field, which indicates at
5986 which address the file should be loaded if we want to avoid post-reload
5987 relocation, is set to 0. It then seeks to offset 256 (base offset for
5988 the objects).
5989
5990 @node Data dumping, Pointers dumping, The header, Dumping phase
5991 @subsection Data dumping
5992
5993 The data is dumped in the same order as the addresses were allocated by
5994 @code{pdump_dump_data()}, called from @code{pdump_scan_by_alignment()}.
5995 This function copies the data to a temporary buffer, relocates all
5996 pointers in the object to the addresses allocated in step Address
5997 Allocation, and writes it to the file. Using the same order means that,
5998 if we are careful with lrecords whose size is not a multiple of 4, we
5999 are ensured that the object is always written at the offset in the file
6000 allocated in step Address Allocation.
6001
6002 @node Pointers dumping, , Data dumping, Dumping phase
6003 @subsection Pointers dumping
6004
6005 A bunch of tables needed to reassign properly the global pointers are
6006 then written. They are:
6007
6008 @enumerate
6009 @item the staticpro array
6010 @item the dumpstruct array
6011 @item the lrecord_implementation_table array
6012 @item a vector of all the offsets to the objects in the file that include a
6013 description (for faster relocation at reload time)
6014 @item the pdump_wired and pdump_wired_list arrays
6015 @end enumerate
6016
6017 For each of the arrays we write both the pointer to the variables and
6018 the relocated offset of the object they point to. Since these variables
6019 are global, the pointers are still valid when restarting the program and
6020 are used to regenerate the global pointers.
6021
6022 The @code{pdump_wired_list} array is a special case. The variables it
6023 points to are the head of weak linked lists of lisp objects of the same
6024 type. Not all objects of this list are dumped so the relocated pointer
6025 we associate with them points to the first dumped object of the list, or
6026 Qnil if none is available. This is also the reason why they are not
6027 used as roots for the purpose of object enumeration.
6028
6029 This is the end of the dumping part.
6030
6031 @node Reloading phase, Remaining issues, Dumping phase, Dumping
6032 @section Reloading phase
6033
6034 @subsection File loading
6035
6036 The file is mmap'ed in memory (which ensures a PAGESIZE alignment, at
6037 least 4096), or if mmap is unavailable or fails, a 256-bytes aligned
6038 malloc is done and the file is loaded.
6039
6040 Some variables are reinitialized from the values found in the header.
6041
6042 The difference between the actual loading address and the reloc_address
6043 is computed and will be used for all the relocations.
6044
6045
6046 @subsection Putting back the staticvec
6047
6048 The staticvec array is memcpy'd from the file and the variables it
6049 points to are reset to the relocated objects addresses.
6050
6051
6052 @subsection Putting back the dumpstructed variables
6053
6054 The variables pointed to by dumpstruct in the dump phase are reset to
6055 the right relocated object addresses.
6056
6057
6058 @subsection lrecord_implementations_table
6059
6060 The lrecord_implementations_table is reset to its dump time state and
6061 the right lrecord_type_index values are put in.
6062
6063
6064 @subsection Object relocation
6065
6066 All the objects are relocated using their description and their offset
6067 by @code{pdump_reloc_one}. This step is unnecessary if the
6068 reloc_address is equal to the file loading address.
6069
6070
6071 @subsection Putting back the pdump_wire and pdump_wire_list variables
6072
6073 Same as Putting back the dumpstructed variables.
6074
6075
6076 @subsection Reorganize the hash tables
6077
6078 Since some of the hash values in the lisp hash tables are
6079 address-dependent, their layout is now wrong. So we go through each of
6080 them and have them resorted by calling @code{pdump_reorganize_hash_table}.
6081
6082 @node Remaining issues, , Reloading phase, Dumping
6083 @section Remaining issues
6084
6085 The build process will have to start a post-dump xemacs, ask it the
6086 loading address (which will, hopefully, be always the same between
6087 different xemacs invocations) and relocate the file to the new address.
6088 This way the object relocation phase will not have to be done, which
6089 means no writes in the objects and that, because of the use of mmap, the
6090 dumped data will be shared between all the xemacs running on the
6091 computer.
6092
6093 Some executable signature will be necessary to ensure that a given dump
6094 file is really associated with a given executable, or random crashes
6095 will occur. Maybe a random number set at compile or configure time thru
6096 a define. This will also allow for having differently-compiled xemacsen
6097 on the same system (mule and no-mule comes to mind).
6098
6099 The DOC file contents should probably end up in the dump file.
6100
6101
6102 @node Events and the Event Loop, Evaluation; Stack Frames; Bindings, Dumping, Top
5197 @chapter Events and the Event Loop 6103 @chapter Events and the Event Loop
5198 6104
5199 @menu 6105 @menu
5200 * Introduction to Events:: 6106 * Introduction to Events::
5201 * Main Loop:: 6107 * Main Loop::
5205 * Other Event Loop Functions:: 6111 * Other Event Loop Functions::
5206 * Converting Events:: 6112 * Converting Events::
5207 * Dispatching Events; The Command Builder:: 6113 * Dispatching Events; The Command Builder::
5208 @end menu 6114 @end menu
5209 6115
5210 @node Introduction to Events 6116 @node Introduction to Events, Main Loop, Events and the Event Loop, Events and the Event Loop
5211 @section Introduction to Events 6117 @section Introduction to Events
5212 6118
5213 An event is an object that encapsulates information about an 6119 An event is an object that encapsulates information about an
5214 interesting occurrence in the operating system. Events are 6120 interesting occurrence in the operating system. Events are
5215 generated either by user action, direct (e.g. typing on the 6121 generated either by user action, direct (e.g. typing on the
5239 XEmacs has its own types of events (called @dfn{Emacs events}), 6145 XEmacs has its own types of events (called @dfn{Emacs events}),
5240 which provides an abstract layer on top of the system-dependent 6146 which provides an abstract layer on top of the system-dependent
5241 nature of the most basic events that are received. Part of the 6147 nature of the most basic events that are received. Part of the
5242 complex nature of the XEmacs event collection process involves 6148 complex nature of the XEmacs event collection process involves
5243 converting from the operating-system events into the proper 6149 converting from the operating-system events into the proper
5244 Emacs events -- there may not be a one-to-one correspondence. 6150 Emacs events---there may not be a one-to-one correspondence.
5245 6151
5246 Emacs events are documented in @file{events.h}; I'll discuss them 6152 Emacs events are documented in @file{events.h}; I'll discuss them
5247 later. 6153 later.
5248 6154
5249 @node Main Loop 6155 @node Main Loop, Specifics of the Event Gathering Mechanism, Introduction to Events, Events and the Event Loop
5250 @section Main Loop 6156 @section Main Loop
5251 6157
5252 The @dfn{command loop} is the top-level loop that the editor is always 6158 The @dfn{command loop} is the top-level loop that the editor is always
5253 running. It loops endlessly, calling @code{next-event} to retrieve an 6159 running. It loops endlessly, calling @code{next-event} to retrieve an
5254 event and @code{dispatch-event} to execute it. @code{dispatch-event} does 6160 event and @code{dispatch-event} to execute it. @code{dispatch-event} does
5264 one console), and the engine that looks up keystrokes and 6170 one console), and the engine that looks up keystrokes and
5265 constructs full key sequences is called the @dfn{command builder}. 6171 constructs full key sequences is called the @dfn{command builder}.
5266 This is documented elsewhere. 6172 This is documented elsewhere.
5267 6173
5268 The guts of the command loop are in @code{command_loop_1()}. This 6174 The guts of the command loop are in @code{command_loop_1()}. This
5269 function doesn't catch errors, though -- that's the job of 6175 function doesn't catch errors, though---that's the job of
5270 @code{command_loop_2()}, which is a condition-case (i.e. error-trapping) 6176 @code{command_loop_2()}, which is a condition-case (i.e. error-trapping)
5271 wrapper around @code{command_loop_1()}. @code{command_loop_1()} never 6177 wrapper around @code{command_loop_1()}. @code{command_loop_1()} never
5272 returns, but may get thrown out of. 6178 returns, but may get thrown out of.
5273 6179
5274 When an error occurs, @code{cmd_error()} is called, which usually 6180 When an error occurs, @code{cmd_error()} is called, which usually
5311 wrapper similar to @code{command_loop_2()}. Note also that 6217 wrapper similar to @code{command_loop_2()}. Note also that
5312 @code{initial_command_loop()} sets up a catch for @code{top-level} when 6218 @code{initial_command_loop()} sets up a catch for @code{top-level} when
5313 invoking @code{top_level_1()}, just like when it invokes 6219 invoking @code{top_level_1()}, just like when it invokes
5314 @code{command_loop_2()}. 6220 @code{command_loop_2()}.
5315 6221
5316 @node Specifics of the Event Gathering Mechanism 6222 @node Specifics of the Event Gathering Mechanism, Specifics About the Emacs Event, Main Loop, Events and the Event Loop
5317 @section Specifics of the Event Gathering Mechanism 6223 @section Specifics of the Event Gathering Mechanism
5318 6224
5319 Here is an approximate diagram of the collection processes 6225 Here is an approximate diagram of the collection processes
5320 at work in XEmacs, under TTY's (TTY's are simpler than X 6226 at work in XEmacs, under TTY's (TTY's are simpler than X
5321 so we'll look at this first): 6227 so we'll look at this first):
5550 which repeatedly calls `next-event' 6456 which repeatedly calls `next-event'
5551 and then dispatches the event 6457 and then dispatches the event
5552 using `dispatch-event' 6458 using `dispatch-event'
5553 @end example 6459 @end example
5554 6460
5555 @node Specifics About the Emacs Event 6461 @node Specifics About the Emacs Event, The Event Stream Callback Routines, Specifics of the Event Gathering Mechanism, Events and the Event Loop
5556 @section Specifics About the Emacs Event 6462 @section Specifics About the Emacs Event
5557 6463
5558 @node The Event Stream Callback Routines 6464 @node The Event Stream Callback Routines, Other Event Loop Functions, Specifics About the Emacs Event, Events and the Event Loop
5559 @section The Event Stream Callback Routines 6465 @section The Event Stream Callback Routines
5560 6466
5561 @node Other Event Loop Functions 6467 @node Other Event Loop Functions, Converting Events, The Event Stream Callback Routines, Events and the Event Loop
5562 @section Other Event Loop Functions 6468 @section Other Event Loop Functions
5563 6469
5564 @code{detect_input_pending()} and @code{input-pending-p} look for 6470 @code{detect_input_pending()} and @code{input-pending-p} look for
5565 input by calling @code{event_stream->event_pending_p} and looking in 6471 input by calling @code{event_stream->event_pending_p} and looking in
5566 @code{[V]unread-command-event} and the @code{command_event_queue} (they 6472 @code{[V]unread-command-event} and the @code{command_event_queue} (they
5578 @code{read-char} calls @code{next-command-event} and uses 6484 @code{read-char} calls @code{next-command-event} and uses
5579 @code{event_to_character()} to return the character equivalent. With 6485 @code{event_to_character()} to return the character equivalent. With
5580 the right kind of input method support, it is possible for (read-char) 6486 the right kind of input method support, it is possible for (read-char)
5581 to return a Kanji character. 6487 to return a Kanji character.
5582 6488
5583 @node Converting Events 6489 @node Converting Events, Dispatching Events; The Command Builder, Other Event Loop Functions, Events and the Event Loop
5584 @section Converting Events 6490 @section Converting Events
5585 6491
5586 @code{character_to_event()}, @code{event_to_character()}, 6492 @code{character_to_event()}, @code{event_to_character()},
5587 @code{event-to-character}, and @code{character-to-event} convert between 6493 @code{event-to-character}, and @code{character-to-event} convert between
5588 characters and keypress events corresponding to the characters. If the 6494 characters and keypress events corresponding to the characters. If the
5589 event was not a keypress, @code{event_to_character()} returns -1 and 6495 event was not a keypress, @code{event_to_character()} returns -1 and
5590 @code{event-to-character} returns @code{nil}. These functions convert 6496 @code{event-to-character} returns @code{nil}. These functions convert
5591 between character representation and the split-up event representation 6497 between character representation and the split-up event representation
5592 (keysym plus mod keys). 6498 (keysym plus mod keys).
5593 6499
5594 @node Dispatching Events; The Command Builder 6500 @node Dispatching Events; The Command Builder, , Converting Events, Events and the Event Loop
5595 @section Dispatching Events; The Command Builder 6501 @section Dispatching Events; The Command Builder
5596 6502
5597 Not yet documented. 6503 Not yet documented.
5598 6504
5599 @node Evaluation; Stack Frames; Bindings, Symbols and Variables, Events and the Event Loop, Top 6505 @node Evaluation; Stack Frames; Bindings, Symbols and Variables, Events and the Event Loop, Top
5604 * Dynamic Binding; The specbinding Stack; Unwind-Protects:: 6510 * Dynamic Binding; The specbinding Stack; Unwind-Protects::
5605 * Simple Special Forms:: 6511 * Simple Special Forms::
5606 * Catch and Throw:: 6512 * Catch and Throw::
5607 @end menu 6513 @end menu
5608 6514
5609 @node Evaluation 6515 @node Evaluation, Dynamic Binding; The specbinding Stack; Unwind-Protects, Evaluation; Stack Frames; Bindings, Evaluation; Stack Frames; Bindings
5610 @section Evaluation 6516 @section Evaluation
5611 6517
5612 @code{Feval()} evaluates the form (a Lisp object) that is passed to 6518 @code{Feval()} evaluates the form (a Lisp object) that is passed to
5613 it. Note that evaluation is only non-trivial for two types of objects: 6519 it. Note that evaluation is only non-trivial for two types of objects:
5614 symbols and conses. A symbol is evaluated simply by calling 6520 symbols and conses. A symbol is evaluated simply by calling
5734 @code{call3()} call a function, passing it the argument(s) given (the 6640 @code{call3()} call a function, passing it the argument(s) given (the
5735 arguments are given as separate C arguments rather than being passed as 6641 arguments are given as separate C arguments rather than being passed as
5736 an array). @code{apply1()} uses @code{Fapply()} while the others use 6642 an array). @code{apply1()} uses @code{Fapply()} while the others use
5737 @code{Ffuncall()} to do the real work. 6643 @code{Ffuncall()} to do the real work.
5738 6644
5739 @node Dynamic Binding; The specbinding Stack; Unwind-Protects 6645 @node Dynamic Binding; The specbinding Stack; Unwind-Protects, Simple Special Forms, Evaluation, Evaluation; Stack Frames; Bindings
5740 @section Dynamic Binding; The specbinding Stack; Unwind-Protects 6646 @section Dynamic Binding; The specbinding Stack; Unwind-Protects
5741 6647
5742 @example 6648 @example
5743 struct specbinding 6649 struct specbinding
5744 @{ 6650 @{
5788 a local-variable binding (@code{func} is 0, @code{symbol} is not 6694 a local-variable binding (@code{func} is 0, @code{symbol} is not
5789 @code{nil}, and @code{old_value} holds the old value, which is stored as 6695 @code{nil}, and @code{old_value} holds the old value, which is stored as
5790 the symbol's value). 6696 the symbol's value).
5791 @end enumerate 6697 @end enumerate
5792 6698
5793 @node Simple Special Forms 6699 @node Simple Special Forms, Catch and Throw, Dynamic Binding; The specbinding Stack; Unwind-Protects, Evaluation; Stack Frames; Bindings
5794 @section Simple Special Forms 6700 @section Simple Special Forms
5795 6701
5796 @code{or}, @code{and}, @code{if}, @code{cond}, @code{progn}, 6702 @code{or}, @code{and}, @code{if}, @code{cond}, @code{progn},
5797 @code{prog1}, @code{prog2}, @code{setq}, @code{quote}, @code{function}, 6703 @code{prog1}, @code{prog2}, @code{setq}, @code{quote}, @code{function},
5798 @code{let*}, @code{let}, @code{while} 6704 @code{let*}, @code{let}, @code{while}
5805 Note that, with the exeption of @code{Fprogn}, these functions are 6711 Note that, with the exeption of @code{Fprogn}, these functions are
5806 typically called in real life only in interpreted code, since the byte 6712 typically called in real life only in interpreted code, since the byte
5807 compiler knows how to convert calls to these functions directly into 6713 compiler knows how to convert calls to these functions directly into
5808 byte code. 6714 byte code.
5809 6715
5810 @node Catch and Throw 6716 @node Catch and Throw, , Simple Special Forms, Evaluation; Stack Frames; Bindings
5811 @section Catch and Throw 6717 @section Catch and Throw
5812 6718
5813 @example 6719 @example
5814 struct catchtag 6720 struct catchtag
5815 @{ 6721 @{
5873 * Introduction to Symbols:: 6779 * Introduction to Symbols::
5874 * Obarrays:: 6780 * Obarrays::
5875 * Symbol Values:: 6781 * Symbol Values::
5876 @end menu 6782 @end menu
5877 6783
5878 @node Introduction to Symbols 6784 @node Introduction to Symbols, Obarrays, Symbols and Variables, Symbols and Variables
5879 @section Introduction to Symbols 6785 @section Introduction to Symbols
5880 6786
5881 A symbol is basically just an object with four fields: a name (a 6787 A symbol is basically just an object with four fields: a name (a
5882 string), a value (some Lisp object), a function (some Lisp object), and 6788 string), a value (some Lisp object), a function (some Lisp object), and
5883 a property list (usually a list of alternating keyword/value pairs). 6789 a property list (usually a list of alternating keyword/value pairs).
5890 there can be a distinct function and variable with the same name. The 6796 there can be a distinct function and variable with the same name. The
5891 property list is used as a more general mechanism of associating 6797 property list is used as a more general mechanism of associating
5892 additional values with particular names, and once again the namespace is 6798 additional values with particular names, and once again the namespace is
5893 independent of the function and variable namespaces. 6799 independent of the function and variable namespaces.
5894 6800
5895 @node Obarrays 6801 @node Obarrays, Symbol Values, Introduction to Symbols, Symbols and Variables
5896 @section Obarrays 6802 @section Obarrays
5897 6803
5898 The identity of symbols with their names is accomplished through a 6804 The identity of symbols with their names is accomplished through a
5899 structure called an obarray, which is just a poorly-implemented hash 6805 structure called an obarray, which is just a poorly-implemented hash
5900 table mapping from strings to symbols whose name is that string. (I say 6806 table mapping from strings to symbols whose name is that string. (I say
5957 a new one, and @code{unintern} to remove a symbol from an obarray. This 6863 a new one, and @code{unintern} to remove a symbol from an obarray. This
5958 returns the removed symbol. (Remember: You can't put the symbol back 6864 returns the removed symbol. (Remember: You can't put the symbol back
5959 into any obarray.) Finally, @code{mapatoms} maps over all of the symbols 6865 into any obarray.) Finally, @code{mapatoms} maps over all of the symbols
5960 in an obarray. 6866 in an obarray.
5961 6867
5962 @node Symbol Values 6868 @node Symbol Values, , Obarrays, Symbols and Variables
5963 @section Symbol Values 6869 @section Symbol Values
5964 6870
5965 The value field of a symbol normally contains a Lisp object. However, 6871 The value field of a symbol normally contains a Lisp object. However,
5966 a symbol can be @dfn{unbound}, meaning that it logically has no value. 6872 a symbol can be @dfn{unbound}, meaning that it logically has no value.
5967 This is internally indicated by storing a special Lisp object, called 6873 This is internally indicated by storing a special Lisp object, called
6012 * Markers and Extents:: Tagging locations within a buffer. 6918 * Markers and Extents:: Tagging locations within a buffer.
6013 * Bufbytes and Emchars:: Representation of individual characters. 6919 * Bufbytes and Emchars:: Representation of individual characters.
6014 * The Buffer Object:: The Lisp object corresponding to a buffer. 6920 * The Buffer Object:: The Lisp object corresponding to a buffer.
6015 @end menu 6921 @end menu
6016 6922
6017 @node Introduction to Buffers 6923 @node Introduction to Buffers, The Text in a Buffer, Buffers and Textual Representation, Buffers and Textual Representation
6018 @section Introduction to Buffers 6924 @section Introduction to Buffers
6019 6925
6020 A buffer is logically just a Lisp object that holds some text. 6926 A buffer is logically just a Lisp object that holds some text.
6021 In this, it is like a string, but a buffer is optimized for 6927 In this, it is like a string, but a buffer is optimized for
6022 frequent insertion and deletion, while a string is not. Furthermore: 6928 frequent insertion and deletion, while a string is not. Furthermore:
6065 and @dfn{buffer of the selected window}, and the distinction between 6971 and @dfn{buffer of the selected window}, and the distinction between
6066 @dfn{point} of the current buffer and @dfn{window-point} of the selected 6972 @dfn{point} of the current buffer and @dfn{window-point} of the selected
6067 window. (This latter distinction is explained in detail in the section 6973 window. (This latter distinction is explained in detail in the section
6068 on windows.) 6974 on windows.)
6069 6975
6070 @node The Text in a Buffer 6976 @node The Text in a Buffer, Buffer Lists, Introduction to Buffers, Buffers and Textual Representation
6071 @section The Text in a Buffer 6977 @section The Text in a Buffer
6072 6978
6073 The text in a buffer consists of a sequence of zero or more 6979 The text in a buffer consists of a sequence of zero or more
6074 characters. A @dfn{character} is an integer that logically represents 6980 characters. A @dfn{character} is an integer that logically represents
6075 a letter, number, space, or other unit of text. Most of the characters 6981 a letter, number, space, or other unit of text. Most of the characters
6205 Bufbytes underscores the fact that we are working with a string of bytes 7111 Bufbytes underscores the fact that we are working with a string of bytes
6206 in the internal Emacs buffer representation rather than in one of a 7112 in the internal Emacs buffer representation rather than in one of a
6207 number of possible alternative representations (e.g. EUC-encoded text, 7113 number of possible alternative representations (e.g. EUC-encoded text,
6208 etc.). 7114 etc.).
6209 7115
6210 @node Buffer Lists 7116 @node Buffer Lists, Markers and Extents, The Text in a Buffer, Buffers and Textual Representation
6211 @section Buffer Lists 7117 @section Buffer Lists
6212 7118
6213 Recall earlier that buffers are @dfn{permanent} objects, i.e. that 7119 Recall earlier that buffers are @dfn{permanent} objects, i.e. that
6214 they remain around until explicitly deleted. This entails that there is 7120 they remain around until explicitly deleted. This entails that there is
6215 a list of all the buffers in existence. This list is actually an 7121 a list of all the buffers in existence. This list is actually an
6241 respectively. You can also force a new buffer to be created using 7147 respectively. You can also force a new buffer to be created using
6242 @code{generate-new-buffer}, which takes a name and (if necessary) makes 7148 @code{generate-new-buffer}, which takes a name and (if necessary) makes
6243 a unique name from this by appending a number, and then creates the 7149 a unique name from this by appending a number, and then creates the
6244 buffer. This is basically like the symbol operation @code{gensym}. 7150 buffer. This is basically like the symbol operation @code{gensym}.
6245 7151
6246 @node Markers and Extents 7152 @node Markers and Extents, Bufbytes and Emchars, Buffer Lists, Buffers and Textual Representation
6247 @section Markers and Extents 7153 @section Markers and Extents
6248 7154
6249 Among the things associated with a buffer are things that are 7155 Among the things associated with a buffer are things that are
6250 logically attached to certain buffer positions. This can be used to 7156 logically attached to certain buffer positions. This can be used to
6251 keep track of a buffer position when text is inserted and deleted, so 7157 keep track of a buffer position when text is inserted and deleted, so
6281 is no way to determine what markers are in a buffer if you are just 7187 is no way to determine what markers are in a buffer if you are just
6282 given the buffer. Extents remain in a buffer until they are detached 7188 given the buffer. Extents remain in a buffer until they are detached
6283 (which could happen as a result of text being deleted) or the buffer is 7189 (which could happen as a result of text being deleted) or the buffer is
6284 deleted, and primitives do exist to enumerate the extents in a buffer. 7190 deleted, and primitives do exist to enumerate the extents in a buffer.
6285 7191
6286 @node Bufbytes and Emchars 7192 @node Bufbytes and Emchars, The Buffer Object, Markers and Extents, Buffers and Textual Representation
6287 @section Bufbytes and Emchars 7193 @section Bufbytes and Emchars
6288 7194
6289 Not yet documented. 7195 Not yet documented.
6290 7196
6291 @node The Buffer Object 7197 @node The Buffer Object, , Bufbytes and Emchars, Buffers and Textual Representation
6292 @section The Buffer Object 7198 @section The Buffer Object
6293 7199
6294 Buffers contain fields not directly accessible by the Lisp programmer. 7200 Buffers contain fields not directly accessible by the Lisp programmer.
6295 We describe them here, naming them by the names used in the C code. 7201 We describe them here, naming them by the names used in the C code.
6296 Many are accessible indirectly in Lisp programs via Lisp primitives. 7202 Many are accessible indirectly in Lisp programs via Lisp primitives.
6405 * Encodings:: 7311 * Encodings::
6406 * Internal Mule Encodings:: 7312 * Internal Mule Encodings::
6407 * CCL:: 7313 * CCL::
6408 @end menu 7314 @end menu
6409 7315
6410 @node Character Sets 7316 @node Character Sets, Encodings, MULE Character Sets and Encodings, MULE Character Sets and Encodings
6411 @section Character Sets 7317 @section Character Sets
6412 7318
6413 A character set (or @dfn{charset}) is an ordered set of characters. A 7319 A character set (or @dfn{charset}) is an ordered set of characters. A
6414 particular character in a charset is indexed using one or more 7320 particular character in a charset is indexed using one or more
6415 @dfn{position codes}, which are non-negative integers. The number of 7321 @dfn{position codes}, which are non-negative integers. The number of
6486 160 - 255 Latin-1 32 - 127 7392 160 - 255 Latin-1 32 - 127
6487 @end example 7393 @end example
6488 7394
6489 This is a bit ad-hoc but gets the job done. 7395 This is a bit ad-hoc but gets the job done.
6490 7396
6491 @node Encodings 7397 @node Encodings, Internal Mule Encodings, Character Sets, MULE Character Sets and Encodings
6492 @section Encodings 7398 @section Encodings
6493 7399
6494 An @dfn{encoding} is a way of numerically representing characters from 7400 An @dfn{encoding} is a way of numerically representing characters from
6495 one or more character sets. If an encoding only encompasses one 7401 one or more character sets. If an encoding only encompasses one
6496 character set, then the position codes for the characters in that 7402 character set, then the position codes for the characters in that
6513 @menu 7419 @menu
6514 * Japanese EUC (Extended Unix Code):: 7420 * Japanese EUC (Extended Unix Code)::
6515 * JIS7:: 7421 * JIS7::
6516 @end menu 7422 @end menu
6517 7423
6518 @node Japanese EUC (Extended Unix Code) 7424 @node Japanese EUC (Extended Unix Code), JIS7, Encodings, Encodings
6519 @subsection Japanese EUC (Extended Unix Code) 7425 @subsection Japanese EUC (Extended Unix Code)
6520 7426
6521 This encompasses the character sets Printing-ASCII, Japanese-JISX0201, 7427 This encompasses the character sets Printing-ASCII, Japanese-JISX0201,
6522 and Japanese-JISX0208-Kana (half-width katakana, the right half of 7428 and Japanese-JISX0208-Kana (half-width katakana, the right half of
6523 JISX0201). It uses 8-bit bytes. 7429 JISX0201). It uses 8-bit bytes.
6535 Japanese-JISX0208 PC1 + 0x80 | PC2 + 0x80 7441 Japanese-JISX0208 PC1 + 0x80 | PC2 + 0x80
6536 Japanese-JISX0212 PC1 + 0x80 | PC2 + 0x80 7442 Japanese-JISX0212 PC1 + 0x80 | PC2 + 0x80
6537 @end example 7443 @end example
6538 7444
6539 7445
6540 @node JIS7 7446 @node JIS7, , Japanese EUC (Extended Unix Code), Encodings
6541 @subsection JIS7 7447 @subsection JIS7
6542 7448
6543 This encompasses the character sets Printing-ASCII, 7449 This encompasses the character sets Printing-ASCII,
6544 Japanese-JISX0201-Roman (the left half of JISX0201; this character set 7450 Japanese-JISX0201-Roman (the left half of JISX0201; this character set
6545 is very similar to Printing-ASCII and is a 94-character charset), 7451 is very similar to Printing-ASCII and is a 94-character charset),
6570 0x1B 0x28 0x42 ESC ( B invoke Printing-ASCII 7476 0x1B 0x28 0x42 ESC ( B invoke Printing-ASCII
6571 @end example 7477 @end example
6572 7478
6573 Initially, Printing-ASCII is invoked. 7479 Initially, Printing-ASCII is invoked.
6574 7480
6575 @node Internal Mule Encodings 7481 @node Internal Mule Encodings, CCL, Encodings, MULE Character Sets and Encodings
6576 @section Internal Mule Encodings 7482 @section Internal Mule Encodings
6577 7483
6578 In XEmacs/Mule, each character set is assigned a unique number, called a 7484 In XEmacs/Mule, each character set is assigned a unique number, called a
6579 @dfn{leading byte}. This is used in the encodings of a character. 7485 @dfn{leading byte}. This is used in the encodings of a character.
6580 Leading bytes are in the range 0x80 - 0xFF (except for ASCII, which has 7486 Leading bytes are in the range 0x80 - 0xFF (except for ASCII, which has
6616 @menu 7522 @menu
6617 * Internal String Encoding:: 7523 * Internal String Encoding::
6618 * Internal Character Encoding:: 7524 * Internal Character Encoding::
6619 @end menu 7525 @end menu
6620 7526
6621 @node Internal String Encoding 7527 @node Internal String Encoding, Internal Character Encoding, Internal Mule Encodings, Internal Mule Encodings
6622 @subsection Internal String Encoding 7528 @subsection Internal String Encoding
6623 7529
6624 ASCII characters are encoded using their position code directly. Other 7530 ASCII characters are encoded using their position code directly. Other
6625 characters are encoded using their leading byte followed by their 7531 characters are encoded using their leading byte followed by their
6626 position code(s) with the high bit set. Characters in private character 7532 position code(s) with the high bit set. Characters in private character
6666 None of the standard non-modal encodings meet all of these 7572 None of the standard non-modal encodings meet all of these
6667 conditions. For example, EUC satisfies only (2) and (3), while 7573 conditions. For example, EUC satisfies only (2) and (3), while
6668 Shift-JIS and Big5 (not yet described) satisfy only (2). (All 7574 Shift-JIS and Big5 (not yet described) satisfy only (2). (All
6669 non-modal encodings must satisfy (2), in order to be unambiguous.) 7575 non-modal encodings must satisfy (2), in order to be unambiguous.)
6670 7576
6671 @node Internal Character Encoding 7577 @node Internal Character Encoding, , Internal String Encoding, Internal Mule Encodings
6672 @subsection Internal Character Encoding 7578 @subsection Internal Character Encoding
6673 7579
6674 One 19-bit word represents a single character. The word is 7580 One 19-bit word represents a single character. The word is
6675 separated into three fields: 7581 separated into three fields:
6676 7582
6701 @end example 7607 @end example
6702 7608
6703 Note that character codes 0 - 255 are the same as the ``binary encoding'' 7609 Note that character codes 0 - 255 are the same as the ``binary encoding''
6704 described above. 7610 described above.
6705 7611
6706 @node CCL 7612 @node CCL, , Internal Mule Encodings, MULE Character Sets and Encodings
6707 @section CCL 7613 @section CCL
6708 7614
6709 @example 7615 @example
6710 CCL PROGRAM SYNTAX: 7616 CCL PROGRAM SYNTAX:
6711 CCL_PROGRAM := (CCL_MAIN_BLOCK 7617 CCL_PROGRAM := (CCL_MAIN_BLOCK
6892 * Lstream Types:: Different sorts of things that are streamed. 7798 * Lstream Types:: Different sorts of things that are streamed.
6893 * Lstream Functions:: Functions for working with lstreams. 7799 * Lstream Functions:: Functions for working with lstreams.
6894 * Lstream Methods:: Creating new lstream types. 7800 * Lstream Methods:: Creating new lstream types.
6895 @end menu 7801 @end menu
6896 7802
6897 @node Creating an Lstream 7803 @node Creating an Lstream, Lstream Types, Lstreams, Lstreams
6898 @section Creating an Lstream 7804 @section Creating an Lstream
6899 7805
6900 Lstreams come in different types, depending on what is being interfaced 7806 Lstreams come in different types, depending on what is being interfaced
6901 to. Although the primitive for creating new lstreams is 7807 to. Although the primitive for creating new lstreams is
6902 @code{Lstream_new()}, generally you do not call this directly. Instead, 7808 @code{Lstream_new()}, generally you do not call this directly. Instead,
6923 Open for reading, but ``read'' never returns partial MULE characters. 7829 Open for reading, but ``read'' never returns partial MULE characters.
6924 @item "wc" 7830 @item "wc"
6925 Open for writing, but never writes partial MULE characters. 7831 Open for writing, but never writes partial MULE characters.
6926 @end table 7832 @end table
6927 7833
6928 @node Lstream Types 7834 @node Lstream Types, Lstream Functions, Creating an Lstream, Lstreams
6929 @section Lstream Types 7835 @section Lstream Types
6930 7836
6931 @table @asis 7837 @table @asis
6932 @item stdio 7838 @item stdio
6933 7839
6948 @item decoding 7854 @item decoding
6949 7855
6950 @item encoding 7856 @item encoding
6951 @end table 7857 @end table
6952 7858
6953 @node Lstream Functions 7859 @node Lstream Functions, Lstream Methods, Lstream Types, Lstreams
6954 @section Lstream Functions 7860 @section Lstream Functions
6955 7861
6956 @deftypefun {Lstream *} Lstream_new (Lstream_implementation *@var{imp}, CONST char *@var{mode}) 7862 @deftypefun {Lstream *} Lstream_new (Lstream_implementation *@var{imp}, const char *@var{mode})
6957 Allocate and return a new Lstream. This function is not really meant to 7863 Allocate and return a new Lstream. This function is not really meant to
6958 be called directly; rather, each stream type should provide its own 7864 be called directly; rather, each stream type should provide its own
6959 stream creation function, which creates the stream and does any other 7865 stream creation function, which creates the stream and does any other
6960 necessary creation stuff (e.g. opening a file). 7866 necessary creation stuff (e.g. opening a file).
6961 @end deftypefun 7867 @end deftypefun
6984 @end deftypefn 7890 @end deftypefn
6985 7891
6986 @deftypefn Macro void Lstream_ungetc (Lstream *@var{stream}, int @var{c}) 7892 @deftypefn Macro void Lstream_ungetc (Lstream *@var{stream}, int @var{c})
6987 Push one byte back onto the input queue. This will be the next byte 7893 Push one byte back onto the input queue. This will be the next byte
6988 read from the stream. Any number of bytes can be pushed back and will 7894 read from the stream. Any number of bytes can be pushed back and will
6989 be read in the reverse order they were pushed back -- most recent 7895 be read in the reverse order they were pushed back---most recent
6990 first. (This is necessary for consistency -- if there are a number of 7896 first. (This is necessary for consistency---if there are a number of
6991 bytes that have been unread and I read and unread a byte, it needs to be 7897 bytes that have been unread and I read and unread a byte, it needs to be
6992 the first to be read again.) This is a macro and so it is very 7898 the first to be read again.) This is a macro and so it is very
6993 efficient. The @var{c} argument is only evaluated once but the @var{stream} 7899 efficient. The @var{c} argument is only evaluated once but the @var{stream}
6994 argument is evaluated more than once. 7900 argument is evaluated more than once.
6995 @end deftypefn 7901 @end deftypefn
6998 @deftypefunx int Lstream_fgetc (Lstream *@var{stream}) 7904 @deftypefunx int Lstream_fgetc (Lstream *@var{stream})
6999 @deftypefunx void Lstream_fungetc (Lstream *@var{stream}, int @var{c}) 7905 @deftypefunx void Lstream_fungetc (Lstream *@var{stream}, int @var{c})
7000 Function equivalents of the above macros. 7906 Function equivalents of the above macros.
7001 @end deftypefun 7907 @end deftypefun
7002 7908
7003 @deftypefun int Lstream_read (Lstream *@var{stream}, void *@var{data}, int @var{size}) 7909 @deftypefun ssize_t Lstream_read (Lstream *@var{stream}, void *@var{data}, size_t @var{size})
7004 Read @var{size} bytes of @var{data} from the stream. Return the number 7910 Read @var{size} bytes of @var{data} from the stream. Return the number
7005 of bytes read. 0 means EOF. -1 means an error occurred and no bytes 7911 of bytes read. 0 means EOF. -1 means an error occurred and no bytes
7006 were read. 7912 were read.
7007 @end deftypefun 7913 @end deftypefun
7008 7914
7009 @deftypefun int Lstream_write (Lstream *@var{stream}, void *@var{data}, int @var{size}) 7915 @deftypefun ssize_t Lstream_write (Lstream *@var{stream}, void *@var{data}, size_t @var{size})
7010 Write @var{size} bytes of @var{data} to the stream. Return the number 7916 Write @var{size} bytes of @var{data} to the stream. Return the number
7011 of bytes written. -1 means an error occurred and no bytes were written. 7917 of bytes written. -1 means an error occurred and no bytes were written.
7012 @end deftypefun 7918 @end deftypefun
7013 7919
7014 @deftypefun void Lstream_unread (Lstream *@var{stream}, void *@var{data}, int @var{size}) 7920 @deftypefun void Lstream_unread (Lstream *@var{stream}, void *@var{data}, size_t @var{size})
7015 Push back @var{size} bytes of @var{data} onto the input queue. The next 7921 Push back @var{size} bytes of @var{data} onto the input queue. The next
7016 call to @code{Lstream_read()} with the same size will read the same 7922 call to @code{Lstream_read()} with the same size will read the same
7017 bytes back. Note that this will be the case even if there is other 7923 bytes back. Note that this will be the case even if there is other
7018 pending unread data. 7924 pending unread data.
7019 @end deftypefun 7925 @end deftypefun
7023 @end deftypefun 7929 @end deftypefun
7024 7930
7025 @deftypefun void Lstream_reopen (Lstream *@var{stream}) 7931 @deftypefun void Lstream_reopen (Lstream *@var{stream})
7026 Reopen a closed stream. This enables I/O on it again. This is not 7932 Reopen a closed stream. This enables I/O on it again. This is not
7027 meant to be called except from a wrapper routine that reinitializes 7933 meant to be called except from a wrapper routine that reinitializes
7028 variables and such -- the close routine may well have freed some 7934 variables and such---the close routine may well have freed some
7029 necessary storage structures, for example. 7935 necessary storage structures, for example.
7030 @end deftypefun 7936 @end deftypefun
7031 7937
7032 @deftypefun void Lstream_rewind (Lstream *@var{stream}) 7938 @deftypefun void Lstream_rewind (Lstream *@var{stream})
7033 Rewind the stream to the beginning. 7939 Rewind the stream to the beginning.
7034 @end deftypefun 7940 @end deftypefun
7035 7941
7036 @node Lstream Methods 7942 @node Lstream Methods, , Lstream Functions, Lstreams
7037 @section Lstream Methods 7943 @section Lstream Methods
7038 7944
7039 @deftypefn {Lstream Method} int reader (Lstream *@var{stream}, unsigned char *@var{data}, int @var{size}) 7945 @deftypefn {Lstream Method} ssize_t reader (Lstream *@var{stream}, unsigned char *@var{data}, size_t @var{size})
7040 Read some data from the stream's end and store it into @var{data}, which 7946 Read some data from the stream's end and store it into @var{data}, which
7041 can hold @var{size} bytes. Return the number of bytes read. A return 7947 can hold @var{size} bytes. Return the number of bytes read. A return
7042 value of 0 means no bytes can be read at this time. This may be because 7948 value of 0 means no bytes can be read at this time. This may be because
7043 of an EOF, or because there is a granularity greater than one byte that 7949 of an EOF, or because there is a granularity greater than one byte that
7044 the stream imposes on the returned data, and @var{size} is less than 7950 the stream imposes on the returned data, and @var{size} is less than
7051 calls @code{Lstream_read()} with a very small size. 7957 calls @code{Lstream_read()} with a very small size.
7052 7958
7053 This function can be @code{NULL} if the stream is output-only. 7959 This function can be @code{NULL} if the stream is output-only.
7054 @end deftypefn 7960 @end deftypefn
7055 7961
7056 @deftypefn {Lstream Method} int writer (Lstream *@var{stream}, CONST unsigned char *@var{data}, int @var{size}) 7962 @deftypefn {Lstream Method} ssize_t writer (Lstream *@var{stream}, const unsigned char *@var{data}, size_t @var{size})
7057 Send some data to the stream's end. Data to be sent is in @var{data} 7963 Send some data to the stream's end. Data to be sent is in @var{data}
7058 and is @var{size} bytes. Return the number of bytes sent. This 7964 and is @var{size} bytes. Return the number of bytes sent. This
7059 function can send and return fewer bytes than is passed in; in that 7965 function can send and return fewer bytes than is passed in; in that
7060 case, the function will just be called again until there is no data left 7966 case, the function will just be called again until there is no data left
7061 or 0 is returned. A return value of 0 means that no more data can be 7967 or 0 is returned. A return value of 0 means that no more data can be
7069 @deftypefn {Lstream Method} int rewinder (Lstream *@var{stream}) 7975 @deftypefn {Lstream Method} int rewinder (Lstream *@var{stream})
7070 Rewind the stream. If this is @code{NULL}, the stream is not seekable. 7976 Rewind the stream. If this is @code{NULL}, the stream is not seekable.
7071 @end deftypefn 7977 @end deftypefn
7072 7978
7073 @deftypefn {Lstream Method} int seekable_p (Lstream *@var{stream}) 7979 @deftypefn {Lstream Method} int seekable_p (Lstream *@var{stream})
7074 Indicate whether this stream is seekable -- i.e. it can be rewound. 7980 Indicate whether this stream is seekable---i.e. it can be rewound.
7075 This method is ignored if the stream does not have a rewind method. If 7981 This method is ignored if the stream does not have a rewind method. If
7076 this method is not present, the result is determined by whether a rewind 7982 this method is not present, the result is determined by whether a rewind
7077 method is present. 7983 method is present.
7078 @end deftypefn 7984 @end deftypefn
7079 7985
7106 * Point:: 8012 * Point::
7107 * Window Hierarchy:: 8013 * Window Hierarchy::
7108 * The Window Object:: 8014 * The Window Object::
7109 @end menu 8015 @end menu
7110 8016
7111 @node Introduction to Consoles; Devices; Frames; Windows 8017 @node Introduction to Consoles; Devices; Frames; Windows, Point, Consoles; Devices; Frames; Windows, Consoles; Devices; Frames; Windows
7112 @section Introduction to Consoles; Devices; Frames; Windows 8018 @section Introduction to Consoles; Devices; Frames; Windows
7113 8019
7114 A window-system window that you see on the screen is called a 8020 A window-system window that you see on the screen is called a
7115 @dfn{frame} in Emacs terminology. Each frame is subdivided into one or 8021 @dfn{frame} in Emacs terminology. Each frame is subdivided into one or
7116 more non-overlapping panes, called (confusingly) @dfn{windows}. Each 8022 more non-overlapping panes, called (confusingly) @dfn{windows}. Each
7148 window, but every frame remembers the last window in it that was 8054 window, but every frame remembers the last window in it that was
7149 selected, and changing the selected frame causes the remembered window 8055 selected, and changing the selected frame causes the remembered window
7150 within it to become the selected window. Similar relationships apply 8056 within it to become the selected window. Similar relationships apply
7151 for consoles to devices and devices to frames. 8057 for consoles to devices and devices to frames.
7152 8058
7153 @node Point 8059 @node Point, Window Hierarchy, Introduction to Consoles; Devices; Frames; Windows, Consoles; Devices; Frames; Windows
7154 @section Point 8060 @section Point
7155 8061
7156 Recall that every buffer has a current insertion position, called 8062 Recall that every buffer has a current insertion position, called
7157 @dfn{point}. Now, two or more windows may be displaying the same buffer, 8063 @dfn{point}. Now, two or more windows may be displaying the same buffer,
7158 and the text cursor in the two windows (i.e. @code{point}) can be in 8064 and the text cursor in the two windows (i.e. @code{point}) can be in
7169 want to retrieve the correct value of @code{point} for a window, 8075 want to retrieve the correct value of @code{point} for a window,
7170 you must special-case on the selected window and retrieve the 8076 you must special-case on the selected window and retrieve the
7171 buffer's point instead. This is related to why @code{save-window-excursion} 8077 buffer's point instead. This is related to why @code{save-window-excursion}
7172 does not save the selected window's value of @code{point}. 8078 does not save the selected window's value of @code{point}.
7173 8079
7174 @node Window Hierarchy 8080 @node Window Hierarchy, The Window Object, Point, Consoles; Devices; Frames; Windows
7175 @section Window Hierarchy 8081 @section Window Hierarchy
7176 @cindex window hierarchy 8082 @cindex window hierarchy
7177 @cindex hierarchy of windows 8083 @cindex hierarchy of windows
7178 8084
7179 If a frame contains multiple windows (panes), they are always created 8085 If a frame contains multiple windows (panes), they are always created
7238 @dfn{one above the other}. 8144 @dfn{one above the other}.
7239 8145
7240 @item 8146 @item
7241 Leaf windows also have markers in their @code{start} (the 8147 Leaf windows also have markers in their @code{start} (the
7242 first buffer position displayed in the window) and @code{pointm} 8148 first buffer position displayed in the window) and @code{pointm}
7243 (the window's stashed value of @code{point} -- see above) fields, 8149 (the window's stashed value of @code{point}---see above) fields,
7244 while combination windows have nil in these fields. 8150 while combination windows have nil in these fields.
7245 8151
7246 @item 8152 @item
7247 The list of children for a window is threaded through the 8153 The list of children for a window is threaded through the
7248 @code{next} and @code{prev} fields of each child window. 8154 @code{next} and @code{prev} fields of each child window.
7254 does nothing except set a special @code{dead} bit to 1 and clear out the 8160 does nothing except set a special @code{dead} bit to 1 and clear out the
7255 @code{next}, @code{prev}, @code{hchild}, and @code{vchild} fields, for 8161 @code{next}, @code{prev}, @code{hchild}, and @code{vchild} fields, for
7256 GC purposes. 8162 GC purposes.
7257 8163
7258 @item 8164 @item
7259 Most frames actually have two top-level windows -- one for the 8165 Most frames actually have two top-level windows---one for the
7260 minibuffer and one (the @dfn{root}) for everything else. The modeline 8166 minibuffer and one (the @dfn{root}) for everything else. The modeline
7261 (if present) separates these two. The @code{next} field of the root 8167 (if present) separates these two. The @code{next} field of the root
7262 points to the minibuffer, and the @code{prev} field of the minibuffer 8168 points to the minibuffer, and the @code{prev} field of the minibuffer
7263 points to the root. The other @code{next} and @code{prev} fields are 8169 points to the root. The other @code{next} and @code{prev} fields are
7264 @code{nil}, and the frame points to both of these windows. 8170 @code{nil}, and the frame points to both of these windows.
7267 frames have no root window, and the @code{next} of the minibuffer window 8173 frames have no root window, and the @code{next} of the minibuffer window
7268 is @code{nil} but the @code{prev} points to itself. (#### This is an 8174 is @code{nil} but the @code{prev} points to itself. (#### This is an
7269 artifact that should be fixed.) 8175 artifact that should be fixed.)
7270 @end enumerate 8176 @end enumerate
7271 8177
7272 @node The Window Object 8178 @node The Window Object, , Window Hierarchy, Consoles; Devices; Frames; Windows
7273 @section The Window Object 8179 @section The Window Object
7274 8180
7275 Windows have the following accessible fields: 8181 Windows have the following accessible fields:
7276 8182
7277 @table @code 8183 @table @code
7396 @end enumerate 8302 @end enumerate
7397 8303
7398 @menu 8304 @menu
7399 * Critical Redisplay Sections:: 8305 * Critical Redisplay Sections::
7400 * Line Start Cache:: 8306 * Line Start Cache::
8307 * Redisplay Piece by Piece::
7401 @end menu 8308 @end menu
7402 8309
7403 @node Critical Redisplay Sections 8310 @node Critical Redisplay Sections, Line Start Cache, The Redisplay Mechanism, The Redisplay Mechanism
7404 @section Critical Redisplay Sections 8311 @section Critical Redisplay Sections
7405 @cindex critical redisplay sections 8312 @cindex critical redisplay sections
7406 8313
7407 Within this section, we are defenseless and assume that the 8314 Within this section, we are defenseless and assume that the
7408 following cannot happen: 8315 following cannot happen:
7430 we simply return. #### We should abort instead. 8337 we simply return. #### We should abort instead.
7431 8338
7432 #### If a frame-size change does occur we should probably 8339 #### If a frame-size change does occur we should probably
7433 actually be preempting redisplay. 8340 actually be preempting redisplay.
7434 8341
7435 @node Line Start Cache 8342 @node Line Start Cache, Redisplay Piece by Piece, Critical Redisplay Sections, The Redisplay Mechanism
7436 @section Line Start Cache 8343 @section Line Start Cache
7437 @cindex line start cache 8344 @cindex line start cache
7438 8345
7439 The traditional scrolling code in Emacs breaks in a variable height 8346 The traditional scrolling code in Emacs breaks in a variable height
7440 world. It depends on the key assumption that the number of lines that 8347 world. It depends on the key assumption that the number of lines that
7474 information basically for free. In those cases where a user is simply 8381 information basically for free. In those cases where a user is simply
7475 scrolling around viewing a buffer there is a high probability that this 8382 scrolling around viewing a buffer there is a high probability that this
7476 is sufficient to always provide the needed information. The second 8383 is sufficient to always provide the needed information. The second
7477 thing we can do is be smart about invalidating the cache. 8384 thing we can do is be smart about invalidating the cache.
7478 8385
7479 TODO -- Be smart about invalidating the cache. Potential places: 8386 TODO---Be smart about invalidating the cache. Potential places:
7480 8387
7481 @itemize @bullet 8388 @itemize @bullet
7482 @item 8389 @item
7483 Insertions at end-of-line which don't cause line-wraps do not alter the 8390 Insertions at end-of-line which don't cause line-wraps do not alter the
7484 starting positions of any display lines. These types of buffer 8391 starting positions of any display lines. These types of buffer
7491 @end itemize 8398 @end itemize
7492 8399
7493 In case you're wondering, the Second Golden Rule of Redisplay is not 8400 In case you're wondering, the Second Golden Rule of Redisplay is not
7494 applicable. 8401 applicable.
7495 8402
7496 @node Extents, Faces and Glyphs, The Redisplay Mechanism, Top 8403 @node Redisplay Piece by Piece, , Line Start Cache, The Redisplay Mechanism
8404 @section Redisplay Piece by Piece
8405 @cindex Redisplay Piece by Piece
8406
8407 As you can begin to see redisplay is complex and also not well
8408 documented. Chuck no longer works on XEmacs so this section is my take
8409 on the workings of redisplay.
8410
8411 Redisplay happens in three phases:
8412
8413 @enumerate
8414 @item
8415 Determine desired display in area that needs redisplay.
8416 Implemented by @code{redisplay.c}
8417 @item
8418 Compare desired display with current display
8419 Implemented by @code{redisplay-output.c}
8420 @item
8421 Output changes Implemented by @code{redisplay-output.c},
8422 @code{redisplay-x.c}, @code{redisplay-msw.c} and @code{redisplay-tty.c}
8423 @end enumerate
8424
8425 Steps 1 and 2 are device-independant and relatively complex. Step 3 is
8426 mostly device-dependent.
8427
8428 Determining the desired display
8429
8430 Display attributes are stored in @code{display_line} structures. Each
8431 @code{display_line} consists of a set of @code{display_block}'s and each
8432 @code{display_block} contains a number of @code{rune}'s. Generally
8433 dynarr's of @code{display_line}'s are held by each window representing
8434 the current display and the desired display.
8435
8436 The @code{display_line} structures are tighly tied to buffers which
8437 presents a problem for redisplay as this connection is bogus for the
8438 modeline. Hence the @code{display_line} generation routines are
8439 duplicated for generating the modeline. This means that the modeline
8440 display code has many bugs that the standard redisplay code does not.
8441
8442 The guts of @code{display_line} generation are in
8443 @code{create_text_block}, which creates a single display line for the
8444 desired locale. This incrementally parses the characters on the current
8445 line and generates redisplay structures for each.
8446
8447 Gutter redisplay is different. Because the data to display is stored in
8448 a string we cannot use @code{create_text_block}. Instead we use
8449 @code{create_text_string_block} which performs the same function as
8450 @code{create_text_block} but for strings. Many of the complexities of
8451 @code{create_text_block} to do with cursor handling and selective
8452 display have been removed.
8453
8454 @node Extents, Faces, The Redisplay Mechanism, Top
7497 @chapter Extents 8455 @chapter Extents
7498 8456
7499 @menu 8457 @menu
7500 * Introduction to Extents:: Extents are ranges over text, with properties. 8458 * Introduction to Extents:: Extents are ranges over text, with properties.
7501 * Extent Ordering:: How extents are ordered internally. 8459 * Extent Ordering:: How extents are ordered internally.
7502 * Format of the Extent Info:: The extent information in a buffer or string. 8460 * Format of the Extent Info:: The extent information in a buffer or string.
7503 * Zero-Length Extents:: A weird special case. 8461 * Zero-Length Extents:: A weird special case.
7504 * Mathematics of Extent Ordering:: A rigorous foundation. 8462 * Mathematics of Extent Ordering:: A rigorous foundation.
7505 * Extent Fragments:: Cached information useful for redisplay. 8463 * Extent Fragments:: Cached information useful for redisplay.
7506 @end menu 8464 @end menu
7507 8465
7508 @node Introduction to Extents 8466 @node Introduction to Extents, Extent Ordering, Extents, Extents
7509 @section Introduction to Extents 8467 @section Introduction to Extents
7510 8468
7511 Extents are regions over a buffer, with a start and an end position 8469 Extents are regions over a buffer, with a start and an end position
7512 denoting the region of the buffer included in the extent. In 8470 denoting the region of the buffer included in the extent. In
7513 addition, either end can be closed or open, meaning that the endpoint 8471 addition, either end can be closed or open, meaning that the endpoint
7525 automatically go inside or out of extents as necessary with no 8483 automatically go inside or out of extents as necessary with no
7526 further work needing to be done. It didn't work out that way, 8484 further work needing to be done. It didn't work out that way,
7527 however, and just ended up complexifying and buggifying all the 8485 however, and just ended up complexifying and buggifying all the
7528 rest of the code.) 8486 rest of the code.)
7529 8487
7530 @node Extent Ordering 8488 @node Extent Ordering, Format of the Extent Info, Introduction to Extents, Extents
7531 @section Extent Ordering 8489 @section Extent Ordering
7532 8490
7533 Extents are compared using memory indices. There are two orderings 8491 Extents are compared using memory indices. There are two orderings
7534 for extents and both orders are kept current at all times. The normal 8492 for extents and both orders are kept current at all times. The normal
7535 or @dfn{display} order is as follows: 8493 or @dfn{display} order is as follows:
7559 The display order and the e-order are complementary orders: any 8517 The display order and the e-order are complementary orders: any
7560 theorem about the display order also applies to the e-order if you swap 8518 theorem about the display order also applies to the e-order if you swap
7561 all occurrences of ``display order'' and ``e-order'', ``less than'' and 8519 all occurrences of ``display order'' and ``e-order'', ``less than'' and
7562 ``greater than'', and ``extent start'' and ``extent end''. 8520 ``greater than'', and ``extent start'' and ``extent end''.
7563 8521
7564 @node Format of the Extent Info 8522 @node Format of the Extent Info, Zero-Length Extents, Extent Ordering, Extents
7565 @section Format of the Extent Info 8523 @section Format of the Extent Info
7566 8524
7567 An extent-info structure consists of a list of the buffer or string's 8525 An extent-info structure consists of a list of the buffer or string's
7568 extents and a @dfn{stack of extents} that lists all of the extents over 8526 extents and a @dfn{stack of extents} that lists all of the extents over
7569 a particular position. The stack-of-extents info is used for 8527 a particular position. The stack-of-extents info is used for
7570 optimization purposes -- it basically caches some info that might 8528 optimization purposes---it basically caches some info that might
7571 be expensive to compute. Certain otherwise hard computations are easy 8529 be expensive to compute. Certain otherwise hard computations are easy
7572 given the stack of extents over a particular position, and if the 8530 given the stack of extents over a particular position, and if the
7573 stack of extents over a nearby position is known (because it was 8531 stack of extents over a nearby position is known (because it was
7574 calculated at some prior point in time), it's easy to move the stack 8532 calculated at some prior point in time), it's easy to move the stack
7575 of extents to the proper position. 8533 of extents to the proper position.
7593 between two extents. Note also that callers of these functions should 8551 between two extents. Note also that callers of these functions should
7594 not be aware of the fact that the extent list is implemented as an 8552 not be aware of the fact that the extent list is implemented as an
7595 array, except for the fact that positions are integers (this should be 8553 array, except for the fact that positions are integers (this should be
7596 generalized to handle integers and linked list equally well). 8554 generalized to handle integers and linked list equally well).
7597 8555
7598 @node Zero-Length Extents 8556 @node Zero-Length Extents, Mathematics of Extent Ordering, Format of the Extent Info, Extents
7599 @section Zero-Length Extents 8557 @section Zero-Length Extents
7600 8558
7601 Extents can be zero-length, and will end up that way if their endpoints 8559 Extents can be zero-length, and will end up that way if their endpoints
7602 are explicitly set that way or if their detachable property is nil 8560 are explicitly set that way or if their detachable property is nil
7603 and all the text in the extent is deleted. (The exception is open-open 8561 and all the text in the extent is deleted. (The exception is open-open
7622 8580
7623 Note that closed-open, non-detachable zero-length extents behave 8581 Note that closed-open, non-detachable zero-length extents behave
7624 exactly like markers and that open-closed, non-detachable zero-length 8582 exactly like markers and that open-closed, non-detachable zero-length
7625 extents behave like the ``point-type'' marker in Mule. 8583 extents behave like the ``point-type'' marker in Mule.
7626 8584
7627 @node Mathematics of Extent Ordering 8585 @node Mathematics of Extent Ordering, Extent Fragments, Zero-Length Extents, Extents
7628 @section Mathematics of Extent Ordering 8586 @section Mathematics of Extent Ordering
7629 @cindex extent mathematics 8587 @cindex extent mathematics
7630 @cindex mathematics of extents 8588 @cindex mathematics of extents
7631 @cindex extent ordering 8589 @cindex extent ordering
7632 8590
7757 Proof: If @math{F2} does not include @math{I} then its start index is 8715 Proof: If @math{F2} does not include @math{I} then its start index is
7758 greater than @math{I} and thus it is greater than any extent in 8716 greater than @math{I} and thus it is greater than any extent in
7759 @math{S}, including @math{F}. Otherwise, @math{F2} includes @math{I} 8717 @math{S}, including @math{F}. Otherwise, @math{F2} includes @math{I}
7760 and thus is in @math{S}, and thus @math{F2 >= F}. 8718 and thus is in @math{S}, and thus @math{F2 >= F}.
7761 8719
7762 @node Extent Fragments 8720 @node Extent Fragments, , Mathematics of Extent Ordering, Extents
7763 @section Extent Fragments 8721 @section Extent Fragments
7764 @cindex extent fragment 8722 @cindex extent fragment
7765 8723
7766 Imagine that the buffer is divided up into contiguous, non-overlapping 8724 Imagine that the buffer is divided up into contiguous, non-overlapping
7767 @dfn{runs} of text such that no extent starts or ends within a run 8725 @dfn{runs} of text such that no extent starts or ends within a run
7768 (extents that abut the run don't count). 8726 (extents that abut the run don't count).
7769 8727
7770 An extent fragment is a structure that holds data about the run that 8728 An extent fragment is a structure that holds data about the run that
7771 contains a particular buffer position (if the buffer position is at the 8729 contains a particular buffer position (if the buffer position is at the
7772 junction of two runs, the run after the position is used) -- the 8730 junction of two runs, the run after the position is used)---the
7773 beginning and end of the run, a list of all of the extents in that run, 8731 beginning and end of the run, a list of all of the extents in that run,
7774 the @dfn{merged face} that results from merging all of the faces 8732 the @dfn{merged face} that results from merging all of the faces
7775 corresponding to those extents, the begin and end glyphs at the 8733 corresponding to those extents, the begin and end glyphs at the
7776 beginning of the run, etc. This is the information that redisplay needs 8734 beginning of the run, etc. This is the information that redisplay needs
7777 in order to display this run. 8735 in order to display this run.
7779 Extent fragments have to be very quick to update to a new buffer 8737 Extent fragments have to be very quick to update to a new buffer
7780 position when moving linearly through the buffer. They rely on the 8738 position when moving linearly through the buffer. They rely on the
7781 stack-of-extents code, which does the heavy-duty algorithmic work of 8739 stack-of-extents code, which does the heavy-duty algorithmic work of
7782 determining which extents overly a particular position. 8740 determining which extents overly a particular position.
7783 8741
7784 @node Faces and Glyphs, Specifiers, Extents, Top 8742 @node Faces, Glyphs, Extents, Top
7785 @chapter Faces and Glyphs 8743 @chapter Faces
7786 8744
7787 Not yet documented. 8745 Not yet documented.
7788 8746
7789 @node Specifiers, Menus, Faces and Glyphs, Top 8747 @node Glyphs, Specifiers, Faces, Top
8748 @chapter Glyphs
8749
8750 Glyphs are graphical elements that can be displayed in XEmacs buffers or
8751 gutters. We use the term graphical element here in the broadest possible
8752 sense since glyphs can be as mundane as text to as arcane as a native
8753 tab widget.
8754
8755 In XEmacs, glyphs represent the uninstantiated state of graphical
8756 elements, i.e. they hold all the information necessary to produce an
8757 image on-screen but the image does not exist at this stage.
8758
8759 Glyphs are lazily instantiated by calling one of the glyph
8760 functions. This usually occurs within redisplay when
8761 @code{Fglyph_height} is called. Instantiation causes an image-instance
8762 to be created and cached. This cache is on a device basis for all glyphs
8763 except glyph-widgets, and on a window basis for glyph widgets. The
8764 caching is done by @code{image_instantiate} and is necessary because it
8765 is generally possible to display an image-instance in multiple
8766 domains. For instance if we create a Pixmap, we can actually display
8767 this on multiple windows - even though we only need a single Pixmap
8768 instance to do this. If caching wasn't done then it would be necessary
8769 to create image-instances for every displayable occurrance of a glyph -
8770 and every usage - and this would be extremely memory and cpu intensive.
8771
8772 Widget-glyphs (a.k.a native widgets) are not cached in this way. This is
8773 because widget-glyph image-instances on screen are toolkit windows, and
8774 thus cannot be reused in multiple XEmacs domains. Thus widget-glyphs are
8775 cached on a window basis.
8776
8777 Any action on a glyph first consults the cache before actually
8778 instantiating a widget.
8779
8780 @section Widget-Glyphs in the MS-Windows Environment
8781
8782 To Do
8783
8784 @section Widget-Glyphs in the X Environment
8785
8786 Widget-glyphs under X make heavy use of lwlib for manipulating the
8787 native toolkit objects. This is primarily so that different toolkits can
8788 be supported for widget-glyphs, just as they are supported for features
8789 such as menubars etc.
8790
8791 Lwlib is extremely poorly documented and quite hairy so here is my
8792 understanding of what goes on.
8793
8794 Lwlib maintains a set of widget_instances which mirror the hierarchical
8795 state of Xt widgets. I think this is so that widgets can be updated and
8796 manipulated generically by the lwlib library. For instance
8797 update_one_widget_instance can cope with multiple types of widget and
8798 multiple types of toolkit. Each element in the widget hierarchy is updated
8799 from its corresponding widget_instance by walking the widget_instance
8800 tree recursively.
8801
8802 This has desirable properties such as lw_modify_all_widgets which is
8803 called from glyphs-x.c and updates all the properties of a widget
8804 without having to know what the widget is or what toolkit it is from.
8805 Unfortunately this also has hairy properrties such as making the lwlib
8806 code quite complex. And of course lwlib has to know at some level what
8807 the widget is and how to set its properties.
8808
8809 @node Specifiers, Menus, Glyphs, Top
7790 @chapter Specifiers 8810 @chapter Specifiers
7791 8811
7792 Not yet documented. 8812 Not yet documented.
7793 8813
7794 @node Menus, Subprocesses, Specifiers, Top 8814 @node Menus, Subprocesses, Specifiers, Top
7914 @item tty_name 8934 @item tty_name
7915 The name of the terminal that the subprocess is using, 8935 The name of the terminal that the subprocess is using,
7916 or @code{nil} if it is using pipes. 8936 or @code{nil} if it is using pipes.
7917 @end table 8937 @end table
7918 8938
7919 @node Interface to X Windows, Index, Subprocesses, Top 8939 @node Interface to X Windows, Index , Subprocesses, Top
7920 @chapter Interface to X Windows 8940 @chapter Interface to X Windows
7921 8941
7922 Not yet documented. 8942 Not yet documented.
7923 8943
7924 @include index.texi 8944 @include index.texi