428
+ − 1 \input texinfo @c -*-texinfo-*-
+ − 2 @c %**start of header
+ − 3 @setfilename ../../info/internals.info
+ − 4 @settitle XEmacs Internals Manual
+ − 5 @c %**end of header
+ − 6
+ − 7 @ifinfo
+ − 8 @dircategory XEmacs Editor
+ − 9 @direntry
440
+ − 10 * Internals: (internals). XEmacs Internals Manual.
428
+ − 11 @end direntry
+ − 12
+ − 13 Copyright @copyright{} 1992 - 1996 Ben Wing.
+ − 14 Copyright @copyright{} 1996, 1997 Sun Microsystems.
+ − 15 Copyright @copyright{} 1994 - 1998 Free Software Foundation.
+ − 16 Copyright @copyright{} 1994, 1995 Board of Trustees, University of Illinois.
+ − 17
+ − 18
+ − 19 Permission is granted to make and distribute verbatim copies of this
+ − 20 manual provided the copyright notice and this permission notice are
+ − 21 preserved on all copies.
+ − 22
+ − 23 @ignore
+ − 24 Permission is granted to process this file through TeX and print the
+ − 25 results, provided the printed document carries copying permission notice
+ − 26 identical to this one except for the removal of this paragraph (this
+ − 27 paragraph not being relevant to the printed manual).
+ − 28
+ − 29 @end ignore
+ − 30 Permission is granted to copy and distribute modified versions of this
+ − 31 manual under the conditions for verbatim copying, provided that the
+ − 32 entire resulting derived work is distributed under the terms of a
+ − 33 permission notice identical to this one.
+ − 34
+ − 35 Permission is granted to copy and distribute translations of this manual
+ − 36 into another language, under the above conditions for modified versions,
+ − 37 except that this permission notice may be stated in a translation
+ − 38 approved by the Foundation.
+ − 39
+ − 40 Permission is granted to copy and distribute modified versions of this
+ − 41 manual under the conditions for verbatim copying, provided also that the
+ − 42 section entitled ``GNU General Public License'' is included exactly as
+ − 43 in the original, and provided that the entire resulting derived work is
+ − 44 distributed under the terms of a permission notice identical to this
+ − 45 one.
+ − 46
+ − 47 Permission is granted to copy and distribute translations of this manual
+ − 48 into another language, under the above conditions for modified versions,
+ − 49 except that the section entitled ``GNU General Public License'' may be
+ − 50 included in a translation approved by the Free Software Foundation
+ − 51 instead of in the original English.
+ − 52 @end ifinfo
+ − 53
+ − 54 @c Combine indices.
+ − 55 @synindex cp fn
+ − 56 @syncodeindex vr fn
+ − 57 @syncodeindex ky fn
+ − 58 @syncodeindex pg fn
+ − 59 @syncodeindex tp fn
+ − 60
+ − 61 @setchapternewpage odd
+ − 62 @finalout
+ − 63
+ − 64 @titlepage
+ − 65 @title XEmacs Internals Manual
+ − 66 @subtitle Version 1.3, August 1999
+ − 67
+ − 68 @author Ben Wing
+ − 69 @author Martin Buchholz
+ − 70 @author Hrvoje Niksic
+ − 71 @author Matthias Neubauer
442
+ − 72 @author Olivier Galibert
428
+ − 73 @page
+ − 74 @vskip 0pt plus 1fill
+ − 75
+ − 76 @noindent
+ − 77 Copyright @copyright{} 1992 - 1996 Ben Wing. @*
+ − 78 Copyright @copyright{} 1996, 1997 Sun Microsystems, Inc. @*
+ − 79 Copyright @copyright{} 1994 - 1998 Free Software Foundation. @*
+ − 80 Copyright @copyright{} 1994, 1995 Board of Trustees, University of Illinois.
+ − 81
+ − 82 @sp 2
+ − 83 Version 1.3 @*
+ − 84 August 1999.@*
+ − 85
+ − 86 Permission is granted to make and distribute verbatim copies of this
+ − 87 manual provided the copyright notice and this permission notice are
+ − 88 preserved on all copies.
+ − 89
+ − 90 Permission is granted to copy and distribute modified versions of this
+ − 91 manual under the conditions for verbatim copying, provided also that the
+ − 92 section entitled ``GNU General Public License'' is included
+ − 93 exactly as in the original, and provided that the entire resulting
+ − 94 derived work is distributed under the terms of a permission notice
+ − 95 identical to this one.
+ − 96
+ − 97 Permission is granted to copy and distribute translations of this manual
+ − 98 into another language, under the above conditions for modified versions,
+ − 99 except that the section entitled ``GNU General Public License'' may be
+ − 100 included in a translation approved by the Free Software Foundation
+ − 101 instead of in the original English.
+ − 102 @end titlepage
+ − 103 @page
+ − 104
+ − 105 @node Top, A History of Emacs, (dir), (dir)
+ − 106
+ − 107 @ifinfo
+ − 108 This Info file contains v1.0 of the XEmacs Internals Manual.
+ − 109 @end ifinfo
+ − 110
+ − 111 @menu
+ − 112 * A History of Emacs:: Times, dates, important events.
+ − 113 * XEmacs From the Outside:: A broad conceptual overview.
+ − 114 * The Lisp Language:: An overview.
+ − 115 * XEmacs From the Perspective of Building::
+ − 116 * XEmacs From the Inside::
+ − 117 * The XEmacs Object System (Abstractly Speaking)::
+ − 118 * How Lisp Objects Are Represented in C::
+ − 119 * Rules When Writing New C Code::
+ − 120 * A Summary of the Various XEmacs Modules::
+ − 121 * Allocation of Objects in XEmacs Lisp::
442
+ − 122 * Dumping::
428
+ − 123 * Events and the Event Loop::
+ − 124 * Evaluation; Stack Frames; Bindings::
+ − 125 * Symbols and Variables::
+ − 126 * Buffers and Textual Representation::
+ − 127 * MULE Character Sets and Encodings::
+ − 128 * The Lisp Reader and Compiler::
+ − 129 * Lstreams::
+ − 130 * Consoles; Devices; Frames; Windows::
+ − 131 * The Redisplay Mechanism::
+ − 132 * Extents::
+ − 133 * Faces::
+ − 134 * Glyphs::
+ − 135 * Specifiers::
+ − 136 * Menus::
+ − 137 * Subprocesses::
446
+ − 138 * Interface to the X Window System::
442
+ − 139 * Index::
+ − 140
+ − 141 @detailmenu
+ − 142
+ − 143 --- The Detailed Node Listing ---
428
+ − 144
+ − 145 A History of Emacs
+ − 146
+ − 147 * Through Version 18:: Unification prevails.
+ − 148 * Lucid Emacs:: One version 19 Emacs.
+ − 149 * GNU Emacs 19:: The other version 19 Emacs.
442
+ − 150 * GNU Emacs 20:: The other version 20 Emacs.
428
+ − 151 * XEmacs:: The continuation of Lucid Emacs.
+ − 152
+ − 153 Rules When Writing New C Code
+ − 154
+ − 155 * General Coding Rules::
+ − 156 * Writing Lisp Primitives::
+ − 157 * Adding Global Lisp Variables::
442
+ − 158 * Coding for Mule::
428
+ − 159 * Techniques for XEmacs Developers::
+ − 160
442
+ − 161 Coding for Mule
+ − 162
+ − 163 * Character-Related Data Types::
+ − 164 * Working With Character and Byte Positions::
+ − 165 * Conversion to and from External Data::
+ − 166 * General Guidelines for Writing Mule-Aware Code::
+ − 167 * An Example of Mule-Aware Code::
+ − 168
428
+ − 169 A Summary of the Various XEmacs Modules
+ − 170
+ − 171 * Low-Level Modules::
+ − 172 * Basic Lisp Modules::
+ − 173 * Modules for Standard Editing Operations::
+ − 174 * Editor-Level Control Flow Modules::
+ − 175 * Modules for the Basic Displayable Lisp Objects::
+ − 176 * Modules for other Display-Related Lisp Objects::
+ − 177 * Modules for the Redisplay Mechanism::
+ − 178 * Modules for Interfacing with the File System::
+ − 179 * Modules for Other Aspects of the Lisp Interpreter and Object System::
+ − 180 * Modules for Interfacing with the Operating System::
+ − 181 * Modules for Interfacing with X Windows::
+ − 182 * Modules for Internationalization::
+ − 183
+ − 184 Allocation of Objects in XEmacs Lisp
+ − 185
+ − 186 * Introduction to Allocation::
+ − 187 * Garbage Collection::
+ − 188 * GCPROing::
+ − 189 * Garbage Collection - Step by Step::
+ − 190 * Integers and Characters::
+ − 191 * Allocation from Frob Blocks::
+ − 192 * lrecords::
+ − 193 * Low-level allocation::
+ − 194 * Cons::
+ − 195 * Vector::
+ − 196 * Bit Vector::
+ − 197 * Symbol::
+ − 198 * Marker::
+ − 199 * String::
+ − 200 * Compiled Function::
+ − 201
442
+ − 202 Garbage Collection - Step by Step
+ − 203
+ − 204 * Invocation::
+ − 205 * garbage_collect_1::
+ − 206 * mark_object::
+ − 207 * gc_sweep::
+ − 208 * sweep_lcrecords_1::
+ − 209 * compact_string_chars::
+ − 210 * sweep_strings::
+ − 211 * sweep_bit_vectors_1::
+ − 212
+ − 213 Dumping
+ − 214
+ − 215 * Overview::
+ − 216 * Data descriptions::
+ − 217 * Dumping phase::
+ − 218 * Reloading phase::
+ − 219
+ − 220 Dumping phase
+ − 221
+ − 222 * Object inventory::
+ − 223 * Address allocation::
+ − 224 * The header::
+ − 225 * Data dumping::
+ − 226 * Pointers dumping::
+ − 227
428
+ − 228 Events and the Event Loop
+ − 229
+ − 230 * Introduction to Events::
+ − 231 * Main Loop::
+ − 232 * Specifics of the Event Gathering Mechanism::
+ − 233 * Specifics About the Emacs Event::
+ − 234 * The Event Stream Callback Routines::
+ − 235 * Other Event Loop Functions::
+ − 236 * Converting Events::
+ − 237 * Dispatching Events; The Command Builder::
+ − 238
+ − 239 Evaluation; Stack Frames; Bindings
+ − 240
+ − 241 * Evaluation::
+ − 242 * Dynamic Binding; The specbinding Stack; Unwind-Protects::
+ − 243 * Simple Special Forms::
+ − 244 * Catch and Throw::
+ − 245
+ − 246 Symbols and Variables
+ − 247
+ − 248 * Introduction to Symbols::
+ − 249 * Obarrays::
+ − 250 * Symbol Values::
+ − 251
+ − 252 Buffers and Textual Representation
+ − 253
+ − 254 * Introduction to Buffers:: A buffer holds a block of text such as a file.
+ − 255 * The Text in a Buffer:: Representation of the text in a buffer.
+ − 256 * Buffer Lists:: Keeping track of all buffers.
+ − 257 * Markers and Extents:: Tagging locations within a buffer.
+ − 258 * Bufbytes and Emchars:: Representation of individual characters.
+ − 259 * The Buffer Object:: The Lisp object corresponding to a buffer.
+ − 260
+ − 261 MULE Character Sets and Encodings
+ − 262
+ − 263 * Character Sets::
+ − 264 * Encodings::
+ − 265 * Internal Mule Encodings::
442
+ − 266 * CCL::
428
+ − 267
+ − 268 Encodings
+ − 269
+ − 270 * Japanese EUC (Extended Unix Code)::
+ − 271 * JIS7::
+ − 272
+ − 273 Internal Mule Encodings
+ − 274
+ − 275 * Internal String Encoding::
+ − 276 * Internal Character Encoding::
+ − 277
+ − 278 Lstreams
+ − 279
442
+ − 280 * Creating an Lstream:: Creating an lstream object.
+ − 281 * Lstream Types:: Different sorts of things that are streamed.
+ − 282 * Lstream Functions:: Functions for working with lstreams.
+ − 283 * Lstream Methods:: Creating new lstream types.
+ − 284
428
+ − 285 Consoles; Devices; Frames; Windows
+ − 286
+ − 287 * Introduction to Consoles; Devices; Frames; Windows::
+ − 288 * Point::
+ − 289 * Window Hierarchy::
442
+ − 290 * The Window Object::
428
+ − 291
+ − 292 The Redisplay Mechanism
+ − 293
+ − 294 * Critical Redisplay Sections::
+ − 295 * Line Start Cache::
442
+ − 296 * Redisplay Piece by Piece::
428
+ − 297
+ − 298 Extents
+ − 299
+ − 300 * Introduction to Extents:: Extents are ranges over text, with properties.
+ − 301 * Extent Ordering:: How extents are ordered internally.
+ − 302 * Format of the Extent Info:: The extent information in a buffer or string.
+ − 303 * Zero-Length Extents:: A weird special case.
442
+ − 304 * Mathematics of Extent Ordering:: A rigorous foundation.
428
+ − 305 * Extent Fragments:: Cached information useful for redisplay.
+ − 306
442
+ − 307 @end detailmenu
428
+ − 308 @end menu
+ − 309
+ − 310 @node A History of Emacs, XEmacs From the Outside, Top, Top
+ − 311 @chapter A History of Emacs
+ − 312 @cindex history of Emacs
+ − 313 @cindex Hackers (Steven Levy)
+ − 314 @cindex Levy, Steven
+ − 315 @cindex ITS (Incompatible Timesharing System)
+ − 316 @cindex Stallman, Richard
+ − 317 @cindex RMS
+ − 318 @cindex MIT
+ − 319 @cindex TECO
+ − 320 @cindex FSF
+ − 321 @cindex Free Software Foundation
+ − 322
+ − 323 XEmacs is a powerful, customizable text editor and development
+ − 324 environment. It began as Lucid Emacs, which was in turn derived from
+ − 325 GNU Emacs, a program written by Richard Stallman of the Free Software
+ − 326 Foundation. GNU Emacs dates back to the 1970's, and was modelled
+ − 327 after a package called ``Emacs'', written in 1976, that was a set of
+ − 328 macros on top of TECO, an old, old text editor written at MIT on the
+ − 329 DEC PDP 10 under one of the earliest time-sharing operating systems,
+ − 330 ITS (Incompatible Timesharing System). (ITS dates back well before
+ − 331 Unix.) ITS, TECO, and Emacs were products of a group of people at MIT
+ − 332 who called themselves ``hackers'', who shared an idealistic belief
+ − 333 system about the free exchange of information and were fanatical in
+ − 334 their devotion to and time spent with computers. (The hacker
+ − 335 subculture dates back to the late 1950's at MIT and is described in
+ − 336 detail in Steven Levy's book @cite{Hackers}. This book also includes
+ − 337 a lot of information about Stallman himself and the development of
+ − 338 Lisp, a programming language developed at MIT that underlies Emacs.)
+ − 339
+ − 340 @menu
+ − 341 * Through Version 18:: Unification prevails.
+ − 342 * Lucid Emacs:: One version 19 Emacs.
+ − 343 * GNU Emacs 19:: The other version 19 Emacs.
+ − 344 * GNU Emacs 20:: The other version 20 Emacs.
+ − 345 * XEmacs:: The continuation of Lucid Emacs.
+ − 346 @end menu
+ − 347
442
+ − 348 @node Through Version 18, Lucid Emacs, A History of Emacs, A History of Emacs
428
+ − 349 @section Through Version 18
+ − 350 @cindex Gosling, James
+ − 351 @cindex Great Usenet Renaming
+ − 352
+ − 353 Although the history of the early versions of GNU Emacs is unclear,
+ − 354 the history is well-known from the middle of 1985. A time line is:
+ − 355
+ − 356 @itemize @bullet
+ − 357 @item
+ − 358 GNU Emacs version 15 (15.34) was released sometime in 1984 or 1985 and
+ − 359 shared some code with a version of Emacs written by James Gosling (the
+ − 360 same James Gosling who later created the Java language).
+ − 361 @item
+ − 362 GNU Emacs version 16 (first released version was 16.56) was released on
+ − 363 July 15, 1985. All Gosling code was removed due to potential copyright
+ − 364 problems with the code.
+ − 365 @item
+ − 366 version 16.57: released on September 16, 1985.
+ − 367 @item
+ − 368 versions 16.58, 16.59: released on September 17, 1985.
+ − 369 @item
+ − 370 version 16.60: released on September 19, 1985. These later version 16's
+ − 371 incorporated patches from the net, esp. for getting Emacs to work under
+ − 372 System V.
+ − 373 @item
+ − 374 version 17.36 (first official v17 release) released on December 20,
+ − 375 1985. Included a TeX-able user manual. First official unpatched
+ − 376 version that worked on vanilla System V machines.
+ − 377 @item
+ − 378 version 17.43 (second official v17 release) released on January 25,
+ − 379 1986.
+ − 380 @item
+ − 381 version 17.45 released on January 30, 1986.
+ − 382 @item
+ − 383 version 17.46 released on February 4, 1986.
+ − 384 @item
+ − 385 version 17.48 released on February 10, 1986.
+ − 386 @item
+ − 387 version 17.49 released on February 12, 1986.
+ − 388 @item
+ − 389 version 17.55 released on March 18, 1986.
+ − 390 @item
+ − 391 version 17.57 released on March 27, 1986.
+ − 392 @item
+ − 393 version 17.58 released on April 4, 1986.
+ − 394 @item
+ − 395 version 17.61 released on April 12, 1986.
+ − 396 @item
+ − 397 version 17.63 released on May 7, 1986.
+ − 398 @item
+ − 399 version 17.64 released on May 12, 1986.
+ − 400 @item
+ − 401 version 18.24 (a beta version) released on October 2, 1986.
+ − 402 @item
+ − 403 version 18.30 (a beta version) released on November 15, 1986.
+ − 404 @item
+ − 405 version 18.31 (a beta version) released on November 23, 1986.
+ − 406 @item
+ − 407 version 18.32 (a beta version) released on December 7, 1986.
+ − 408 @item
+ − 409 version 18.33 (a beta version) released on December 12, 1986.
+ − 410 @item
+ − 411 version 18.35 (a beta version) released on January 5, 1987.
+ − 412 @item
+ − 413 version 18.36 (a beta version) released on January 21, 1987.
+ − 414 @item
+ − 415 January 27, 1987: The Great Usenet Renaming. net.emacs is now
+ − 416 comp.emacs.
+ − 417 @item
+ − 418 version 18.37 (a beta version) released on February 12, 1987.
+ − 419 @item
+ − 420 version 18.38 (a beta version) released on March 3, 1987.
+ − 421 @item
+ − 422 version 18.39 (a beta version) released on March 14, 1987.
+ − 423 @item
+ − 424 version 18.40 (a beta version) released on March 18, 1987.
+ − 425 @item
+ − 426 version 18.41 (the first ``official'' release) released on March 22,
+ − 427 1987.
+ − 428 @item
+ − 429 version 18.45 released on June 2, 1987.
+ − 430 @item
+ − 431 version 18.46 released on June 9, 1987.
+ − 432 @item
+ − 433 version 18.47 released on June 18, 1987.
+ − 434 @item
+ − 435 version 18.48 released on September 3, 1987.
+ − 436 @item
+ − 437 version 18.49 released on September 18, 1987.
+ − 438 @item
+ − 439 version 18.50 released on February 13, 1988.
+ − 440 @item
+ − 441 version 18.51 released on May 7, 1988.
+ − 442 @item
+ − 443 version 18.52 released on September 1, 1988.
+ − 444 @item
+ − 445 version 18.53 released on February 24, 1989.
+ − 446 @item
+ − 447 version 18.54 released on April 26, 1989.
+ − 448 @item
+ − 449 version 18.55 released on August 23, 1989. This is the earliest version
+ − 450 that is still available by FTP.
+ − 451 @item
+ − 452 version 18.56 released on January 17, 1991.
+ − 453 @item
+ − 454 version 18.57 released late January, 1991.
+ − 455 @item
+ − 456 version 18.58 released ?????.
+ − 457 @item
+ − 458 version 18.59 released October 31, 1992.
+ − 459 @end itemize
+ − 460
442
+ − 461 @node Lucid Emacs, GNU Emacs 19, Through Version 18, A History of Emacs
428
+ − 462 @section Lucid Emacs
+ − 463 @cindex Lucid Emacs
+ − 464 @cindex Lucid Inc.
+ − 465 @cindex Energize
+ − 466 @cindex Epoch
+ − 467
+ − 468 Lucid Emacs was developed by the (now-defunct) Lucid Inc., a maker of
+ − 469 C++ and Lisp development environments. It began when Lucid decided they
+ − 470 wanted to use Emacs as the editor and cornerstone of their C++
+ − 471 development environment (called ``Energize''). They needed many features
+ − 472 that were not available in the existing version of GNU Emacs (version
+ − 473 18.5something), in particular good and integrated support for GUI
+ − 474 elements such as mouse support, multiple fonts, multiple window-system
+ − 475 windows, etc. A branch of GNU Emacs called Epoch, written at the
+ − 476 University of Illinois, existed that supplied many of these features;
+ − 477 however, Lucid needed more than what existed in Epoch. At the time, the
+ − 478 Free Software Foundation was working on version 19 of Emacs (this was
+ − 479 sometime around 1991), which was planned to have similar features, and
+ − 480 so Lucid decided to work with the Free Software Foundation. Their plan
+ − 481 was to add features that they needed, and coordinate with the FSF so
+ − 482 that the features would get included back into Emacs version 19.
+ − 483
+ − 484 Delays in the release of version 19 occurred, however (resulting in it
+ − 485 finally being released more than a year after what was initially
+ − 486 planned), and Lucid encountered unexpected technical resistance in
+ − 487 getting their changes merged back into version 19, so they decided to
+ − 488 release their own version of Emacs, which became Lucid Emacs 19.0.
+ − 489
+ − 490 @cindex Zawinski, Jamie
+ − 491 @cindex Sexton, Harlan
+ − 492 @cindex Benson, Eric
+ − 493 @cindex Devin, Matthieu
+ − 494 The initial authors of Lucid Emacs were Matthieu Devin, Harlan Sexton,
+ − 495 and Eric Benson, and the work was later taken over by Jamie Zawinski,
+ − 496 who became ``Mr. Lucid Emacs'' for many releases.
+ − 497
+ − 498 A time line for Lucid Emacs/XEmacs is
+ − 499
+ − 500 @itemize @bullet
+ − 501 @item
+ − 502 version 19.0 shipped with Energize 1.0, April 1992.
+ − 503 @item
+ − 504 version 19.1 released June 4, 1992.
+ − 505 @item
+ − 506 version 19.2 released June 19, 1992.
+ − 507 @item
+ − 508 version 19.3 released September 9, 1992.
+ − 509 @item
+ − 510 version 19.4 released January 21, 1993.
+ − 511 @item
+ − 512 version 19.5 was a repackaging of 19.4 with a few bug fixes and
+ − 513 shipped with Energize 2.0. Never released to the net.
+ − 514 @item
+ − 515 version 19.6 released April 9, 1993.
+ − 516 @item
+ − 517 version 19.7 was a repackaging of 19.6 with a few bug fixes and
+ − 518 shipped with Energize 2.1. Never released to the net.
+ − 519 @item
+ − 520 version 19.8 released September 6, 1993.
+ − 521 @item
+ − 522 version 19.9 released January 12, 1994.
+ − 523 @item
+ − 524 version 19.10 released May 27, 1994.
+ − 525 @item
+ − 526 version 19.11 (first XEmacs) released September 13, 1994.
+ − 527 @item
+ − 528 version 19.12 released June 23, 1995.
+ − 529 @item
+ − 530 version 19.13 released September 1, 1995.
+ − 531 @item
+ − 532 version 19.14 released June 23, 1996.
+ − 533 @item
+ − 534 version 20.0 released February 9, 1997.
+ − 535 @item
+ − 536 version 19.15 released March 28, 1997.
+ − 537 @item
+ − 538 version 20.1 (not released to the net) April 15, 1997.
+ − 539 @item
+ − 540 version 20.2 released May 16, 1997.
+ − 541 @item
+ − 542 version 19.16 released October 31, 1997.
+ − 543 @item
+ − 544 version 20.3 (the first stable version of XEmacs 20.x) released November 30,
+ − 545 1997.
+ − 546 version 20.4 released February 28, 1998.
+ − 547 @end itemize
+ − 548
442
+ − 549 @node GNU Emacs 19, GNU Emacs 20, Lucid Emacs, A History of Emacs
428
+ − 550 @section GNU Emacs 19
+ − 551 @cindex GNU Emacs 19
+ − 552 @cindex FSF Emacs
+ − 553
+ − 554 About a year after the initial release of Lucid Emacs, the FSF
+ − 555 released a beta of their version of Emacs 19 (referred to here as ``GNU
+ − 556 Emacs''). By this time, the current version of Lucid Emacs was
+ − 557 19.6. (Strangely, the first released beta from the FSF was GNU Emacs
+ − 558 19.7.) A time line for GNU Emacs version 19 is
+ − 559
+ − 560 @itemize @bullet
+ − 561 @item
+ − 562 version 19.8 (beta) released May 27, 1993.
+ − 563 @item
+ − 564 version 19.9 (beta) released May 27, 1993.
+ − 565 @item
+ − 566 version 19.10 (beta) released May 30, 1993.
+ − 567 @item
+ − 568 version 19.11 (beta) released June 1, 1993.
+ − 569 @item
+ − 570 version 19.12 (beta) released June 2, 1993.
+ − 571 @item
+ − 572 version 19.13 (beta) released June 8, 1993.
+ − 573 @item
+ − 574 version 19.14 (beta) released June 17, 1993.
+ − 575 @item
+ − 576 version 19.15 (beta) released June 19, 1993.
+ − 577 @item
+ − 578 version 19.16 (beta) released July 6, 1993.
+ − 579 @item
+ − 580 version 19.17 (beta) released late July, 1993.
+ − 581 @item
+ − 582 version 19.18 (beta) released August 9, 1993.
+ − 583 @item
+ − 584 version 19.19 (beta) released August 15, 1993.
+ − 585 @item
+ − 586 version 19.20 (beta) released November 17, 1993.
+ − 587 @item
+ − 588 version 19.21 (beta) released November 17, 1993.
+ − 589 @item
+ − 590 version 19.22 (beta) released November 28, 1993.
+ − 591 @item
+ − 592 version 19.23 (beta) released May 17, 1994.
+ − 593 @item
+ − 594 version 19.24 (beta) released May 16, 1994.
+ − 595 @item
+ − 596 version 19.25 (beta) released June 3, 1994.
+ − 597 @item
+ − 598 version 19.26 (beta) released September 11, 1994.
+ − 599 @item
+ − 600 version 19.27 (beta) released September 14, 1994.
+ − 601 @item
+ − 602 version 19.28 (first ``official'' release) released November 1, 1994.
+ − 603 @item
+ − 604 version 19.29 released June 21, 1995.
+ − 605 @item
+ − 606 version 19.30 released November 24, 1995.
+ − 607 @item
+ − 608 version 19.31 released May 25, 1996.
+ − 609 @item
+ − 610 version 19.32 released July 31, 1996.
+ − 611 @item
+ − 612 version 19.33 released August 11, 1996.
+ − 613 @item
+ − 614 version 19.34 released August 21, 1996.
+ − 615 @item
+ − 616 version 19.34b released September 6, 1996.
+ − 617 @end itemize
+ − 618
+ − 619 @cindex Mlynarik, Richard
+ − 620 In some ways, GNU Emacs 19 was better than Lucid Emacs; in some ways,
+ − 621 worse. Lucid soon began incorporating features from GNU Emacs 19 into
+ − 622 Lucid Emacs; the work was mostly done by Richard Mlynarik, who had been
+ − 623 working on and using GNU Emacs for a long time (back as far as version
+ − 624 16 or 17).
+ − 625
442
+ − 626 @node GNU Emacs 20, XEmacs, GNU Emacs 19, A History of Emacs
428
+ − 627 @section GNU Emacs 20
+ − 628 @cindex GNU Emacs 20
+ − 629 @cindex FSF Emacs
+ − 630
+ − 631 On February 2, 1997 work began on GNU Emacs to integrate Mule. The first
+ − 632 release was made in September of that year.
+ − 633
+ − 634 A timeline for Emacs 20 is
+ − 635
+ − 636 @itemize @bullet
+ − 637 @item
+ − 638 version 20.1 released September 17, 1997.
+ − 639 @item
+ − 640 version 20.2 released September 20, 1997.
+ − 641 @item
+ − 642 version 20.3 released August 19, 1998.
+ − 643 @end itemize
+ − 644
442
+ − 645 @node XEmacs, , GNU Emacs 20, A History of Emacs
428
+ − 646 @section XEmacs
+ − 647 @cindex XEmacs
+ − 648
+ − 649 @cindex Sun Microsystems
+ − 650 @cindex University of Illinois
+ − 651 @cindex Illinois, University of
+ − 652 @cindex SPARCWorks
+ − 653 @cindex Andreessen, Marc
+ − 654 @cindex Baur, Steve
+ − 655 @cindex Buchholz, Martin
+ − 656 @cindex Kaplan, Simon
+ − 657 @cindex Wing, Ben
+ − 658 @cindex Thompson, Chuck
+ − 659 @cindex Win-Emacs
+ − 660 @cindex Epoch
+ − 661 @cindex Amdahl Corporation
+ − 662 Around the time that Lucid was developing Energize, Sun Microsystems
+ − 663 was developing their own development environment (called ``SPARCWorks'')
+ − 664 and also decided to use Emacs. They joined forces with the Epoch team
+ − 665 at the University of Illinois and later with Lucid. The maintainer of
+ − 666 the last-released version of Epoch was Marc Andreessen, but he dropped
+ − 667 out and the Epoch project, headed by Simon Kaplan, lured Chuck Thompson
+ − 668 away from a system administration job to become the primary Lucid Emacs
+ − 669 author for Epoch and Sun. Chuck's area of specialty became the
+ − 670 redisplay engine (he replaced the old Lucid Emacs redisplay engine with
+ − 671 a ported version from Epoch and then later rewrote it from scratch).
+ − 672 Sun also hired Ben Wing (the author of Win-Emacs, a port of Lucid Emacs
+ − 673 to Microsoft Windows 3.1) in 1993, for what was initially a one-month
+ − 674 contract to fix some event problems but later became a many-year
+ − 675 involvement, punctuated by a six-month contract with Amdahl Corporation.
+ − 676
+ − 677 @cindex rename to XEmacs
+ − 678 In 1994, Sun and Lucid agreed to rename Lucid Emacs to XEmacs (a name
+ − 679 not favorable to either company); the first release called XEmacs was
+ − 680 version 19.11. In June 1994, Lucid folded and Jamie quit to work for
+ − 681 the newly formed Mosaic Communications Corp., later Netscape
+ − 682 Communications Corp. (co-founded by the same Marc Andreessen, who had
+ − 683 quit his Epoch job to work on a graphical browser for the World Wide
+ − 684 Web). Chuck then become the primary maintainer of XEmacs, and put out
+ − 685 versions 19.11 through 19.14 in conjunction with Ben. For 19.12 and
+ − 686 19.13, Chuck added the new redisplay and many other display improvements
+ − 687 and Ben added MULE support (support for Asian and other languages) and
+ − 688 redesigned most of the internal Lisp subsystems to better support the
+ − 689 MULE work and the various other features being added to XEmacs. After
+ − 690 19.14 Chuck retired as primary maintainer and Steve Baur stepped in.
+ − 691
+ − 692 @cindex MULE merged XEmacs appears
+ − 693 Soon after 19.13 was released, work began in earnest on the MULE
+ − 694 internationalization code and the source tree was divided into two
+ − 695 development paths. The MULE version was initially called 19.20, but was
+ − 696 soon renamed to 20.0. In 1996 Martin Buchholz of Sun Microsystems took
+ − 697 over the care and feeding of it and worked on it in parallel with the
+ − 698 19.14 development that was occurring at the same time. After much work
+ − 699 by Martin, it was decided to release 20.0 ahead of 19.15 in February
+ − 700 1997. The source tree remained divided until 20.2 when the version 19
+ − 701 source was finally retired at version 19.16.
+ − 702
+ − 703 @cindex Baur, Steve
+ − 704 @cindex Buchholz, Martin
+ − 705 @cindex Jones, Kyle
+ − 706 @cindex Niksic, Hrvoje
+ − 707 @cindex XEmacs goes it alone
+ − 708 In 1997, Sun finally dropped all pretense of support for XEmacs and
+ − 709 Martin Buchholz left the company in November. Since then, and mostly
+ − 710 for the previous year, because Steve Baur was never paid to work on
+ − 711 XEmacs, XEmacs has existed solely on the contributions of volunteers
+ − 712 from the Free Software Community. Starting from 1997, Hrvoje Niksic and
+ − 713 Kyle Jones have figured prominently in XEmacs development.
+ − 714
+ − 715 @cindex merging attempts
+ − 716 Many attempts have been made to merge XEmacs and GNU Emacs, but they
+ − 717 have consistently failed.
+ − 718
+ − 719 A more detailed history is contained in the XEmacs About page.
+ − 720
+ − 721 @node XEmacs From the Outside, The Lisp Language, A History of Emacs, Top
+ − 722 @chapter XEmacs From the Outside
+ − 723 @cindex read-eval-print
+ − 724
+ − 725 XEmacs appears to the outside world as an editor, but it is really a
+ − 726 Lisp environment. At its heart is a Lisp interpreter; it also
+ − 727 ``happens'' to contain many specialized object types (e.g. buffers,
+ − 728 windows, frames, events) that are useful for implementing an editor.
+ − 729 Some of these objects (in particular windows and frames) have
+ − 730 displayable representations, and XEmacs provides a function
+ − 731 @code{redisplay()} that ensures that the display of all such objects
+ − 732 matches their internal state. Most of the time, a standard Lisp
440
+ − 733 environment is in a @dfn{read-eval-print} loop---i.e. ``read some Lisp
428
+ − 734 code, execute it, and print the results''. XEmacs has a similar loop:
+ − 735
+ − 736 @itemize @bullet
+ − 737 @item
+ − 738 read an event
+ − 739 @item
+ − 740 dispatch the event (i.e. ``do it'')
+ − 741 @item
+ − 742 redisplay
+ − 743 @end itemize
+ − 744
+ − 745 Reading an event is done using the Lisp function @code{next-event},
+ − 746 which waits for something to happen (typically, the user presses a key
+ − 747 or moves the mouse) and returns an event object describing this.
+ − 748 Dispatching an event is done using the Lisp function
+ − 749 @code{dispatch-event}, which looks up the event in a keymap object (a
+ − 750 particular kind of object that associates an event with a Lisp function)
+ − 751 and calls that function. The function ``does'' what the user has
+ − 752 requested by changing the state of particular frame objects, buffer
+ − 753 objects, etc. Finally, @code{redisplay()} is called, which updates the
+ − 754 display to reflect those changes just made. Thus is an ``editor'' born.
+ − 755
+ − 756 @cindex bridge, playing
+ − 757 @cindex taxes, doing
+ − 758 @cindex pi, calculating
+ − 759 Note that you do not have to use XEmacs as an editor; you could just
+ − 760 as well make it do your taxes, compute pi, play bridge, etc. You'd just
+ − 761 have to write functions to do those operations in Lisp.
+ − 762
+ − 763 @node The Lisp Language, XEmacs From the Perspective of Building, XEmacs From the Outside, Top
+ − 764 @chapter The Lisp Language
+ − 765 @cindex Lisp vs. C
+ − 766 @cindex C vs. Lisp
+ − 767 @cindex Lisp vs. Java
+ − 768 @cindex Java vs. Lisp
+ − 769 @cindex dynamic scoping
+ − 770 @cindex scoping, dynamic
+ − 771 @cindex dynamic types
+ − 772 @cindex types, dynamic
+ − 773 @cindex Java
+ − 774 @cindex Common Lisp
+ − 775 @cindex Gosling, James
+ − 776
+ − 777 Lisp is a general-purpose language that is higher-level than C and in
+ − 778 many ways more powerful than C. Powerful dialects of Lisp such as
+ − 779 Common Lisp are probably much better languages for writing very large
+ − 780 applications than is C. (Unfortunately, for many non-technical
+ − 781 reasons C and its successor C++ have become the dominant languages for
+ − 782 application development. These languages are both inadequate for
+ − 783 extremely large applications, which is evidenced by the fact that newer,
+ − 784 larger programs are becoming ever harder to write and are requiring ever
+ − 785 more programmers despite great increases in C development environments;
+ − 786 and by the fact that, although hardware speeds and reliability have been
+ − 787 growing at an exponential rate, most software is still generally
+ − 788 considered to be slow and buggy.)
+ − 789
+ − 790 The new Java language holds promise as a better general-purpose
+ − 791 development language than C. Java has many features in common with
+ − 792 Lisp that are not shared by C (this is not a coincidence, since
+ − 793 Java was designed by James Gosling, a former Lisp hacker). This
+ − 794 will be discussed more later.
+ − 795
+ − 796 For those used to C, here is a summary of the basic differences between
+ − 797 C and Lisp:
+ − 798
+ − 799 @enumerate
+ − 800 @item
+ − 801 Lisp has an extremely regular syntax. Every function, expression,
+ − 802 and control statement is written in the form
+ − 803
+ − 804 @example
+ − 805 (@var{func} @var{arg1} @var{arg2} ...)
+ − 806 @end example
+ − 807
+ − 808 This is as opposed to C, which writes functions as
+ − 809
+ − 810 @example
+ − 811 func(@var{arg1}, @var{arg2}, ...)
+ − 812 @end example
+ − 813
+ − 814 but writes expressions involving operators as (e.g.)
+ − 815
+ − 816 @example
+ − 817 @var{arg1} + @var{arg2}
+ − 818 @end example
+ − 819
+ − 820 and writes control statements as (e.g.)
+ − 821
+ − 822 @example
+ − 823 while (@var{expr}) @{ @var{statement1}; @var{statement2}; ... @}
+ − 824 @end example
+ − 825
+ − 826 Lisp equivalents of the latter two would be
+ − 827
+ − 828 @example
+ − 829 (+ @var{arg1} @var{arg2} ...)
+ − 830 @end example
+ − 831
+ − 832 and
+ − 833
+ − 834 @example
+ − 835 (while @var{expr} @var{statement1} @var{statement2} ...)
+ − 836 @end example
+ − 837
+ − 838 @item
+ − 839 Lisp is a safe language. Assuming there are no bugs in the Lisp
+ − 840 interpreter/compiler, it is impossible to write a program that ``core
+ − 841 dumps'' or otherwise causes the machine to execute an illegal
+ − 842 instruction. This is very different from C, where perhaps the most
+ − 843 common outcome of a bug is exactly such a crash. A corollary of this is that
+ − 844 the C operation of casting a pointer is impossible (and unnecessary) in
+ − 845 Lisp, and that it is impossible to access memory outside the bounds of
+ − 846 an array.
+ − 847
+ − 848 @item
+ − 849 Programs and data are written in the same form. The
+ − 850 parenthesis-enclosing form described above for statements is the same
+ − 851 form used for the most common data type in Lisp, the list. Thus, it is
+ − 852 possible to represent any Lisp program using Lisp data types, and for
+ − 853 one program to construct Lisp statements and then dynamically
+ − 854 @dfn{evaluate} them, or cause them to execute.
+ − 855
+ − 856 @item
+ − 857 All objects are @dfn{dynamically typed}. This means that part of every
+ − 858 object is an indication of what type it is. A Lisp program can
+ − 859 manipulate an object without knowing what type it is, and can query an
+ − 860 object to determine its type. This means that, correspondingly,
+ − 861 variables and function parameters can hold objects of any type and are
+ − 862 not normally declared as being of any particular type. This is opposed
+ − 863 to the @dfn{static typing} of C, where variables can hold exactly one
+ − 864 type of object and must be declared as such, and objects do not contain
+ − 865 an indication of their type because it's implicit in the variables they
+ − 866 are stored in. It is possible in C to have a variable hold different
+ − 867 types of objects (e.g. through the use of @code{void *} pointers or
+ − 868 variable-argument functions), but the type information must then be
+ − 869 passed explicitly in some other fashion, leading to additional program
+ − 870 complexity.
+ − 871
+ − 872 @item
+ − 873 Allocated memory is automatically reclaimed when it is no longer in use.
+ − 874 This operation is called @dfn{garbage collection} and involves looking
+ − 875 through all variables to see what memory is being pointed to, and
+ − 876 reclaiming any memory that is not pointed to and is thus
+ − 877 ``inaccessible'' and out of use. This is as opposed to C, in which
+ − 878 allocated memory must be explicitly reclaimed using @code{free()}. If
+ − 879 you simply drop all pointers to memory without freeing it, it becomes
+ − 880 ``leaked'' memory that still takes up space. Over a long period of
+ − 881 time, this can cause your program to grow and grow until it runs out of
+ − 882 memory.
+ − 883
+ − 884 @item
+ − 885 Lisp has built-in facilities for handling errors and exceptions. In C,
+ − 886 when an error occurs, usually either the program exits entirely or the
+ − 887 routine in which the error occurs returns a value indicating this. If
+ − 888 an error occurs in a deeply-nested routine, then every routine currently
+ − 889 called must unwind itself normally and return an error value back up to
+ − 890 the next routine. This means that every routine must explicitly check
+ − 891 for an error in all the routines it calls; if it does not do so,
+ − 892 unexpected and often random behavior results. This is an extremely
+ − 893 common source of bugs in C programs. An alternative would be to do a
+ − 894 non-local exit using @code{longjmp()}, but that is often very dangerous
+ − 895 because the routines that were exited past had no opportunity to clean
+ − 896 up after themselves and may leave things in an inconsistent state,
+ − 897 causing a crash shortly afterwards.
+ − 898
+ − 899 Lisp provides mechanisms to make such non-local exits safe. When an
+ − 900 error occurs, a routine simply signals that an error of a particular
+ − 901 class has occurred, and a non-local exit takes place. Any routine can
+ − 902 trap errors occurring in routines it calls by registering an error
+ − 903 handler for some or all classes of errors. (If no handler is registered,
+ − 904 a default handler, generally installed by the top-level event loop, is
+ − 905 executed; this prints out the error and continues.) Routines can also
+ − 906 specify cleanup code (called an @dfn{unwind-protect}) that will be
+ − 907 called when control exits from a block of code, no matter how that exit
440
+ − 908 occurs---i.e. even if a function deeply nested below it causes a
428
+ − 909 non-local exit back to the top level.
+ − 910
+ − 911 Note that this facility has appeared in some recent vintages of C, in
+ − 912 particular Visual C++ and other PC compilers written for the Microsoft
+ − 913 Win32 API.
+ − 914
+ − 915 @item
+ − 916 In Emacs Lisp, local variables are @dfn{dynamically scoped}. This means
+ − 917 that if you declare a local variable in a particular function, and then
+ − 918 call another function, that subfunction can ``see'' the local variable
+ − 919 you declared. This is actually considered a bug in Emacs Lisp and in
+ − 920 all other early dialects of Lisp, and was corrected in Common Lisp. (In
+ − 921 Common Lisp, you can still declare dynamically scoped variables if you
440
+ − 922 want to---they are sometimes useful---but variables by default are
428
+ − 923 @dfn{lexically scoped} as in C.)
+ − 924 @end enumerate
+ − 925
+ − 926 For those familiar with Lisp, Emacs Lisp is modelled after MacLisp, an
+ − 927 early dialect of Lisp developed at MIT (no relation to the Macintosh
+ − 928 computer). There is a Common Lisp compatibility package available for
+ − 929 Emacs that provides many of the features of Common Lisp.
+ − 930
+ − 931 The Java language is derived in many ways from C, and shares a similar
+ − 932 syntax, but has the following features in common with Lisp (and different
+ − 933 from C):
+ − 934
+ − 935 @enumerate
+ − 936 @item
+ − 937 Java is a safe language, like Lisp.
+ − 938 @item
+ − 939 Java provides garbage collection, like Lisp.
+ − 940 @item
+ − 941 Java has built-in facilities for handling errors and exceptions, like
+ − 942 Lisp.
+ − 943 @item
+ − 944 Java has a type system that combines the best advantages of both static
+ − 945 and dynamic typing. Objects (except very simple types) are explicitly
+ − 946 marked with their type, as in dynamic typing; but there is a hierarchy
+ − 947 of types and functions are declared to accept only certain types, thus
+ − 948 providing the increased compile-time error-checking of static typing.
+ − 949 @end enumerate
+ − 950
+ − 951 The Java language also has some negative attributes:
+ − 952
+ − 953 @enumerate
+ − 954 @item
+ − 955 Java uses the edit/compile/run model of software development. This
+ − 956 makes it hard to use interactively. For example, to use Java like
+ − 957 @code{bc} it is necessary to write a special purpose, albeit tiny,
+ − 958 application. In Emacs Lisp, a calculator comes built-in without any
+ − 959 effort - one can always just type an expression in the @code{*scratch*}
+ − 960 buffer.
+ − 961 @item
+ − 962 Java tries too hard to enforce, not merely enable, portability, making
+ − 963 ordinary access to standard OS facilities painful. Java has an
+ − 964 @dfn{agenda}. I think this is why @code{chdir} is not part of standard
+ − 965 Java, which is inexcusable.
+ − 966 @end enumerate
+ − 967
+ − 968 Unfortunately, there is no perfect language. Static typing allows a
+ − 969 compiler to catch programmer errors and produce more efficient code, but
442
+ − 970 makes programming more tedious and less fun. For the foreseeable future,
428
+ − 971 an Ideal Editing and Programming Environment (and that is what XEmacs
+ − 972 aspires to) will be programmable in multiple languages: high level ones
+ − 973 like Lisp for user customization and prototyping, and lower level ones
+ − 974 for infrastructure and industrial strength applications. If I had my
+ − 975 way, XEmacs would be friendly towards the Python, Scheme, C++, ML,
+ − 976 etc... communities. But there are serious technical difficulties to
+ − 977 achieving that goal.
+ − 978
+ − 979 The word @dfn{application} in the previous paragraph was used
+ − 980 intentionally. XEmacs implements an API for programs written in Lisp
+ − 981 that makes it a full-fledged application platform, very much like an OS
+ − 982 inside the real OS.
+ − 983
+ − 984 @node XEmacs From the Perspective of Building, XEmacs From the Inside, The Lisp Language, Top
+ − 985 @chapter XEmacs From the Perspective of Building
+ − 986
+ − 987 The heart of XEmacs is the Lisp environment, which is written in C.
+ − 988 This is contained in the @file{src/} subdirectory. Underneath
+ − 989 @file{src/} are two subdirectories of header files: @file{s/} (header
+ − 990 files for particular operating systems) and @file{m/} (header files for
+ − 991 particular machine types). In practice the distinction between the two
+ − 992 types of header files is blurred. These header files define or undefine
+ − 993 certain preprocessor constants and macros to indicate particular
+ − 994 characteristics of the associated machine or operating system. As part
+ − 995 of the configure process, one @file{s/} file and one @file{m/} file is
+ − 996 identified for the particular environment in which XEmacs is being
+ − 997 built.
+ − 998
+ − 999 XEmacs also contains a great deal of Lisp code. This implements the
+ − 1000 operations that make XEmacs useful as an editor as well as just a Lisp
+ − 1001 environment, and also contains many add-on packages that allow XEmacs to
+ − 1002 browse directories, act as a mail and Usenet news reader, compile Lisp
+ − 1003 code, etc. There is actually more Lisp code than C code associated with
+ − 1004 XEmacs, but much of the Lisp code is peripheral to the actual operation
+ − 1005 of the editor. The Lisp code all lies in subdirectories underneath the
+ − 1006 @file{lisp/} directory.
+ − 1007
+ − 1008 The @file{lwlib/} directory contains C code that implements a
+ − 1009 generalized interface onto different X widget toolkits and also
+ − 1010 implements some widgets of its own that behave like Motif widgets but
+ − 1011 are faster, free, and in some cases more powerful. The code in this
+ − 1012 directory compiles into a library and is mostly independent from XEmacs.
+ − 1013
+ − 1014 The @file{etc/} directory contains various data files associated with
+ − 1015 XEmacs. Some of them are actually read by XEmacs at startup; others
+ − 1016 merely contain useful information of various sorts.
+ − 1017
+ − 1018 The @file{lib-src/} directory contains C code for various auxiliary
+ − 1019 programs that are used in connection with XEmacs. Some of them are used
+ − 1020 during the build process; others are used to perform certain functions
+ − 1021 that cannot conveniently be placed in the XEmacs executable (e.g. the
+ − 1022 @file{movemail} program for fetching mail out of @file{/var/spool/mail},
+ − 1023 which must be setgid to @file{mail} on many systems; and the
+ − 1024 @file{gnuclient} program, which allows an external script to communicate
+ − 1025 with a running XEmacs process).
+ − 1026
+ − 1027 The @file{man/} directory contains the sources for the XEmacs
+ − 1028 documentation. It is mostly in a form called Texinfo, which can be
+ − 1029 converted into either a printed document (by passing it through @TeX{})
+ − 1030 or into on-line documentation called @dfn{info files}.
+ − 1031
+ − 1032 The @file{info/} directory contains the results of formatting the XEmacs
+ − 1033 documentation as @dfn{info files}, for on-line use. These files are
+ − 1034 used when you enter the Info system using @kbd{C-h i} or through the
+ − 1035 Help menu.
+ − 1036
+ − 1037 The @file{dynodump/} directory contains auxiliary code used to build
+ − 1038 XEmacs on Solaris platforms.
+ − 1039
+ − 1040 The other directories contain various miscellaneous code and information
+ − 1041 that is not normally used or needed.
+ − 1042
+ − 1043 The first step of building involves running the @file{configure} program
+ − 1044 and passing it various parameters to specify any optional features you
+ − 1045 want and compiler arguments and such, as described in the @file{INSTALL}
+ − 1046 file. This determines what the build environment is, chooses the
+ − 1047 appropriate @file{s/} and @file{m/} file, and runs a series of tests to
+ − 1048 determine many details about your environment, such as which library
+ − 1049 functions are available and exactly how they work. The reason for
+ − 1050 running these tests is that it allows XEmacs to be compiled on a much
+ − 1051 wider variety of platforms than those that the XEmacs developers happen
+ − 1052 to be familiar with, including various sorts of hybrid platforms. This
+ − 1053 is especially important now that many operating systems give you a great
+ − 1054 deal of control over exactly what features you want installed, and allow
+ − 1055 for easy upgrading of parts of a system without upgrading the rest. It
+ − 1056 would be impossible to pre-determine and pre-specify the information for
+ − 1057 all possible configurations.
+ − 1058
+ − 1059 In fact, the @file{s/} and @file{m/} files are basically @emph{evil},
+ − 1060 since they contain unmaintainable platform-specific hard-coded
+ − 1061 information. XEmacs has been moving in the direction of having all
+ − 1062 system-specific information be determined dynamically by
+ − 1063 @file{configure}. Perhaps someday we can @code{rm -rf src/s src/m}.
+ − 1064
+ − 1065 When configure is done running, it generates @file{Makefile}s and
+ − 1066 @file{GNUmakefile}s and the file @file{src/config.h} (which describes
+ − 1067 the features of your system) from template files. You then run
+ − 1068 @file{make}, which compiles the auxiliary code and programs in
+ − 1069 @file{lib-src/} and @file{lwlib/} and the main XEmacs executable in
+ − 1070 @file{src/}. The result of compiling and linking is an executable
+ − 1071 called @file{temacs}, which is @emph{not} the final XEmacs executable.
+ − 1072 @file{temacs} by itself is not intended to function as an editor or even
+ − 1073 display any windows on the screen, and if you simply run it, it will
+ − 1074 exit immediately. The @file{Makefile} runs @file{temacs} with certain
+ − 1075 options that cause it to initialize itself, read in a number of basic
+ − 1076 Lisp files, and then dump itself out into a new executable called
+ − 1077 @file{xemacs}. This new executable has been pre-initialized and
+ − 1078 contains pre-digested Lisp code that is necessary for the editor to
+ − 1079 function (this includes most basic editing functions,
+ − 1080 e.g. @code{kill-line}, that can be defined in terms of other Lisp
+ − 1081 primitives; some initialization code that is called when certain
+ − 1082 objects, such as frames, are created; and all of the standard
+ − 1083 keybindings and code for the actions they result in). This executable,
+ − 1084 @file{xemacs}, is the executable that you run to use the XEmacs editor.
+ − 1085
+ − 1086 Although @file{temacs} is not intended to be run as an editor, it can,
+ − 1087 by using the incantation @code{temacs -batch -l loadup.el run-temacs}.
+ − 1088 This is useful when the dumping procedure described above is broken, or
+ − 1089 when using certain program debugging tools such as Purify. These tools
+ − 1090 get mighty confused by the tricks played by the XEmacs build process,
+ − 1091 such as allocation memory in one process, and freeing it in the next.
+ − 1092
+ − 1093 @node XEmacs From the Inside, The XEmacs Object System (Abstractly Speaking), XEmacs From the Perspective of Building, Top
+ − 1094 @chapter XEmacs From the Inside
+ − 1095
+ − 1096 Internally, XEmacs is quite complex, and can be very confusing. To
+ − 1097 simplify things, it can be useful to think of XEmacs as containing an
+ − 1098 event loop that ``drives'' everything, and a number of other subsystems,
+ − 1099 such as a Lisp engine and a redisplay mechanism. Each of these other
+ − 1100 subsystems exists simultaneously in XEmacs, and each has a certain
+ − 1101 state. The flow of control continually passes in and out of these
+ − 1102 different subsystems in the course of normal operation of the editor.
+ − 1103
+ − 1104 It is important to keep in mind that, most of the time, the editor is
+ − 1105 ``driven'' by the event loop. Except during initialization and batch
+ − 1106 mode, all subsystems are entered directly or indirectly through the
+ − 1107 event loop, and ultimately, control exits out of all subsystems back up
+ − 1108 to the event loop. This cycle of entering a subsystem, exiting back out
+ − 1109 to the event loop, and starting another iteration of the event loop
+ − 1110 occurs once each keystroke, mouse motion, etc.
+ − 1111
+ − 1112 If you're trying to understand a particular subsystem (other than the
+ − 1113 event loop), think of it as a ``daemon'' process or ``servant'' that is
+ − 1114 responsible for one particular aspect of a larger system, and
+ − 1115 periodically receives commands or environment changes that cause it to
+ − 1116 do something. Ultimately, these commands and environment changes are
+ − 1117 always triggered by the event loop. For example:
+ − 1118
+ − 1119 @itemize @bullet
+ − 1120 @item
+ − 1121 The window and frame mechanism is responsible for keeping track of what
+ − 1122 windows and frames exist, what buffers are in them, etc. It is
+ − 1123 periodically given commands (usually from the user) to make a change to
+ − 1124 the current window/frame state: i.e. create a new frame, delete a
+ − 1125 window, etc.
+ − 1126
+ − 1127 @item
+ − 1128 The buffer mechanism is responsible for keeping track of what buffers
+ − 1129 exist and what text is in them. It is periodically given commands
+ − 1130 (usually from the user) to insert or delete text, create a buffer, etc.
+ − 1131 When it receives a text-change command, it notifies the redisplay
+ − 1132 mechanism.
+ − 1133
+ − 1134 @item
+ − 1135 The redisplay mechanism is responsible for making sure that windows and
+ − 1136 frames are displayed correctly. It is periodically told (by the event
+ − 1137 loop) to actually ``do its job'', i.e. snoop around and see what the
+ − 1138 current state of the environment (mostly of the currently-existing
+ − 1139 windows, frames, and buffers) is, and make sure that that state matches
+ − 1140 what's actually displayed. It keeps lots and lots of information around
+ − 1141 (such as what is actually being displayed currently, and what the
+ − 1142 environment was last time it checked) so that it can minimize the work
+ − 1143 it has to do. It is also helped along in that whenever a relevant
+ − 1144 change to the environment occurs, the redisplay mechanism is told about
+ − 1145 this, so it has a pretty good idea of where it has to look to find
+ − 1146 possible changes and doesn't have to look everywhere.
+ − 1147
+ − 1148 @item
+ − 1149 The Lisp engine is responsible for executing the Lisp code in which most
+ − 1150 user commands are written. It is entered through a call to @code{eval}
+ − 1151 or @code{funcall}, which occurs as a result of dispatching an event from
+ − 1152 the event loop. The functions it calls issue commands to the buffer
+ − 1153 mechanism, the window/frame subsystem, etc.
+ − 1154
+ − 1155 @item
+ − 1156 The Lisp allocation subsystem is responsible for keeping track of Lisp
+ − 1157 objects. It is given commands from the Lisp engine to allocate objects,
+ − 1158 garbage collect, etc.
+ − 1159 @end itemize
+ − 1160
+ − 1161 etc.
+ − 1162
+ − 1163 The important idea here is that there are a number of independent
+ − 1164 subsystems each with its own responsibility and persistent state, just
+ − 1165 like different employees in a company, and each subsystem is
+ − 1166 periodically given commands from other subsystems. Commands can flow
+ − 1167 from any one subsystem to any other, but there is usually some sort of
+ − 1168 hierarchy, with all commands originating from the event subsystem.
+ − 1169
+ − 1170 XEmacs is entered in @code{main()}, which is in @file{emacs.c}. When
+ − 1171 this is called the first time (in a properly-invoked @file{temacs}), it
+ − 1172 does the following:
+ − 1173
+ − 1174 @enumerate
+ − 1175 @item
+ − 1176 It does some very basic environment initializations, such as determining
+ − 1177 where it and its directories (e.g. @file{lisp/} and @file{etc/}) reside
+ − 1178 and setting up signal handlers.
+ − 1179 @item
+ − 1180 It initializes the entire Lisp interpreter.
+ − 1181 @item
+ − 1182 It sets the initial values of many built-in variables (including many
+ − 1183 variables that are visible to Lisp programs), such as the global keymap
+ − 1184 object and the built-in faces (a face is an object that describes the
+ − 1185 display characteristics of text). This involves creating Lisp objects
+ − 1186 and thus is dependent on step (2).
+ − 1187 @item
+ − 1188 It performs various other initializations that are relevant to the
+ − 1189 particular environment it is running in, such as retrieving environment
+ − 1190 variables, determining the current date and the user who is running the
+ − 1191 program, examining its standard input, creating any necessary file
+ − 1192 descriptors, etc.
+ − 1193 @item
+ − 1194 At this point, the C initialization is complete. A Lisp program that
+ − 1195 was specified on the command line (usually @file{loadup.el}) is called
+ − 1196 (temacs is normally invoked as @code{temacs -batch -l loadup.el dump}).
+ − 1197 @file{loadup.el} loads all of the other Lisp files that are needed for
+ − 1198 the operation of the editor, calls the @code{dump-emacs} function to
+ − 1199 write out @file{xemacs}, and then kills the temacs process.
+ − 1200 @end enumerate
+ − 1201
+ − 1202 When @file{xemacs} is then run, it only redoes steps (1) and (4)
+ − 1203 above; all variables already contain the values they were set to when
+ − 1204 the executable was dumped, and all memory that was allocated with
+ − 1205 @code{malloc()} is still around. (XEmacs knows whether it is being run
+ − 1206 as @file{xemacs} or @file{temacs} because it sets the global variable
+ − 1207 @code{initialized} to 1 after step (4) above.) At this point,
+ − 1208 @file{xemacs} calls a Lisp function to do any further initialization,
+ − 1209 which includes parsing the command-line (the C code can only do limited
+ − 1210 command-line parsing, which includes looking for the @samp{-batch} and
+ − 1211 @samp{-l} flags and a few other flags that it needs to know about before
+ − 1212 initialization is complete), creating the first frame (or @dfn{window}
+ − 1213 in standard window-system parlance), running the user's init file
+ − 1214 (usually the file @file{.emacs} in the user's home directory), etc. The
+ − 1215 function to do this is usually called @code{normal-top-level};
+ − 1216 @file{loadup.el} tells the C code about this function by setting its
+ − 1217 name as the value of the Lisp variable @code{top-level}.
+ − 1218
+ − 1219 When the Lisp initialization code is done, the C code enters the event
+ − 1220 loop, and stays there for the duration of the XEmacs process. The code
442
+ − 1221 for the event loop is contained in @file{cmdloop.c}, and is called
428
+ − 1222 @code{Fcommand_loop_1()}. Note that this event loop could very well be
+ − 1223 written in Lisp, and in fact a Lisp version exists; but apparently,
+ − 1224 doing this makes XEmacs run noticeably slower.
+ − 1225
+ − 1226 Notice how much of the initialization is done in Lisp, not in C.
+ − 1227 In general, XEmacs tries to move as much code as is possible
+ − 1228 into Lisp. Code that remains in C is code that implements the
+ − 1229 Lisp interpreter itself, or code that needs to be very fast, or
+ − 1230 code that needs to do system calls or other such stuff that
+ − 1231 needs to be done in C, or code that needs to have access to
+ − 1232 ``forbidden'' structures. (One conscious aspect of the design of
+ − 1233 Lisp under XEmacs is a clean separation between the external
+ − 1234 interface to a Lisp object's functionality and its internal
+ − 1235 implementation. Part of this design is that Lisp programs
+ − 1236 are forbidden from accessing the contents of the object other
+ − 1237 than through using a standard API. In this respect, XEmacs Lisp
+ − 1238 is similar to modern Lisp dialects but differs from GNU Emacs,
+ − 1239 which tends to expose the implementation and allow Lisp
+ − 1240 programs to look at it directly. The major advantage of
+ − 1241 hiding the implementation is that it allows the implementation
+ − 1242 to be redesigned without affecting any Lisp programs, including
+ − 1243 those that might want to be ``clever'' by looking directly at
+ − 1244 the object's contents and possibly manipulating them.)
+ − 1245
+ − 1246 Moving code into Lisp makes the code easier to debug and maintain and
+ − 1247 makes it much easier for people who are not XEmacs developers to
+ − 1248 customize XEmacs, because they can make a change with much less chance
+ − 1249 of obscure and unwanted interactions occurring than if they were to
+ − 1250 change the C code.
+ − 1251
+ − 1252 @node The XEmacs Object System (Abstractly Speaking), How Lisp Objects Are Represented in C, XEmacs From the Inside, Top
+ − 1253 @chapter The XEmacs Object System (Abstractly Speaking)
+ − 1254
+ − 1255 At the heart of the Lisp interpreter is its management of objects.
+ − 1256 XEmacs Lisp contains many built-in objects, some of which are
+ − 1257 simple and others of which can be very complex; and some of which
+ − 1258 are very common, and others of which are rarely used or are only
+ − 1259 used internally. (Since the Lisp allocation system, with its
+ − 1260 automatic reclamation of unused storage, is so much more convenient
+ − 1261 than @code{malloc()} and @code{free()}, the C code makes extensive use of it
+ − 1262 in its internal operations.)
+ − 1263
+ − 1264 The basic Lisp objects are
+ − 1265
+ − 1266 @table @code
+ − 1267 @item integer
+ − 1268 28 or 31 bits of precision, or 60 or 63 bits on 64-bit machines; the
+ − 1269 reason for this is described below when the internal Lisp object
+ − 1270 representation is described.
+ − 1271 @item float
+ − 1272 Same precision as a double in C.
+ − 1273 @item cons
+ − 1274 A simple container for two Lisp objects, used to implement lists and
+ − 1275 most other data structures in Lisp.
+ − 1276 @item char
+ − 1277 An object representing a single character of text; chars behave like
+ − 1278 integers in many ways but are logically considered text rather than
+ − 1279 numbers and have a different read syntax. (the read syntax for a char
440
+ − 1280 contains the char itself or some textual encoding of it---for example,
428
+ − 1281 a Japanese Kanji character might be encoded as @samp{^[$(B#&^[(B} using the
440
+ − 1282 ISO-2022 encoding standard---rather than the numerical representation
428
+ − 1283 of the char; this way, if the mapping between chars and integers
+ − 1284 changes, which is quite possible for Kanji characters and other extended
+ − 1285 characters, the same character will still be created. Note that some
+ − 1286 primitives confuse chars and integers. The worst culprit is @code{eq},
+ − 1287 which makes a special exception and considers a char to be @code{eq} to
+ − 1288 its integer equivalent, even though in no other case are objects of two
+ − 1289 different types @code{eq}. The reason for this monstrosity is
+ − 1290 compatibility with existing code; the separation of char from integer
+ − 1291 came fairly recently.)
+ − 1292 @item symbol
+ − 1293 An object that contains Lisp objects and is referred to by name;
+ − 1294 symbols are used to implement variables and named functions
+ − 1295 and to provide the equivalent of preprocessor constants in C.
+ − 1296 @item vector
+ − 1297 A one-dimensional array of Lisp objects providing constant-time access
+ − 1298 to any of the objects; access to an arbitrary object in a vector is
+ − 1299 faster than for lists, but the operations that can be done on a vector
+ − 1300 are more limited.
+ − 1301 @item string
+ − 1302 Self-explanatory; behaves much like a vector of chars
+ − 1303 but has a different read syntax and is stored and manipulated
+ − 1304 more compactly.
+ − 1305 @item bit-vector
+ − 1306 A vector of bits; similar to a string in spirit.
+ − 1307 @item compiled-function
+ − 1308 An object containing compiled Lisp code, known as @dfn{byte code}.
+ − 1309 @item subr
+ − 1310 A Lisp primitive, i.e. a Lisp-callable function implemented in C.
+ − 1311 @end table
+ − 1312
+ − 1313 @cindex closure
+ − 1314 Note that there is no basic ``function'' type, as in more powerful
+ − 1315 versions of Lisp (where it's called a @dfn{closure}). XEmacs Lisp does
+ − 1316 not provide the closure semantics implemented by Common Lisp and Scheme.
+ − 1317 The guts of a function in XEmacs Lisp are represented in one of four
+ − 1318 ways: a symbol specifying another function (when one function is an
+ − 1319 alias for another), a list (whose first element must be the symbol
+ − 1320 @code{lambda}) containing the function's source code, a
+ − 1321 compiled-function object, or a subr object. (In other words, given a
+ − 1322 symbol specifying the name of a function, calling @code{symbol-function}
+ − 1323 to retrieve the contents of the symbol's function cell will return one
+ − 1324 of these types of objects.)
+ − 1325
+ − 1326 XEmacs Lisp also contains numerous specialized objects used to implement
+ − 1327 the editor:
+ − 1328
+ − 1329 @table @code
+ − 1330 @item buffer
+ − 1331 Stores text like a string, but is optimized for insertion and deletion
+ − 1332 and has certain other properties that can be set.
+ − 1333 @item frame
+ − 1334 An object with various properties whose displayable representation is a
+ − 1335 @dfn{window} in window-system parlance.
+ − 1336 @item window
+ − 1337 A section of a frame that displays the contents of a buffer;
+ − 1338 often called a @dfn{pane} in window-system parlance.
+ − 1339 @item window-configuration
+ − 1340 An object that represents a saved configuration of windows in a frame.
+ − 1341 @item device
+ − 1342 An object representing a screen on which frames can be displayed;
+ − 1343 equivalent to a @dfn{display} in the X Window System and a @dfn{TTY} in
+ − 1344 character mode.
+ − 1345 @item face
+ − 1346 An object specifying the appearance of text or graphics; it has
+ − 1347 properties such as font, foreground color, and background color.
+ − 1348 @item marker
+ − 1349 An object that refers to a particular position in a buffer and moves
+ − 1350 around as text is inserted and deleted to stay in the same relative
+ − 1351 position to the text around it.
+ − 1352 @item extent
+ − 1353 Similar to a marker but covers a range of text in a buffer; can also
+ − 1354 specify properties of the text, such as a face in which the text is to
+ − 1355 be displayed, whether the text is invisible or unmodifiable, etc.
+ − 1356 @item event
+ − 1357 Generated by calling @code{next-event} and contains information
+ − 1358 describing a particular event happening in the system, such as the user
+ − 1359 pressing a key or a process terminating.
+ − 1360 @item keymap
+ − 1361 An object that maps from events (described using lists, vectors, and
+ − 1362 symbols rather than with an event object because the mapping is for
+ − 1363 classes of events, rather than individual events) to functions to
+ − 1364 execute or other events to recursively look up; the functions are
+ − 1365 described by name, using a symbol, or using lists to specify the
+ − 1366 function's code.
+ − 1367 @item glyph
+ − 1368 An object that describes the appearance of an image (e.g. pixmap) on
+ − 1369 the screen; glyphs can be attached to the beginning or end of extents
+ − 1370 and in some future version of XEmacs will be able to be inserted
+ − 1371 directly into a buffer.
+ − 1372 @item process
+ − 1373 An object that describes a connection to an externally-running process.
+ − 1374 @end table
+ − 1375
+ − 1376 There are some other, less-commonly-encountered general objects:
+ − 1377
+ − 1378 @table @code
+ − 1379 @item hash-table
+ − 1380 An object that maps from an arbitrary Lisp object to another arbitrary
+ − 1381 Lisp object, using hashing for fast lookup.
+ − 1382 @item obarray
+ − 1383 A limited form of hash-table that maps from strings to symbols; obarrays
+ − 1384 are used to look up a symbol given its name and are not actually their
+ − 1385 own object type but are kludgily represented using vectors with hidden
+ − 1386 fields (this representation derives from GNU Emacs).
+ − 1387 @item specifier
+ − 1388 A complex object used to specify the value of a display property; a
+ − 1389 default value is given and different values can be specified for
+ − 1390 particular frames, buffers, windows, devices, or classes of device.
+ − 1391 @item char-table
+ − 1392 An object that maps from chars or classes of chars to arbitrary Lisp
+ − 1393 objects; internally char tables use a complex nested-vector
+ − 1394 representation that is optimized to the way characters are represented
+ − 1395 as integers.
+ − 1396 @item range-table
+ − 1397 An object that maps from ranges of integers to arbitrary Lisp objects.
+ − 1398 @end table
+ − 1399
+ − 1400 And some strange special-purpose objects:
+ − 1401
+ − 1402 @table @code
+ − 1403 @item charset
+ − 1404 @itemx coding-system
+ − 1405 Objects used when MULE, or multi-lingual/Asian-language, support is
+ − 1406 enabled.
+ − 1407 @item color-instance
+ − 1408 @itemx font-instance
+ − 1409 @itemx image-instance
+ − 1410 An object that encapsulates a window-system resource; instances are
+ − 1411 mostly used internally but are exposed on the Lisp level for cleanness
+ − 1412 of the specifier model and because it's occasionally useful for Lisp
+ − 1413 program to create or query the properties of instances.
+ − 1414 @item subwindow
+ − 1415 An object that encapsulate a @dfn{subwindow} resource, i.e. a
+ − 1416 window-system child window that is drawn into by an external process;
+ − 1417 this object should be integrated into the glyph system but isn't yet,
+ − 1418 and may change form when this is done.
+ − 1419 @item tooltalk-message
+ − 1420 @itemx tooltalk-pattern
+ − 1421 Objects that represent resources used in the ToolTalk interprocess
+ − 1422 communication protocol.
+ − 1423 @item toolbar-button
+ − 1424 An object used in conjunction with the toolbar.
+ − 1425 @end table
+ − 1426
+ − 1427 And objects that are only used internally:
+ − 1428
+ − 1429 @table @code
+ − 1430 @item opaque
+ − 1431 A generic object for encapsulating arbitrary memory; this allows you the
+ − 1432 generality of @code{malloc()} and the convenience of the Lisp object
+ − 1433 system.
+ − 1434 @item lstream
+ − 1435 A buffering I/O stream, used to provide a unified interface to anything
+ − 1436 that can accept output or provide input, such as a file descriptor, a
+ − 1437 stdio stream, a chunk of memory, a Lisp buffer, a Lisp string, etc.;
+ − 1438 it's a Lisp object to make its memory management more convenient.
+ − 1439 @item char-table-entry
+ − 1440 Subsidiary objects in the internal char-table representation.
+ − 1441 @item extent-auxiliary
+ − 1442 @itemx menubar-data
+ − 1443 @itemx toolbar-data
+ − 1444 Various special-purpose objects that are basically just used to
+ − 1445 encapsulate memory for particular subsystems, similar to the more
+ − 1446 general ``opaque'' object.
+ − 1447 @item symbol-value-forward
+ − 1448 @itemx symbol-value-buffer-local
+ − 1449 @itemx symbol-value-varalias
+ − 1450 @itemx symbol-value-lisp-magic
+ − 1451 Special internal-only objects that are placed in the value cell of a
+ − 1452 symbol to indicate that there is something special with this variable --
+ − 1453 e.g. it has no value, it mirrors another variable, or it mirrors some C
+ − 1454 variable; there is really only one kind of object, called a
+ − 1455 @dfn{symbol-value-magic}, but it is sort-of halfway kludged into
+ − 1456 semi-different object types.
+ − 1457 @end table
+ − 1458
+ − 1459 @cindex permanent objects
+ − 1460 @cindex temporary objects
+ − 1461 Some types of objects are @dfn{permanent}, meaning that once created,
+ − 1462 they do not disappear until explicitly destroyed, using a function such
+ − 1463 as @code{delete-buffer}, @code{delete-window}, @code{delete-frame}, etc.
+ − 1464 Others will disappear once they are not longer used, through the garbage
+ − 1465 collection mechanism. Buffers, frames, windows, devices, and processes
+ − 1466 are among the objects that are permanent. Note that some objects can go
+ − 1467 both ways: Faces can be created either way; extents are normally
+ − 1468 permanent, but detached extents (extents not referring to any text, as
+ − 1469 happens to some extents when the text they are referring to is deleted)
+ − 1470 are temporary. Note that some permanent objects, such as faces and
+ − 1471 coding systems, cannot be deleted. Note also that windows are unique in
+ − 1472 that they can be @emph{undeleted} after having previously been
+ − 1473 deleted. (This happens as a result of restoring a window configuration.)
+ − 1474
+ − 1475 @cindex read syntax
+ − 1476 Note that many types of objects have a @dfn{read syntax}, i.e. a way of
+ − 1477 specifying an object of that type in Lisp code. When you load a Lisp
+ − 1478 file, or type in code to be evaluated, what really happens is that the
+ − 1479 function @code{read} is called, which reads some text and creates an object
+ − 1480 based on the syntax of that text; then @code{eval} is called, which
+ − 1481 possibly does something special; then this loop repeats until there's
+ − 1482 no more text to read. (@code{eval} only actually does something special
+ − 1483 with symbols, which causes the symbol's value to be returned,
+ − 1484 similar to referencing a variable; and with conses [i.e. lists],
+ − 1485 which cause a function invocation. All other values are returned
+ − 1486 unchanged.)
+ − 1487
+ − 1488 The read syntax
+ − 1489
+ − 1490 @example
+ − 1491 17297
+ − 1492 @end example
+ − 1493
+ − 1494 converts to an integer whose value is 17297.
+ − 1495
+ − 1496 @example
+ − 1497 1.983e-4
+ − 1498 @end example
+ − 1499
+ − 1500 converts to a float whose value is 1.983e-4, or .0001983.
+ − 1501
+ − 1502 @example
+ − 1503 ?b
+ − 1504 @end example
+ − 1505
+ − 1506 converts to a char that represents the lowercase letter b.
+ − 1507
+ − 1508 @example
+ − 1509 ?^[$(B#&^[(B
+ − 1510 @end example
+ − 1511
+ − 1512 (where @samp{^[} actually is an @samp{ESC} character) converts to a
+ − 1513 particular Kanji character when using an ISO2022-based coding system for
+ − 1514 input. (To decode this goo: @samp{ESC} begins an escape sequence;
+ − 1515 @samp{ESC $ (} is a class of escape sequences meaning ``switch to a
+ − 1516 94x94 character set''; @samp{ESC $ ( B} means ``switch to Japanese
+ − 1517 Kanji''; @samp{#} and @samp{&} collectively index into a 94-by-94 array
+ − 1518 of characters [subtract 33 from the ASCII value of each character to get
+ − 1519 the corresponding index]; @samp{ESC (} is a class of escape sequences
+ − 1520 meaning ``switch to a 94 character set''; @samp{ESC (B} means ``switch
+ − 1521 to US ASCII''. It is a coincidence that the letter @samp{B} is used to
+ − 1522 denote both Japanese Kanji and US ASCII. If the first @samp{B} were
+ − 1523 replaced with an @samp{A}, you'd be requesting a Chinese Hanzi character
+ − 1524 from the GB2312 character set.)
+ − 1525
+ − 1526 @example
+ − 1527 "foobar"
+ − 1528 @end example
+ − 1529
+ − 1530 converts to a string.
+ − 1531
+ − 1532 @example
+ − 1533 foobar
+ − 1534 @end example
+ − 1535
+ − 1536 converts to a symbol whose name is @code{"foobar"}. This is done by
+ − 1537 looking up the string equivalent in the global variable
+ − 1538 @code{obarray}, whose contents should be an obarray. If no symbol
+ − 1539 is found, a new symbol with the name @code{"foobar"} is automatically
+ − 1540 created and added to @code{obarray}; this process is called
+ − 1541 @dfn{interning} the symbol.
+ − 1542 @cindex interning
+ − 1543
+ − 1544 @example
+ − 1545 (foo . bar)
+ − 1546 @end example
+ − 1547
+ − 1548 converts to a cons cell containing the symbols @code{foo} and @code{bar}.
+ − 1549
+ − 1550 @example
+ − 1551 (1 a 2.5)
+ − 1552 @end example
+ − 1553
+ − 1554 converts to a three-element list containing the specified objects
+ − 1555 (note that a list is actually a set of nested conses; see the
+ − 1556 XEmacs Lisp Reference).
+ − 1557
+ − 1558 @example
+ − 1559 [1 a 2.5]
+ − 1560 @end example
+ − 1561
+ − 1562 converts to a three-element vector containing the specified objects.
+ − 1563
+ − 1564 @example
+ − 1565 #[... ... ... ...]
+ − 1566 @end example
+ − 1567
+ − 1568 converts to a compiled-function object (the actual contents are not
+ − 1569 shown since they are not relevant here; look at a file that ends with
+ − 1570 @file{.elc} for examples).
+ − 1571
+ − 1572 @example
+ − 1573 #*01110110
+ − 1574 @end example
+ − 1575
+ − 1576 converts to a bit-vector.
+ − 1577
+ − 1578 @example
+ − 1579 #s(hash-table ... ...)
+ − 1580 @end example
+ − 1581
+ − 1582 converts to a hash table (the actual contents are not shown).
+ − 1583
+ − 1584 @example
+ − 1585 #s(range-table ... ...)
+ − 1586 @end example
+ − 1587
+ − 1588 converts to a range table (the actual contents are not shown).
+ − 1589
+ − 1590 @example
+ − 1591 #s(char-table ... ...)
+ − 1592 @end example
+ − 1593
+ − 1594 converts to a char table (the actual contents are not shown).
+ − 1595
+ − 1596 Note that the @code{#s()} syntax is the general syntax for structures,
+ − 1597 which are not really implemented in XEmacs Lisp but should be.
+ − 1598
+ − 1599 When an object is printed out (using @code{print} or a related
+ − 1600 function), the read syntax is used, so that the same object can be read
+ − 1601 in again.
+ − 1602
+ − 1603 The other objects do not have read syntaxes, usually because it does not
+ − 1604 really make sense to create them in this fashion (i.e. processes, where
+ − 1605 it doesn't make sense to have a subprocess created as a side effect of
+ − 1606 reading some Lisp code), or because they can't be created at all
+ − 1607 (e.g. subrs). Permanent objects, as a rule, do not have a read syntax;
+ − 1608 nor do most complex objects, which contain too much state to be easily
+ − 1609 initialized through a read syntax.
+ − 1610
+ − 1611 @node How Lisp Objects Are Represented in C, Rules When Writing New C Code, The XEmacs Object System (Abstractly Speaking), Top
+ − 1612 @chapter How Lisp Objects Are Represented in C
+ − 1613
+ − 1614 Lisp objects are represented in C using a 32-bit or 64-bit machine word
+ − 1615 (depending on the processor; i.e. DEC Alphas use 64-bit Lisp objects and
+ − 1616 most other processors use 32-bit Lisp objects). The representation
+ − 1617 stuffs a pointer together with a tag, as follows:
+ − 1618
+ − 1619 @example
+ − 1620 [ 3 3 2 2 2 2 2 2 2 2 2 2 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 ]
+ − 1621 [ 1 0 9 8 7 6 5 4 3 2 1 0 9 8 7 6 5 4 3 2 1 0 9 8 7 6 5 4 3 2 1 0 ]
+ − 1622
442
+ − 1623 <---------------------------------------------------------> <->
+ − 1624 a pointer to a structure, or an integer tag
+ − 1625 @end example
+ − 1626
+ − 1627 A tag of 00 is used for all pointer object types, a tag of 10 is used
+ − 1628 for characters, and the other two tags 01 and 11 are joined together to
+ − 1629 form the integer object type. This representation gives us 31 bit
+ − 1630 integers and 30 bit characters, while pointers are represented directly
+ − 1631 without any bit masking or shifting. This representation, though,
+ − 1632 assumes that pointers to structs are always aligned to multiples of 4,
+ − 1633 so the lower 2 bits are always zero.
428
+ − 1634
+ − 1635 Lisp objects use the typedef @code{Lisp_Object}, but the actual C type
+ − 1636 used for the Lisp object can vary. It can be either a simple type
+ − 1637 (@code{long} on the DEC Alpha, @code{int} on other machines) or a
+ − 1638 structure whose fields are bit fields that line up properly (actually, a
+ − 1639 union of structures is used). Generally the simple integral type is
+ − 1640 preferable because it ensures that the compiler will actually use a
+ − 1641 machine word to represent the object (some compilers will use more
+ − 1642 general and less efficient code for unions and structs even if they can
+ − 1643 fit in a machine word). The union type, however, has the advantage of
442
+ − 1644 stricter type checking. If you accidentally pass an integer where a Lisp
+ − 1645 object is desired, you get a compile error. The choice of which type
+ − 1646 to use is determined by the preprocessor constant @code{USE_UNION_TYPE}
+ − 1647 which is defined via the @code{--use-union-type} option to
+ − 1648 @code{configure}.
+ − 1649
+ − 1650 Various macros are used to convert between Lisp_Objects and the
+ − 1651 corresponding C type. Macros of the form @code{XINT()}, @code{XCHAR()},
+ − 1652 @code{XSTRING()}, @code{XSYMBOL()}, do any required bit shifting and/or
+ − 1653 masking and cast it to the appropriate type. @code{XINT()} needs to be
+ − 1654 a bit tricky so that negative numbers are properly sign-extended. Since
+ − 1655 integers are stored left-shifted, if the right-shift operator does an
+ − 1656 arithmetic shift (i.e. it leaves the most-significant bit as-is rather
+ − 1657 than shifting in a zero, so that it mimics a divide-by-two even for
+ − 1658 negative numbers) the shift to remove the tag bit is enough. This is
+ − 1659 the case on all the systems we support.
+ − 1660
+ − 1661 Note that when @code{ERROR_CHECK_TYPECHECK} is defined, the converter
440
+ − 1662 macros become more complicated---they check the tag bits and/or the
428
+ − 1663 type field in the first four bytes of a record type to ensure that the
+ − 1664 object is really of the correct type. This is great for catching places
440
+ − 1665 where an incorrect type is being dereferenced---this typically results
428
+ − 1666 in a pointer being dereferenced as the wrong type of structure, with
+ − 1667 unpredictable (and sometimes not easily traceable) results.
+ − 1668
+ − 1669 There are similar @code{XSET@var{TYPE}()} macros that construct a Lisp
+ − 1670 object. These macros are of the form @code{XSET@var{TYPE}
442
+ − 1671 (@var{lvalue}, @var{result})}, i.e. they have to be a statement rather
+ − 1672 than just used in an expression. The reason for this is that standard C
+ − 1673 doesn't let you ``construct'' a structure (but GCC does). Granted, this
+ − 1674 sometimes isn't too convenient; for the case of integers, at least, you
+ − 1675 can use the function @code{make_int()}, which constructs and
+ − 1676 @emph{returns} an integer Lisp object. Note that the
+ − 1677 @code{XSET@var{TYPE}()} macros are also affected by
+ − 1678 @code{ERROR_CHECK_TYPECHECK} and make sure that the structure is of the
+ − 1679 right type in the case of record types, where the type is contained in
+ − 1680 the structure.
428
+ − 1681
+ − 1682 The C programmer is responsible for @strong{guaranteeing} that a
442
+ − 1683 Lisp_Object is the correct type before using the @code{X@var{TYPE}}
428
+ − 1684 macros. This is especially important in the case of lists. Use
+ − 1685 @code{XCAR} and @code{XCDR} if a Lisp_Object is certainly a cons cell,
+ − 1686 else use @code{Fcar()} and @code{Fcdr()}. Trust other C code, but not
+ − 1687 Lisp code. On the other hand, if XEmacs has an internal logic error,
442
+ − 1688 it's better to crash immediately, so sprinkle @code{assert()}s and
+ − 1689 ``unreachable'' @code{abort()}s liberally about the source code. Where
+ − 1690 performance is an issue, use @code{type_checking_assert},
+ − 1691 @code{bufpos_checking_assert}, and @code{gc_checking_assert}, which do
+ − 1692 nothing unless the corresponding configure error checking flag was
+ − 1693 specified.
428
+ − 1694
+ − 1695 @node Rules When Writing New C Code, A Summary of the Various XEmacs Modules, How Lisp Objects Are Represented in C, Top
+ − 1696 @chapter Rules When Writing New C Code
+ − 1697
+ − 1698 The XEmacs C Code is extremely complex and intricate, and there are many
+ − 1699 rules that are more or less consistently followed throughout the code.
+ − 1700 Many of these rules are not obvious, so they are explained here. It is
+ − 1701 of the utmost importance that you follow them. If you don't, you may
+ − 1702 get something that appears to work, but which will crash in odd
+ − 1703 situations, often in code far away from where the actual breakage is.
+ − 1704
+ − 1705 @menu
+ − 1706 * General Coding Rules::
+ − 1707 * Writing Lisp Primitives::
+ − 1708 * Adding Global Lisp Variables::
+ − 1709 * Coding for Mule::
+ − 1710 * Techniques for XEmacs Developers::
+ − 1711 @end menu
+ − 1712
442
+ − 1713 @node General Coding Rules, Writing Lisp Primitives, Rules When Writing New C Code, Rules When Writing New C Code
428
+ − 1714 @section General Coding Rules
+ − 1715
+ − 1716 The C code is actually written in a dialect of C called @dfn{Clean C},
+ − 1717 meaning that it can be compiled, mostly warning-free, with either a C or
+ − 1718 C++ compiler. Coding in Clean C has several advantages over plain C.
+ − 1719 C++ compilers are more nit-picking, and a number of coding errors have
+ − 1720 been found by compiling with C++. The ability to use both C and C++
+ − 1721 tools means that a greater variety of development tools are available to
+ − 1722 the developer.
+ − 1723
+ − 1724 Almost every module contains a @code{syms_of_*()} function and a
+ − 1725 @code{vars_of_*()} function. The former declares any Lisp primitives
+ − 1726 you have defined and defines any symbols you will be using. The latter
+ − 1727 declares any global Lisp variables you have added and initializes global
+ − 1728 C variables in the module. For each such function, declare it in
+ − 1729 @file{symsinit.h} and make sure it's called in the appropriate place in
+ − 1730 @file{emacs.c}. @strong{Important}: There are stringent requirements on
+ − 1731 exactly what can go into these functions. See the comment in
+ − 1732 @file{emacs.c}. The reason for this is to avoid obscure unwanted
+ − 1733 interactions during initialization. If you don't follow these rules,
+ − 1734 you'll be sorry! If you want to do anything that isn't allowed, create
+ − 1735 a @code{complex_vars_of_*()} function for it. Doing this is tricky,
+ − 1736 though: You have to make sure your function is called at the right time
+ − 1737 so that all the initialization dependencies work out.
+ − 1738
+ − 1739 Every module includes @file{<config.h>} (angle brackets so that
+ − 1740 @samp{--srcdir} works correctly; @file{config.h} may or may not be in
+ − 1741 the same directory as the C sources) and @file{lisp.h}. @file{config.h}
+ − 1742 must always be included before any other header files (including
+ − 1743 system header files) to ensure that certain tricks played by various
+ − 1744 @file{s/} and @file{m/} files work out correctly.
+ − 1745
440
+ − 1746 When including header files, always use angle brackets, not double
442
+ − 1747 quotes, except when the file to be included is always in the same
+ − 1748 directory as the including file. If either file is a generated file,
+ − 1749 then that is not likely to be the case. In order to understand why we
+ − 1750 have this rule, imagine what happens when you do a build in the source
+ − 1751 directory using @samp{./configure} and another build in another
+ − 1752 directory using @samp{../work/configure}. There will be two different
+ − 1753 @file{config.h} files. Which one will be used if you @samp{#include
+ − 1754 "config.h"}?
440
+ − 1755
428
+ − 1756 @strong{All global and static variables that are to be modifiable must
+ − 1757 be declared uninitialized.} This means that you may not use the
+ − 1758 ``declare with initializer'' form for these variables, such as @code{int
+ − 1759 some_variable = 0;}. The reason for this has to do with some kludges
+ − 1760 done during the dumping process: If possible, the initialized data
+ − 1761 segment is re-mapped so that it becomes part of the (unmodifiable) code
+ − 1762 segment in the dumped executable. This allows this memory to be shared
+ − 1763 among multiple running XEmacs processes. XEmacs is careful to place as
442
+ − 1764 much constant data as possible into initialized variables during the
+ − 1765 @file{temacs} phase.
428
+ − 1766
+ − 1767 @cindex copy-on-write
+ − 1768 @strong{Please note:} This kludge only works on a few systems nowadays,
+ − 1769 and is rapidly becoming irrelevant because most modern operating systems
+ − 1770 provide @dfn{copy-on-write} semantics. All data is initially shared
+ − 1771 between processes, and a private copy is automatically made (on a
+ − 1772 page-by-page basis) when a process first attempts to write to a page of
+ − 1773 memory.
+ − 1774
+ − 1775 Formerly, there was a requirement that static variables not be declared
+ − 1776 inside of functions. This had to do with another hack along the same
+ − 1777 vein as what was just described: old USG systems put statically-declared
+ − 1778 variables in the initialized data space, so those header files had a
+ − 1779 @code{#define static} declaration. (That way, the data-segment remapping
+ − 1780 described above could still work.) This fails badly on static variables
+ − 1781 inside of functions, which suddenly become automatic variables;
+ − 1782 therefore, you weren't supposed to have any of them. This awful kludge
+ − 1783 has been removed in XEmacs because
+ − 1784
+ − 1785 @enumerate
+ − 1786 @item
+ − 1787 almost all of the systems that used this kludge ended up having
+ − 1788 to disable the data-segment remapping anyway;
+ − 1789 @item
+ − 1790 the only systems that didn't were extremely outdated ones;
+ − 1791 @item
+ − 1792 this hack completely messed up inline functions.
+ − 1793 @end enumerate
+ − 1794
+ − 1795 The C source code makes heavy use of C preprocessor macros. One popular
+ − 1796 macro style is:
+ − 1797
+ − 1798 @example
442
+ − 1799 #define FOO(var, value) do @{ \
440
+ − 1800 Lisp_Object FOO_value = (value); \
+ − 1801 ... /* compute using FOO_value */ \
+ − 1802 (var) = bar; \
428
+ − 1803 @} while (0)
+ − 1804 @end example
+ − 1805
+ − 1806 The @code{do @{...@} while (0)} is a standard trick to allow FOO to have
+ − 1807 statement semantics, so that it can safely be used within an @code{if}
+ − 1808 statement in C, for example. Multiple evaluation is prevented by
+ − 1809 copying a supplied argument into a local variable, so that
+ − 1810 @code{FOO(var,fun(1))} only calls @code{fun} once.
+ − 1811
+ − 1812 Lisp lists are popular data structures in the C code as well as in
+ − 1813 Elisp. There are two sets of macros that iterate over lists.
+ − 1814 @code{EXTERNAL_LIST_LOOP_@var{n}} should be used when the list has been
+ − 1815 supplied by the user, and cannot be trusted to be acyclic and
444
+ − 1816 @code{nil}-terminated. A @code{malformed-list} or @code{circular-list} error
428
+ − 1817 will be generated if the list being iterated over is not entirely
+ − 1818 kosher. @code{LIST_LOOP_@var{n}}, on the other hand, is faster and less
+ − 1819 safe, and can be used only on trusted lists.
+ − 1820
+ − 1821 Related macros are @code{GET_EXTERNAL_LIST_LENGTH} and
+ − 1822 @code{GET_LIST_LENGTH}, which calculate the length of a list, and in the
+ − 1823 case of @code{GET_EXTERNAL_LIST_LENGTH}, validating the properness of
+ − 1824 the list. The macros @code{EXTERNAL_LIST_LOOP_DELETE_IF} and
+ − 1825 @code{LIST_LOOP_DELETE_IF} delete elements from a lisp list satisfying some
+ − 1826 predicate.
+ − 1827
442
+ − 1828 @node Writing Lisp Primitives, Adding Global Lisp Variables, General Coding Rules, Rules When Writing New C Code
428
+ − 1829 @section Writing Lisp Primitives
+ − 1830
+ − 1831 Lisp primitives are Lisp functions implemented in C. The details of
+ − 1832 interfacing the C function so that Lisp can call it are handled by a few
+ − 1833 C macros. The only way to really understand how to write new C code is
+ − 1834 to read the source, but we can explain some things here.
+ − 1835
+ − 1836 An example of a special form is the definition of @code{prog1}, from
+ − 1837 @file{eval.c}. (An ordinary function would have the same general
+ − 1838 appearance.)
+ − 1839
+ − 1840 @cindex garbage collection protection
+ − 1841 @smallexample
+ − 1842 @group
+ − 1843 DEFUN ("prog1", Fprog1, 1, UNEVALLED, 0, /*
+ − 1844 Similar to `progn', but the value of the first form is returned.
+ − 1845 \(prog1 FIRST BODY...): All the arguments are evaluated sequentially.
+ − 1846 The value of FIRST is saved during evaluation of the remaining args,
+ − 1847 whose values are discarded.
+ − 1848 */
+ − 1849 (args))
+ − 1850 @{
+ − 1851 /* This function can GC */
+ − 1852 REGISTER Lisp_Object val, form, tail;
+ − 1853 struct gcpro gcpro1;
+ − 1854
+ − 1855 val = Feval (XCAR (args));
+ − 1856
+ − 1857 GCPRO1 (val);
+ − 1858
+ − 1859 LIST_LOOP_3 (form, XCDR (args), tail)
+ − 1860 Feval (form);
+ − 1861
+ − 1862 UNGCPRO;
+ − 1863 return val;
+ − 1864 @}
+ − 1865 @end group
+ − 1866 @end smallexample
+ − 1867
+ − 1868 Let's start with a precise explanation of the arguments to the
+ − 1869 @code{DEFUN} macro. Here is a template for them:
+ − 1870
+ − 1871 @example
+ − 1872 @group
+ − 1873 DEFUN (@var{lname}, @var{fname}, @var{min_args}, @var{max_args}, @var{interactive}, /*
+ − 1874 @var{docstring}
+ − 1875 */
+ − 1876 (@var{arglist}))
+ − 1877 @end group
+ − 1878 @end example
+ − 1879
+ − 1880 @table @var
+ − 1881 @item lname
+ − 1882 This string is the name of the Lisp symbol to define as the function
+ − 1883 name; in the example above, it is @code{"prog1"}.
+ − 1884
+ − 1885 @item fname
+ − 1886 This is the C function name for this function. This is the name that is
+ − 1887 used in C code for calling the function. The name is, by convention,
+ − 1888 @samp{F} prepended to the Lisp name, with all dashes (@samp{-}) in the
+ − 1889 Lisp name changed to underscores. Thus, to call this function from C
+ − 1890 code, call @code{Fprog1}. Remember that the arguments are of type
+ − 1891 @code{Lisp_Object}; various macros and functions for creating values of
+ − 1892 type @code{Lisp_Object} are declared in the file @file{lisp.h}.
+ − 1893
+ − 1894 Primitives whose names are special characters (e.g. @code{+} or
+ − 1895 @code{<}) are named by spelling out, in some fashion, the special
+ − 1896 character: e.g. @code{Fplus()} or @code{Flss()}. Primitives whose names
+ − 1897 begin with normal alphanumeric characters but also contain special
+ − 1898 characters are spelled out in some creative way, e.g. @code{let*}
+ − 1899 becomes @code{FletX()}.
+ − 1900
+ − 1901 Each function also has an associated structure that holds the data for
+ − 1902 the subr object that represents the function in Lisp. This structure
+ − 1903 conveys the Lisp symbol name to the initialization routine that will
+ − 1904 create the symbol and store the subr object as its definition. The C
+ − 1905 variable name of this structure is always @samp{S} prepended to the
+ − 1906 @var{fname}. You hardly ever need to be aware of the existence of this
+ − 1907 structure, since @code{DEFUN} plus @code{DEFSUBR} takes care of all the
+ − 1908 details.
+ − 1909
+ − 1910 @item min_args
+ − 1911 This is the minimum number of arguments that the function requires. The
+ − 1912 function @code{prog1} allows a minimum of one argument.
+ − 1913
+ − 1914 @item max_args
+ − 1915 This is the maximum number of arguments that the function accepts, if
+ − 1916 there is a fixed maximum. Alternatively, it can be @code{UNEVALLED},
+ − 1917 indicating a special form that receives unevaluated arguments, or
+ − 1918 @code{MANY}, indicating an unlimited number of evaluated arguments (the
+ − 1919 C equivalent of @code{&rest}). Both @code{UNEVALLED} and @code{MANY}
+ − 1920 are macros. If @var{max_args} is a number, it may not be less than
+ − 1921 @var{min_args} and it may not be greater than 8. (If you need to add a
+ − 1922 function with more than 8 arguments, use the @code{MANY} form. Resist
+ − 1923 the urge to edit the definition of @code{DEFUN} in @file{lisp.h}. If
+ − 1924 you do it anyways, make sure to also add another clause to the switch
+ − 1925 statement in @code{primitive_funcall().})
+ − 1926
+ − 1927 @item interactive
+ − 1928 This is an interactive specification, a string such as might be used as
+ − 1929 the argument of @code{interactive} in a Lisp function. In the case of
+ − 1930 @code{prog1}, it is 0 (a null pointer), indicating that @code{prog1}
+ − 1931 cannot be called interactively. A value of @code{""} indicates a
+ − 1932 function that should receive no arguments when called interactively.
+ − 1933
+ − 1934 @item docstring
+ − 1935 This is the documentation string. It is written just like a
+ − 1936 documentation string for a function defined in Lisp; in particular, the
+ − 1937 first line should be a single sentence. Note how the documentation
+ − 1938 string is enclosed in a comment, none of the documentation is placed on
+ − 1939 the same lines as the comment-start and comment-end characters, and the
+ − 1940 comment-start characters are on the same line as the interactive
+ − 1941 specification. @file{make-docfile}, which scans the C files for
+ − 1942 documentation strings, is very particular about what it looks for, and
+ − 1943 will not properly extract the doc string if it's not in this exact format.
+ − 1944
+ − 1945 In order to make both @file{etags} and @file{make-docfile} happy, make
+ − 1946 sure that the @code{DEFUN} line contains the @var{lname} and
+ − 1947 @var{fname}, and that the comment-start characters for the doc string
+ − 1948 are on the same line as the interactive specification, and put a newline
+ − 1949 directly after them (and before the comment-end characters).
+ − 1950
+ − 1951 @item arglist
+ − 1952 This is the comma-separated list of arguments to the C function. For a
+ − 1953 function with a fixed maximum number of arguments, provide a C argument
+ − 1954 for each Lisp argument. In this case, unlike regular C functions, the
+ − 1955 types of the arguments are not declared; they are simply always of type
+ − 1956 @code{Lisp_Object}.
+ − 1957
+ − 1958 The names of the C arguments will be used as the names of the arguments
+ − 1959 to the Lisp primitive as displayed in its documentation, modulo the same
+ − 1960 concerns described above for @code{F...} names (in particular,
+ − 1961 underscores in the C arguments become dashes in the Lisp arguments).
+ − 1962
+ − 1963 There is one additional kludge: A trailing `_' on the C argument is
+ − 1964 discarded when forming the Lisp argument. This allows C language
+ − 1965 reserved words (like @code{default}) or global symbols (like
+ − 1966 @code{dirname}) to be used as argument names without compiler warnings
+ − 1967 or errors.
+ − 1968
+ − 1969 A Lisp function with @w{@var{max_args} = @code{UNEVALLED}} is a
+ − 1970 @w{@dfn{special form}}; its arguments are not evaluated. Instead it
+ − 1971 receives one argument of type @code{Lisp_Object}, a (Lisp) list of the
+ − 1972 unevaluated arguments, conventionally named @code{(args)}.
+ − 1973
+ − 1974 When a Lisp function has no upper limit on the number of arguments,
+ − 1975 specify @w{@var{max_args} = @code{MANY}}. In this case its implementation in
+ − 1976 C actually receives exactly two arguments: the number of Lisp arguments
+ − 1977 (an @code{int}) and the address of a block containing their values (a
+ − 1978 @w{@code{Lisp_Object *}}). In this case only are the C types specified
+ − 1979 in the @var{arglist}: @w{@code{(int nargs, Lisp_Object *args)}}.
+ − 1980
+ − 1981 @end table
+ − 1982
+ − 1983 Within the function @code{Fprog1} itself, note the use of the macros
+ − 1984 @code{GCPRO1} and @code{UNGCPRO}. @code{GCPRO1} is used to ``protect''
+ − 1985 a variable from garbage collection---to inform the garbage collector
+ − 1986 that it must look in that variable and regard the object pointed at by
+ − 1987 its contents as an accessible object. This is necessary whenever you
+ − 1988 call @code{Feval} or anything that can directly or indirectly call
+ − 1989 @code{Feval} (this includes the @code{QUIT} macro!). At such a time,
+ − 1990 any Lisp object that you intend to refer to again must be protected
+ − 1991 somehow. @code{UNGCPRO} cancels the protection of the variables that
+ − 1992 are protected in the current function. It is necessary to do this
+ − 1993 explicitly.
+ − 1994
+ − 1995 The macro @code{GCPRO1} protects just one local variable. If you want
+ − 1996 to protect two, use @code{GCPRO2} instead; repeating @code{GCPRO1} will
+ − 1997 not work. Macros @code{GCPRO3} and @code{GCPRO4} also exist.
+ − 1998
+ − 1999 These macros implicitly use local variables such as @code{gcpro1}; you
+ − 2000 must declare these explicitly, with type @code{struct gcpro}. Thus, if
+ − 2001 you use @code{GCPRO2}, you must declare @code{gcpro1} and @code{gcpro2}.
+ − 2002
+ − 2003 @cindex caller-protects (@code{GCPRO} rule)
+ − 2004 Note also that the general rule is @dfn{caller-protects}; i.e. you are
+ − 2005 only responsible for protecting those Lisp objects that you create. Any
+ − 2006 objects passed to you as arguments should have been protected by whoever
+ − 2007 created them, so you don't in general have to protect them.
+ − 2008
+ − 2009 In particular, the arguments to any Lisp primitive are always
+ − 2010 automatically @code{GCPRO}ed, when called ``normally'' from Lisp code or
+ − 2011 bytecode. So only a few Lisp primitives that are called frequently from
+ − 2012 C code, such as @code{Fprogn} protect their arguments as a service to
+ − 2013 their caller. You don't need to protect your arguments when writing a
+ − 2014 new @code{DEFUN}.
+ − 2015
+ − 2016 @code{GCPRO}ing is perhaps the trickiest and most error-prone part of
+ − 2017 XEmacs coding. It is @strong{extremely} important that you get this
+ − 2018 right and use a great deal of discipline when writing this code.
+ − 2019 @xref{GCPROing, ,@code{GCPRO}ing}, for full details on how to do this.
+ − 2020
+ − 2021 What @code{DEFUN} actually does is declare a global structure of type
+ − 2022 @code{Lisp_Subr} whose name begins with capital @samp{SF} and which
+ − 2023 contains information about the primitive (e.g. a pointer to the
+ − 2024 function, its minimum and maximum allowed arguments, a string describing
+ − 2025 its Lisp name); @code{DEFUN} then begins a normal C function declaration
+ − 2026 using the @code{F...} name. The Lisp subr object that is the function
+ − 2027 definition of a primitive (i.e. the object in the function slot of the
+ − 2028 symbol that names the primitive) actually points to this @samp{SF}
+ − 2029 structure; when @code{Feval} encounters a subr, it looks in the
+ − 2030 structure to find out how to call the C function.
+ − 2031
+ − 2032 Defining the C function is not enough to make a Lisp primitive
+ − 2033 available; you must also create the Lisp symbol for the primitive (the
+ − 2034 symbol is @dfn{interned}; @pxref{Obarrays}) and store a suitable subr
+ − 2035 object in its function cell. (If you don't do this, the primitive won't
+ − 2036 be seen by Lisp code.) The code looks like this:
+ − 2037
+ − 2038 @example
+ − 2039 DEFSUBR (@var{fname});
+ − 2040 @end example
+ − 2041
+ − 2042 @noindent
+ − 2043 Here @var{fname} is the same name you used as the second argument to
+ − 2044 @code{DEFUN}.
+ − 2045
+ − 2046 This call to @code{DEFSUBR} should go in the @code{syms_of_*()} function
+ − 2047 at the end of the module. If no such function exists, create it and
+ − 2048 make sure to also declare it in @file{symsinit.h} and call it from the
+ − 2049 appropriate spot in @code{main()}. @xref{General Coding Rules}.
+ − 2050
+ − 2051 Note that C code cannot call functions by name unless they are defined
+ − 2052 in C. The way to call a function written in Lisp from C is to use
+ − 2053 @code{Ffuncall}, which embodies the Lisp function @code{funcall}. Since
+ − 2054 the Lisp function @code{funcall} accepts an unlimited number of
+ − 2055 arguments, in C it takes two: the number of Lisp-level arguments, and a
+ − 2056 one-dimensional array containing their values. The first Lisp-level
+ − 2057 argument is the Lisp function to call, and the rest are the arguments to
+ − 2058 pass to it. Since @code{Ffuncall} can call the evaluator, you must
+ − 2059 protect pointers from garbage collection around the call to
+ − 2060 @code{Ffuncall}. (However, @code{Ffuncall} explicitly protects all of
+ − 2061 its parameters, so you don't have to protect any pointers passed as
+ − 2062 parameters to it.)
+ − 2063
+ − 2064 The C functions @code{call0}, @code{call1}, @code{call2}, and so on,
+ − 2065 provide handy ways to call a Lisp function conveniently with a fixed
+ − 2066 number of arguments. They work by calling @code{Ffuncall}.
+ − 2067
+ − 2068 @file{eval.c} is a very good file to look through for examples;
+ − 2069 @file{lisp.h} contains the definitions for important macros and
+ − 2070 functions.
+ − 2071
442
+ − 2072 @node Adding Global Lisp Variables, Coding for Mule, Writing Lisp Primitives, Rules When Writing New C Code
428
+ − 2073 @section Adding Global Lisp Variables
+ − 2074
+ − 2075 Global variables whose names begin with @samp{Q} are constants whose
+ − 2076 value is a symbol of a particular name. The name of the variable should
+ − 2077 be derived from the name of the symbol using the same rules as for Lisp
+ − 2078 primitives. These variables are initialized using a call to
+ − 2079 @code{defsymbol()} in the @code{syms_of_*()} function. (This call
+ − 2080 interns a symbol, sets the C variable to the resulting Lisp object, and
+ − 2081 calls @code{staticpro()} on the C variable to tell the
+ − 2082 garbage-collection mechanism about this variable. What
+ − 2083 @code{staticpro()} does is add a pointer to the variable to a large
+ − 2084 global array; when garbage-collection happens, all pointers listed in
+ − 2085 the array are used as starting points for marking Lisp objects. This is
+ − 2086 important because it's quite possible that the only current reference to
+ − 2087 the object is the C variable. In the case of symbols, the
+ − 2088 @code{staticpro()} doesn't matter all that much because the symbol is
+ − 2089 contained in @code{obarray}, which is itself @code{staticpro()}ed.
+ − 2090 However, it's possible that a naughty user could do something like
+ − 2091 uninterning the symbol out of @code{obarray} or even setting
+ − 2092 @code{obarray} to a different value [although this is likely to make
+ − 2093 XEmacs crash!].)
+ − 2094
+ − 2095 @strong{Please note:} It is potentially deadly if you declare a
+ − 2096 @samp{Q...} variable in two different modules. The two calls to
+ − 2097 @code{defsymbol()} are no problem, but some linkers will complain about
+ − 2098 multiply-defined symbols. The most insidious aspect of this is that
+ − 2099 often the link will succeed anyway, but then the resulting executable
+ − 2100 will sometimes crash in obscure ways during certain operations! To
+ − 2101 avoid this problem, declare any symbols with common names (such as
+ − 2102 @code{text}) that are not obviously associated with this particular
+ − 2103 module in the module @file{general.c}.
+ − 2104
+ − 2105 Global variables whose names begin with @samp{V} are variables that
+ − 2106 contain Lisp objects. The convention here is that all global variables
+ − 2107 of type @code{Lisp_Object} begin with @samp{V}, and all others don't
+ − 2108 (including integer and boolean variables that have Lisp
+ − 2109 equivalents). Most of the time, these variables have equivalents in
+ − 2110 Lisp, but some don't. Those that do are declared this way by a call to
+ − 2111 @code{DEFVAR_LISP()} in the @code{vars_of_*()} initializer for the
+ − 2112 module. What this does is create a special @dfn{symbol-value-forward}
+ − 2113 Lisp object that contains a pointer to the C variable, intern a symbol
+ − 2114 whose name is as specified in the call to @code{DEFVAR_LISP()}, and set
+ − 2115 its value to the symbol-value-forward Lisp object; it also calls
+ − 2116 @code{staticpro()} on the C variable to tell the garbage-collection
+ − 2117 mechanism about the variable. When @code{eval} (or actually
+ − 2118 @code{symbol-value}) encounters this special object in the process of
+ − 2119 retrieving a variable's value, it follows the indirection to the C
+ − 2120 variable and gets its value. @code{setq} does similar things so that
+ − 2121 the C variable gets changed.
+ − 2122
+ − 2123 Whether or not you @code{DEFVAR_LISP()} a variable, you need to
+ − 2124 initialize it in the @code{vars_of_*()} function; otherwise it will end
+ − 2125 up as all zeroes, which is the integer 0 (@emph{not} @code{nil}), and
+ − 2126 this is probably not what you want. Also, if the variable is not
+ − 2127 @code{DEFVAR_LISP()}ed, @strong{you must call} @code{staticpro()} on the
+ − 2128 C variable in the @code{vars_of_*()} function. Otherwise, the
+ − 2129 garbage-collection mechanism won't know that the object in this variable
+ − 2130 is in use, and will happily collect it and reuse its storage for another
+ − 2131 Lisp object, and you will be the one who's unhappy when you can't figure
+ − 2132 out how your variable got overwritten.
+ − 2133
442
+ − 2134 @node Coding for Mule, Techniques for XEmacs Developers, Adding Global Lisp Variables, Rules When Writing New C Code
428
+ − 2135 @section Coding for Mule
+ − 2136 @cindex Coding for Mule
+ − 2137
+ − 2138 Although Mule support is not compiled by default in XEmacs, many people
+ − 2139 are using it, and we consider it crucial that new code works correctly
+ − 2140 with multibyte characters. This is not hard; it is only a matter of
+ − 2141 following several simple user-interface guidelines. Even if you never
+ − 2142 compile with Mule, with a little practice you will find it quite easy
+ − 2143 to code Mule-correctly.
+ − 2144
+ − 2145 Note that these guidelines are not necessarily tied to the current Mule
+ − 2146 implementation; they are also a good idea to follow on the grounds of
+ − 2147 code generalization for future I18N work.
+ − 2148
+ − 2149 @menu
+ − 2150 * Character-Related Data Types::
+ − 2151 * Working With Character and Byte Positions::
+ − 2152 * Conversion to and from External Data::
+ − 2153 * General Guidelines for Writing Mule-Aware Code::
+ − 2154 * An Example of Mule-Aware Code::
+ − 2155 @end menu
+ − 2156
442
+ − 2157 @node Character-Related Data Types, Working With Character and Byte Positions, Coding for Mule, Coding for Mule
428
+ − 2158 @subsection Character-Related Data Types
+ − 2159
+ − 2160 First, let's review the basic character-related datatypes used by
+ − 2161 XEmacs. Note that the separate @code{typedef}s are not mandatory in the
+ − 2162 current implementation (all of them boil down to @code{unsigned char} or
+ − 2163 @code{int}), but they improve clarity of code a great deal, because one
+ − 2164 glance at the declaration can tell the intended use of the variable.
+ − 2165
+ − 2166 @table @code
+ − 2167 @item Emchar
+ − 2168 @cindex Emchar
+ − 2169 An @code{Emchar} holds a single Emacs character.
+ − 2170
+ − 2171 Obviously, the equality between characters and bytes is lost in the Mule
+ − 2172 world. Characters can be represented by one or more bytes in the
+ − 2173 buffer, and @code{Emchar} is the C type large enough to hold any
+ − 2174 character.
+ − 2175
+ − 2176 Without Mule support, an @code{Emchar} is equivalent to an
+ − 2177 @code{unsigned char}.
+ − 2178
+ − 2179 @item Bufbyte
+ − 2180 @cindex Bufbyte
+ − 2181 The data representing the text in a buffer or string is logically a set
+ − 2182 of @code{Bufbyte}s.
+ − 2183
442
+ − 2184 XEmacs does not work with the same character formats all the time; when
+ − 2185 reading characters from the outside, it decodes them to an internal
+ − 2186 format, and likewise encodes them when writing. @code{Bufbyte} (in fact
428
+ − 2187 @code{unsigned char}) is the basic unit of XEmacs internal buffers and
442
+ − 2188 strings format. A @code{Bufbyte *} is the type that points at text
+ − 2189 encoded in the variable-width internal encoding.
428
+ − 2190
+ − 2191 One character can correspond to one or more @code{Bufbyte}s. In the
442
+ − 2192 current Mule implementation, an ASCII character is represented by the
+ − 2193 same @code{Bufbyte}, and other characters are represented by a sequence
+ − 2194 of two or more @code{Bufbyte}s.
+ − 2195
+ − 2196 Without Mule support, there are exactly 256 characters, implicitly
+ − 2197 Latin-1, and each character is represented using one @code{Bufbyte}, and
+ − 2198 there is a one-to-one correspondence between @code{Bufbyte}s and
+ − 2199 @code{Emchar}s.
428
+ − 2200
+ − 2201 @item Bufpos
+ − 2202 @itemx Charcount
+ − 2203 @cindex Bufpos
+ − 2204 @cindex Charcount
+ − 2205 A @code{Bufpos} represents a character position in a buffer or string.
+ − 2206 A @code{Charcount} represents a number (count) of characters.
+ − 2207 Logically, subtracting two @code{Bufpos} values yields a
+ − 2208 @code{Charcount} value. Although all of these are @code{typedef}ed to
442
+ − 2209 @code{EMACS_INT}, we use them in preference to @code{EMACS_INT} to make
+ − 2210 it clear what sort of position is being used.
428
+ − 2211
+ − 2212 @code{Bufpos} and @code{Charcount} values are the only ones that are
+ − 2213 ever visible to Lisp.
+ − 2214
+ − 2215 @item Bytind
+ − 2216 @itemx Bytecount
+ − 2217 @cindex Bytind
+ − 2218 @cindex Bytecount
+ − 2219 A @code{Bytind} represents a byte position in a buffer or string. A
442
+ − 2220 @code{Bytecount} represents the distance between two positions, in bytes.
428
+ − 2221 The relationship between @code{Bytind} and @code{Bytecount} is the same
+ − 2222 as the relationship between @code{Bufpos} and @code{Charcount}.
+ − 2223
+ − 2224 @item Extbyte
+ − 2225 @itemx Extcount
+ − 2226 @cindex Extbyte
+ − 2227 @cindex Extcount
+ − 2228 When dealing with the outside world, XEmacs works with @code{Extbyte}s,
+ − 2229 which are equivalent to @code{unsigned char}. Obviously, an
+ − 2230 @code{Extcount} is the distance between two @code{Extbyte}s. Extbytes
+ − 2231 and Extcounts are not all that frequent in XEmacs code.
+ − 2232 @end table
+ − 2233
442
+ − 2234 @node Working With Character and Byte Positions, Conversion to and from External Data, Character-Related Data Types, Coding for Mule
428
+ − 2235 @subsection Working With Character and Byte Positions
+ − 2236
+ − 2237 Now that we have defined the basic character-related types, we can look
+ − 2238 at the macros and functions designed for work with them and for
+ − 2239 conversion between them. Most of these macros are defined in
+ − 2240 @file{buffer.h}, and we don't discuss all of them here, but only the
+ − 2241 most important ones. Examining the existing code is the best way to
+ − 2242 learn about them.
+ − 2243
+ − 2244 @table @code
+ − 2245 @item MAX_EMCHAR_LEN
+ − 2246 @cindex MAX_EMCHAR_LEN
442
+ − 2247 This preprocessor constant is the maximum number of buffer bytes to
+ − 2248 represent an Emacs character in the variable width internal encoding.
+ − 2249 It is useful when allocating temporary strings to keep a known number of
+ − 2250 characters. For instance:
428
+ − 2251
+ − 2252 @example
+ − 2253 @group
+ − 2254 @{
+ − 2255 Charcount cclen;
+ − 2256 ...
+ − 2257 @{
+ − 2258 /* Allocate place for @var{cclen} characters. */
+ − 2259 Bufbyte *buf = (Bufbyte *)alloca (cclen * MAX_EMCHAR_LEN);
+ − 2260 ...
+ − 2261 @end group
+ − 2262 @end example
+ − 2263
+ − 2264 If you followed the previous section, you can guess that, logically,
+ − 2265 multiplying a @code{Charcount} value with @code{MAX_EMCHAR_LEN} produces
+ − 2266 a @code{Bytecount} value.
+ − 2267
+ − 2268 In the current Mule implementation, @code{MAX_EMCHAR_LEN} equals 4.
+ − 2269 Without Mule, it is 1.
+ − 2270
+ − 2271 @item charptr_emchar
+ − 2272 @itemx set_charptr_emchar
+ − 2273 @cindex charptr_emchar
+ − 2274 @cindex set_charptr_emchar
+ − 2275 The @code{charptr_emchar} macro takes a @code{Bufbyte} pointer and
+ − 2276 returns the @code{Emchar} stored at that position. If it were a
+ − 2277 function, its prototype would be:
+ − 2278
+ − 2279 @example
+ − 2280 Emchar charptr_emchar (Bufbyte *p);
+ − 2281 @end example
+ − 2282
+ − 2283 @code{set_charptr_emchar} stores an @code{Emchar} to the specified byte
+ − 2284 position. It returns the number of bytes stored:
+ − 2285
+ − 2286 @example
+ − 2287 Bytecount set_charptr_emchar (Bufbyte *p, Emchar c);
+ − 2288 @end example
+ − 2289
+ − 2290 It is important to note that @code{set_charptr_emchar} is safe only for
+ − 2291 appending a character at the end of a buffer, not for overwriting a
+ − 2292 character in the middle. This is because the width of characters
+ − 2293 varies, and @code{set_charptr_emchar} cannot resize the string if it
+ − 2294 writes, say, a two-byte character where a single-byte character used to
+ − 2295 reside.
+ − 2296
+ − 2297 A typical use of @code{set_charptr_emchar} can be demonstrated by this
+ − 2298 example, which copies characters from buffer @var{buf} to a temporary
+ − 2299 string of Bufbytes.
+ − 2300
+ − 2301 @example
+ − 2302 @group
+ − 2303 @{
+ − 2304 Bufpos pos;
+ − 2305 for (pos = beg; pos < end; pos++)
+ − 2306 @{
+ − 2307 Emchar c = BUF_FETCH_CHAR (buf, pos);
+ − 2308 p += set_charptr_emchar (buf, c);
+ − 2309 @}
+ − 2310 @}
+ − 2311 @end group
+ − 2312 @end example
+ − 2313
+ − 2314 Note how @code{set_charptr_emchar} is used to store the @code{Emchar}
+ − 2315 and increment the counter, at the same time.
+ − 2316
+ − 2317 @item INC_CHARPTR
+ − 2318 @itemx DEC_CHARPTR
+ − 2319 @cindex INC_CHARPTR
+ − 2320 @cindex DEC_CHARPTR
+ − 2321 These two macros increment and decrement a @code{Bufbyte} pointer,
+ − 2322 respectively. They will adjust the pointer by the appropriate number of
+ − 2323 bytes according to the byte length of the character stored there. Both
+ − 2324 macros assume that the memory address is located at the beginning of a
+ − 2325 valid character.
+ − 2326
+ − 2327 Without Mule support, @code{INC_CHARPTR (p)} and @code{DEC_CHARPTR (p)}
+ − 2328 simply expand to @code{p++} and @code{p--}, respectively.
+ − 2329
+ − 2330 @item bytecount_to_charcount
+ − 2331 @cindex bytecount_to_charcount
+ − 2332 Given a pointer to a text string and a length in bytes, return the
+ − 2333 equivalent length in characters.
+ − 2334
+ − 2335 @example
+ − 2336 Charcount bytecount_to_charcount (Bufbyte *p, Bytecount bc);
+ − 2337 @end example
+ − 2338
+ − 2339 @item charcount_to_bytecount
+ − 2340 @cindex charcount_to_bytecount
+ − 2341 Given a pointer to a text string and a length in characters, return the
+ − 2342 equivalent length in bytes.
+ − 2343
+ − 2344 @example
+ − 2345 Bytecount charcount_to_bytecount (Bufbyte *p, Charcount cc);
+ − 2346 @end example
+ − 2347
+ − 2348 @item charptr_n_addr
+ − 2349 @cindex charptr_n_addr
+ − 2350 Return a pointer to the beginning of the character offset @var{cc} (in
+ − 2351 characters) from @var{p}.
+ − 2352
+ − 2353 @example
+ − 2354 Bufbyte *charptr_n_addr (Bufbyte *p, Charcount cc);
+ − 2355 @end example
+ − 2356 @end table
+ − 2357
442
+ − 2358 @node Conversion to and from External Data, General Guidelines for Writing Mule-Aware Code, Working With Character and Byte Positions, Coding for Mule
428
+ − 2359 @subsection Conversion to and from External Data
+ − 2360
+ − 2361 When an external function, such as a C library function, returns a
+ − 2362 @code{char} pointer, you should almost never treat it as @code{Bufbyte}.
+ − 2363 This is because these returned strings may contain 8bit characters which
+ − 2364 can be misinterpreted by XEmacs, and cause a crash. Likewise, when
+ − 2365 exporting a piece of internal text to the outside world, you should
+ − 2366 always convert it to an appropriate external encoding, lest the internal
+ − 2367 stuff (such as the infamous \201 characters) leak out.
+ − 2368
+ − 2369 The interface to conversion between the internal and external
+ − 2370 representations of text are the numerous conversion macros defined in
442
+ − 2371 @file{buffer.h}. There used to be a fixed set of external formats
+ − 2372 supported by these macros, but now any coding system can be used with
+ − 2373 these macros. The coding system alias mechanism is used to create the
+ − 2374 following logical coding systems, which replace the fixed external
+ − 2375 formats. The (dontusethis-set-symbol-value-handler) mechanism was
+ − 2376 enhanced to make this possible (more work on that is needed - like
+ − 2377 remove the @code{dontusethis-} prefix).
428
+ − 2378
+ − 2379 @table @code
442
+ − 2380 @item Qbinary
+ − 2381 This is the simplest format and is what we use in the absence of a more
+ − 2382 appropriate format. This converts according to the @code{binary} coding
+ − 2383 system:
+ − 2384
+ − 2385 @enumerate a
+ − 2386 @item
+ − 2387 On input, bytes 0--255 are converted into (implicitly Latin-1)
+ − 2388 characters 0--255. A non-Mule xemacs doesn't really know about
+ − 2389 different character sets and the fonts to display them, so the bytes can
+ − 2390 be treated as text in different 1-byte encodings by simply setting the
+ − 2391 appropriate fonts. So in a sense, non-Mule xemacs is a multi-lingual
+ − 2392 editor if, for example, different fonts are used to display text in
+ − 2393 different buffers, faces, or windows. The specifier mechanism gives the
+ − 2394 user complete control over this kind of behavior.
+ − 2395 @item
+ − 2396 On output, characters 0--255 are converted into bytes 0--255 and other
+ − 2397 characters are converted into `~'.
+ − 2398 @end enumerate
+ − 2399
+ − 2400 @item Qfile_name
+ − 2401 Format used for filenames. This is user-definable via either the
+ − 2402 @code{file-name-coding-system} or @code{pathname-coding-system} (now
+ − 2403 obsolete) variables.
+ − 2404
+ − 2405 @item Qnative
+ − 2406 Format used for the external Unix environment---@code{argv[]}, stuff
+ − 2407 from @code{getenv()}, stuff from the @file{/etc/passwd} file, etc.
+ − 2408 Currently this is the same as Qfile_name. The two should be
+ − 2409 distinguished for clarity and possible future separation.
+ − 2410
+ − 2411 @item Qctext
+ − 2412 Compound--text format. This is the standard X11 format used for data
+ − 2413 stored in properties, selections, and the like. This is an 8-bit
+ − 2414 no-lock-shift ISO2022 coding system. This is a real coding system,
+ − 2415 unlike Qfile_name, which is user-definable.
428
+ − 2416 @end table
+ − 2417
442
+ − 2418 There are two fundamental macros to convert between external and
+ − 2419 internal format.
+ − 2420
+ − 2421 @code{TO_INTERNAL_FORMAT} converts external data to internal format, and
+ − 2422 @code{TO_EXTERNAL_FORMAT} converts the other way around. The arguments
+ − 2423 each of these receives are a source type, a source, a sink type, a sink,
+ − 2424 and a coding system (or a symbol naming a coding system).
+ − 2425
+ − 2426 A typical call looks like
+ − 2427 @example
+ − 2428 TO_EXTERNAL_FORMAT (LISP_STRING, str, C_STRING_MALLOC, ptr, Qfile_name);
+ − 2429 @end example
+ − 2430
+ − 2431 which means that the contents of the lisp string @code{str} are written
+ − 2432 to a malloc'ed memory area which will be pointed to by @code{ptr}, after
+ − 2433 the function returns. The conversion will be done using the
+ − 2434 @code{file-name} coding system, which will be controlled by the user
+ − 2435 indirectly by setting or binding the variable
+ − 2436 @code{file-name-coding-system}.
+ − 2437
+ − 2438 Some sources and sinks require two C variables to specify. We use some
+ − 2439 preprocessor magic to allow different source and sink types, and even
+ − 2440 different numbers of arguments to specify different types of sources and
+ − 2441 sinks.
+ − 2442
+ − 2443 So we can have a call that looks like
+ − 2444 @example
+ − 2445 TO_INTERNAL_FORMAT (DATA, (ptr, len),
+ − 2446 MALLOC, (ptr, len),
+ − 2447 coding_system);
+ − 2448 @end example
+ − 2449
+ − 2450 The parenthesized argument pairs are required to make the preprocessor
+ − 2451 magic work.
+ − 2452
+ − 2453 Here are the different source and sink types:
+ − 2454
+ − 2455 @table @code
+ − 2456 @item @code{DATA, (ptr, len),}
+ − 2457 input data is a fixed buffer of size @var{len} at address @var{ptr}
+ − 2458 @item @code{ALLOCA, (ptr, len),}
+ − 2459 output data is placed in an alloca()ed buffer of size @var{len} pointed to by @var{ptr}
+ − 2460 @item @code{MALLOC, (ptr, len),}
+ − 2461 output data is in a malloc()ed buffer of size @var{len} pointed to by @var{ptr}
+ − 2462 @item @code{C_STRING_ALLOCA, ptr,}
+ − 2463 equivalent to @code{ALLOCA (ptr, len_ignored)} on output.
+ − 2464 @item @code{C_STRING_MALLOC, ptr,}
+ − 2465 equivalent to @code{MALLOC (ptr, len_ignored)} on output
+ − 2466 @item @code{C_STRING, ptr,}
+ − 2467 equivalent to @code{DATA, (ptr, strlen (ptr) + 1)} on input
+ − 2468 @item @code{LISP_STRING, string,}
+ − 2469 input or output is a Lisp_Object of type string
+ − 2470 @item @code{LISP_BUFFER, buffer,}
+ − 2471 output is written to @code{(point)} in lisp buffer @var{buffer}
+ − 2472 @item @code{LISP_LSTREAM, lstream,}
+ − 2473 input or output is a Lisp_Object of type lstream
+ − 2474 @item @code{LISP_OPAQUE, object,}
+ − 2475 input or output is a Lisp_Object of type opaque
+ − 2476 @end table
+ − 2477
+ − 2478 Often, the data is being converted to a '\0'-byte-terminated string,
+ − 2479 which is the format required by many external system C APIs. For these
+ − 2480 purposes, a source type of @code{C_STRING} or a sink type of
+ − 2481 @code{C_STRING_ALLOCA} or @code{C_STRING_MALLOC} is appropriate.
+ − 2482 Otherwise, we should try to keep XEmacs '\0'-byte-clean, which means
+ − 2483 using (ptr, len) pairs.
+ − 2484
+ − 2485 The sinks to be specified must be lvalues, unless they are the lisp
+ − 2486 object types @code{LISP_LSTREAM} or @code{LISP_BUFFER}.
+ − 2487
+ − 2488 For the sink types @code{ALLOCA} and @code{C_STRING_ALLOCA}, the
+ − 2489 resulting text is stored in a stack-allocated buffer, which is
+ − 2490 automatically freed on returning from the function. However, the sink
+ − 2491 types @code{MALLOC} and @code{C_STRING_MALLOC} return @code{xmalloc()}ed
+ − 2492 memory. The caller is responsible for freeing this memory using
+ − 2493 @code{xfree()}.
+ − 2494
+ − 2495 Note that it doesn't make sense for @code{LISP_STRING} to be a source
+ − 2496 for @code{TO_INTERNAL_FORMAT} or a sink for @code{TO_EXTERNAL_FORMAT}.
+ − 2497 You'll get an assertion failure if you try.
+ − 2498
+ − 2499
+ − 2500 @node General Guidelines for Writing Mule-Aware Code, An Example of Mule-Aware Code, Conversion to and from External Data, Coding for Mule
428
+ − 2501 @subsection General Guidelines for Writing Mule-Aware Code
+ − 2502
+ − 2503 This section contains some general guidance on how to write Mule-aware
+ − 2504 code, as well as some pitfalls you should avoid.
+ − 2505
+ − 2506 @table @emph
+ − 2507 @item Never use @code{char} and @code{char *}.
+ − 2508 In XEmacs, the use of @code{char} and @code{char *} is almost always a
+ − 2509 mistake. If you want to manipulate an Emacs character from ``C'', use
+ − 2510 @code{Emchar}. If you want to examine a specific octet in the internal
+ − 2511 format, use @code{Bufbyte}. If you want a Lisp-visible character, use a
+ − 2512 @code{Lisp_Object} and @code{make_char}. If you want a pointer to move
+ − 2513 through the internal text, use @code{Bufbyte *}. Also note that you
+ − 2514 almost certainly do not need @code{Emchar *}.
+ − 2515
+ − 2516 @item Be careful not to confuse @code{Charcount}, @code{Bytecount}, and @code{Bufpos}.
+ − 2517 The whole point of using different types is to avoid confusion about the
+ − 2518 use of certain variables. Lest this effect be nullified, you need to be
+ − 2519 careful about using the right types.
+ − 2520
+ − 2521 @item Always convert external data
+ − 2522 It is extremely important to always convert external data, because
+ − 2523 XEmacs can crash if unexpected 8bit sequences are copied to its internal
+ − 2524 buffers literally.
+ − 2525
+ − 2526 This means that when a system function, such as @code{readdir}, returns
442
+ − 2527 a string, you may need to convert it using one of the conversion macros
428
+ − 2528 described in the previous chapter, before passing it further to Lisp.
442
+ − 2529
+ − 2530 Actually, most of the basic system functions that accept '\0'-terminated
+ − 2531 string arguments, like @code{stat()} and @code{open()}, have been
+ − 2532 @strong{encapsulated} so that they are they @code{always} do internal to
+ − 2533 external conversion themselves. This means you must pass internally
+ − 2534 encoded data, typically the @code{XSTRING_DATA} of a Lisp_String to
+ − 2535 these functions. This is actually a design bug, since it unexpectedly
+ − 2536 changes the semantics of the system functions. A better design would be
+ − 2537 to provide separate versions of these system functions that accepted
+ − 2538 Lisp_Objects which were lisp strings in place of their current
+ − 2539 @code{char *} arguments.
+ − 2540
+ − 2541 @example
+ − 2542 int stat_lisp (Lisp_Object path, struct stat *buf); /* Implement me */
+ − 2543 @end example
428
+ − 2544
+ − 2545 Also note that many internal functions, such as @code{make_string},
+ − 2546 accept Bufbytes, which removes the need for them to convert the data
+ − 2547 they receive. This increases efficiency because that way external data
+ − 2548 needs to be decoded only once, when it is read. After that, it is
+ − 2549 passed around in internal format.
+ − 2550 @end table
+ − 2551
442
+ − 2552 @node An Example of Mule-Aware Code, , General Guidelines for Writing Mule-Aware Code, Coding for Mule
428
+ − 2553 @subsection An Example of Mule-Aware Code
+ − 2554
442
+ − 2555 As an example of Mule-aware code, we will analyze the @code{string}
+ − 2556 function, which conses up a Lisp string from the character arguments it
+ − 2557 receives. Here is the definition, pasted from @code{alloc.c}:
428
+ − 2558
+ − 2559 @example
+ − 2560 @group
+ − 2561 DEFUN ("string", Fstring, 0, MANY, 0, /*
+ − 2562 Concatenate all the argument characters and make the result a string.
+ − 2563 */
+ − 2564 (int nargs, Lisp_Object *args))
+ − 2565 @{
+ − 2566 Bufbyte *storage = alloca_array (Bufbyte, nargs * MAX_EMCHAR_LEN);
+ − 2567 Bufbyte *p = storage;
+ − 2568
+ − 2569 for (; nargs; nargs--, args++)
+ − 2570 @{
+ − 2571 Lisp_Object lisp_char = *args;
+ − 2572 CHECK_CHAR_COERCE_INT (lisp_char);
+ − 2573 p += set_charptr_emchar (p, XCHAR (lisp_char));
+ − 2574 @}
+ − 2575 return make_string (storage, p - storage);
+ − 2576 @}
+ − 2577 @end group
+ − 2578 @end example
+ − 2579
+ − 2580 Now we can analyze the source line by line.
+ − 2581
+ − 2582 Obviously, string will be as long as there are arguments to the
+ − 2583 function. This is why we allocate @code{MAX_EMCHAR_LEN} * @var{nargs}
+ − 2584 bytes on the stack, i.e. the worst-case number of bytes for @var{nargs}
+ − 2585 @code{Emchar}s to fit in the string.
+ − 2586
+ − 2587 Then, the loop checks that each element is a character, converting
+ − 2588 integers in the process. Like many other functions in XEmacs, this
+ − 2589 function silently accepts integers where characters are expected, for
+ − 2590 historical and compatibility reasons. Unless you know what you are
+ − 2591 doing, @code{CHECK_CHAR} will also suffice. @code{XCHAR (lisp_char)}
+ − 2592 extracts the @code{Emchar} from the @code{Lisp_Object}, and
+ − 2593 @code{set_charptr_emchar} stores it to storage, increasing @code{p} in
+ − 2594 the process.
+ − 2595
+ − 2596 Other instructive examples of correct coding under Mule can be found all
+ − 2597 over the XEmacs code. For starters, I recommend
+ − 2598 @code{Fnormalize_menu_item_name} in @file{menubar.c}. After you have
+ − 2599 understood this section of the manual and studied the examples, you can
+ − 2600 proceed writing new Mule-aware code.
+ − 2601
442
+ − 2602 @node Techniques for XEmacs Developers, , Coding for Mule, Rules When Writing New C Code
428
+ − 2603 @section Techniques for XEmacs Developers
+ − 2604
442
+ − 2605 To make a purified XEmacs, do: @code{make puremacs}.
428
+ − 2606 To make a quantified XEmacs, do: @code{make quantmacs}.
+ − 2607
442
+ − 2608 You simply can't dump Quantified and Purified images (unless using the
+ − 2609 portable dumper). Purify gets confused when xemacs frees memory in one
+ − 2610 process that was allocated in a @emph{different} process on a different
+ − 2611 machine!. Run it like so:
+ − 2612 @example
+ − 2613 temacs -batch -l loadup.el run-temacs @var{xemacs-args...}
+ − 2614 @end example
428
+ − 2615
+ − 2616 Before you go through the trouble, are you compiling with all
442
+ − 2617 debugging and error-checking off? If not, try that first. Be warned
428
+ − 2618 that while Quantify is directly responsible for quite a few
+ − 2619 optimizations which have been made to XEmacs, doing a run which
+ − 2620 generates results which can be acted upon is not necessarily a trivial
+ − 2621 task.
+ − 2622
+ − 2623 Also, if you're still willing to do some runs make sure you configure
+ − 2624 with the @samp{--quantify} flag. That will keep Quantify from starting
+ − 2625 to record data until after the loadup is completed and will shut off
+ − 2626 recording right before it shuts down (which generates enough bogus data
+ − 2627 to throw most results off). It also enables three additional elisp
+ − 2628 commands: @code{quantify-start-recording-data},
+ − 2629 @code{quantify-stop-recording-data} and @code{quantify-clear-data}.
+ − 2630
+ − 2631 If you want to make XEmacs faster, target your favorite slow benchmark,
+ − 2632 run a profiler like Quantify, @code{gprof}, or @code{tcov}, and figure
+ − 2633 out where the cycles are going. Specific projects:
+ − 2634
+ − 2635 @itemize @bullet
+ − 2636 @item
+ − 2637 Make the garbage collector faster. Figure out how to write an
+ − 2638 incremental garbage collector.
+ − 2639 @item
+ − 2640 Write a compiler that takes bytecode and spits out C code.
+ − 2641 Unfortunately, you will then need a C compiler and a more fully
+ − 2642 developed module system.
+ − 2643 @item
+ − 2644 Speed up redisplay.
+ − 2645 @item
+ − 2646 Speed up syntax highlighting. Maybe moving some of the syntax
+ − 2647 highlighting capabilities into C would make a difference.
+ − 2648 @item
+ − 2649 Implement tail recursion in Emacs Lisp (hard!).
+ − 2650 @end itemize
+ − 2651
+ − 2652 Unfortunately, Emacs Lisp is slow, and is going to stay slow. Function
+ − 2653 calls in elisp are especially expensive. Iterating over a long list is
+ − 2654 going to be 30 times faster implemented in C than in Elisp.
+ − 2655
442
+ − 2656 Heavily used small code fragments need to be fast. The traditional way
+ − 2657 to implement such code fragments in C is with macros. But macros in C
+ − 2658 are known to be broken.
+ − 2659
+ − 2660 Macro arguments that are repeatedly evaluated may suffer from repeated
+ − 2661 side effects or suboptimal performance.
+ − 2662
+ − 2663 Variable names used in macros may collide with caller's variables,
+ − 2664 causing (at least) unwanted compiler warnings.
+ − 2665
+ − 2666 In order to solve these problems, and maintain statement semantics, one
+ − 2667 should use the @code{do @{ ... @} while (0)} trick while trying to
+ − 2668 reference macro arguments exactly once using local variables.
+ − 2669
+ − 2670 Let's take a look at this poor macro definition:
+ − 2671
+ − 2672 @example
+ − 2673 #define MARK_OBJECT(obj) \
+ − 2674 if (!marked_p (obj)) mark_object (obj), did_mark = 1
+ − 2675 @end example
+ − 2676
+ − 2677 This macro evaluates its argument twice, and also fails if used like this:
+ − 2678 @example
+ − 2679 if (flag) MARK_OBJECT (obj); else do_something();
+ − 2680 @end example
+ − 2681
+ − 2682 A much better definition is
+ − 2683
+ − 2684 @example
+ − 2685 #define MARK_OBJECT(obj) do @{ \
+ − 2686 Lisp_Object mo_obj = (obj); \
+ − 2687 if (!marked_p (mo_obj)) \
+ − 2688 @{ \
+ − 2689 mark_object (mo_obj); \
+ − 2690 did_mark = 1; \
+ − 2691 @} \
+ − 2692 @} while (0)
+ − 2693 @end example
+ − 2694
+ − 2695 Notice the elimination of double evaluation by using the local variable
+ − 2696 with the obscure name. Writing safe and efficient macros requires great
+ − 2697 care. The one problem with macros that cannot be portably worked around
+ − 2698 is, since a C block has no value, a macro used as an expression rather
+ − 2699 than a statement cannot use the techniques just described to avoid
+ − 2700 multiple evaluation.
+ − 2701
+ − 2702 In most cases where a macro has function semantics, an inline function
+ − 2703 is a better implementation technique. Modern compiler optimizers tend
+ − 2704 to inline functions even if they have no @code{inline} keyword, and
+ − 2705 configure magic ensures that the @code{inline} keyword can be safely
+ − 2706 used as an additional compiler hint. Inline functions used in a single
+ − 2707 .c files are easy. The function must already be defined to be
+ − 2708 @code{static}. Just add another @code{inline} keyword to the
+ − 2709 definition.
+ − 2710
+ − 2711 @example
+ − 2712 inline static int
+ − 2713 heavily_used_small_function (int arg)
+ − 2714 @{
+ − 2715 ...
+ − 2716 @}
+ − 2717 @end example
+ − 2718
+ − 2719 Inline functions in header files are trickier, because we would like to
+ − 2720 make the following optimization if the function is @emph{not} inlined
+ − 2721 (for example, because we're compiling for debugging). We would like the
+ − 2722 function to be defined externally exactly once, and each calling
+ − 2723 translation unit would create an external reference to the function,
+ − 2724 instead of including a definition of the inline function in the object
+ − 2725 code of every translation unit that uses it. This optimization is
+ − 2726 currently only available for gcc. But you don't have to worry about the
+ − 2727 trickiness; just define your inline functions in header files using this
+ − 2728 pattern:
+ − 2729
+ − 2730 @example
+ − 2731 INLINE_HEADER int
+ − 2732 i_used_to_be_a_crufty_macro_but_look_at_me_now (int arg);
+ − 2733 INLINE_HEADER int
+ − 2734 i_used_to_be_a_crufty_macro_but_look_at_me_now (int arg)
+ − 2735 @{
+ − 2736 ...
+ − 2737 @}
+ − 2738 @end example
+ − 2739
+ − 2740 The declaration right before the definition is to prevent warnings when
+ − 2741 compiling with @code{gcc -Wmissing-declarations}. I consider issuing
+ − 2742 this warning for inline functions a gcc bug, but the gcc maintainers disagree.
+ − 2743
+ − 2744 Every header which contains inline functions, either directly by using
+ − 2745 @code{INLINE_HEADER} or indirectly by using @code{DECLARE_LRECORD} must
+ − 2746 be added to @file{inline.c}'s includes to make the optimization
+ − 2747 described above work. (Optimization note: if all INLINE_HEADER
+ − 2748 functions are in fact inlined in all translation units, then the linker
+ − 2749 can just discard @code{inline.o}, since it contains only unreferenced code).
+ − 2750
438
+ − 2751 To get started debugging XEmacs, take a look at the @file{.gdbinit} and
442
+ − 2752 @file{.dbxrc} files in the @file{src} directory. See the section in the
+ − 2753 XEmacs FAQ on How to Debug an XEmacs problem with a debugger.
428
+ − 2754
+ − 2755 After making source code changes, run @code{make check} to ensure that
442
+ − 2756 you haven't introduced any regressions. If you want to make xemacs more
+ − 2757 reliable, please improve the test suite in @file{tests/automated}.
+ − 2758
+ − 2759 Did you make sure you didn't introduce any new compiler warnings?
+ − 2760
+ − 2761 Before submitting a patch, please try compiling at least once with
+ − 2762
+ − 2763 @example
+ − 2764 configure --with-mule --with-union-type --error-checking=all
+ − 2765 @end example
428
+ − 2766
+ − 2767 Here are things to know when you create a new source file:
+ − 2768
+ − 2769 @itemize @bullet
+ − 2770 @item
+ − 2771 All @file{.c} files should @code{#include <config.h>} first. Almost all
+ − 2772 @file{.c} files should @code{#include "lisp.h"} second.
+ − 2773
+ − 2774 @item
+ − 2775 Generated header files should be included using the @code{#include <...>} syntax,
+ − 2776 not the @code{#include "..."} syntax. The generated headers are:
+ − 2777
442
+ − 2778 @file{config.h sheap-adjust.h paths.h Emacs.ad.h}
428
+ − 2779
+ − 2780 The basic rule is that you should assume builds using @code{--srcdir}
+ − 2781 and the @code{#include <...>} syntax needs to be used when the
+ − 2782 to-be-included generated file is in a potentially different directory
+ − 2783 @emph{at compile time}. The non-obvious C rule is that @code{#include "..."}
+ − 2784 means to search for the included file in the same directory as the
+ − 2785 including file, @emph{not} in the current directory.
+ − 2786
+ − 2787 @item
+ − 2788 Header files should @emph{not} include @code{<config.h>} and
+ − 2789 @code{"lisp.h"}. It is the responsibility of the @file{.c} files that
+ − 2790 use it to do so.
+ − 2791
+ − 2792 @end itemize
+ − 2793
442
+ − 2794 Here is a checklist of things to do when creating a new lisp object type
+ − 2795 named @var{foo}:
+ − 2796
+ − 2797 @enumerate
+ − 2798 @item
+ − 2799 create @var{foo}.h
+ − 2800 @item
+ − 2801 create @var{foo}.c
+ − 2802 @item
+ − 2803 add definitions of @code{syms_of_@var{foo}}, etc. to @file{@var{foo}.c}
+ − 2804 @item
+ − 2805 add declarations of @code{syms_of_@var{foo}}, etc. to @file{symsinit.h}
+ − 2806 @item
+ − 2807 add calls to @code{syms_of_@var{foo}}, etc. to @file{emacs.c}
+ − 2808 @item
+ − 2809 add definitions of macros like @code{CHECK_@var{FOO}} and
+ − 2810 @code{@var{FOO}P} to @file{@var{foo}.h}
+ − 2811 @item
+ − 2812 add the new type index to @code{enum lrecord_type}
+ − 2813 @item
+ − 2814 add a DEFINE_LRECORD_IMPLEMENTATION call to @file{@var{foo}.c}
+ − 2815 @item
+ − 2816 add an INIT_LRECORD_IMPLEMENTATION call to @code{syms_of_@var{foo}.c}
+ − 2817 @end enumerate
428
+ − 2818
+ − 2819 @node A Summary of the Various XEmacs Modules, Allocation of Objects in XEmacs Lisp, Rules When Writing New C Code, Top
+ − 2820 @chapter A Summary of the Various XEmacs Modules
+ − 2821
+ − 2822 This is accurate as of XEmacs 20.0.
+ − 2823
+ − 2824 @menu
+ − 2825 * Low-Level Modules::
+ − 2826 * Basic Lisp Modules::
+ − 2827 * Modules for Standard Editing Operations::
+ − 2828 * Editor-Level Control Flow Modules::
+ − 2829 * Modules for the Basic Displayable Lisp Objects::
+ − 2830 * Modules for other Display-Related Lisp Objects::
+ − 2831 * Modules for the Redisplay Mechanism::
+ − 2832 * Modules for Interfacing with the File System::
+ − 2833 * Modules for Other Aspects of the Lisp Interpreter and Object System::
+ − 2834 * Modules for Interfacing with the Operating System::
+ − 2835 * Modules for Interfacing with X Windows::
+ − 2836 * Modules for Internationalization::
+ − 2837 @end menu
+ − 2838
442
+ − 2839 @node Low-Level Modules, Basic Lisp Modules, A Summary of the Various XEmacs Modules, A Summary of the Various XEmacs Modules
428
+ − 2840 @section Low-Level Modules
+ − 2841
+ − 2842 @example
+ − 2843 config.h
+ − 2844 @end example
+ − 2845
+ − 2846 This is automatically generated from @file{config.h.in} based on the
+ − 2847 results of configure tests and user-selected optional features and
+ − 2848 contains preprocessor definitions specifying the nature of the
+ − 2849 environment in which XEmacs is being compiled.
+ − 2850
+ − 2851
+ − 2852
+ − 2853 @example
+ − 2854 paths.h
+ − 2855 @end example
+ − 2856
+ − 2857 This is automatically generated from @file{paths.h.in} based on supplied
+ − 2858 configure values, and allows for non-standard installed configurations
+ − 2859 of the XEmacs directories. It's currently broken, though.
+ − 2860
+ − 2861
+ − 2862
+ − 2863 @example
+ − 2864 emacs.c
+ − 2865 signal.c
+ − 2866 @end example
+ − 2867
+ − 2868 @file{emacs.c} contains @code{main()} and other code that performs the most
+ − 2869 basic environment initializations and handles shutting down the XEmacs
+ − 2870 process (this includes @code{kill-emacs}, the normal way that XEmacs is
+ − 2871 exited; @code{dump-emacs}, which is used during the build process to
+ − 2872 write out the XEmacs executable; @code{run-emacs-from-temacs}, which can
+ − 2873 be used to start XEmacs directly when temacs has finished loading all
+ − 2874 the Lisp code; and emergency code to handle crashes [XEmacs tries to
+ − 2875 auto-save all files before it crashes]).
+ − 2876
+ − 2877 Low-level code that directly interacts with the Unix signal mechanism,
+ − 2878 however, is in @file{signal.c}. Note that this code does not handle system
+ − 2879 dependencies in interfacing to signals; that is handled using the
+ − 2880 @file{syssignal.h} header file, described in section J below.
+ − 2881
+ − 2882
+ − 2883
+ − 2884 @example
+ − 2885 unexaix.c
+ − 2886 unexalpha.c
+ − 2887 unexapollo.c
+ − 2888 unexconvex.c
+ − 2889 unexec.c
+ − 2890 unexelf.c
+ − 2891 unexelfsgi.c
+ − 2892 unexencap.c
+ − 2893 unexenix.c
+ − 2894 unexfreebsd.c
+ − 2895 unexfx2800.c
+ − 2896 unexhp9k3.c
+ − 2897 unexhp9k800.c
+ − 2898 unexmips.c
+ − 2899 unexnext.c
+ − 2900 unexsol2.c
+ − 2901 unexsunos4.c
+ − 2902 @end example
+ − 2903
+ − 2904 These modules contain code dumping out the XEmacs executable on various
+ − 2905 different systems. (This process is highly machine-specific and
+ − 2906 requires intimate knowledge of the executable format and the memory map
+ − 2907 of the process.) Only one of these modules is actually used; this is
+ − 2908 chosen by @file{configure}.
+ − 2909
+ − 2910
+ − 2911
+ − 2912 @example
442
+ − 2913 ecrt0.c
428
+ − 2914 lastfile.c
+ − 2915 pre-crt0.c
+ − 2916 @end example
+ − 2917
+ − 2918 These modules are used in conjunction with the dump mechanism. On some
+ − 2919 systems, an alternative version of the C startup code (the actual code
+ − 2920 that receives control from the operating system when the process is
+ − 2921 started, and which calls @code{main()}) is required so that the dumping
+ − 2922 process works properly; @file{crt0.c} provides this.
+ − 2923
+ − 2924 @file{pre-crt0.c} and @file{lastfile.c} should be the very first and
+ − 2925 very last file linked, respectively. (Actually, this is not really true.
+ − 2926 @file{lastfile.c} should be after all Emacs modules whose initialized
+ − 2927 data should be made constant, and before all other Emacs files and all
+ − 2928 libraries. In particular, the allocation modules @file{gmalloc.c},
+ − 2929 @file{alloca.c}, etc. are normally placed past @file{lastfile.c}, and
+ − 2930 all of the files that implement Xt widget classes @emph{must} be placed
+ − 2931 after @file{lastfile.c} because they contain various structures that
+ − 2932 must be statically initialized and into which Xt writes at various
+ − 2933 times.) @file{pre-crt0.c} and @file{lastfile.c} contain exported symbols
+ − 2934 that are used to determine the start and end of XEmacs' initialized
+ − 2935 data space when dumping.
+ − 2936
+ − 2937
+ − 2938
+ − 2939 @example
+ − 2940 alloca.c
+ − 2941 free-hook.c
+ − 2942 getpagesize.h
+ − 2943 gmalloc.c
+ − 2944 malloc.c
+ − 2945 mem-limits.h
+ − 2946 ralloc.c
+ − 2947 vm-limit.c
+ − 2948 @end example
+ − 2949
+ − 2950 These handle basic C allocation of memory. @file{alloca.c} is an emulation of
+ − 2951 the stack allocation function @code{alloca()} on machines that lack
+ − 2952 this. (XEmacs makes extensive use of @code{alloca()} in its code.)
+ − 2953
+ − 2954 @file{gmalloc.c} and @file{malloc.c} are two implementations of the standard C
+ − 2955 functions @code{malloc()}, @code{realloc()} and @code{free()}. They are
+ − 2956 often used in place of the standard system-provided @code{malloc()}
+ − 2957 because they usually provide a much faster implementation, at the
+ − 2958 expense of additional memory use. @file{gmalloc.c} is a newer implementation
+ − 2959 that is much more memory-efficient for large allocations than @file{malloc.c},
+ − 2960 and should always be preferred if it works. (At one point, @file{gmalloc.c}
+ − 2961 didn't work on some systems where @file{malloc.c} worked; but this should be
+ − 2962 fixed now.)
+ − 2963
+ − 2964 @cindex relocating allocator
+ − 2965 @file{ralloc.c} is the @dfn{relocating allocator}. It provides
+ − 2966 functions similar to @code{malloc()}, @code{realloc()} and @code{free()}
+ − 2967 that allocate memory that can be dynamically relocated in memory. The
+ − 2968 advantage of this is that allocated memory can be shuffled around to
+ − 2969 place all the free memory at the end of the heap, and the heap can then
+ − 2970 be shrunk, releasing the memory back to the operating system. The use
+ − 2971 of this can be controlled with the configure option @code{--rel-alloc};
+ − 2972 if enabled, memory allocated for buffers will be relocatable, so that if
+ − 2973 a very large file is visited and the buffer is later killed, the memory
+ − 2974 can be released to the operating system. (The disadvantage of this
+ − 2975 mechanism is that it can be very slow. On systems with the
+ − 2976 @code{mmap()} system call, the XEmacs version of @file{ralloc.c} uses
+ − 2977 this to move memory around without actually having to block-copy it,
+ − 2978 which can speed things up; but it can still cause noticeable performance
+ − 2979 degradation.)
+ − 2980
+ − 2981 @file{free-hook.c} contains some debugging functions for checking for invalid
+ − 2982 arguments to @code{free()}.
+ − 2983
+ − 2984 @file{vm-limit.c} contains some functions that warn the user when memory is
+ − 2985 getting low. These are callback functions that are called by @file{gmalloc.c}
+ − 2986 and @file{malloc.c} at appropriate times.
+ − 2987
+ − 2988 @file{getpagesize.h} provides a uniform interface for retrieving the size of a
+ − 2989 page in virtual memory. @file{mem-limits.h} provides a uniform interface for
+ − 2990 retrieving the total amount of available virtual memory. Both are
+ − 2991 similar in spirit to the @file{sys*.h} files described in section J, below.
+ − 2992
+ − 2993
+ − 2994
+ − 2995 @example
+ − 2996 blocktype.c
+ − 2997 blocktype.h
+ − 2998 dynarr.c
+ − 2999 @end example
+ − 3000
+ − 3001 These implement a couple of basic C data types to facilitate memory
+ − 3002 allocation. The @code{Blocktype} type efficiently manages the
+ − 3003 allocation of fixed-size blocks by minimizing the number of times that
+ − 3004 @code{malloc()} and @code{free()} are called. It allocates memory in
+ − 3005 large chunks, subdivides the chunks into blocks of the proper size, and
+ − 3006 returns the blocks as requested. When blocks are freed, they are placed
+ − 3007 onto a linked list, so they can be efficiently reused. This data type
+ − 3008 is not much used in XEmacs currently, because it's a fairly new
+ − 3009 addition.
+ − 3010
+ − 3011 @cindex dynamic array
+ − 3012 The @code{Dynarr} type implements a @dfn{dynamic array}, which is
+ − 3013 similar to a standard C array but has no fixed limit on the number of
+ − 3014 elements it can contain. Dynamic arrays can hold elements of any type,
+ − 3015 and when you add a new element, the array automatically resizes itself
+ − 3016 if it isn't big enough. Dynarrs are extensively used in the redisplay
+ − 3017 mechanism.
+ − 3018
+ − 3019
+ − 3020
+ − 3021 @example
+ − 3022 inline.c
+ − 3023 @end example
+ − 3024
+ − 3025 This module is used in connection with inline functions (available in
+ − 3026 some compilers). Often, inline functions need to have a corresponding
+ − 3027 non-inline function that does the same thing. This module is where they
+ − 3028 reside. It contains no actual code, but defines some special flags that
+ − 3029 cause inline functions defined in header files to be rendered as actual
+ − 3030 functions. It then includes all header files that contain any inline
+ − 3031 function definitions, so that each one gets a real function equivalent.
+ − 3032
+ − 3033
+ − 3034
+ − 3035 @example
+ − 3036 debug.c
+ − 3037 debug.h
+ − 3038 @end example
+ − 3039
+ − 3040 These functions provide a system for doing internal consistency checks
+ − 3041 during code development. This system is not currently used; instead the
+ − 3042 simpler @code{assert()} macro is used along with the various checks
+ − 3043 provided by the @samp{--error-check-*} configuration options.
+ − 3044
+ − 3045
+ − 3046
+ − 3047 @example
+ − 3048 universe.h
+ − 3049 @end example
+ − 3050
+ − 3051 This is not currently used.
+ − 3052
+ − 3053
+ − 3054
442
+ − 3055 @node Basic Lisp Modules, Modules for Standard Editing Operations, Low-Level Modules, A Summary of the Various XEmacs Modules
428
+ − 3056 @section Basic Lisp Modules
+ − 3057
+ − 3058 @example
+ − 3059 lisp-disunion.h
+ − 3060 lisp-union.h
+ − 3061 lisp.h
+ − 3062 lrecord.h
+ − 3063 symsinit.h
+ − 3064 @end example
+ − 3065
+ − 3066 These are the basic header files for all XEmacs modules. Each module
+ − 3067 includes @file{lisp.h}, which brings the other header files in.
+ − 3068 @file{lisp.h} contains the definitions of the structures and extractor
+ − 3069 and constructor macros for the basic Lisp objects and various other
+ − 3070 basic definitions for the Lisp environment, as well as some
+ − 3071 general-purpose definitions (e.g. @code{min()} and @code{max()}).
+ − 3072 @file{lisp.h} includes either @file{lisp-disunion.h} or
+ − 3073 @file{lisp-union.h}, depending on whether @code{USE_UNION_TYPE} is
+ − 3074 defined. These files define the typedef of the Lisp object itself (as
+ − 3075 described above) and the low-level macros that hide the actual
+ − 3076 implementation of the Lisp object. All extractor and constructor macros
+ − 3077 for particular types of Lisp objects are defined in terms of these
+ − 3078 low-level macros.
+ − 3079
+ − 3080 As a general rule, all typedefs should go into the typedefs section of
+ − 3081 @file{lisp.h} rather than into a module-specific header file even if the
+ − 3082 structure is defined elsewhere. This allows function prototypes that
+ − 3083 use the typedef to be placed into other header files. Forward structure
+ − 3084 declarations (i.e. a simple declaration like @code{struct foo;} where
+ − 3085 the structure itself is defined elsewhere) should be placed into the
+ − 3086 typedefs section as necessary.
+ − 3087
+ − 3088 @file{lrecord.h} contains the basic structures and macros that implement
440
+ − 3089 all record-type Lisp objects---i.e. all objects whose type is a field
428
+ − 3090 in their C structure, which includes all objects except the few most
+ − 3091 basic ones.
+ − 3092
+ − 3093 @file{lisp.h} contains prototypes for most of the exported functions in
+ − 3094 the various modules. Lisp primitives defined using @code{DEFUN} that
+ − 3095 need to be called by C code should be declared using @code{EXFUN}.
+ − 3096 Other function prototypes should be placed either into the appropriate
+ − 3097 section of @code{lisp.h}, or into a module-specific header file,
+ − 3098 depending on how general-purpose the function is and whether it has
+ − 3099 special-purpose argument types requiring definitions not in
+ − 3100 @file{lisp.h}.) All initialization functions are prototyped in
+ − 3101 @file{symsinit.h}.
+ − 3102
+ − 3103
+ − 3104
+ − 3105 @example
+ − 3106 alloc.c
+ − 3107 @end example
+ − 3108
+ − 3109 The large module @file{alloc.c} implements all of the basic allocation and
+ − 3110 garbage collection for Lisp objects. The most commonly used Lisp
+ − 3111 objects are allocated in chunks, similar to the Blocktype data type
+ − 3112 described above; others are allocated in individually @code{malloc()}ed
+ − 3113 blocks. This module provides the foundation on which all other aspects
+ − 3114 of the Lisp environment sit, and is the first module initialized at
+ − 3115 startup.
+ − 3116
+ − 3117 Note that @file{alloc.c} provides a series of generic functions that are
+ − 3118 not dependent on any particular object type, and interfaces to
+ − 3119 particular types of objects using a standardized interface of
+ − 3120 type-specific methods. This scheme is a fundamental principle of
+ − 3121 object-oriented programming and is heavily used throughout XEmacs. The
+ − 3122 great advantage of this is that it allows for a clean separation of
440
+ − 3123 functionality into different modules---new classes of Lisp objects, new
428
+ − 3124 event interfaces, new device types, new stream interfaces, etc. can be
+ − 3125 added transparently without affecting code anywhere else in XEmacs.
+ − 3126 Because the different subsystems are divided into general and specific
+ − 3127 code, adding a new subtype within a subsystem will in general not
+ − 3128 require changes to the generic subsystem code or affect any of the other
+ − 3129 subtypes in the subsystem; this provides a great deal of robustness to
+ − 3130 the XEmacs code.
+ − 3131
+ − 3132
+ − 3133 @example
+ − 3134 eval.c
+ − 3135 backtrace.h
+ − 3136 @end example
+ − 3137
+ − 3138 This module contains all of the functions to handle the flow of control.
+ − 3139 This includes the mechanisms of defining functions, calling functions,
+ − 3140 traversing stack frames, and binding variables; the control primitives
+ − 3141 and other special forms such as @code{while}, @code{if}, @code{eval},
+ − 3142 @code{let}, @code{and}, @code{or}, @code{progn}, etc.; handling of
+ − 3143 non-local exits, unwind-protects, and exception handlers; entering the
+ − 3144 debugger; methods for the subr Lisp object type; etc. It does
+ − 3145 @emph{not} include the @code{read} function, the @code{print} function,
+ − 3146 or the handling of symbols and obarrays.
+ − 3147
+ − 3148 @file{backtrace.h} contains some structures related to stack frames and the
+ − 3149 flow of control.
+ − 3150
+ − 3151
+ − 3152
+ − 3153 @example
+ − 3154 lread.c
+ − 3155 @end example
+ − 3156
+ − 3157 This module implements the Lisp reader and the @code{read} function,
+ − 3158 which converts text into Lisp objects, according to the read syntax of
+ − 3159 the objects, as described above. This is similar to the parser that is
+ − 3160 a part of all compilers.
+ − 3161
+ − 3162
+ − 3163
+ − 3164 @example
+ − 3165 print.c
+ − 3166 @end example
+ − 3167
+ − 3168 This module implements the Lisp print mechanism and the @code{print}
+ − 3169 function and related functions. This is the inverse of the Lisp reader
+ − 3170 -- it converts Lisp objects to a printed, textual representation.
+ − 3171 (Hopefully something that can be read back in using @code{read} to get
+ − 3172 an equivalent object.)
+ − 3173
+ − 3174
+ − 3175
+ − 3176 @example
+ − 3177 general.c
+ − 3178 symbols.c
+ − 3179 symeval.h
+ − 3180 @end example
+ − 3181
+ − 3182 @file{symbols.c} implements the handling of symbols, obarrays, and
+ − 3183 retrieving the values of symbols. Much of the code is devoted to
+ − 3184 handling the special @dfn{symbol-value-magic} objects that define
440
+ − 3185 special types of variables---this includes buffer-local variables,
428
+ − 3186 variable aliases, variables that forward into C variables, etc. This
+ − 3187 module is initialized extremely early (right after @file{alloc.c}),
+ − 3188 because it is here that the basic symbols @code{t} and @code{nil} are
+ − 3189 created, and those symbols are used everywhere throughout XEmacs.
+ − 3190
+ − 3191 @file{symeval.h} contains the definitions of symbol structures and the
+ − 3192 @code{DEFVAR_LISP()} and related macros for declaring variables.
+ − 3193
+ − 3194
+ − 3195
+ − 3196 @example
+ − 3197 data.c
+ − 3198 floatfns.c
+ − 3199 fns.c
+ − 3200 @end example
+ − 3201
+ − 3202 These modules implement the methods and standard Lisp primitives for all
+ − 3203 the basic Lisp object types other than symbols (which are described
+ − 3204 above). @file{data.c} contains all the predicates (primitives that return
+ − 3205 whether an object is of a particular type); the integer arithmetic
+ − 3206 functions; and the basic accessor and mutator primitives for the various
+ − 3207 object types. @file{fns.c} contains all the standard predicates for working
+ − 3208 with sequences (where, abstractly speaking, a sequence is an ordered set
+ − 3209 of objects, and can be represented by a list, string, vector, or
+ − 3210 bit-vector); it also contains @code{equal}, perhaps on the grounds that
+ − 3211 bulk of the operation of @code{equal} is comparing sequences.
+ − 3212 @file{floatfns.c} contains methods and primitives for floats and floating-point
+ − 3213 arithmetic.
+ − 3214
+ − 3215
+ − 3216
+ − 3217 @example
+ − 3218 bytecode.c
+ − 3219 bytecode.h
+ − 3220 @end example
+ − 3221
+ − 3222 @file{bytecode.c} implements the byte-code interpreter and
+ − 3223 compiled-function objects, and @file{bytecode.h} contains associated
+ − 3224 structures. Note that the byte-code @emph{compiler} is written in Lisp.
+ − 3225
+ − 3226
+ − 3227
+ − 3228
442
+ − 3229 @node Modules for Standard Editing Operations, Editor-Level Control Flow Modules, Basic Lisp Modules, A Summary of the Various XEmacs Modules
428
+ − 3230 @section Modules for Standard Editing Operations
+ − 3231
+ − 3232 @example
+ − 3233 buffer.c
+ − 3234 buffer.h
+ − 3235 bufslots.h
+ − 3236 @end example
+ − 3237
+ − 3238 @file{buffer.c} implements the @dfn{buffer} Lisp object type. This
+ − 3239 includes functions that create and destroy buffers; retrieve buffers by
+ − 3240 name or by other properties; manipulate lists of buffers (remember that
+ − 3241 buffers are permanent objects and stored in various ordered lists);
+ − 3242 retrieve or change buffer properties; etc. It also contains the
+ − 3243 definitions of all the built-in buffer-local variables (which can be
+ − 3244 viewed as buffer properties). It does @emph{not} contain code to
+ − 3245 manipulate buffer-local variables (that's in @file{symbols.c}, described
+ − 3246 above); or code to manipulate the text in a buffer.
+ − 3247
+ − 3248 @file{buffer.h} defines the structures associated with a buffer and the various
+ − 3249 macros for retrieving text from a buffer and special buffer positions
+ − 3250 (e.g. @code{point}, the default location for text insertion). It also
+ − 3251 contains macros for working with buffer positions and converting between
+ − 3252 their representations as character offsets and as byte offsets (under
+ − 3253 MULE, they are different, because characters can be multi-byte). It is
+ − 3254 one of the largest header files.
+ − 3255
+ − 3256 @file{bufslots.h} defines the fields in the buffer structure that correspond to
+ − 3257 the built-in buffer-local variables. It is its own header file because
+ − 3258 it is included many times in @file{buffer.c}, as a way of iterating over all
+ − 3259 the built-in buffer-local variables.
+ − 3260
+ − 3261
+ − 3262
+ − 3263 @example
+ − 3264 insdel.c
+ − 3265 insdel.h
+ − 3266 @end example
+ − 3267
+ − 3268 @file{insdel.c} contains low-level functions for inserting and deleting text in
+ − 3269 a buffer, keeping track of changed regions for use by redisplay, and
+ − 3270 calling any before-change and after-change functions that may have been
+ − 3271 registered for the buffer. It also contains the actual functions that
+ − 3272 convert between byte offsets and character offsets.
+ − 3273
+ − 3274 @file{insdel.h} contains associated headers.
+ − 3275
+ − 3276
+ − 3277
+ − 3278 @example
+ − 3279 marker.c
+ − 3280 @end example
+ − 3281
+ − 3282 This module implements the @dfn{marker} Lisp object type, which
+ − 3283 conceptually is a pointer to a text position in a buffer that moves
+ − 3284 around as text is inserted and deleted, so as to remain in the same
+ − 3285 relative position. This module doesn't actually move the markers around
+ − 3286 -- that's handled in @file{insdel.c}. This module just creates them and
+ − 3287 implements the primitives for working with them. As markers are simple
+ − 3288 objects, this does not entail much.
+ − 3289
+ − 3290 Note that the standard arithmetic primitives (e.g. @code{+}) accept
+ − 3291 markers in place of integers and automatically substitute the value of
+ − 3292 @code{marker-position} for the marker, i.e. an integer describing the
+ − 3293 current buffer position of the marker.
+ − 3294
+ − 3295
+ − 3296
+ − 3297 @example
+ − 3298 extents.c
+ − 3299 extents.h
+ − 3300 @end example
+ − 3301
+ − 3302 This module implements the @dfn{extent} Lisp object type, which is like
+ − 3303 a marker that works over a range of text rather than a single position.
+ − 3304 Extents are also much more complex and powerful than markers and have a
+ − 3305 more efficient (and more algorithmically complex) implementation. The
+ − 3306 implementation is described in detail in comments in @file{extents.c}.
+ − 3307
+ − 3308 The code in @file{extents.c} works closely with @file{insdel.c} so that
+ − 3309 extents are properly moved around as text is inserted and deleted.
+ − 3310 There is also code in @file{extents.c} that provides information needed
+ − 3311 by the redisplay mechanism for efficient operation. (Remember that
+ − 3312 extents can have display properties that affect [sometimes drastically,
+ − 3313 as in the @code{invisible} property] the display of the text they
+ − 3314 cover.)
+ − 3315
+ − 3316
+ − 3317
+ − 3318 @example
+ − 3319 editfns.c
+ − 3320 @end example
+ − 3321
+ − 3322 @file{editfns.c} contains the standard Lisp primitives for working with
+ − 3323 a buffer's text, and calls the low-level functions in @file{insdel.c}.
+ − 3324 It also contains primitives for working with @code{point} (the default
+ − 3325 buffer insertion location).
+ − 3326
+ − 3327 @file{editfns.c} also contains functions for retrieving various
+ − 3328 characteristics from the external environment: the current time, the
+ − 3329 process ID of the running XEmacs process, the name of the user who ran
+ − 3330 this XEmacs process, etc. It's not clear why this code is in
+ − 3331 @file{editfns.c}.
+ − 3332
+ − 3333
+ − 3334
+ − 3335 @example
+ − 3336 callint.c
+ − 3337 cmds.c
+ − 3338 commands.h
+ − 3339 @end example
+ − 3340
+ − 3341 @cindex interactive
+ − 3342 These modules implement the basic @dfn{interactive} commands,
+ − 3343 i.e. user-callable functions. Commands, as opposed to other functions,
+ − 3344 have special ways of getting their parameters interactively (by querying
+ − 3345 the user), as opposed to having them passed in a normal function
+ − 3346 invocation. Many commands are not really meant to be called from other
+ − 3347 Lisp functions, because they modify global state in a way that's often
+ − 3348 undesired as part of other Lisp functions.
+ − 3349
+ − 3350 @file{callint.c} implements the mechanism for querying the user for
+ − 3351 parameters and calling interactive commands. The bulk of this module is
+ − 3352 code that parses the interactive spec that is supplied with an
+ − 3353 interactive command.
+ − 3354
+ − 3355 @file{cmds.c} implements the basic, most commonly used editing commands:
+ − 3356 commands to move around the current buffer and insert and delete
+ − 3357 characters. These commands are implemented using the Lisp primitives
+ − 3358 defined in @file{editfns.c}.
+ − 3359
+ − 3360 @file{commands.h} contains associated structure definitions and prototypes.
+ − 3361
+ − 3362
+ − 3363
+ − 3364 @example
+ − 3365 regex.c
+ − 3366 regex.h
+ − 3367 search.c
+ − 3368 @end example
+ − 3369
+ − 3370 @file{search.c} implements the Lisp primitives for searching for text in
+ − 3371 a buffer, and some of the low-level algorithms for doing this. In
+ − 3372 particular, the fast fixed-string Boyer-Moore search algorithm is
+ − 3373 implemented in @file{search.c}. The low-level algorithms for doing
+ − 3374 regular-expression searching, however, are implemented in @file{regex.c}
+ − 3375 and @file{regex.h}. These two modules are largely independent of
+ − 3376 XEmacs, and are similar to (and based upon) the regular-expression
+ − 3377 routines used in @file{grep} and other GNU utilities.
+ − 3378
+ − 3379
+ − 3380
+ − 3381 @example
+ − 3382 doprnt.c
+ − 3383 @end example
+ − 3384
+ − 3385 @file{doprnt.c} implements formatted-string processing, similar to
+ − 3386 @code{printf()} command in C.
+ − 3387
+ − 3388
+ − 3389
+ − 3390 @example
+ − 3391 undo.c
+ − 3392 @end example
+ − 3393
+ − 3394 This module implements the undo mechanism for tracking buffer changes.
+ − 3395 Most of this could be implemented in Lisp.
+ − 3396
+ − 3397
+ − 3398
442
+ − 3399 @node Editor-Level Control Flow Modules, Modules for the Basic Displayable Lisp Objects, Modules for Standard Editing Operations, A Summary of the Various XEmacs Modules
428
+ − 3400 @section Editor-Level Control Flow Modules
+ − 3401
+ − 3402 @example
+ − 3403 event-Xt.c
442
+ − 3404 event-msw.c
428
+ − 3405 event-stream.c
+ − 3406 event-tty.c
442
+ − 3407 events-mod.h
+ − 3408 gpmevent.c
+ − 3409 gpmevent.h
428
+ − 3410 events.c
+ − 3411 events.h
+ − 3412 @end example
+ − 3413
+ − 3414 These implement the handling of events (user input and other system
+ − 3415 notifications).
+ − 3416
+ − 3417 @file{events.c} and @file{events.h} define the @dfn{event} Lisp object
+ − 3418 type and primitives for manipulating it.
+ − 3419
+ − 3420 @file{event-stream.c} implements the basic functions for working with
+ − 3421 event queues, dispatching an event by looking it up in relevant keymaps
+ − 3422 and such, and handling timeouts; this includes the primitives
+ − 3423 @code{next-event} and @code{dispatch-event}, as well as related
+ − 3424 primitives such as @code{sit-for}, @code{sleep-for}, and
+ − 3425 @code{accept-process-output}. (@file{event-stream.c} is one of the
+ − 3426 hairiest and trickiest modules in XEmacs. Beware! You can easily mess
+ − 3427 things up here.)
+ − 3428
+ − 3429 @file{event-Xt.c} and @file{event-tty.c} implement the low-level
+ − 3430 interfaces onto retrieving events from Xt (the X toolkit) and from TTY's
+ − 3431 (using @code{read()} and @code{select()}), respectively. The event
+ − 3432 interface enforces a clean separation between the specific code for
+ − 3433 interfacing with the operating system and the generic code for working
+ − 3434 with events, by defining an API of basic, low-level event methods;
+ − 3435 @file{event-Xt.c} and @file{event-tty.c} are two different
+ − 3436 implementations of this API. To add support for a new operating system
+ − 3437 (e.g. NeXTstep), one merely needs to provide another implementation of
+ − 3438 those API functions.
+ − 3439
+ − 3440 Note that the choice of whether to use @file{event-Xt.c} or
+ − 3441 @file{event-tty.c} is made at compile time! Or at the very latest, it
+ − 3442 is made at startup time. @file{event-Xt.c} handles events for
+ − 3443 @emph{both} X and TTY frames; @file{event-tty.c} is only used when X
+ − 3444 support is not compiled into XEmacs. The reason for this is that there
+ − 3445 is only one event loop in XEmacs: thus, it needs to be able to receive
+ − 3446 events from all different kinds of frames.
+ − 3447
+ − 3448
+ − 3449
+ − 3450 @example
+ − 3451 keymap.c
+ − 3452 keymap.h
+ − 3453 @end example
+ − 3454
+ − 3455 @file{keymap.c} and @file{keymap.h} define the @dfn{keymap} Lisp object
+ − 3456 type and associated methods and primitives. (Remember that keymaps are
+ − 3457 objects that associate event descriptions with functions to be called to
+ − 3458 ``execute'' those events; @code{dispatch-event} looks up events in the
+ − 3459 relevant keymaps.)
+ − 3460
+ − 3461
+ − 3462
+ − 3463 @example
442
+ − 3464 cmdloop.c
+ − 3465 @end example
+ − 3466
+ − 3467 @file{cmdloop.c} contains functions that implement the actual editor
440
+ − 3468 command loop---i.e. the event loop that cyclically retrieves and
428
+ − 3469 dispatches events. This code is also rather tricky, just like
+ − 3470 @file{event-stream.c}.
+ − 3471
+ − 3472
+ − 3473
+ − 3474 @example
+ − 3475 macros.c
+ − 3476 macros.h
+ − 3477 @end example
+ − 3478
+ − 3479 These two modules contain the basic code for defining keyboard macros.
+ − 3480 These functions don't actually do much; most of the code that handles keyboard
+ − 3481 macros is mixed in with the event-handling code in @file{event-stream.c}.
+ − 3482
+ − 3483
+ − 3484
+ − 3485 @example
+ − 3486 minibuf.c
+ − 3487 @end example
+ − 3488
+ − 3489 This contains some miscellaneous code related to the minibuffer (most of
+ − 3490 the minibuffer code was moved into Lisp by Richard Mlynarik). This
+ − 3491 includes the primitives for completion (although filename completion is
+ − 3492 in @file{dired.c}), the lowest-level interface to the minibuffer (if the
+ − 3493 command loop were cleaned up, this too could be in Lisp), and code for
+ − 3494 dealing with the echo area (this, too, was mostly moved into Lisp, and
+ − 3495 the only code remaining is code to call out to Lisp or provide simple
+ − 3496 bootstrapping implementations early in temacs, before the echo-area Lisp
+ − 3497 code is loaded).
+ − 3498
+ − 3499
+ − 3500
442
+ − 3501 @node Modules for the Basic Displayable Lisp Objects, Modules for other Display-Related Lisp Objects, Editor-Level Control Flow Modules, A Summary of the Various XEmacs Modules
428
+ − 3502 @section Modules for the Basic Displayable Lisp Objects
+ − 3503
+ − 3504 @example
442
+ − 3505 console-msw.c
+ − 3506 console-msw.h
+ − 3507 console-stream.c
+ − 3508 console-stream.h
+ − 3509 console-tty.c
+ − 3510 console-tty.h
+ − 3511 console-x.c
+ − 3512 console-x.h
+ − 3513 console.c
+ − 3514 console.h
+ − 3515 @end example
+ − 3516
+ − 3517 These modules implement the @dfn{console} Lisp object type. A console
+ − 3518 contains multiple display devices, but only one keyboard and mouse.
+ − 3519 Most of the time, a console will contain exactly one device.
+ − 3520
+ − 3521 Consoles are the top of a lisp object inclusion hierarchy. Consoles
+ − 3522 contain devices, which contain frames, which contain windows.
+ − 3523
+ − 3524
+ − 3525
+ − 3526 @example
+ − 3527 device-msw.c
428
+ − 3528 device-tty.c
+ − 3529 device-x.c
+ − 3530 device.c
+ − 3531 device.h
+ − 3532 @end example
+ − 3533
+ − 3534 These modules implement the @dfn{device} Lisp object type. This
+ − 3535 abstracts a particular screen or connection on which frames are
+ − 3536 displayed. As with Lisp objects, event interfaces, and other
+ − 3537 subsystems, the device code is separated into a generic component that
+ − 3538 contains a standardized interface (in the form of a set of methods) onto
+ − 3539 particular device types.
+ − 3540
+ − 3541 The device subsystem defines all the methods and provides method
+ − 3542 services for not only device operations but also for the frame, window,
+ − 3543 menubar, scrollbar, toolbar, and other displayable-object subsystems.
+ − 3544 The reason for this is that all of these subsystems have the same
+ − 3545 subtypes (X, TTY, NeXTstep, Microsoft Windows, etc.) as devices do.
+ − 3546
+ − 3547
+ − 3548
+ − 3549 @example
442
+ − 3550 frame-msw.c
428
+ − 3551 frame-tty.c
+ − 3552 frame-x.c
+ − 3553 frame.c
+ − 3554 frame.h
+ − 3555 @end example
+ − 3556
+ − 3557 Each device contains one or more frames in which objects (e.g. text) are
+ − 3558 displayed. A frame corresponds to a window in the window system;
+ − 3559 usually this is a top-level window but it could potentially be one of a
+ − 3560 number of overlapping child windows within a top-level window, using the
+ − 3561 MDI (Multiple Document Interface) protocol in Microsoft Windows or a
+ − 3562 similar scheme.
+ − 3563
+ − 3564 The @file{frame-*} files implement the @dfn{frame} Lisp object type and
+ − 3565 provide the generic and device-type-specific operations on frames
+ − 3566 (e.g. raising, lowering, resizing, moving, etc.).
+ − 3567
+ − 3568
+ − 3569
+ − 3570 @example
+ − 3571 window.c
+ − 3572 window.h
+ − 3573 @end example
+ − 3574
+ − 3575 @cindex window (in Emacs)
+ − 3576 @cindex pane
+ − 3577 Each frame consists of one or more non-overlapping @dfn{windows} (better
+ − 3578 known as @dfn{panes} in standard window-system terminology) in which a
+ − 3579 buffer's text can be displayed. Windows can also have scrollbars
+ − 3580 displayed around their edges.
+ − 3581
+ − 3582 @file{window.c} and @file{window.h} implement the @dfn{window} Lisp
+ − 3583 object type and provide code to manage windows. Since windows have no
+ − 3584 associated resources in the window system (the window system knows only
+ − 3585 about the frame; no child windows or anything are used for XEmacs
+ − 3586 windows), there is no device-type-specific code here; all of that code
+ − 3587 is part of the redisplay mechanism or the code for particular object
+ − 3588 types such as scrollbars.
+ − 3589
+ − 3590
+ − 3591
442
+ − 3592 @node Modules for other Display-Related Lisp Objects, Modules for the Redisplay Mechanism, Modules for the Basic Displayable Lisp Objects, A Summary of the Various XEmacs Modules
428
+ − 3593 @section Modules for other Display-Related Lisp Objects
+ − 3594
+ − 3595 @example
+ − 3596 faces.c
+ − 3597 faces.h
+ − 3598 @end example
+ − 3599
+ − 3600
+ − 3601
+ − 3602 @example
+ − 3603 bitmaps.h
442
+ − 3604 glyphs-eimage.c
+ − 3605 glyphs-msw.c
+ − 3606 glyphs-msw.h
+ − 3607 glyphs-widget.c
428
+ − 3608 glyphs-x.c
+ − 3609 glyphs-x.h
+ − 3610 glyphs.c
+ − 3611 glyphs.h
+ − 3612 @end example
+ − 3613
+ − 3614
+ − 3615
+ − 3616 @example
442
+ − 3617 objects-msw.c
+ − 3618 objects-msw.h
428
+ − 3619 objects-tty.c
+ − 3620 objects-tty.h
+ − 3621 objects-x.c
+ − 3622 objects-x.h
+ − 3623 objects.c
+ − 3624 objects.h
+ − 3625 @end example
+ − 3626
+ − 3627
+ − 3628
+ − 3629 @example
442
+ − 3630 menubar-msw.c
+ − 3631 menubar-msw.h
428
+ − 3632 menubar-x.c
+ − 3633 menubar.c
442
+ − 3634 menubar.h
+ − 3635 @end example
+ − 3636
+ − 3637
+ − 3638
+ − 3639 @example
+ − 3640 scrollbar-msw.c
+ − 3641 scrollbar-msw.h
428
+ − 3642 scrollbar-x.c
+ − 3643 scrollbar-x.h
+ − 3644 scrollbar.c
+ − 3645 scrollbar.h
+ − 3646 @end example
+ − 3647
+ − 3648
+ − 3649
+ − 3650 @example
442
+ − 3651 toolbar-msw.c
428
+ − 3652 toolbar-x.c
+ − 3653 toolbar.c
+ − 3654 toolbar.h
+ − 3655 @end example
+ − 3656
+ − 3657
+ − 3658
+ − 3659 @example
+ − 3660 font-lock.c
+ − 3661 @end example
+ − 3662
440
+ − 3663 This file provides C support for syntax highlighting---i.e.
428
+ − 3664 highlighting different syntactic constructs of a source file in
+ − 3665 different colors, for easy reading. The C support is provided so that
+ − 3666 this is fast.
+ − 3667
+ − 3668
+ − 3669
+ − 3670 @example
+ − 3671 dgif_lib.c
+ − 3672 gif_err.c
+ − 3673 gif_lib.h
+ − 3674 gifalloc.c
+ − 3675 @end example
+ − 3676
+ − 3677 These modules decode GIF-format image files, for use with glyphs.
442
+ − 3678 These files were removed due to Unisys patent infringement concerns.
+ − 3679
+ − 3680
+ − 3681
+ − 3682 @node Modules for the Redisplay Mechanism, Modules for Interfacing with the File System, Modules for other Display-Related Lisp Objects, A Summary of the Various XEmacs Modules
428
+ − 3683 @section Modules for the Redisplay Mechanism
+ − 3684
+ − 3685 @example
+ − 3686 redisplay-output.c
442
+ − 3687 redisplay-msw.c
428
+ − 3688 redisplay-tty.c
+ − 3689 redisplay-x.c
+ − 3690 redisplay.c
+ − 3691 redisplay.h
+ − 3692 @end example
+ − 3693
+ − 3694 These files provide the redisplay mechanism. As with many other
+ − 3695 subsystems in XEmacs, there is a clean separation between the general
+ − 3696 and device-specific support.
+ − 3697
+ − 3698 @file{redisplay.c} contains the bulk of the redisplay engine. These
+ − 3699 functions update the redisplay structures (which describe how the screen
+ − 3700 is to appear) to reflect any changes made to the state of any
+ − 3701 displayable objects (buffer, frame, window, etc.) since the last time
+ − 3702 that redisplay was called. These functions are highly optimized to
+ − 3703 avoid doing more work than necessary (since redisplay is called
+ − 3704 extremely often and is potentially a huge time sink), and depend heavily
+ − 3705 on notifications from the objects themselves that changes have occurred,
+ − 3706 so that redisplay doesn't explicitly have to check each possible object.
+ − 3707 The redisplay mechanism also contains a great deal of caching to further
+ − 3708 speed things up; some of this caching is contained within the various
+ − 3709 displayable objects.
+ − 3710
+ − 3711 @file{redisplay-output.c} goes through the redisplay structures and converts
+ − 3712 them into calls to device-specific methods to actually output the screen
+ − 3713 changes.
+ − 3714
+ − 3715 @file{redisplay-x.c} and @file{redisplay-tty.c} are two implementations
+ − 3716 of these redisplay output methods, for X frames and TTY frames,
+ − 3717 respectively.
+ − 3718
+ − 3719
+ − 3720
+ − 3721 @example
+ − 3722 indent.c
+ − 3723 @end example
+ − 3724
+ − 3725 This module contains various functions and Lisp primitives for
+ − 3726 converting between buffer positions and screen positions. These
+ − 3727 functions call the redisplay mechanism to do most of the work, and then
+ − 3728 examine the redisplay structures to get the necessary information. This
+ − 3729 module needs work.
+ − 3730
+ − 3731
+ − 3732
+ − 3733 @example
+ − 3734 termcap.c
+ − 3735 terminfo.c
+ − 3736 tparam.c
+ − 3737 @end example
+ − 3738
+ − 3739 These files contain functions for working with the termcap (BSD-style)
+ − 3740 and terminfo (System V style) databases of terminal capabilities and
+ − 3741 escape sequences, used when XEmacs is displaying in a TTY.
+ − 3742
+ − 3743
+ − 3744
+ − 3745 @example
+ − 3746 cm.c
+ − 3747 cm.h
+ − 3748 @end example
+ − 3749
+ − 3750 These files provide some miscellaneous TTY-output functions and should
+ − 3751 probably be merged into @file{redisplay-tty.c}.
+ − 3752
+ − 3753
+ − 3754
442
+ − 3755 @node Modules for Interfacing with the File System, Modules for Other Aspects of the Lisp Interpreter and Object System, Modules for the Redisplay Mechanism, A Summary of the Various XEmacs Modules
428
+ − 3756 @section Modules for Interfacing with the File System
+ − 3757
+ − 3758 @example
+ − 3759 lstream.c
+ − 3760 lstream.h
+ − 3761 @end example
+ − 3762
+ − 3763 These modules implement the @dfn{stream} Lisp object type. This is an
+ − 3764 internal-only Lisp object that implements a generic buffering stream.
+ − 3765 The idea is to provide a uniform interface onto all sources and sinks of
+ − 3766 data, including file descriptors, stdio streams, chunks of memory, Lisp
+ − 3767 buffers, Lisp strings, etc. That way, I/O functions can be written to
+ − 3768 the stream interface and can transparently handle all possible sources
+ − 3769 and sinks. (For example, the @code{read} function can read data from a
+ − 3770 file, a string, a buffer, or even a function that is called repeatedly
+ − 3771 to return data, without worrying about where the data is coming from or
+ − 3772 what-size chunks it is returned in.)
+ − 3773
+ − 3774 @cindex lstream
+ − 3775 Note that in the C code, streams are called @dfn{lstreams} (for ``Lisp
+ − 3776 streams'') to distinguish them from other kinds of streams, e.g. stdio
+ − 3777 streams and C++ I/O streams.
+ − 3778
+ − 3779 Similar to other subsystems in XEmacs, lstreams are separated into
+ − 3780 generic functions and a set of methods for the different types of
+ − 3781 lstreams. @file{lstream.c} provides implementations of many different
442
+ − 3782 types of streams; others are provided, e.g., in @file{file-coding.c}.
428
+ − 3783
+ − 3784
+ − 3785
+ − 3786 @example
+ − 3787 fileio.c
+ − 3788 @end example
+ − 3789
+ − 3790 This implements the basic primitives for interfacing with the file
+ − 3791 system. This includes primitives for reading files into buffers,
+ − 3792 writing buffers into files, checking for the presence or accessibility
+ − 3793 of files, canonicalizing file names, etc. Note that these primitives
+ − 3794 are usually not invoked directly by the user: There is a great deal of
+ − 3795 higher-level Lisp code that implements the user commands such as
+ − 3796 @code{find-file} and @code{save-buffer}. This is similar to the
+ − 3797 distinction between the lower-level primitives in @file{editfns.c} and
+ − 3798 the higher-level user commands in @file{commands.c} and
+ − 3799 @file{simple.el}.
+ − 3800
+ − 3801
+ − 3802
+ − 3803 @example
+ − 3804 filelock.c
+ − 3805 @end example
+ − 3806
+ − 3807 This file provides functions for detecting clashes between different
+ − 3808 processes (e.g. XEmacs and some external process, or two different
+ − 3809 XEmacs processes) modifying the same file. (XEmacs can optionally use
+ − 3810 the @file{lock/} subdirectory to provide a form of ``locking'' between
+ − 3811 different XEmacs processes.) This module is also used by the low-level
+ − 3812 functions in @file{insdel.c} to ensure that, if the first modification
+ − 3813 is being made to a buffer whose corresponding file has been externally
+ − 3814 modified, the user is made aware of this so that the buffer can be
+ − 3815 synched up with the external changes if necessary.
+ − 3816
+ − 3817
+ − 3818 @example
+ − 3819 filemode.c
+ − 3820 @end example
+ − 3821
+ − 3822 This file provides some miscellaneous functions that construct a
+ − 3823 @samp{rwxr-xr-x}-type permissions string (as might appear in an
+ − 3824 @file{ls}-style directory listing) given the information returned by the
+ − 3825 @code{stat()} system call.
+ − 3826
+ − 3827
+ − 3828
+ − 3829 @example
+ − 3830 dired.c
+ − 3831 ndir.h
+ − 3832 @end example
+ − 3833
+ − 3834 These files implement the XEmacs interface to directory searching. This
+ − 3835 includes a number of primitives for determining the files in a directory
+ − 3836 and for doing filename completion. (Remember that generic completion is
+ − 3837 handled by a different mechanism, in @file{minibuf.c}.)
+ − 3838
+ − 3839 @file{ndir.h} is a header file used for the directory-searching
+ − 3840 emulation functions provided in @file{sysdep.c} (see section J below),
+ − 3841 for systems that don't provide any directory-searching functions. (On
+ − 3842 those systems, directories can be read directly as files, and parsed.)
+ − 3843
+ − 3844
+ − 3845
+ − 3846 @example
+ − 3847 realpath.c
+ − 3848 @end example
+ − 3849
+ − 3850 This file provides an implementation of the @code{realpath()} function
+ − 3851 for expanding symbolic links, on systems that don't implement it or have
+ − 3852 a broken implementation.
+ − 3853
+ − 3854
+ − 3855
442
+ − 3856 @node Modules for Other Aspects of the Lisp Interpreter and Object System, Modules for Interfacing with the Operating System, Modules for Interfacing with the File System, A Summary of the Various XEmacs Modules
428
+ − 3857 @section Modules for Other Aspects of the Lisp Interpreter and Object System
+ − 3858
+ − 3859 @example
+ − 3860 elhash.c
+ − 3861 elhash.h
+ − 3862 hash.c
+ − 3863 hash.h
+ − 3864 @end example
+ − 3865
+ − 3866 These files provide two implementations of hash tables. Files
+ − 3867 @file{hash.c} and @file{hash.h} provide a generic C implementation of
+ − 3868 hash tables which can stand independently of XEmacs. Files
+ − 3869 @file{elhash.c} and @file{elhash.h} provide a separate implementation of
+ − 3870 hash tables that can store only Lisp objects, and knows about Lispy
+ − 3871 things like garbage collection, and implement the @dfn{hash-table} Lisp
+ − 3872 object type.
+ − 3873
+ − 3874
+ − 3875 @example
+ − 3876 specifier.c
+ − 3877 specifier.h
+ − 3878 @end example
+ − 3879
+ − 3880 This module implements the @dfn{specifier} Lisp object type. This is
+ − 3881 primarily used for displayable properties, and allows for values that
+ − 3882 are specific to a particular buffer, window, frame, device, or device
+ − 3883 class, as well as a default value existing. This is used, for example,
+ − 3884 to control the height of the horizontal scrollbar or the appearance of
+ − 3885 the @code{default}, @code{bold}, or other faces. The specifier object
+ − 3886 consists of a number of specifications, each of which maps from a
+ − 3887 buffer, window, etc. to a value. The function @code{specifier-instance}
+ − 3888 looks up a value given a window (from which a buffer, frame, and device
+ − 3889 can be derived).
+ − 3890
+ − 3891
+ − 3892 @example
+ − 3893 chartab.c
+ − 3894 chartab.h
+ − 3895 casetab.c
+ − 3896 @end example
+ − 3897
+ − 3898 @file{chartab.c} and @file{chartab.h} implement the @dfn{char table}
+ − 3899 Lisp object type, which maps from characters or certain sorts of
+ − 3900 character ranges to Lisp objects. The implementation of this object
+ − 3901 type is optimized for the internal representation of characters. Char
+ − 3902 tables come in different types, which affect the allowed object types to
+ − 3903 which a character can be mapped and also dictate certain other
+ − 3904 properties of the char table.
+ − 3905
+ − 3906 @cindex case table
+ − 3907 @file{casetab.c} implements one sort of char table, the @dfn{case
+ − 3908 table}, which maps characters to other characters of possibly different
+ − 3909 case. These are used by XEmacs to implement case-changing primitives
+ − 3910 and to do case-insensitive searching.
+ − 3911
+ − 3912
+ − 3913
+ − 3914 @example
+ − 3915 syntax.c
+ − 3916 syntax.h
+ − 3917 @end example
+ − 3918
+ − 3919 @cindex scanner
+ − 3920 This module implements @dfn{syntax tables}, another sort of char table
+ − 3921 that maps characters into syntax classes that define the syntax of these
+ − 3922 characters (e.g. a parenthesis belongs to a class of @samp{open}
+ − 3923 characters that have corresponding @samp{close} characters and can be
+ − 3924 nested). This module also implements the Lisp @dfn{scanner}, a set of
+ − 3925 primitives for scanning over text based on syntax tables. This is used,
+ − 3926 for example, to find the matching parenthesis in a command such as
+ − 3927 @code{forward-sexp}, and by @file{font-lock.c} to locate quoted strings,
+ − 3928 comments, etc.
+ − 3929
+ − 3930
+ − 3931
+ − 3932 @example
+ − 3933 casefiddle.c
+ − 3934 @end example
+ − 3935
+ − 3936 This module implements various Lisp primitives for upcasing, downcasing
+ − 3937 and capitalizing strings or regions of buffers.
+ − 3938
+ − 3939
+ − 3940
+ − 3941 @example
+ − 3942 rangetab.c
+ − 3943 @end example
+ − 3944
+ − 3945 This module implements the @dfn{range table} Lisp object type, which
+ − 3946 provides for a mapping from ranges of integers to arbitrary Lisp
+ − 3947 objects.
+ − 3948
+ − 3949
+ − 3950
+ − 3951 @example
+ − 3952 opaque.c
+ − 3953 opaque.h
+ − 3954 @end example
+ − 3955
+ − 3956 This module implements the @dfn{opaque} Lisp object type, an
+ − 3957 internal-only Lisp object that encapsulates an arbitrary block of memory
+ − 3958 so that it can be managed by the Lisp allocation system. To create an
+ − 3959 opaque object, you call @code{make_opaque()}, passing a pointer to a
+ − 3960 block of memory. An object is created that is big enough to hold the
+ − 3961 memory, which is copied into the object's storage. The object will then
+ − 3962 stick around as long as you keep pointers to it, after which it will be
+ − 3963 automatically reclaimed.
+ − 3964
+ − 3965 @cindex mark method
+ − 3966 Opaque objects can also have an arbitrary @dfn{mark method} associated
+ − 3967 with them, in case the block of memory contains other Lisp objects that
+ − 3968 need to be marked for garbage-collection purposes. (If you need other
+ − 3969 object methods, such as a finalize method, you should just go ahead and
440
+ − 3970 create a new Lisp object type---it's not hard.)
428
+ − 3971
+ − 3972
+ − 3973
+ − 3974 @example
+ − 3975 abbrev.c
+ − 3976 @end example
+ − 3977
+ − 3978 This function provides a few primitives for doing dynamic abbreviation
+ − 3979 expansion. In XEmacs, most of the code for this has been moved into
+ − 3980 Lisp. Some C code remains for speed and because the primitive
+ − 3981 @code{self-insert-command} (which is executed for all self-inserting
+ − 3982 characters) hooks into the abbrev mechanism. (@code{self-insert-command}
+ − 3983 is itself in C only for speed.)
+ − 3984
+ − 3985
+ − 3986
+ − 3987 @example
+ − 3988 doc.c
+ − 3989 @end example
+ − 3990
+ − 3991 This function provides primitives for retrieving the documentation
+ − 3992 strings of functions and variables. These documentation strings contain
+ − 3993 certain special markers that get dynamically expanded (e.g. a
+ − 3994 reverse-lookup is performed on some named functions to retrieve their
+ − 3995 current key bindings). Some documentation strings (in particular, for
+ − 3996 the built-in primitives and pre-loaded Lisp functions) are stored
+ − 3997 externally in a file @file{DOC} in the @file{lib-src/} directory and
+ − 3998 need to be fetched from that file. (Part of the build stage involves
+ − 3999 building this file, and another part involves constructing an index for
+ − 4000 this file and embedding it into the executable, so that the functions in
+ − 4001 @file{doc.c} do not have to search the entire @file{DOC} file to find
+ − 4002 the appropriate documentation string.)
+ − 4003
+ − 4004
+ − 4005
+ − 4006 @example
+ − 4007 md5.c
+ − 4008 @end example
+ − 4009
+ − 4010 This function provides a Lisp primitive that implements the MD5 secure
+ − 4011 hashing scheme, used to create a large hash value of a string of data such that
+ − 4012 the data cannot be derived from the hash value. This is used for
+ − 4013 various security applications on the Internet.
+ − 4014
+ − 4015
+ − 4016
+ − 4017
442
+ − 4018 @node Modules for Interfacing with the Operating System, Modules for Interfacing with X Windows, Modules for Other Aspects of the Lisp Interpreter and Object System, A Summary of the Various XEmacs Modules
428
+ − 4019 @section Modules for Interfacing with the Operating System
+ − 4020
+ − 4021 @example
+ − 4022 callproc.c
+ − 4023 process.c
+ − 4024 process.h
+ − 4025 @end example
+ − 4026
+ − 4027 These modules allow XEmacs to spawn and communicate with subprocesses
+ − 4028 and network connections.
+ − 4029
+ − 4030 @cindex synchronous subprocesses
+ − 4031 @cindex subprocesses, synchronous
+ − 4032 @file{callproc.c} implements (through the @code{call-process}
+ − 4033 primitive) what are called @dfn{synchronous subprocesses}. This means
+ − 4034 that XEmacs runs a program, waits till it's done, and retrieves its
+ − 4035 output. A typical example might be calling the @file{ls} program to get
+ − 4036 a directory listing.
+ − 4037
+ − 4038 @cindex asynchronous subprocesses
+ − 4039 @cindex subprocesses, asynchronous
+ − 4040 @file{process.c} and @file{process.h} implement @dfn{asynchronous
+ − 4041 subprocesses}. This means that XEmacs starts a program and then
+ − 4042 continues normally, not waiting for the process to finish. Data can be
+ − 4043 sent to the process or retrieved from it as it's running. This is used
+ − 4044 for the @code{shell} command (which provides a front end onto a shell
+ − 4045 program such as @file{csh}), the mail and news readers implemented in
+ − 4046 XEmacs, etc. The result of calling @code{start-process} to start a
+ − 4047 subprocess is a process object, a particular kind of object used to
+ − 4048 communicate with the subprocess. You can send data to the process by
+ − 4049 passing the process object and the data to @code{send-process}, and you
+ − 4050 can specify what happens to data retrieved from the process by setting
+ − 4051 properties of the process object. (When the process sends data, XEmacs
+ − 4052 receives a process event, which says that there is data ready. When
+ − 4053 @code{dispatch-event} is called on this event, it reads the data from
+ − 4054 the process and does something with it, as specified by the process
+ − 4055 object's properties. Typically, this means inserting the data into a
+ − 4056 buffer or calling a function.) Another property of the process object is
+ − 4057 called the @dfn{sentinel}, which is a function that is called when the
+ − 4058 process terminates.
+ − 4059
+ − 4060 @cindex network connections
+ − 4061 Process objects are also used for network connections (connections to a
+ − 4062 process running on another machine). Network connections are started
+ − 4063 with @code{open-network-stream} but otherwise work just like
+ − 4064 subprocesses.
+ − 4065
+ − 4066
+ − 4067
+ − 4068 @example
+ − 4069 sysdep.c
+ − 4070 sysdep.h
+ − 4071 @end example
+ − 4072
+ − 4073 These modules implement most of the low-level, messy operating-system
+ − 4074 interface code. This includes various device control (ioctl) operations
+ − 4075 for file descriptors, TTY's, pseudo-terminals, etc. (usually this stuff
+ − 4076 is fairly system-dependent; thus the name of this module), and emulation
+ − 4077 of standard library functions and system calls on systems that don't
+ − 4078 provide them or have broken versions.
+ − 4079
+ − 4080
+ − 4081
+ − 4082 @example
+ − 4083 sysdir.h
+ − 4084 sysfile.h
+ − 4085 sysfloat.h
+ − 4086 sysproc.h
+ − 4087 syspwd.h
+ − 4088 syssignal.h
+ − 4089 systime.h
+ − 4090 systty.h
+ − 4091 syswait.h
+ − 4092 @end example
+ − 4093
+ − 4094 These header files provide consistent interfaces onto system-dependent
+ − 4095 header files and system calls. The idea is that, instead of including a
+ − 4096 standard header file like @file{<sys/param.h>} (which may or may not
+ − 4097 exist on various systems) or having to worry about whether all system
+ − 4098 provide a particular preprocessor constant, or having to deal with the
+ − 4099 four different paradigms for manipulating signals, you just include the
+ − 4100 appropriate @file{sys*.h} header file, which includes all the right
+ − 4101 system header files, defines and missing preprocessor constants,
+ − 4102 provides a uniform interface onto system calls, etc.
+ − 4103
+ − 4104 @file{sysdir.h} provides a uniform interface onto directory-querying
+ − 4105 functions. (In some cases, this is in conjunction with emulation
+ − 4106 functions in @file{sysdep.c}.)
+ − 4107
+ − 4108 @file{sysfile.h} includes all the necessary header files for standard
+ − 4109 system calls (e.g. @code{read()}), ensures that all necessary
+ − 4110 @code{open()} and @code{stat()} preprocessor constants are defined, and
+ − 4111 possibly (usually) substitutes sugared versions of @code{read()},
+ − 4112 @code{write()}, etc. that automatically restart interrupted I/O
+ − 4113 operations.
+ − 4114
+ − 4115 @file{sysfloat.h} includes the necessary header files for floating-point
+ − 4116 operations.
+ − 4117
+ − 4118 @file{sysproc.h} includes the necessary header files for calling
+ − 4119 @code{select()}, @code{fork()}, @code{execve()}, socket operations, and
+ − 4120 the like, and ensures that the @code{FD_*()} macros for descriptor-set
+ − 4121 manipulations are available.
+ − 4122
+ − 4123 @file{syspwd.h} includes the necessary header files for obtaining
+ − 4124 information from @file{/etc/passwd} (the functions are emulated under
+ − 4125 VMS).
+ − 4126
+ − 4127 @file{syssignal.h} includes the necessary header files for
+ − 4128 signal-handling and provides a uniform interface onto the different
+ − 4129 signal-handling and signal-blocking paradigms.
+ − 4130
+ − 4131 @file{systime.h} includes the necessary header files and provides
+ − 4132 uniform interfaces for retrieving the time of day, setting file
+ − 4133 access/modification times, getting the amount of time used by the XEmacs
+ − 4134 process, etc.
+ − 4135
+ − 4136 @file{systty.h} buffers against the infinitude of different ways of
+ − 4137 controlling TTY's.
+ − 4138
+ − 4139 @file{syswait.h} provides a uniform way of retrieving the exit status
+ − 4140 from a @code{wait()}ed-on process (some systems use a union, others use
+ − 4141 an int).
+ − 4142
+ − 4143
+ − 4144
+ − 4145 @example
+ − 4146 hpplay.c
+ − 4147 libsst.c
+ − 4148 libsst.h
+ − 4149 libst.h
+ − 4150 linuxplay.c
+ − 4151 nas.c
+ − 4152 sgiplay.c
+ − 4153 sound.c
+ − 4154 sunplay.c
+ − 4155 @end example
+ − 4156
+ − 4157 These files implement the ability to play various sounds on some types
+ − 4158 of computers. You have to configure your XEmacs with sound support in
+ − 4159 order to get this capability.
+ − 4160
+ − 4161 @file{sound.c} provides the generic interface. It implements various
+ − 4162 Lisp primitives and variables that let you specify which sounds should
+ − 4163 be played in certain conditions. (The conditions are identified by
+ − 4164 symbols, which are passed to @code{ding} to make a sound. Various
+ − 4165 standard functions call this function at certain times; if sound support
+ − 4166 does not exist, a simple beep results.
+ − 4167
+ − 4168 @cindex native sound
+ − 4169 @cindex sound, native
+ − 4170 @file{sgiplay.c}, @file{sunplay.c}, @file{hpplay.c}, and
+ − 4171 @file{linuxplay.c} interface to the machine's speaker for various
+ − 4172 different kind of machines. This is called @dfn{native} sound.
+ − 4173
+ − 4174 @cindex sound, network
+ − 4175 @cindex network sound
+ − 4176 @cindex NAS
+ − 4177 @file{nas.c} interfaces to a computer somewhere else on the network
+ − 4178 using the NAS (Network Audio Server) protocol, playing sounds on that
+ − 4179 machine. This allows you to run XEmacs on a remote machine, with its
+ − 4180 display set to your local machine, and have the sounds be made on your
+ − 4181 local machine, provided that you have a NAS server running on your local
+ − 4182 machine.
+ − 4183
+ − 4184 @file{libsst.c}, @file{libsst.h}, and @file{libst.h} provide some
+ − 4185 additional functions for playing sound on a Sun SPARC but are not
+ − 4186 currently in use.
+ − 4187
+ − 4188
+ − 4189
+ − 4190 @example
+ − 4191 tooltalk.c
+ − 4192 tooltalk.h
+ − 4193 @end example
+ − 4194
+ − 4195 These two modules implement an interface to the ToolTalk protocol, which
+ − 4196 is an interprocess communication protocol implemented on some versions
+ − 4197 of Unix. ToolTalk is a high-level protocol that allows processes to
+ − 4198 register themselves as providers of particular services; other processes
+ − 4199 can then request a service without knowing or caring exactly who is
+ − 4200 providing the service. It is similar in spirit to the DDE protocol
+ − 4201 provided under Microsoft Windows. ToolTalk is a part of the new CDE
+ − 4202 (Common Desktop Environment) specification and is used to connect the
+ − 4203 parts of the SPARCWorks development environment.
+ − 4204
+ − 4205
+ − 4206
+ − 4207 @example
+ − 4208 getloadavg.c
+ − 4209 @end example
+ − 4210
+ − 4211 This module provides the ability to retrieve the system's current load
+ − 4212 average. (The way to do this is highly system-specific, unfortunately,
+ − 4213 and requires a lot of special-case code.)
+ − 4214
+ − 4215
+ − 4216
+ − 4217 @example
+ − 4218 sunpro.c
+ − 4219 @end example
+ − 4220
+ − 4221 This module provides a small amount of code used internally at Sun to
+ − 4222 keep statistics on the usage of XEmacs.
+ − 4223
+ − 4224
+ − 4225
+ − 4226 @example
+ − 4227 broken-sun.h
+ − 4228 strcmp.c
+ − 4229 strcpy.c
+ − 4230 sunOS-fix.c
+ − 4231 @end example
+ − 4232
+ − 4233 These files provide replacement functions and prototypes to fix numerous
+ − 4234 bugs in early releases of SunOS 4.1.
+ − 4235
+ − 4236
+ − 4237
+ − 4238 @example
+ − 4239 hftctl.c
+ − 4240 @end example
+ − 4241
+ − 4242 This module provides some terminal-control code necessary on versions of
+ − 4243 AIX prior to 4.1.
+ − 4244
+ − 4245
+ − 4246
442
+ − 4247 @node Modules for Interfacing with X Windows, Modules for Internationalization, Modules for Interfacing with the Operating System, A Summary of the Various XEmacs Modules
428
+ − 4248 @section Modules for Interfacing with X Windows
+ − 4249
+ − 4250 @example
+ − 4251 Emacs.ad.h
+ − 4252 @end example
+ − 4253
+ − 4254 A file generated from @file{Emacs.ad}, which contains XEmacs-supplied
+ − 4255 fallback resources (so that XEmacs has pretty defaults).
+ − 4256
+ − 4257
+ − 4258
+ − 4259 @example
+ − 4260 EmacsFrame.c
+ − 4261 EmacsFrame.h
+ − 4262 EmacsFrameP.h
+ − 4263 @end example
+ − 4264
+ − 4265 These modules implement an Xt widget class that encapsulates a frame.
+ − 4266 This is for ease in integrating with Xt. The EmacsFrame widget covers
+ − 4267 the entire X window except for the menubar; the scrollbars are
+ − 4268 positioned on top of the EmacsFrame widget.
+ − 4269
+ − 4270 @strong{Warning:} Abandon hope, all ye who enter here. This code took
+ − 4271 an ungodly amount of time to get right, and is likely to fall apart
+ − 4272 mercilessly at the slightest change. Such is life under Xt.
+ − 4273
+ − 4274
+ − 4275
+ − 4276 @example
+ − 4277 EmacsManager.c
+ − 4278 EmacsManager.h
+ − 4279 EmacsManagerP.h
+ − 4280 @end example
+ − 4281
+ − 4282 These modules implement a simple Xt manager (i.e. composite) widget
+ − 4283 class that simply lets its children set whatever geometry they want.
+ − 4284 It's amazing that Xt doesn't provide this standardly, but on second
+ − 4285 thought, it makes sense, considering how amazingly broken Xt is.
+ − 4286
+ − 4287
+ − 4288 @example
+ − 4289 EmacsShell-sub.c
+ − 4290 EmacsShell.c
+ − 4291 EmacsShell.h
+ − 4292 EmacsShellP.h
+ − 4293 @end example
+ − 4294
+ − 4295 These modules implement two Xt widget classes that are subclasses of
+ − 4296 the TopLevelShell and TransientShell classes. This is necessary to deal
+ − 4297 with more brokenness that Xt has sadistically thrust onto the backs of
+ − 4298 developers.
+ − 4299
+ − 4300
+ − 4301
+ − 4302 @example
+ − 4303 xgccache.c
+ − 4304 xgccache.h
+ − 4305 @end example
+ − 4306
+ − 4307 These modules provide functions for maintenance and caching of GC's
+ − 4308 (graphics contexts) under the X Window System. This code is junky and
+ − 4309 needs to be rewritten.
+ − 4310
+ − 4311
+ − 4312
+ − 4313 @example
442
+ − 4314 select-msw.c
+ − 4315 select-x.c
+ − 4316 select.c
+ − 4317 select.h
428
+ − 4318 @end example
+ − 4319
+ − 4320 @cindex selections
+ − 4321 This module provides an interface to the X Window System's concept of
+ − 4322 @dfn{selections}, the standard way for X applications to communicate
+ − 4323 with each other.
+ − 4324
+ − 4325
+ − 4326
+ − 4327 @example
+ − 4328 xintrinsic.h
+ − 4329 xintrinsicp.h
+ − 4330 xmmanagerp.h
+ − 4331 xmprimitivep.h
+ − 4332 @end example
+ − 4333
+ − 4334 These header files are similar in spirit to the @file{sys*.h} files and buffer
+ − 4335 against different implementations of Xt and Motif.
+ − 4336
+ − 4337 @itemize @bullet
+ − 4338 @item
+ − 4339 @file{xintrinsic.h} should be included in place of @file{<Intrinsic.h>}.
+ − 4340 @item
+ − 4341 @file{xintrinsicp.h} should be included in place of @file{<IntrinsicP.h>}.
+ − 4342 @item
+ − 4343 @file{xmmanagerp.h} should be included in place of @file{<XmManagerP.h>}.
+ − 4344 @item
+ − 4345 @file{xmprimitivep.h} should be included in place of @file{<XmPrimitiveP.h>}.
+ − 4346 @end itemize
+ − 4347
+ − 4348
+ − 4349
+ − 4350 @example
+ − 4351 xmu.c
+ − 4352 xmu.h
+ − 4353 @end example
+ − 4354
+ − 4355 These files provide an emulation of the Xmu library for those systems
+ − 4356 (i.e. HPUX) that don't provide it as a standard part of X.
+ − 4357
+ − 4358
+ − 4359
+ − 4360 @example
+ − 4361 ExternalClient-Xlib.c
+ − 4362 ExternalClient.c
+ − 4363 ExternalClient.h
+ − 4364 ExternalClientP.h
+ − 4365 ExternalShell.c
+ − 4366 ExternalShell.h
+ − 4367 ExternalShellP.h
+ − 4368 extw-Xlib.c
+ − 4369 extw-Xlib.h
+ − 4370 extw-Xt.c
+ − 4371 extw-Xt.h
+ − 4372 @end example
+ − 4373
+ − 4374 @cindex external widget
+ − 4375 These files provide the @dfn{external widget} interface, which allows an
+ − 4376 XEmacs frame to appear as a widget in another application. To do this,
+ − 4377 you have to configure with @samp{--external-widget}.
+ − 4378
+ − 4379 @file{ExternalShell*} provides the server (XEmacs) side of the
+ − 4380 connection.
+ − 4381
+ − 4382 @file{ExternalClient*} provides the client (other application) side of
+ − 4383 the connection. These files are not compiled into XEmacs but are
+ − 4384 compiled into libraries that are then linked into your application.
+ − 4385
+ − 4386 @file{extw-*} is common code that is used for both the client and server.
+ − 4387
+ − 4388 Don't touch this code; something is liable to break if you do.
+ − 4389
+ − 4390
+ − 4391
442
+ − 4392 @node Modules for Internationalization, , Modules for Interfacing with X Windows, A Summary of the Various XEmacs Modules
428
+ − 4393 @section Modules for Internationalization
+ − 4394
+ − 4395 @example
+ − 4396 mule-canna.c
+ − 4397 mule-ccl.c
+ − 4398 mule-charset.c
+ − 4399 mule-charset.h
442
+ − 4400 file-coding.c
+ − 4401 file-coding.h
428
+ − 4402 mule-mcpath.c
+ − 4403 mule-mcpath.h
+ − 4404 mule-wnnfns.c
+ − 4405 mule.c
+ − 4406 @end example
+ − 4407
+ − 4408 These files implement the MULE (Asian-language) support. Note that MULE
+ − 4409 actually provides a general interface for all sorts of languages, not
+ − 4410 just Asian languages (although they are generally the most complicated
+ − 4411 to support). This code is still in beta.
+ − 4412
442
+ − 4413 @file{mule-charset.*} and @file{file-coding.*} provide the heart of the
428
+ − 4414 XEmacs MULE support. @file{mule-charset.*} implements the @dfn{charset}
+ − 4415 Lisp object type, which encapsulates a character set (an ordered one- or
+ − 4416 two-dimensional set of characters, such as US ASCII or JISX0208 Japanese
+ − 4417 Kanji).
+ − 4418
442
+ − 4419 @file{file-coding.*} implements the @dfn{coding-system} Lisp object
428
+ − 4420 type, which encapsulates a method of converting between different
+ − 4421 encodings. An encoding is a representation of a stream of characters,
+ − 4422 possibly from multiple character sets, using a stream of bytes or words,
+ − 4423 and defines (e.g.) which escape sequences are used to specify particular
+ − 4424 character sets, how the indices for a character are converted into bytes
+ − 4425 (sometimes this involves setting the high bit; sometimes complicated
+ − 4426 rearranging of the values takes place, as in the Shift-JIS encoding),
+ − 4427 etc.
+ − 4428
+ − 4429 @file{mule-ccl.c} provides the CCL (Code Conversion Language)
+ − 4430 interpreter. CCL is similar in spirit to Lisp byte code and is used to
+ − 4431 implement converters for custom encodings.
+ − 4432
+ − 4433 @file{mule-canna.c} and @file{mule-wnnfns.c} implement interfaces to
+ − 4434 external programs used to implement the Canna and WNN input methods,
+ − 4435 respectively. This is currently in beta.
+ − 4436
+ − 4437 @file{mule-mcpath.c} provides some functions to allow for pathnames
+ − 4438 containing extended characters. This code is fragmentary, obsolete, and
+ − 4439 completely non-working. Instead, @var{pathname-coding-system} is used
+ − 4440 to specify conversions of names of files and directories. The standard
+ − 4441 C I/O functions like @samp{open()} are wrapped so that conversion occurs
+ − 4442 automatically.
+ − 4443
+ − 4444 @file{mule.c} provides a few miscellaneous things that should probably
+ − 4445 be elsewhere.
+ − 4446
+ − 4447
+ − 4448
+ − 4449 @example
+ − 4450 intl.c
+ − 4451 @end example
+ − 4452
+ − 4453 This provides some miscellaneous internationalization code for
+ − 4454 implementing message translation and interfacing to the Ximp input
+ − 4455 method. None of this code is currently working.
+ − 4456
+ − 4457
+ − 4458
+ − 4459 @example
+ − 4460 iso-wide.h
+ − 4461 @end example
+ − 4462
+ − 4463 This contains leftover code from an earlier implementation of
+ − 4464 Asian-language support, and is not currently used.
+ − 4465
+ − 4466
+ − 4467
+ − 4468
442
+ − 4469 @node Allocation of Objects in XEmacs Lisp, Dumping, A Summary of the Various XEmacs Modules, Top
428
+ − 4470 @chapter Allocation of Objects in XEmacs Lisp
+ − 4471
+ − 4472 @menu
+ − 4473 * Introduction to Allocation::
+ − 4474 * Garbage Collection::
+ − 4475 * GCPROing::
+ − 4476 * Garbage Collection - Step by Step::
+ − 4477 * Integers and Characters::
+ − 4478 * Allocation from Frob Blocks::
+ − 4479 * lrecords::
+ − 4480 * Low-level allocation::
+ − 4481 * Cons::
+ − 4482 * Vector::
+ − 4483 * Bit Vector::
+ − 4484 * Symbol::
+ − 4485 * Marker::
+ − 4486 * String::
+ − 4487 * Compiled Function::
+ − 4488 @end menu
+ − 4489
442
+ − 4490 @node Introduction to Allocation, Garbage Collection, Allocation of Objects in XEmacs Lisp, Allocation of Objects in XEmacs Lisp
428
+ − 4491 @section Introduction to Allocation
+ − 4492
+ − 4493 Emacs Lisp, like all Lisps, has garbage collection. This means that
+ − 4494 the programmer never has to explicitly free (destroy) an object; it
+ − 4495 happens automatically when the object becomes inaccessible. Most
+ − 4496 experts agree that garbage collection is a necessity in a modern,
+ − 4497 high-level language. Its omission from C stems from the fact that C was
+ − 4498 originally designed to be a nice abstract layer on top of assembly
+ − 4499 language, for writing kernels and basic system utilities rather than
+ − 4500 large applications.
+ − 4501
+ − 4502 Lisp objects can be created by any of a number of Lisp primitives.
+ − 4503 Most object types have one or a small number of basic primitives
+ − 4504 for creating objects. For conses, the basic primitive is @code{cons};
+ − 4505 for vectors, the primitives are @code{make-vector} and @code{vector}; for
+ − 4506 symbols, the primitives are @code{make-symbol} and @code{intern}; etc.
+ − 4507 Some Lisp objects, especially those that are primarily used internally,
+ − 4508 have no corresponding Lisp primitives. Every Lisp object, though,
+ − 4509 has at least one C primitive for creating it.
+ − 4510
442
+ − 4511 Recall from section (VII) that a Lisp object, as stored in a 32-bit or
+ − 4512 64-bit word, has a few tag bits, and a ``value'' that occupies the
+ − 4513 remainder of the bits. We can separate the different Lisp object types
+ − 4514 into three broad categories:
428
+ − 4515
+ − 4516 @itemize @bullet
+ − 4517 @item
+ − 4518 (a) Those for whom the value directly represents the contents of the
+ − 4519 Lisp object. Only two types are in this category: integers and
+ − 4520 characters. No special allocation or garbage collection is necessary
+ − 4521 for such objects. Lisp objects of these types do not need to be
+ − 4522 @code{GCPRO}ed.
+ − 4523 @end itemize
+ − 4524
442
+ − 4525 In the remaining two categories, the type is stored in the object
+ − 4526 itself. The tag for all such objects is the generic @dfn{lrecord}
+ − 4527 (Lisp_Type_Record) tag. The first bytes of the object's structure are an
+ − 4528 integer (actually a char) characterising the object's type and some
+ − 4529 flags, in particular the mark bit used for garbage collection. A
+ − 4530 structure describing the type is accessible thru the
+ − 4531 lrecord_implementation_table indexed with said integer. This structure
+ − 4532 includes the method pointers and a pointer to a string naming the type.
428
+ − 4533
+ − 4534 @itemize @bullet
+ − 4535 @item
442
+ − 4536 (b) Those lrecords that are allocated in frob blocks (see above). This
428
+ − 4537 includes the objects that are most common and relatively small, and
442
+ − 4538 includes conses, strings, subrs, floats, compiled functions, symbols,
428
+ − 4539 extents, events, and markers. With the cleanup of frob blocks done in
+ − 4540 19.12, it's not terribly hard to add more objects to this category, but
442
+ − 4541 it's a bit trickier than adding an object type to type (c) (esp. if the
428
+ − 4542 object needs a finalization method), and is not likely to save much
+ − 4543 space unless the object is small and there are many of them. (In fact,
+ − 4544 if there are very few of them, it might actually waste space.)
+ − 4545 @item
442
+ − 4546 (c) Those lrecords that are individually @code{malloc()}ed. These are
428
+ − 4547 called @dfn{lcrecords}. All other types are in this category. Adding a
+ − 4548 new type to this category is comparatively easy, and all types added
+ − 4549 since 19.8 (when the current allocation scheme was devised, by Richard
+ − 4550 Mlynarik), with the exception of the character type, have been in this
+ − 4551 category.
+ − 4552 @end itemize
+ − 4553
+ − 4554 Note that bit vectors are a bit of a special case. They are
442
+ − 4555 simple lrecords as in category (b), but are individually @code{malloc()}ed
428
+ − 4556 like vectors. You can basically view them as exactly like vectors
+ − 4557 except that their type is stored in lrecord fashion rather than
+ − 4558 in directly-tagged fashion.
+ − 4559
442
+ − 4560
+ − 4561 @node Garbage Collection, GCPROing, Introduction to Allocation, Allocation of Objects in XEmacs Lisp
428
+ − 4562 @section Garbage Collection
+ − 4563 @cindex garbage collection
+ − 4564
+ − 4565 @cindex mark and sweep
+ − 4566 Garbage collection is simple in theory but tricky to implement.
+ − 4567 Emacs Lisp uses the oldest garbage collection method, called
+ − 4568 @dfn{mark and sweep}. Garbage collection begins by starting with
+ − 4569 all accessible locations (i.e. all variables and other slots where
+ − 4570 Lisp objects might occur) and recursively traversing all objects
+ − 4571 accessible from those slots, marking each one that is found.
+ − 4572 We then go through all of memory and free each object that is
+ − 4573 not marked, and unmarking each object that is marked. Note
+ − 4574 that ``all of memory'' means all currently allocated objects.
+ − 4575 Traversing all these objects means traversing all frob blocks,
+ − 4576 all vectors (which are chained in one big list), and all
+ − 4577 lcrecords (which are likewise chained).
+ − 4578
442
+ − 4579 Garbage collection can be invoked explicitly by calling
+ − 4580 @code{garbage-collect} but is also called automatically by @code{eval},
+ − 4581 once a certain amount of memory has been allocated since the last
+ − 4582 garbage collection (according to @code{gc-cons-threshold}).
+ − 4583
+ − 4584
+ − 4585 @node GCPROing, Garbage Collection - Step by Step, Garbage Collection, Allocation of Objects in XEmacs Lisp
428
+ − 4586 @section @code{GCPRO}ing
+ − 4587
+ − 4588 @code{GCPRO}ing is one of the ugliest and trickiest parts of Emacs
+ − 4589 internals. The basic idea is that whenever garbage collection
+ − 4590 occurs, all in-use objects must be reachable somehow or
+ − 4591 other from one of the roots of accessibility. The roots
+ − 4592 of accessibility are:
+ − 4593
+ − 4594 @enumerate
+ − 4595 @item
442
+ − 4596 All objects that have been @code{staticpro()}d or
+ − 4597 @code{staticpro_nodump()}ed. This is used for any global C variables
+ − 4598 that hold Lisp objects. A call to @code{staticpro()} happens implicitly
+ − 4599 as a result of any symbols declared with @code{defsymbol()} and any
+ − 4600 variables declared with @code{DEFVAR_FOO()}. You need to explicitly
+ − 4601 call @code{staticpro()} (in the @code{vars_of_foo()} method of a module)
+ − 4602 for other global C variables holding Lisp objects. (This typically
+ − 4603 includes internal lists and such things.). Use
+ − 4604 @code{staticpro_nodump()} only in the rare cases when you do not want
+ − 4605 the pointed variable to be saved at dump time but rather recompute it at
+ − 4606 startup.
428
+ − 4607
+ − 4608 Note that @code{obarray} is one of the @code{staticpro()}d things.
+ − 4609 Therefore, all functions and variables get marked through this.
+ − 4610 @item
+ − 4611 Any shadowed bindings that are sitting on the @code{specpdl} stack.
+ − 4612 @item
+ − 4613 Any objects sitting in currently active (Lisp) stack frames,
+ − 4614 catches, and condition cases.
+ − 4615 @item
+ − 4616 A couple of special-case places where active objects are
+ − 4617 located.
+ − 4618 @item
+ − 4619 Anything currently marked with @code{GCPRO}.
+ − 4620 @end enumerate
+ − 4621
+ − 4622 Marking with @code{GCPRO} is necessary because some C functions (quite
+ − 4623 a lot, in fact), allocate objects during their operation. Quite
+ − 4624 frequently, there will be no other pointer to the object while the
+ − 4625 function is running, and if a garbage collection occurs and the object
+ − 4626 needs to be referenced again, bad things will happen. The solution is
+ − 4627 to mark those objects with @code{GCPRO}. Unfortunately this is easy to
+ − 4628 forget, and there is basically no way around this problem. Here are
+ − 4629 some rules, though:
+ − 4630
+ − 4631 @enumerate
+ − 4632 @item
+ − 4633 For every @code{GCPRO@var{n}}, there have to be declarations of
+ − 4634 @code{struct gcpro gcpro1, gcpro2}, etc.
+ − 4635
+ − 4636 @item
+ − 4637 You @emph{must} @code{UNGCPRO} anything that's @code{GCPRO}ed, and you
+ − 4638 @emph{must not} @code{UNGCPRO} if you haven't @code{GCPRO}ed. Getting
+ − 4639 either of these wrong will lead to crashes, often in completely random
+ − 4640 places unrelated to where the problem lies.
+ − 4641
+ − 4642 @item
+ − 4643 The way this actually works is that all currently active @code{GCPRO}s
+ − 4644 are chained through the @code{struct gcpro} local variables, with the
+ − 4645 variable @samp{gcprolist} pointing to the head of the list and the nth
+ − 4646 local @code{gcpro} variable pointing to the first @code{gcpro} variable
+ − 4647 in the next enclosing stack frame. Each @code{GCPRO}ed thing is an
+ − 4648 lvalue, and the @code{struct gcpro} local variable contains a pointer to
+ − 4649 this lvalue. This is why things will mess up badly if you don't pair up
440
+ − 4650 the @code{GCPRO}s and @code{UNGCPRO}s---you will end up with
428
+ − 4651 @code{gcprolist}s containing pointers to @code{struct gcpro}s or local
+ − 4652 @code{Lisp_Object} variables in no-longer-active stack frames.
+ − 4653
+ − 4654 @item
+ − 4655 It is actually possible for a single @code{struct gcpro} to
+ − 4656 protect a contiguous array of any number of values, rather than
+ − 4657 just a single lvalue. To effect this, call @code{GCPRO@var{n}} as usual on
+ − 4658 the first object in the array and then set @code{gcpro@var{n}.nvars}.
+ − 4659
+ − 4660 @item
+ − 4661 @strong{Strings are relocated.} What this means in practice is that the
+ − 4662 pointer obtained using @code{XSTRING_DATA()} is liable to change at any
+ − 4663 time, and you should never keep it around past any function call, or
+ − 4664 pass it as an argument to any function that might cause a garbage
+ − 4665 collection. This is why a number of functions accept either a
+ − 4666 ``non-relocatable'' @code{char *} pointer or a relocatable Lisp string,
+ − 4667 and only access the Lisp string's data at the very last minute. In some
+ − 4668 cases, you may end up having to @code{alloca()} some space and copy the
+ − 4669 string's data into it.
+ − 4670
+ − 4671 @item
+ − 4672 By convention, if you have to nest @code{GCPRO}'s, use @code{NGCPRO@var{n}}
+ − 4673 (along with @code{struct gcpro ngcpro1, ngcpro2}, etc.), @code{NNGCPRO@var{n}},
+ − 4674 etc. This avoids compiler warnings about shadowed locals.
+ − 4675
+ − 4676 @item
+ − 4677 It is @emph{always} better to err on the side of extra @code{GCPRO}s
+ − 4678 rather than too few. The extra cycles spent on this are
+ − 4679 almost never going to make a whit of difference in the
+ − 4680 speed of anything.
+ − 4681
+ − 4682 @item
+ − 4683 The general rule to follow is that caller, not callee, @code{GCPRO}s.
+ − 4684 That is, you should not have to explicitly @code{GCPRO} any Lisp objects
+ − 4685 that are passed in as parameters.
+ − 4686
+ − 4687 One exception from this rule is if you ever plan to change the parameter
+ − 4688 value, and store a new object in it. In that case, you @emph{must}
+ − 4689 @code{GCPRO} the parameter, because otherwise the new object will not be
+ − 4690 protected.
+ − 4691
+ − 4692 So, if you create any Lisp objects (remember, this happens in all sorts
+ − 4693 of circumstances, e.g. with @code{Fcons()}, etc.), you are responsible
+ − 4694 for @code{GCPRO}ing them, unless you are @emph{absolutely sure} that
+ − 4695 there's no possibility that a garbage-collection can occur while you
+ − 4696 need to use the object. Even then, consider @code{GCPRO}ing.
+ − 4697
+ − 4698 @item
+ − 4699 A garbage collection can occur whenever anything calls @code{Feval}, or
+ − 4700 whenever a QUIT can occur where execution can continue past
+ − 4701 this. (Remember, this is almost anywhere.)
+ − 4702
+ − 4703 @item
+ − 4704 If you have the @emph{least smidgeon of doubt} about whether
+ − 4705 you need to @code{GCPRO}, you should @code{GCPRO}.
+ − 4706
+ − 4707 @item
+ − 4708 Beware of @code{GCPRO}ing something that is uninitialized. If you have
+ − 4709 any shade of doubt about this, initialize all your variables to @code{Qnil}.
+ − 4710
+ − 4711 @item
+ − 4712 Be careful of traps, like calling @code{Fcons()} in the argument to
+ − 4713 another function. By the ``caller protects'' law, you should be
+ − 4714 @code{GCPRO}ing the newly-created cons, but you aren't. A certain
+ − 4715 number of functions that are commonly called on freshly created stuff
+ − 4716 (e.g. @code{nconc2()}, @code{Fsignal()}), break the ``caller protects''
+ − 4717 law and go ahead and @code{GCPRO} their arguments so as to simplify
+ − 4718 things, but make sure and check if it's OK whenever doing something like
+ − 4719 this.
+ − 4720
+ − 4721 @item
+ − 4722 Once again, remember to @code{GCPRO}! Bugs resulting from insufficient
+ − 4723 @code{GCPRO}ing are intermittent and extremely difficult to track down,
+ − 4724 often showing up in crashes inside of @code{garbage-collect} or in
+ − 4725 weirdly corrupted objects or even in incorrect values in a totally
+ − 4726 different section of code.
+ − 4727 @end enumerate
+ − 4728
+ − 4729 @cindex garbage collection, conservative
+ − 4730 @cindex conservative garbage collection
+ − 4731 Given the extremely error-prone nature of the @code{GCPRO} scheme, and
+ − 4732 the difficulties in tracking down, it should be considered a deficiency
+ − 4733 in the XEmacs code. A solution to this problem would involve
+ − 4734 implementing so-called @dfn{conservative} garbage collection for the C
+ − 4735 stack. That involves looking through all of stack memory and treating
+ − 4736 anything that looks like a reference to an object as a reference. This
+ − 4737 will result in a few objects not getting collected when they should, but
+ − 4738 it obviates the need for @code{GCPRO}ing, and allows garbage collection
+ − 4739 to happen at any point at all, such as during object allocation.
+ − 4740
442
+ − 4741 @node Garbage Collection - Step by Step, Integers and Characters, GCPROing, Allocation of Objects in XEmacs Lisp
428
+ − 4742 @section Garbage Collection - Step by Step
+ − 4743 @cindex garbage collection step by step
+ − 4744
+ − 4745 @menu
+ − 4746 * Invocation::
+ − 4747 * garbage_collect_1::
+ − 4748 * mark_object::
+ − 4749 * gc_sweep::
+ − 4750 * sweep_lcrecords_1::
+ − 4751 * compact_string_chars::
+ − 4752 * sweep_strings::
+ − 4753 * sweep_bit_vectors_1::
+ − 4754 @end menu
+ − 4755
442
+ − 4756 @node Invocation, garbage_collect_1, Garbage Collection - Step by Step, Garbage Collection - Step by Step
428
+ − 4757 @subsection Invocation
+ − 4758 @cindex garbage collection, invocation
+ − 4759
+ − 4760 The first thing that anyone should know about garbage collection is:
442
+ − 4761 when and how the garbage collector is invoked. One might think that this
428
+ − 4762 could happen every time new memory is allocated, e.g. new objects are
+ − 4763 created, but this is @emph{not} the case. Instead, we have the following
+ − 4764 situation:
+ − 4765
+ − 4766 The entry point of any process of garbage collection is an invocation
+ − 4767 of the function @code{garbage_collect_1} in file @code{alloc.c}. The
+ − 4768 invocation can occur @emph{explicitly} by calling the function
+ − 4769 @code{Fgarbage_collect} (in addition this function provides information
442
+ − 4770 about the freed memory), or can occur @emph{implicitly} in four different
428
+ − 4771 situations:
+ − 4772 @enumerate
+ − 4773 @item
+ − 4774 In function @code{main_1} in file @code{emacs.c}. This function is called
+ − 4775 at each startup of xemacs. The garbage collection is invoked after all
+ − 4776 initial creations are completed, but only if a special internal error
+ − 4777 checking-constant @code{ERROR_CHECK_GC} is defined.
+ − 4778 @item
+ − 4779 In function @code{disksave_object_finalization} in file
+ − 4780 @code{alloc.c}. The only purpose of this function is to clear the
442
+ − 4781 objects from memory which need not be stored with xemacs when we dump out
428
+ − 4782 an executable. This is only done by @code{Fdump_emacs} or by
+ − 4783 @code{Fdump_emacs_data} respectively (both in @code{emacs.c}). The
+ − 4784 actual clearing is accomplished by making these objects unreachable and
+ − 4785 starting a garbage collection. The function is only used while building
+ − 4786 xemacs.
+ − 4787 @item
+ − 4788 In function @code{Feval / eval} in file @code{eval.c}. Each time the
+ − 4789 well known and often used function eval is called to evaluate a form,
+ − 4790 one of the first things that could happen, is a potential call of
+ − 4791 @code{garbage_collect_1}. There exist three global variables,
+ − 4792 @code{consing_since_gc} (counts the created cons-cells since the last
+ − 4793 garbage collection), @code{gc_cons_threshold} (a specified threshold
+ − 4794 after which a garbage collection occurs) and @code{always_gc}. If
+ − 4795 @code{always_gc} is set or if the threshold is exceeded, the garbage
+ − 4796 collection will start.
+ − 4797 @item
+ − 4798 In function @code{Ffuncall / funcall} in file @code{eval.c}. This
+ − 4799 function evaluates calls of elisp functions and works according to
+ − 4800 @code{Feval}.
+ − 4801 @end enumerate
+ − 4802
+ − 4803 The upshot is that garbage collection can basically occur everywhere
+ − 4804 @code{Feval}, respectively @code{Ffuncall}, is used - either directly or
442
+ − 4805 through another function. Since calls to these two functions are hidden
+ − 4806 in various other functions, many calls to @code{garbage_collect_1} are
+ − 4807 not obviously foreseeable, and therefore unexpected. Instances where
+ − 4808 they are used that are worth remembering are various elisp commands, as
+ − 4809 for example @code{or}, @code{and}, @code{if}, @code{cond}, @code{while},
+ − 4810 @code{setq}, etc., miscellaneous @code{gui_item_...} functions,
+ − 4811 everything related to @code{eval} (@code{Feval_buffer}, @code{call0},
+ − 4812 ...) and inside @code{Fsignal}. The latter is used to handle signals, as
444
+ − 4813 for example the ones raised by every @code{QUIT}-macro triggered after
442
+ − 4814 pressing Ctrl-g.
+ − 4815
+ − 4816 @node garbage_collect_1, mark_object, Invocation, Garbage Collection - Step by Step
428
+ − 4817 @subsection @code{garbage_collect_1}
+ − 4818 @cindex @code{garbage_collect_1}
+ − 4819
+ − 4820 We can now describe exactly what happens after the invocation takes
+ − 4821 place.
+ − 4822 @enumerate
+ − 4823 @item
442
+ − 4824 There are several cases in which the garbage collector is left immediately:
428
+ − 4825 when we are already garbage collecting (@code{gc_in_progress}), when
+ − 4826 the garbage collection is somehow forbidden
+ − 4827 (@code{gc_currently_forbidden}), when we are currently displaying something
+ − 4828 (@code{in_display}) or when we are preparing for the armageddon of the
+ − 4829 whole system (@code{preparing_for_armageddon}).
+ − 4830 @item
+ − 4831 Next the correct frame in which to put
+ − 4832 all the output occurring during garbage collecting is determined. In
+ − 4833 order to be able to restore the old display's state after displaying the
+ − 4834 message, some data about the current cursor position has to be
442
+ − 4835 saved. The variables @code{pre_gc_cursor} and @code{cursor_changed} take
428
+ − 4836 care of that.
+ − 4837 @item
+ − 4838 The state of @code{gc_currently_forbidden} must be restored after
+ − 4839 the garbage collection, no matter what happens during the process. We
+ − 4840 accomplish this by @code{record_unwind_protect}ing the suitable function
+ − 4841 @code{restore_gc_inhibit} together with the current value of
442
+ − 4842 @code{gc_currently_forbidden}.
428
+ − 4843 @item
+ − 4844 If we are concurrently running an interactive xemacs session, the next step
+ − 4845 is simply to show the garbage collector's cursor/message.
+ − 4846 @item
+ − 4847 The following steps are the intrinsic steps of the garbage collector,
+ − 4848 therefore @code{gc_in_progress} is set.
+ − 4849 @item
+ − 4850 For debugging purposes, it is possible to copy the current C stack
+ − 4851 frame. However, this seems to be a currently unused feature.
+ − 4852 @item
+ − 4853 Before actually starting to go over all live objects, references to
+ − 4854 objects that are no longer used are pruned. We only have to do this for events
+ − 4855 (@code{clear_event_resource}) and for specifiers
442
+ − 4856 (@code{cleanup_specifiers}).
428
+ − 4857 @item
+ − 4858 Now the mark phase begins and marks all accessible elements. In order to
+ − 4859 start from
+ − 4860 all slots that serve as roots of accessibility, the function
+ − 4861 @code{mark_object} is called for each root individually to go out from
+ − 4862 there to mark all reachable objects. All roots that are traversed are
+ − 4863 shown in their processed order:
+ − 4864 @itemize @bullet
+ − 4865 @item
+ − 4866 all constant symbols and static variables that are registered via
+ − 4867 @code{staticpro}@ in the array @code{staticvec}.
442
+ − 4868 @xref{Adding Global Lisp Variables}.
428
+ − 4869 @item
+ − 4870 all Lisp objects that are created in C functions and that must be
+ − 4871 protected from freeing them. They are registered in the global
+ − 4872 list @code{gcprolist}.
+ − 4873 @xref{GCPROing}.
442
+ − 4874 @item
428
+ − 4875 all local variables (i.e. their name fields @code{symbol} and old
+ − 4876 values @code{old_values}) that are bound during the evaluation by the Lisp
+ − 4877 engine. They are stored in @code{specbinding} structs pushed on a stack
+ − 4878 called @code{specpdl}.
+ − 4879 @xref{Dynamic Binding; The specbinding Stack; Unwind-Protects}.
+ − 4880 @item
+ − 4881 all catch blocks that the Lisp engine encounters during the evaluation
+ − 4882 cause the creation of structs @code{catchtag} inserted in the list
+ − 4883 @code{catchlist}. Their tag (@code{tag}) and value (@code{val} fields
+ − 4884 are freshly created objects and therefore have to be marked.
+ − 4885 @xref{Catch and Throw}.
+ − 4886 @item
442
+ − 4887 every function application pushes new structs @code{backtrace}
+ − 4888 on the call stack of the Lisp engine (@code{backtrace_list}). The unique
428
+ − 4889 parts that have to be marked are the fields for each function
+ − 4890 (@code{function}) and all their arguments (@code{args}).
+ − 4891 @xref{Evaluation}.
+ − 4892 @item
442
+ − 4893 all objects that are used by the redisplay engine that must not be freed
428
+ − 4894 are marked by a special function called @code{mark_redisplay} (in
+ − 4895 @code{redisplay.c}).
+ − 4896 @item
+ − 4897 all objects created for profiling purposes are allocated by C functions
+ − 4898 instead of using the lisp allocation mechanisms. In order to receive the
+ − 4899 right ones during the sweep phase, they also have to be marked
+ − 4900 manually. That is done by the function @code{mark_profiling_info}
+ − 4901 @end itemize
+ − 4902 @item
436
+ − 4903 Hash tables in XEmacs belong to a kind of special objects that
428
+ − 4904 make use of a concept often called 'weak pointers'.
+ − 4905 To make a long story short, these kind of pointers are not followed
+ − 4906 during the estimation of the live objects during garbage collection.
+ − 4907 Any object referenced only by weak pointers is collected
+ − 4908 anyway, and the reference to it is cleared. In hash tables there are
+ − 4909 different usage patterns of them, manifesting in different types of hash
+ − 4910 tables, namely 'non-weak', 'weak', 'key-weak' and 'value-weak'
442
+ − 4911 (internally also 'key-car-weak' and 'value-car-weak') hash tables, each
+ − 4912 clearing entries depending on different conditions. More information can
428
+ − 4913 be found in the documentation to the function @code{make-hash-table}.
+ − 4914
+ − 4915 Because there are complicated dependency rules about when and what to
+ − 4916 mark while processing weak hash tables, the standard @code{marker}
+ − 4917 method is only active if it is marking non-weak hash tables. As soon as
+ − 4918 a weak component is in the table, the hash table entries are ignored
+ − 4919 while marking. Instead their marking is done each separately by the
+ − 4920 function @code{finish_marking_weak_hash_tables}. This function iterates
+ − 4921 over each hash table entry @code{hentries} for each weak hash table in
+ − 4922 @code{Vall_weak_hash_tables}. Depending on the type of a table, the
442
+ − 4923 appropriate action is performed.
428
+ − 4924 If a table is acting as @code{HASH_TABLE_KEY_WEAK}, and a key already marked,
442
+ − 4925 everything reachable from the @code{value} component is marked. If it is
428
+ − 4926 acting as a @code{HASH_TABLE_VALUE_WEAK} and the value component is
442
+ − 4927 already marked, the marking starts beginning only from the
428
+ − 4928 @code{key} component.
442
+ − 4929 If it is a @code{HASH_TABLE_KEY_CAR_WEAK} and the car
428
+ − 4930 of the key entry is already marked, we mark both the @code{key} and
+ − 4931 @code{value} components.
+ − 4932 Finally, if the table is of the type @code{HASH_TABLE_VALUE_CAR_WEAK}
+ − 4933 and the car of the value components is already marked, again both the
+ − 4934 @code{key} and the @code{value} components get marked.
+ − 4935
+ − 4936 Again, there are lists with comparable properties called weak
+ − 4937 lists. There exist different peculiarities of their types called
+ − 4938 @code{simple}, @code{assoc}, @code{key-assoc} and
+ − 4939 @code{value-assoc}. You can find further details about them in the
+ − 4940 description to the function @code{make-weak-list}. The scheme of their
442
+ − 4941 marking is similar: all weak lists are listed in @code{Qall_weak_lists},
428
+ − 4942 therefore we iterate over them. The marking is advanced until we hit an
442
+ − 4943 already marked pair. Then we know that during a former run all
428
+ − 4944 the rest has been marked completely. Again, depending on the special
+ − 4945 type of the weak list, our jobs differ. If it is a @code{WEAK_LIST_SIMPLE}
+ − 4946 and the elem is marked, we mark the @code{cons} part. If it is a
+ − 4947 @code{WEAK_LIST_ASSOC} and not a pair or a pair with both marked car and
+ − 4948 cdr, we mark the @code{cons} and the @code{elem}. If it is a
+ − 4949 @code{WEAK_LIST_KEY_ASSOC} and not a pair or a pair with a marked car of
+ − 4950 the elem, we mark the @code{cons} and the @code{elem}. Finally, if it is
+ − 4951 a @code{WEAK_LIST_VALUE_ASSOC} and not a pair or a pair with a marked
+ − 4952 cdr of the elem, we mark both the @code{cons} and the @code{elem}.
+ − 4953
+ − 4954 Since, by marking objects in reach from weak hash tables and weak lists,
+ − 4955 other objects could get marked, this perhaps implies further marking of
442
+ − 4956 other weak objects, both finishing functions are redone as long as
428
+ − 4957 yet unmarked objects get freshly marked.
+ − 4958
+ − 4959 @item
+ − 4960 After completing the special marking for the weak hash tables and for the weak
+ − 4961 lists, all entries that point to objects that are going to be swept in
+ − 4962 the further process are useless, and therefore have to be removed from
+ − 4963 the table or the list.
+ − 4964
+ − 4965 The function @code{prune_weak_hash_tables} does the job for weak hash
+ − 4966 tables. Totally unmarked hash tables are removed from the list
+ − 4967 @code{Vall_weak_hash_tables}. The other ones are treated more carefully
442
+ − 4968 by scanning over all entries and removing one as soon as one of
428
+ − 4969 the components @code{key} and @code{value} is unmarked.
+ − 4970
+ − 4971 The same idea applies to the weak lists. It is accomplished by
+ − 4972 @code{prune_weak_lists}: An unmarked list is pruned from
+ − 4973 @code{Vall_weak_lists} immediately. A marked list is treated more
+ − 4974 carefully by going over it and removing just the unmarked pairs.
+ − 4975
+ − 4976 @item
+ − 4977 The function @code{prune_specifiers} checks all listed specifiers held
442
+ − 4978 in @code{Vall_specifiers} and removes the ones from the lists that are
428
+ − 4979 unmarked.
+ − 4980
+ − 4981 @item
+ − 4982 All syntax tables are stored in a list called
442
+ − 4983 @code{Vall_syntax_tables}. The function @code{prune_syntax_tables} walks
428
+ − 4984 through it and unlinks the tables that are unmarked.
+ − 4985
+ − 4986 @item
+ − 4987 Next, we will attack the complete sweeping - the function
+ − 4988 @code{gc_sweep} which holds the predominance.
+ − 4989 @item
+ − 4990 First, all the variables with respect to garbage collection are
442
+ − 4991 reset. @code{consing_since_gc} - the counter of the created cells since
428
+ − 4992 the last garbage collection - is set back to 0, and
+ − 4993 @code{gc_in_progress} is not @code{true} anymore.
+ − 4994 @item
442
+ − 4995 In case the session is interactive, the displayed cursor and message are
428
+ − 4996 removed again.
+ − 4997 @item
+ − 4998 The state of @code{gc_inhibit} is restored to the former value by
+ − 4999 unwinding the stack.
+ − 5000 @item
+ − 5001 A small memory reserve is always held back that can be reached by
+ − 5002 @code{breathing_space}. If nothing more is left, we create a new reserve
442
+ − 5003 and exit.
428
+ − 5004 @end enumerate
+ − 5005
442
+ − 5006 @node mark_object, gc_sweep, garbage_collect_1, Garbage Collection - Step by Step
428
+ − 5007 @subsection @code{mark_object}
+ − 5008 @cindex @code{mark_object}
+ − 5009
+ − 5010 The first thing that is checked while marking an object is whether the
+ − 5011 object is a real Lisp object @code{Lisp_Type_Record} or just an integer
+ − 5012 or a character. Integers and characters are the only two types that are
+ − 5013 stored directly - without another level of indirection, and therefore they
442
+ − 5014 don't have to be marked and collected.
428
+ − 5015 @xref{How Lisp Objects Are Represented in C}.
+ − 5016
+ − 5017 The second case is the one we have to handle. It is the one when we are
+ − 5018 dealing with a pointer to a Lisp object. But, there exist also three
+ − 5019 possibilities, that prevent us from doing anything while marking: The
+ − 5020 object is read only which prevents it from being garbage collected,
+ − 5021 i.e. marked (@code{C_READONLY_RECORD_HEADER}). The object in question is
+ − 5022 already marked, and need not be marked for the second time (checked by
+ − 5023 @code{MARKED_RECORD_HEADER_P}). If it is a special, unmarkable object
+ − 5024 (@code{UNMARKABLE_RECORD_HEADER_P}, apparently, these are objects that
442
+ − 5025 sit in some const space, and can therefore not be marked, see
428
+ − 5026 @code{this_one_is_unmarkable} in @code{alloc.c}).
+ − 5027
+ − 5028 Now, the actual marking is feasible. We do so by once using the macro
+ − 5029 @code{MARK_RECORD_HEADER} to mark the object itself (actually the
+ − 5030 special flag in the lrecord header), and calling its special marker
+ − 5031 "method" @code{marker} if available. The marker method marks every
442
+ − 5032 other object that is in reach from our current object. Note, that these
428
+ − 5033 marker methods should not call @code{mark_object} recursively, but
+ − 5034 instead should return the next object from where further marking has to
+ − 5035 be performed.
+ − 5036
+ − 5037 In case another object was returned, as mentioned before, we reiterate
+ − 5038 the whole @code{mark_object} process beginning with this next object.
+ − 5039
442
+ − 5040 @node gc_sweep, sweep_lcrecords_1, mark_object, Garbage Collection - Step by Step
428
+ − 5041 @subsection @code{gc_sweep}
+ − 5042 @cindex @code{gc_sweep}
+ − 5043
442
+ − 5044 The job of this function is to free all unmarked records from memory. As
428
+ − 5045 we know, there are different types of objects implemented and managed, and
+ − 5046 consequently different ways to free them from memory.
+ − 5047 @xref{Introduction to Allocation}.
+ − 5048
+ − 5049 We start with all objects stored through @code{lcrecords}. All
+ − 5050 bulkier objects are allocated and handled using that scheme of
+ − 5051 @code{lcrecords}. Each object is @code{malloc}ed separately
+ − 5052 instead of placing it in one of the contiguous frob blocks. All types
442
+ − 5053 that are currently stored
438
+ − 5054 using @code{lcrecords}'s @code{alloc_lcrecord} and
428
+ − 5055 @code{make_lcrecord_list} are the types: vectors, buffers,
+ − 5056 char-table, char-table-entry, console, weak-list, database, device,
+ − 5057 ldap, hash-table, command-builder, extent-auxiliary, extent-info, face,
+ − 5058 coding-system, frame, image-instance, glyph, popup-data, gui-item,
+ − 5059 keymap, charset, color_instance, font_instance, opaque, opaque-list,
+ − 5060 process, range-table, specifier, symbol-value-buffer-local,
+ − 5061 symbol-value-lisp-magic, symbol-value-varalias, toolbar-button,
+ − 5062 tooltalk-message, tooltalk-pattern, window, and window-configuration. We
+ − 5063 take care of them in the fist place
+ − 5064 in order to be able to handle and to finalize items stored in them more
+ − 5065 easily. The function @code{sweep_lcrecords_1} as described below is
+ − 5066 doing the whole job for us.
+ − 5067 For a description about the internals: @xref{lrecords}.
+ − 5068
+ − 5069 Our next candidates are the other objects that behave quite differently
+ − 5070 than everything else: the strings. They consists of two parts, a
442
+ − 5071 fixed-size portion (@code{struct Lisp_String}) holding the string's
428
+ − 5072 length, its property list and a pointer to the second part, and the
+ − 5073 actual string data, which is stored in string-chars blocks comparable to
+ − 5074 frob blocks. In this block, the data is not only freed, but also a
+ − 5075 compression of holes is made, i.e. all strings are relocated together.
+ − 5076 @xref{String}. This compacting phase is performed by the function
+ − 5077 @code{compact_string_chars}, the actual sweeping by the function
+ − 5078 @code{sweep_strings} is described below.
+ − 5079
+ − 5080 After that, the other types are swept step by step using functions
+ − 5081 @code{sweep_conses}, @code{sweep_bit_vectors_1},
+ − 5082 @code{sweep_compiled_functions}, @code{sweep_floats},
+ − 5083 @code{sweep_symbols}, @code{sweep_extents}, @code{sweep_markers} and
+ − 5084 @code{sweep_extents}. They are the fixed-size types cons, floats,
+ − 5085 compiled-functions, symbol, marker, extent, and event stored in
+ − 5086 so-called "frob blocks", and therefore we can basically do the same on
+ − 5087 every type objects, using the same macros, especially defined only to
442
+ − 5088 handle everything with respect to fixed-size blocks. The only fixed-size
428
+ − 5089 type that is not handled here are the fixed-size portion of strings,
+ − 5090 because we took special care of them earlier.
+ − 5091
+ − 5092 The only big exceptions are bit vectors stored differently and
442
+ − 5093 therefore treated differently by the function @code{sweep_bit_vectors_1}
428
+ − 5094 described later.
+ − 5095
+ − 5096 At first, we need some brief information about how
+ − 5097 these fixed-size types are managed in general, in order to understand
+ − 5098 how the sweeping is done. They have all a fixed size, and are therefore
+ − 5099 stored in big blocks of memory - allocated at once - that can hold a
+ − 5100 certain amount of objects of one type. The macro
+ − 5101 @code{DECLARE_FIXED_TYPE_ALLOC} creates the suitable structures for
442
+ − 5102 every type. More precisely, we have the block struct
428
+ − 5103 (holding a pointer to the previous block @code{prev} and the
+ − 5104 objects in @code{block[]}), a pointer to current block
+ − 5105 (@code{current_..._block)}) and its last index
+ − 5106 (@code{current_..._block_index}), and a pointer to the free list that
+ − 5107 will be created. Also a macro @code{FIXED_TYPE_FROM_BLOCK} plus some
+ − 5108 related macros exists that are used to obtain a new object, either from
+ − 5109 the free list @code{ALLOCATE_FIXED_TYPE_1} if there is an unused object
+ − 5110 of that type stored or by allocating a completely new block using
+ − 5111 @code{ALLOCATE_FIXED_TYPE_FROM_BLOCK}.
+ − 5112
+ − 5113 The rest works as follows: all of them define a
+ − 5114 macro @code{UNMARK_...} that is used to unmark the object. They define a
+ − 5115 macro @code{ADDITIONAL_FREE_...} that defines additional work that has
+ − 5116 to be done when converting an object from in use to not in use (so far,
+ − 5117 only markers use it in order to unchain them). Then, they all call
442
+ − 5118 the macro @code{SWEEP_FIXED_TYPE_BLOCK} instantiated with their type name
428
+ − 5119 and their struct name.
+ − 5120
+ − 5121 This call in particular does the following: we go over all blocks
+ − 5122 starting with the current moving towards the oldest.
+ − 5123 For each block, we look at every object in it. If the object already
+ − 5124 freed (checked with @code{FREE_STRUCT_P} using the first pointer of the
442
+ − 5125 object), or if it is
428
+ − 5126 set to read only (@code{C_READONLY_RECORD_HEADER_P}, nothing must be
+ − 5127 done. If it is unmarked (checked with @code{MARKED_RECORD_HEADER_P}), it
+ − 5128 is put in the free list and set free (using the macro
442
+ − 5129 @code{FREE_FIXED_TYPE}, otherwise it stays in the block, but is unmarked
428
+ − 5130 (by @code{UNMARK_...}). While going through one block, we note if the
+ − 5131 whole block is empty. If so, the whole block is freed (using
+ − 5132 @code{xfree}) and the free list state is set to the state it had before
+ − 5133 handling this block.
+ − 5134
442
+ − 5135 @node sweep_lcrecords_1, compact_string_chars, gc_sweep, Garbage Collection - Step by Step
428
+ − 5136 @subsection @code{sweep_lcrecords_1}
+ − 5137 @cindex @code{sweep_lcrecords_1}
+ − 5138
+ − 5139 After nullifying the complete lcrecord statistics, we go over all
442
+ − 5140 lcrecords two separate times. They are all chained together in a list with
+ − 5141 a head called @code{all_lcrecords}.
+ − 5142
+ − 5143 The first loop calls for each object its @code{finalizer} method, but only
428
+ − 5144 in the case that it is not read only
+ − 5145 (@code{C_READONLY_RECORD_HEADER_P)}, it is not already marked
+ − 5146 (@code{MARKED_RECORD_HEADER_P}), it is not already in a free list (list of
+ − 5147 freed objects, field @code{free}) and finally it owns a finalizer
+ − 5148 method.
442
+ − 5149
+ − 5150 The second loop actually frees the appropriate objects again by iterating
+ − 5151 through the whole list. In case an object is read only or marked, it
428
+ − 5152 has to persist, otherwise it is manually freed by calling
+ − 5153 @code{xfree}. During this loop, the lcrecord statistics are kept up to
442
+ − 5154 date by calling @code{tick_lcrecord_stats} with the right arguments,
+ − 5155
+ − 5156 @node compact_string_chars, sweep_strings, sweep_lcrecords_1, Garbage Collection - Step by Step
428
+ − 5157 @subsection @code{compact_string_chars}
+ − 5158 @cindex @code{compact_string_chars}
+ − 5159
+ − 5160 The purpose of this function is to compact all the data parts of the
+ − 5161 strings that are held in so-called @code{string_chars_block}, i.e. the
+ − 5162 strings that do not exceed a certain maximal length.
+ − 5163
+ − 5164 The procedure with which this is done is as follows. We are keeping two
+ − 5165 positions in the @code{string_chars_block}s using two pointer/integer
+ − 5166 pairs, namely @code{from_sb}/@code{from_pos} and
+ − 5167 @code{to_sb}/@code{to_pos}. They stand for the actual positions, from
442
+ − 5168 where to where, to copy the actually handled string.
428
+ − 5169
+ − 5170 While going over all chained @code{string_char_block}s and their held
+ − 5171 strings, staring at @code{first_string_chars_block}, both pointers
+ − 5172 are advanced and eventually a string is copied from @code{from_sb} to
+ − 5173 @code{to_sb}, depending on the status of the pointed at strings.
+ − 5174
+ − 5175 More precisely, we can distinguish between the following actions.
+ − 5176 @itemize @bullet
+ − 5177 @item
+ − 5178 The string at @code{from_sb}'s position could be marked as free, which
442
+ − 5179 is indicated by an invalid pointer to the pointer that should point back
428
+ − 5180 to the fixed size string object, and which is checked by
+ − 5181 @code{FREE_STRUCT_P}. In this case, the @code{from_sb}/@code{from_pos}
+ − 5182 is advanced to the next string, and nothing has to be copied.
+ − 5183 @item
+ − 5184 Also, if a string object itself is unmarked, nothing has to be
+ − 5185 copied. We likewise advance the @code{from_sb}/@code{from_pos}
+ − 5186 pair as described above.
+ − 5187 @item
442
+ − 5188 In all other cases, we have a marked string at hand. The string data
428
+ − 5189 must be moved from the from-position to the to-position. In case
+ − 5190 there is not enough space in the actual @code{to_sb}-block, we advance
+ − 5191 this pointer to the beginning of the next block before copying. In case the
+ − 5192 from and to positions are different, we perform the
+ − 5193 actual copying using the library function @code{memmove}.
+ − 5194 @end itemize
+ − 5195
+ − 5196 After compacting, the pointer to the current
+ − 5197 @code{string_chars_block}, sitting in @code{current_string_chars_block},
+ − 5198 is reset on the last block to which we moved a string,
+ − 5199 i.e. @code{to_block}, and all remaining blocks (we know that they just
+ − 5200 carry garbage) are explicitly @code{xfree}d.
+ − 5201
442
+ − 5202 @node sweep_strings, sweep_bit_vectors_1, compact_string_chars, Garbage Collection - Step by Step
428
+ − 5203 @subsection @code{sweep_strings}
+ − 5204 @cindex @code{sweep_strings}
+ − 5205
+ − 5206 The sweeping for the fixed sized string objects is essentially exactly
+ − 5207 the same as it is for all other fixed size types. As before, the freeing
+ − 5208 into the suitable free list is done by using the macro
+ − 5209 @code{SWEEP_FIXED_SIZE_BLOCK} after defining the right macros
+ − 5210 @code{UNMARK_string} and @code{ADDITIONAL_FREE_string}. These two
+ − 5211 definitions are a little bit special compared to the ones used
+ − 5212 for the other fixed size types.
+ − 5213
442
+ − 5214 @code{UNMARK_string} is defined the same way except some additional code
428
+ − 5215 used for updating the bookkeeping information.
+ − 5216
+ − 5217 For strings, @code{ADDITIONAL_FREE_string} has to do something in
+ − 5218 addition: in case, the string was not allocated in a
+ − 5219 @code{string_chars_block} because it exceeded the maximal length, and
+ − 5220 therefore it was @code{malloc}ed separately, we know also @code{xfree}
+ − 5221 it explicitly.
+ − 5222
442
+ − 5223 @node sweep_bit_vectors_1, , sweep_strings, Garbage Collection - Step by Step
428
+ − 5224 @subsection @code{sweep_bit_vectors_1}
+ − 5225 @cindex @code{sweep_bit_vectors_1}
+ − 5226
+ − 5227 Bit vectors are also one of the rare types that are @code{malloc}ed
+ − 5228 individually. Consequently, while sweeping, all further needless
+ − 5229 bit vectors must be freed by hand. This is done, as one might imagine,
+ − 5230 the expected way: since they are all registered in a list called
+ − 5231 @code{all_bit_vectors}, all elements of that list are traversed,
442
+ − 5232 all unmarked bit vectors are unlinked by calling @code{xfree} and all of
428
+ − 5233 them become unmarked.
442
+ − 5234 In addition, the bookkeeping information used for garbage
428
+ − 5235 collector's output purposes is updated.
+ − 5236
442
+ − 5237 @node Integers and Characters, Allocation from Frob Blocks, Garbage Collection - Step by Step, Allocation of Objects in XEmacs Lisp
428
+ − 5238 @section Integers and Characters
+ − 5239
+ − 5240 Integer and character Lisp objects are created from integers using the
+ − 5241 macros @code{XSETINT()} and @code{XSETCHAR()} or the equivalent
+ − 5242 functions @code{make_int()} and @code{make_char()}. (These are actually
+ − 5243 macros on most systems.) These functions basically just do some moving
+ − 5244 of bits around, since the integral value of the object is stored
+ − 5245 directly in the @code{Lisp_Object}.
+ − 5246
+ − 5247 @code{XSETINT()} and the like will truncate values given to them that
+ − 5248 are too big; i.e. you won't get the value you expected but the tag bits
+ − 5249 will at least be correct.
+ − 5250
442
+ − 5251 @node Allocation from Frob Blocks, lrecords, Integers and Characters, Allocation of Objects in XEmacs Lisp
428
+ − 5252 @section Allocation from Frob Blocks
+ − 5253
+ − 5254 The uninitialized memory required by a @code{Lisp_Object} of a particular type
+ − 5255 is allocated using
+ − 5256 @code{ALLOCATE_FIXED_TYPE()}. This only occurs inside of the
+ − 5257 lowest-level object-creating functions in @file{alloc.c}:
+ − 5258 @code{Fcons()}, @code{make_float()}, @code{Fmake_byte_code()},
+ − 5259 @code{Fmake_symbol()}, @code{allocate_extent()},
+ − 5260 @code{allocate_event()}, @code{Fmake_marker()}, and
+ − 5261 @code{make_uninit_string()}. The idea is that, for each type, there are
+ − 5262 a number of frob blocks (each 2K in size); each frob block is divided up
+ − 5263 into object-sized chunks. Each frob block will have some of these
+ − 5264 chunks that are currently assigned to objects, and perhaps some that are
+ − 5265 free. (If a frob block has nothing but free chunks, it is freed at the
+ − 5266 end of the garbage collection cycle.) The free chunks are stored in a
+ − 5267 free list, which is chained by storing a pointer in the first four bytes
+ − 5268 of the chunk. (Except for the free chunks at the end of the last frob
+ − 5269 block, which are handled using an index which points past the end of the
+ − 5270 last-allocated chunk in the last frob block.)
+ − 5271 @code{ALLOCATE_FIXED_TYPE()} first tries to retrieve a chunk from the
+ − 5272 free list; if that fails, it calls
+ − 5273 @code{ALLOCATE_FIXED_TYPE_FROM_BLOCK()}, which looks at the end of the
+ − 5274 last frob block for space, and creates a new frob block if there is
+ − 5275 none. (There are actually two versions of these macros, one of which is
+ − 5276 more defensive but less efficient and is used for error-checking.)
+ − 5277
442
+ − 5278 @node lrecords, Low-level allocation, Allocation from Frob Blocks, Allocation of Objects in XEmacs Lisp
428
+ − 5279 @section lrecords
+ − 5280
+ − 5281 [see @file{lrecord.h}]
+ − 5282
+ − 5283 All lrecords have at the beginning of their structure a @code{struct
442
+ − 5284 lrecord_header}. This just contains a type number and some flags,
+ − 5285 including the mark bit. All builtin type numbers are defined as
+ − 5286 constants in @code{enum lrecord_type}, to allow the compiler to generate
+ − 5287 more efficient code for @code{@var{type}P}. The type number, thru the
+ − 5288 @code{lrecord_implementation_table}, gives access to a @code{struct
428
+ − 5289 lrecord_implementation}, which is a structure containing method pointers
+ − 5290 and such. There is one of these for each type, and it is a global,
+ − 5291 constant, statically-declared structure that is declared in the
442
+ − 5292 @code{DEFINE_LRECORD_IMPLEMENTATION()} macro.
+ − 5293
+ − 5294 Simple lrecords (of type (b) above) just have a @code{struct
428
+ − 5295 lrecord_header} at their beginning. lcrecords, however, actually have a
+ − 5296 @code{struct lcrecord_header}. This, in turn, has a @code{struct
+ − 5297 lrecord_header} at its beginning, so sanity is preserved; but it also
+ − 5298 has a pointer used to chain all lcrecords together, and a special ID
+ − 5299 field used to distinguish one lcrecord from another. (This field is used
+ − 5300 only for debugging and could be removed, but the space gain is not
+ − 5301 significant.)
+ − 5302
+ − 5303 Simple lrecords are created using @code{ALLOCATE_FIXED_TYPE()}, just
+ − 5304 like for other frob blocks. The only change is that the implementation
+ − 5305 pointer must be initialized correctly. (The implementation structure for
+ − 5306 an lrecord, or rather the pointer to it, is named @code{lrecord_float},
+ − 5307 @code{lrecord_extent}, @code{lrecord_buffer}, etc.)
+ − 5308
+ − 5309 lcrecords are created using @code{alloc_lcrecord()}. This takes a
+ − 5310 size to allocate and an implementation pointer. (The size needs to be
+ − 5311 passed because some lcrecords, such as window configurations, are of
+ − 5312 variable size.) This basically just @code{malloc()}s the storage,
+ − 5313 initializes the @code{struct lcrecord_header}, and chains the lcrecord
+ − 5314 onto the head of the list of all lcrecords, which is stored in the
+ − 5315 variable @code{all_lcrecords}. The calls to @code{alloc_lcrecord()}
+ − 5316 generally occur in the lowest-level allocation function for each lrecord
+ − 5317 type.
+ − 5318
+ − 5319 Whenever you create an lrecord, you need to call either
+ − 5320 @code{DEFINE_LRECORD_IMPLEMENTATION()} or
+ − 5321 @code{DEFINE_LRECORD_SEQUENCE_IMPLEMENTATION()}. This needs to be
442
+ − 5322 specified in a @file{.c} file, at the top level. What this actually
+ − 5323 does is define and initialize the implementation structure for the
+ − 5324 lrecord. (And possibly declares a function @code{error_check_foo()} that
+ − 5325 implements the @code{XFOO()} macro when error-checking is enabled.) The
+ − 5326 arguments to the macros are the actual type name (this is used to
+ − 5327 construct the C variable name of the lrecord implementation structure
+ − 5328 and related structures using the @samp{##} macro concatenation
+ − 5329 operator), a string that names the type on the Lisp level (this may not
+ − 5330 be the same as the C type name; typically, the C type name has
+ − 5331 underscores, while the Lisp string has dashes), various method pointers,
+ − 5332 and the name of the C structure that contains the object. The methods
+ − 5333 are used to encapsulate type-specific information about the object, such
+ − 5334 as how to print it or mark it for garbage collection, so that it's easy
+ − 5335 to add new object types without having to add a specific case for each
+ − 5336 new type in a bunch of different places.
428
+ − 5337
+ − 5338 The difference between @code{DEFINE_LRECORD_IMPLEMENTATION()} and
+ − 5339 @code{DEFINE_LRECORD_SEQUENCE_IMPLEMENTATION()} is that the former is
+ − 5340 used for fixed-size object types and the latter is for variable-size
+ − 5341 object types. Most object types are fixed-size; some complex
+ − 5342 types, however (e.g. window configurations), are variable-size.
+ − 5343 Variable-size object types have an extra method, which is called
+ − 5344 to determine the actual size of a particular object of that type.
+ − 5345 (Currently this is only used for keeping allocation statistics.)
+ − 5346
+ − 5347 For the purpose of keeping allocation statistics, the allocation
+ − 5348 engine keeps a list of all the different types that exist. Note that,
+ − 5349 since @code{DEFINE_LRECORD_IMPLEMENTATION()} is a macro that is
442
+ − 5350 specified at top-level, there is no way for it to initialize the global
+ − 5351 data structures containing type information, like
+ − 5352 @code{lrecord_implementations_table}. For this reason a call to
+ − 5353 @code{INIT_LRECORD_IMPLEMENTATION} must be added to the same source file
+ − 5354 containing @code{DEFINE_LRECORD_IMPLEMENTATION}, but instead of to the
+ − 5355 top level, to one of the init functions, typically
+ − 5356 @code{syms_of_@var{foo}.c}. @code{INIT_LRECORD_IMPLEMENTATION} must be
+ − 5357 called before an object of this type is used.
+ − 5358
+ − 5359 The type number is also used to index into an array holding the number
+ − 5360 of objects of each type and the total memory allocated for objects of
+ − 5361 that type. The statistics in this array are computed during the sweep
+ − 5362 stage. These statistics are returned by the call to
+ − 5363 @code{garbage-collect}.
428
+ − 5364
+ − 5365 Note that for every type defined with a @code{DEFINE_LRECORD_*()}
+ − 5366 macro, there needs to be a @code{DECLARE_LRECORD_IMPLEMENTATION()}
+ − 5367 somewhere in a @file{.h} file, and this @file{.h} file needs to be
+ − 5368 included by @file{inline.c}.
+ − 5369
+ − 5370 Furthermore, there should generally be a set of @code{XFOOBAR()},
+ − 5371 @code{FOOBARP()}, etc. macros in a @file{.h} (or occasionally @file{.c})
+ − 5372 file. To create one of these, copy an existing model and modify as
+ − 5373 necessary.
+ − 5374
442
+ − 5375 @strong{Please note:} If you define an lrecord in an external
+ − 5376 dynamically-loaded module, you must use @code{DECLARE_EXTERNAL_LRECORD},
+ − 5377 @code{DEFINE_EXTERNAL_LRECORD_IMPLEMENTATION}, and
+ − 5378 @code{DEFINE_EXTERNAL_LRECORD_SEQUENCE_IMPLEMENTATION} instead of the
+ − 5379 non-EXTERNAL forms. These macros will dynamically add new type numbers
+ − 5380 to the global enum that records them, whereas the non-EXTERNAL forms
+ − 5381 assume that the programmer has already inserted the correct type numbers
+ − 5382 into the enum's code at compile-time.
+ − 5383
428
+ − 5384 The various methods in the lrecord implementation structure are:
+ − 5385
+ − 5386 @enumerate
+ − 5387 @item
+ − 5388 @cindex mark method
+ − 5389 A @dfn{mark} method. This is called during the marking stage and passed
+ − 5390 a function pointer (usually the @code{mark_object()} function), which is
+ − 5391 used to mark an object. All Lisp objects that are contained within the
+ − 5392 object need to be marked by applying this function to them. The mark
444
+ − 5393 method should also return a Lisp object, which should be either @code{nil} or
428
+ − 5394 an object to mark. (This can be used in lieu of calling
+ − 5395 @code{mark_object()} on the object, to reduce the recursion depth, and
+ − 5396 consequently should be the most heavily nested sub-object, such as a
+ − 5397 long list.)
+ − 5398
+ − 5399 @strong{Please note:} When the mark method is called, garbage collection
+ − 5400 is in progress, and special precautions need to be taken when accessing
+ − 5401 objects; see section (B) above.
+ − 5402
+ − 5403 If your mark method does not need to do anything, it can be
+ − 5404 @code{NULL}.
+ − 5405
+ − 5406 @item
+ − 5407 A @dfn{print} method. This is called to create a printed representation
+ − 5408 of the object, whenever @code{princ}, @code{prin1}, or the like is
+ − 5409 called. It is passed the object, a stream to which the output is to be
+ − 5410 directed, and an @code{escapeflag} which indicates whether the object's
+ − 5411 printed representation should be @dfn{escaped} so that it is
+ − 5412 readable. (This corresponds to the difference between @code{princ} and
+ − 5413 @code{prin1}.) Basically, @dfn{escaped} means that strings will have
+ − 5414 quotes around them and confusing characters in the strings such as
+ − 5415 quotes, backslashes, and newlines will be backslashed; and that special
+ − 5416 care will be taken to make symbols print in a readable fashion
+ − 5417 (e.g. symbols that look like numbers will be backslashed). Other
+ − 5418 readable objects should perhaps pass @code{escapeflag} on when
+ − 5419 sub-objects are printed, so that readability is preserved when necessary
+ − 5420 (or if not, always pass in a 1 for @code{escapeflag}). Non-readable
+ − 5421 objects should in general ignore @code{escapeflag}, except that some use
+ − 5422 it as an indication that more verbose output should be given.
+ − 5423
+ − 5424 Sub-objects are printed using @code{print_internal()}, which takes
+ − 5425 exactly the same arguments as are passed to the print method.
+ − 5426
+ − 5427 Literal C strings should be printed using @code{write_c_string()},
+ − 5428 or @code{write_string_1()} for non-null-terminated strings.
+ − 5429
+ − 5430 Functions that do not have a readable representation should check the
+ − 5431 @code{print_readably} flag and signal an error if it is set.
+ − 5432
+ − 5433 If you specify NULL for the print method, the
+ − 5434 @code{default_object_printer()} will be used.
+ − 5435
+ − 5436 @item
+ − 5437 A @dfn{finalize} method. This is called at the beginning of the sweep
+ − 5438 stage on lcrecords that are about to be freed, and should be used to
+ − 5439 perform any extra object cleanup. This typically involves freeing any
+ − 5440 extra @code{malloc()}ed memory associated with the object, releasing any
+ − 5441 operating-system and window-system resources associated with the object
+ − 5442 (e.g. pixmaps, fonts), etc.
+ − 5443
+ − 5444 The finalize method can be NULL if nothing needs to be done.
+ − 5445
+ − 5446 WARNING #1: The finalize method is also called at the end of the dump
+ − 5447 phase; this time with the for_disksave parameter set to non-zero. The
+ − 5448 object is @emph{not} about to disappear, so you have to make sure to
+ − 5449 @emph{not} free any extra @code{malloc()}ed memory if you're going to
+ − 5450 need it later. (Also, signal an error if there are any operating-system
+ − 5451 and window-system resources here, because they can't be dumped.)
+ − 5452
+ − 5453 Finalize methods should, as a rule, set to zero any pointers after
+ − 5454 they've been freed, and check to make sure pointers are not zero before
+ − 5455 freeing. Although I'm pretty sure that finalize methods are not called
+ − 5456 twice on the same object (except for the @code{for_disksave} proviso),
+ − 5457 we've gotten nastily burned in some cases by not doing this.
+ − 5458
+ − 5459 WARNING #2: The finalize method is @emph{only} called for
+ − 5460 lcrecords, @emph{not} for simply lrecords. If you need a
+ − 5461 finalize method for simple lrecords, you have to stick
+ − 5462 it in the @code{ADDITIONAL_FREE_foo()} macro in @file{alloc.c}.
+ − 5463
+ − 5464 WARNING #3: Things are in an @emph{extremely} bizarre state
+ − 5465 when @code{ADDITIONAL_FREE_foo()} is called, so you have to
+ − 5466 be incredibly careful when writing one of these functions.
+ − 5467 See the comment in @code{gc_sweep()}. If you ever have to add
+ − 5468 one of these, consider using an lcrecord or dealing with
+ − 5469 the problem in a different fashion.
+ − 5470
+ − 5471 @item
+ − 5472 An @dfn{equal} method. This compares the two objects for similarity,
+ − 5473 when @code{equal} is called. It should compare the contents of the
+ − 5474 objects in some reasonable fashion. It is passed the two objects and a
+ − 5475 @dfn{depth} value, which is used to catch circular objects. To compare
+ − 5476 sub-Lisp-objects, call @code{internal_equal()} and bump the depth value
+ − 5477 by one. If this value gets too high, a @code{circular-object} error
+ − 5478 will be signaled.
+ − 5479
+ − 5480 If this is NULL, objects are @code{equal} only when they are @code{eq},
+ − 5481 i.e. identical.
+ − 5482
+ − 5483 @item
+ − 5484 A @dfn{hash} method. This is used to hash objects when they are to be
+ − 5485 compared with @code{equal}. The rule here is that if two objects are
+ − 5486 @code{equal}, they @emph{must} hash to the same value; i.e. your hash
+ − 5487 function should use some subset of the sub-fields of the object that are
+ − 5488 compared in the ``equal'' method. If you specify this method as
+ − 5489 @code{NULL}, the object's pointer will be used as the hash, which will
+ − 5490 @emph{fail} if the object has an @code{equal} method, so don't do this.
+ − 5491
+ − 5492 To hash a sub-Lisp-object, call @code{internal_hash()}. Bump the
+ − 5493 depth by one, just like in the ``equal'' method.
+ − 5494
+ − 5495 To convert a Lisp object directly into a hash value (using
+ − 5496 its pointer), use @code{LISP_HASH()}. This is what happens when
+ − 5497 the hash method is NULL.
+ − 5498
+ − 5499 To hash two or more values together into a single value, use
+ − 5500 @code{HASH2()}, @code{HASH3()}, @code{HASH4()}, etc.
+ − 5501
+ − 5502 @item
+ − 5503 @dfn{getprop}, @dfn{putprop}, @dfn{remprop}, and @dfn{plist} methods.
+ − 5504 These are used for object types that have properties. I don't feel like
+ − 5505 documenting them here. If you create one of these objects, you have to
+ − 5506 use different macros to define them,
+ − 5507 i.e. @code{DEFINE_LRECORD_IMPLEMENTATION_WITH_PROPS()} or
+ − 5508 @code{DEFINE_LRECORD_SEQUENCE_IMPLEMENTATION_WITH_PROPS()}.
+ − 5509
+ − 5510 @item
+ − 5511 A @dfn{size_in_bytes} method, when the object is of variable-size.
+ − 5512 (i.e. declared with a @code{_SEQUENCE_IMPLEMENTATION} macro.) This should
+ − 5513 simply return the object's size in bytes, exactly as you might expect.
+ − 5514 For an example, see the methods for window configurations and opaques.
+ − 5515 @end enumerate
+ − 5516
442
+ − 5517 @node Low-level allocation, Cons, lrecords, Allocation of Objects in XEmacs Lisp
428
+ − 5518 @section Low-level allocation
+ − 5519
+ − 5520 Memory that you want to allocate directly should be allocated using
+ − 5521 @code{xmalloc()} rather than @code{malloc()}. This implements
+ − 5522 error-checking on the return value, and once upon a time did some more
+ − 5523 vital stuff (i.e. @code{BLOCK_INPUT}, which is no longer necessary).
+ − 5524 Free using @code{xfree()}, and realloc using @code{xrealloc()}. Note
+ − 5525 that @code{xmalloc()} will do a non-local exit if the memory can't be
+ − 5526 allocated. (Many functions, however, do not expect this, and thus XEmacs
+ − 5527 will likely crash if this happens. @strong{This is a bug.} If you can,
+ − 5528 you should strive to make your function handle this OK. However, it's
+ − 5529 difficult in the general circumstance, perhaps requiring extra
+ − 5530 unwind-protects and such.)
+ − 5531
+ − 5532 Note that XEmacs provides two separate replacements for the standard
+ − 5533 @code{malloc()} library function. These are called @dfn{old GNU malloc}
+ − 5534 (@file{malloc.c}) and @dfn{new GNU malloc} (@file{gmalloc.c}),
+ − 5535 respectively. New GNU malloc is better in pretty much every way than
+ − 5536 old GNU malloc, and should be used if possible. (It used to be that on
+ − 5537 some systems, the old one worked but the new one didn't. I think this
+ − 5538 was due specifically to a bug in SunOS, which the new one now works
+ − 5539 around; so I don't think the old one ever has to be used any more.) The
+ − 5540 primary difference between both of these mallocs and the standard system
+ − 5541 malloc is that they are much faster, at the expense of increased space.
+ − 5542 The basic idea is that memory is allocated in fixed chunks of powers of
+ − 5543 two. This allows for basically constant malloc time, since the various
+ − 5544 chunks can just be kept on a number of free lists. (The standard system
+ − 5545 malloc typically allocates arbitrary-sized chunks and has to spend some
+ − 5546 time, sometimes a significant amount of time, walking the heap looking
+ − 5547 for a free block to use and cleaning things up.) The new GNU malloc
+ − 5548 improves on things by allocating large objects in chunks of 4096 bytes
+ − 5549 rather than in ever larger powers of two, which results in ever larger
+ − 5550 wastage. There is a slight speed loss here, but it's of doubtful
+ − 5551 significance.
+ − 5552
+ − 5553 NOTE: Apparently there is a third-generation GNU malloc that is
+ − 5554 significantly better than the new GNU malloc, and should probably
+ − 5555 be included in XEmacs.
+ − 5556
+ − 5557 There is also the relocating allocator, @file{ralloc.c}. This actually
+ − 5558 moves blocks of memory around so that the @code{sbrk()} pointer shrunk
+ − 5559 and virtual memory released back to the system. On some systems,
+ − 5560 this is a big win. On all systems, it causes a noticeable (and
+ − 5561 sometimes huge) speed penalty, so I turn it off by default.
+ − 5562 @file{ralloc.c} only works with the new GNU malloc in @file{gmalloc.c}.
+ − 5563 There are also two versions of @file{ralloc.c}, one that uses @code{mmap()}
+ − 5564 rather than block copies to move data around. This purports to
+ − 5565 be faster, although that depends on the amount of data that would
+ − 5566 have had to be block copied and the system-call overhead for
+ − 5567 @code{mmap()}. I don't know exactly how this works, except that the
+ − 5568 relocating-allocation routines are pretty much used only for
+ − 5569 the memory allocated for a buffer, which is the biggest consumer
+ − 5570 of space, esp. of space that may get freed later.
+ − 5571
+ − 5572 Note that the GNU mallocs have some ``memory warning'' facilities.
+ − 5573 XEmacs taps into them and issues a warning through the standard
+ − 5574 warning system, when memory gets to 75%, 85%, and 95% full.
+ − 5575 (On some systems, the memory warnings are not functional.)
+ − 5576
+ − 5577 Allocated memory that is going to be used to make a Lisp object
442
+ − 5578 is created using @code{allocate_lisp_storage()}. This just calls
+ − 5579 @code{xmalloc()}. It used to verify that the pointer to the memory can
+ − 5580 fit into a Lisp word, before the current Lisp object representation was
+ − 5581 introduced. @code{allocate_lisp_storage()} is called by
+ − 5582 @code{alloc_lcrecord()}, @code{ALLOCATE_FIXED_TYPE()}, and the vector
+ − 5583 and bit-vector creation routines. These routines also call
+ − 5584 @code{INCREMENT_CONS_COUNTER()} at the appropriate times; this keeps
+ − 5585 statistics on how much memory is allocated, so that garbage-collection
+ − 5586 can be invoked when the threshold is reached.
+ − 5587
+ − 5588 @node Cons, Vector, Low-level allocation, Allocation of Objects in XEmacs Lisp
428
+ − 5589 @section Cons
+ − 5590
+ − 5591 Conses are allocated in standard frob blocks. The only thing to
+ − 5592 note is that conses can be explicitly freed using @code{free_cons()}
+ − 5593 and associated functions @code{free_list()} and @code{free_alist()}. This
+ − 5594 immediately puts the conses onto the cons free list, and decrements
+ − 5595 the statistics on memory allocation appropriately. This is used
+ − 5596 to good effect by some extremely commonly-used code, to avoid
+ − 5597 generating extra objects and thereby triggering GC sooner.
+ − 5598 However, you have to be @emph{extremely} careful when doing this.
+ − 5599 If you mess this up, you will get BADLY BURNED, and it has happened
+ − 5600 before.
+ − 5601
442
+ − 5602 @node Vector, Bit Vector, Cons, Allocation of Objects in XEmacs Lisp
428
+ − 5603 @section Vector
+ − 5604
+ − 5605 As mentioned above, each vector is @code{malloc()}ed individually, and
+ − 5606 all are threaded through the variable @code{all_vectors}. Vectors are
+ − 5607 marked strangely during garbage collection, by kludging the size field.
+ − 5608 Note that the @code{struct Lisp_Vector} is declared with its
+ − 5609 @code{contents} field being a @emph{stretchy} array of one element. It
+ − 5610 is actually @code{malloc()}ed with the right size, however, and access
+ − 5611 to any element through the @code{contents} array works fine.
+ − 5612
442
+ − 5613 @node Bit Vector, Symbol, Vector, Allocation of Objects in XEmacs Lisp
428
+ − 5614 @section Bit Vector
+ − 5615
+ − 5616 Bit vectors work exactly like vectors, except for more complicated
+ − 5617 code to access an individual bit, and except for the fact that bit
+ − 5618 vectors are lrecords while vectors are not. (The only difference here is
+ − 5619 that there's an lrecord implementation pointer at the beginning and the
+ − 5620 tag field in bit vector Lisp words is ``lrecord'' rather than
+ − 5621 ``vector''.)
+ − 5622
442
+ − 5623 @node Symbol, Marker, Bit Vector, Allocation of Objects in XEmacs Lisp
428
+ − 5624 @section Symbol
+ − 5625
442
+ − 5626 Symbols are also allocated in frob blocks. Symbols in the awful
+ − 5627 horrible obarray structure are chained through their @code{next} field.
428
+ − 5628
+ − 5629 Remember that @code{intern} looks up a symbol in an obarray, creating
+ − 5630 one if necessary.
+ − 5631
442
+ − 5632 @node Marker, String, Symbol, Allocation of Objects in XEmacs Lisp
428
+ − 5633 @section Marker
+ − 5634
+ − 5635 Markers are allocated in frob blocks, as usual. They are kept
+ − 5636 in a buffer unordered, but in a doubly-linked list so that they
+ − 5637 can easily be removed. (Formerly this was a singly-linked list,
+ − 5638 but in some cases garbage collection took an extraordinarily
+ − 5639 long time due to the O(N^2) time required to remove lots of
+ − 5640 markers from a buffer.) Markers are removed from a buffer in
+ − 5641 the finalize stage, in @code{ADDITIONAL_FREE_marker()}.
+ − 5642
442
+ − 5643 @node String, Compiled Function, Marker, Allocation of Objects in XEmacs Lisp
428
+ − 5644 @section String
+ − 5645
+ − 5646 As mentioned above, strings are a special case. A string is logically
+ − 5647 two parts, a fixed-size object (containing the length, property list,
+ − 5648 and a pointer to the actual data), and the actual data in the string.
+ − 5649 The fixed-size object is a @code{struct Lisp_String} and is allocated in
+ − 5650 frob blocks, as usual. The actual data is stored in special
+ − 5651 @dfn{string-chars blocks}, which are 8K blocks of memory.
+ − 5652 Currently-allocated strings are simply laid end to end in these
+ − 5653 string-chars blocks, with a pointer back to the @code{struct Lisp_String}
+ − 5654 stored before each string in the string-chars block. When a new string
+ − 5655 needs to be allocated, the remaining space at the end of the last
+ − 5656 string-chars block is used if there's enough, and a new string-chars
+ − 5657 block is created otherwise.
+ − 5658
+ − 5659 There are never any holes in the string-chars blocks due to the string
+ − 5660 compaction and relocation that happens at the end of garbage collection.
+ − 5661 During the sweep stage of garbage collection, when objects are
+ − 5662 reclaimed, the garbage collector goes through all string-chars blocks,
+ − 5663 looking for unused strings. Each chunk of string data is preceded by a
+ − 5664 pointer to the corresponding @code{struct Lisp_String}, which indicates
+ − 5665 both whether the string is used and how big the string is, i.e. how to
+ − 5666 get to the next chunk of string data. Holes are compressed by
+ − 5667 block-copying the next string into the empty space and relocating the
+ − 5668 pointer stored in the corresponding @code{struct Lisp_String}.
+ − 5669 @strong{This means you have to be careful with strings in your code.}
+ − 5670 See the section above on @code{GCPRO}ing.
+ − 5671
+ − 5672 Note that there is one situation not handled: a string that is too big
+ − 5673 to fit into a string-chars block. Such strings, called @dfn{big
+ − 5674 strings}, are all @code{malloc()}ed as their own block. (#### Although it
+ − 5675 would make more sense for the threshold for big strings to be somewhat
+ − 5676 lower, e.g. 1/2 or 1/4 the size of a string-chars block. It seems that
440
+ − 5677 this was indeed the case formerly---indeed, the threshold was set at
+ − 5678 1/8---but Mly forgot about this when rewriting things for 19.8.)
428
+ − 5679
+ − 5680 Note also that the string data in string-chars blocks is padded as
+ − 5681 necessary so that proper alignment constraints on the @code{struct
+ − 5682 Lisp_String} back pointers are maintained.
+ − 5683
+ − 5684 Finally, strings can be resized. This happens in Mule when a
+ − 5685 character is substituted with a different-length character, or during
+ − 5686 modeline frobbing. (You could also export this to Lisp, but it's not
+ − 5687 done so currently.) Resizing a string is a potentially tricky process.
+ − 5688 If the change is small enough that the padding can absorb it, nothing
+ − 5689 other than a simple memory move needs to be done. Keep in mind,
+ − 5690 however, that the string can't shrink too much because the offset to the
+ − 5691 next string in the string-chars block is computed by looking at the
+ − 5692 length and rounding to the nearest multiple of four or eight. If the
+ − 5693 string would shrink or expand beyond the correct padding, new string
+ − 5694 data needs to be allocated at the end of the last string-chars block and
+ − 5695 the data moved appropriately. This leaves some dead string data, which
+ − 5696 is marked by putting a special marker of 0xFFFFFFFF in the @code{struct
+ − 5697 Lisp_String} pointer before the data (there's no real @code{struct
+ − 5698 Lisp_String} to point to and relocate), and storing the size of the dead
+ − 5699 string data (which would normally be obtained from the now-non-existent
+ − 5700 @code{struct Lisp_String}) at the beginning of the dead string data gap.
+ − 5701 The string compactor recognizes this special 0xFFFFFFFF marker and
+ − 5702 handles it correctly.
+ − 5703
442
+ − 5704 @node Compiled Function, , String, Allocation of Objects in XEmacs Lisp
428
+ − 5705 @section Compiled Function
+ − 5706
+ − 5707 Not yet documented.
+ − 5708
442
+ − 5709
+ − 5710 @node Dumping, Events and the Event Loop, Allocation of Objects in XEmacs Lisp, Top
+ − 5711 @chapter Dumping
+ − 5712
+ − 5713 @section What is dumping and its justification
+ − 5714
+ − 5715 The C code of XEmacs is just a Lisp engine with a lot of built-in
+ − 5716 primitives useful for writing an editor. The editor itself is written
+ − 5717 mostly in Lisp, and represents around 100K lines of code. Loading and
+ − 5718 executing the initialization of all this code takes a bit a time (five
+ − 5719 to ten times the usual startup time of current xemacs) and requires
+ − 5720 having all the lisp source files around. Having to reload them each
+ − 5721 time the editor is started would not be acceptable.
+ − 5722
+ − 5723 The traditional solution to this problem is called dumping: the build
+ − 5724 process first creates the lisp engine under the name @file{temacs}, then
+ − 5725 runs it until it has finished loading and initializing all the lisp
+ − 5726 code, and eventually creates a new executable called @file{xemacs}
+ − 5727 including both the object code in @file{temacs} and all the contents of
+ − 5728 the memory after the initialization.
+ − 5729
+ − 5730 This solution, while working, has a huge problem: the creation of the
+ − 5731 new executable from the actual contents of memory is an extremely
+ − 5732 system-specific process, quite error-prone, and which interferes with a
+ − 5733 lot of system libraries (like malloc). It is even getting worse
+ − 5734 nowadays with libraries using constructors which are automatically
+ − 5735 called when the program is started (even before main()) which tend to
+ − 5736 crash when they are called multiple times, once before dumping and once
+ − 5737 after (IRIX 6.x libz.so pulls in some C++ image libraries thru
+ − 5738 dependencies which have this problem). Writing the dumper is also one
+ − 5739 of the most difficult parts of porting XEmacs to a new operating system.
+ − 5740 Basically, `dumping' is an operation that is just not officially
+ − 5741 supported on many operating systems.
+ − 5742
+ − 5743 The aim of the portable dumper is to solve the same problem as the
+ − 5744 system-specific dumper, that is to be able to reload quickly, using only
+ − 5745 a small number of files, the fully initialized lisp part of the editor,
+ − 5746 without any system-specific hacks.
+ − 5747
+ − 5748 @menu
+ − 5749 * Overview::
+ − 5750 * Data descriptions::
+ − 5751 * Dumping phase::
+ − 5752 * Reloading phase::
+ − 5753 * Remaining issues::
+ − 5754 @end menu
+ − 5755
+ − 5756 @node Overview, Data descriptions, Dumping, Dumping
+ − 5757 @section Overview
+ − 5758
+ − 5759 The portable dumping system has to:
+ − 5760
+ − 5761 @enumerate
+ − 5762 @item
+ − 5763 At dump time, write all initialized, non-quickly-rebuildable data to a
+ − 5764 file [Note: currently named @file{xemacs.dmp}, but the name will
+ − 5765 change], along with all informations needed for the reloading.
+ − 5766
+ − 5767 @item
+ − 5768 When starting xemacs, reload the dump file, relocate it to its new
+ − 5769 starting address if needed, and reinitialize all pointers to this
+ − 5770 data. Also, rebuild all the quickly rebuildable data.
+ − 5771 @end enumerate
+ − 5772
+ − 5773 @node Data descriptions, Dumping phase, Overview, Dumping
+ − 5774 @section Data descriptions
+ − 5775
+ − 5776 The more complex task of the dumper is to be able to write lisp objects
+ − 5777 (lrecords) and C structs to disk and reload them at a different address,
+ − 5778 updating all the pointers they include in the process. This is done by
+ − 5779 using external data descriptions that give information about the layout
+ − 5780 of the structures in memory.
+ − 5781
+ − 5782 The specification of these descriptions is in lrecord.h. A description
+ − 5783 of an lrecord is an array of struct lrecord_description. Each of these
+ − 5784 structs include a type, an offset in the structure and some optional
+ − 5785 parameters depending on the type. For instance, here is the string
+ − 5786 description:
+ − 5787
+ − 5788 @example
+ − 5789 static const struct lrecord_description string_description[] = @{
+ − 5790 @{ XD_BYTECOUNT, offsetof (Lisp_String, size) @},
+ − 5791 @{ XD_OPAQUE_DATA_PTR, offsetof (Lisp_String, data), XD_INDIRECT(0, 1) @},
+ − 5792 @{ XD_LISP_OBJECT, offsetof (Lisp_String, plist) @},
+ − 5793 @{ XD_END @}
+ − 5794 @};
+ − 5795 @end example
+ − 5796
+ − 5797 The first line indicates a member of type Bytecount, which is used by
+ − 5798 the next, indirect directive. The second means "there is a pointer to
+ − 5799 some opaque data in the field @code{data}". The length of said data is
+ − 5800 given by the expression @code{XD_INDIRECT(0, 1)}, which means "the value
+ − 5801 in the 0th line of the description (welcome to C) plus one". The third
+ − 5802 line means "there is a Lisp_Object member @code{plist} in the Lisp_String
+ − 5803 structure". @code{XD_END} then ends the description.
+ − 5804
+ − 5805 This gives us all the information we need to move around what is pointed
+ − 5806 to by a structure (C or lrecord) and, by transitivity, everything that
+ − 5807 it points to. The only missing information for dumping is the size of
+ − 5808 the structure. For lrecords, this is part of the
+ − 5809 lrecord_implementation, so we don't need to duplicate it. For C
+ − 5810 structures we use a struct struct_description, which includes a size
+ − 5811 field and a pointer to an associated array of lrecord_description.
+ − 5812
+ − 5813 @node Dumping phase, Reloading phase, Data descriptions, Dumping
+ − 5814 @section Dumping phase
+ − 5815
+ − 5816 Dumping is done by calling the function pdump() (in dumper.c) which is
+ − 5817 invoked from Fdump_emacs (in emacs.c). This function performs a number
+ − 5818 of tasks.
+ − 5819
+ − 5820 @menu
+ − 5821 * Object inventory::
+ − 5822 * Address allocation::
+ − 5823 * The header::
+ − 5824 * Data dumping::
+ − 5825 * Pointers dumping::
+ − 5826 @end menu
+ − 5827
+ − 5828 @node Object inventory, Address allocation, Dumping phase, Dumping phase
+ − 5829 @subsection Object inventory
+ − 5830
+ − 5831 The first task is to build the list of the objects to dump. This
+ − 5832 includes:
+ − 5833
+ − 5834 @itemize @bullet
+ − 5835 @item lisp objects
+ − 5836 @item C structures
+ − 5837 @end itemize
+ − 5838
+ − 5839 We end up with one @code{pdump_entry_list_elmt} per object group (arrays
+ − 5840 of C structs are kept together) which includes a pointer to the first
+ − 5841 object of the group, the per-object size and the count of objects in the
+ − 5842 group, along with some other information which is initialized later.
+ − 5843
+ − 5844 These entries are linked together in @code{pdump_entry_list} structures
+ − 5845 and can be enumerated thru either:
+ − 5846
+ − 5847 @enumerate
+ − 5848 @item
+ − 5849 the @code{pdump_object_table}, an array of @code{pdump_entry_list}, one
+ − 5850 per lrecord type, indexed by type number.
+ − 5851
+ − 5852 @item
+ − 5853 the @code{pdump_opaque_data_list}, used for the opaque data which does
+ − 5854 not include pointers, and hence does not need descriptions.
+ − 5855
+ − 5856 @item
+ − 5857 the @code{pdump_struct_table}, which is a vector of
+ − 5858 @code{struct_description}/@code{pdump_entry_list} pairs, used for
+ − 5859 non-opaque C structures.
+ − 5860 @end enumerate
+ − 5861
+ − 5862 This uses a marking strategy similar to the garbage collector. Some
+ − 5863 differences though:
+ − 5864
+ − 5865 @enumerate
+ − 5866 @item
+ − 5867 We do not use the mark bit (which does not exist for C structures
+ − 5868 anyway), we use a big hash table instead.
+ − 5869
+ − 5870 @item
+ − 5871 We do not use the mark function of lrecords but instead rely on the
+ − 5872 external descriptions. This happens essentially because we need to
+ − 5873 follow pointers to C structures and opaque data in addition to
+ − 5874 Lisp_Object members.
+ − 5875 @end enumerate
+ − 5876
+ − 5877 This is done by @code{pdump_register_object}, which handles Lisp_Object
+ − 5878 variables, and pdump_register_struct which handles C structures, which
+ − 5879 both delegate the description management to pdump_register_sub.
+ − 5880
+ − 5881 The hash table doubles as a map object to pdump_entry_list_elmt (i.e.
+ − 5882 allows us to look up a pdump_entry_list_elmt with the object it points
+ − 5883 to). Entries are added with @code{pdump_add_entry()} and looked up with
+ − 5884 @code{pdump_get_entry()}. There is no need for entry removal. The hash
+ − 5885 value is computed quite basically from the object pointer by
+ − 5886 @code{pdump_make_hash()}.
+ − 5887
+ − 5888 The roots for the marking are:
+ − 5889
+ − 5890 @enumerate
+ − 5891 @item
+ − 5892 the @code{staticpro}'ed variables (there is a special @code{staticpro_nodump()}
+ − 5893 call for protected variables we do not want to dump).
+ − 5894
+ − 5895 @item
+ − 5896 the @code{pdump_wire}'d variables (@code{staticpro} is equivalent to
+ − 5897 @code{staticpro_nodump()} + @code{pdump_wire()}).
+ − 5898
+ − 5899 @item
+ − 5900 the @code{dumpstruct}'ed variables, which points to C structures.
+ − 5901 @end enumerate
+ − 5902
+ − 5903 This does not include the GCPRO'ed variables, the specbinds, the
+ − 5904 catchtags, the backlist, the redisplay or the profiling info, since we
+ − 5905 do not want to rebuild the actual chain of lisp calls which end up to
+ − 5906 the dump-emacs call, only the global variables.
+ − 5907
+ − 5908 Weak lists and weak hash tables are dumped as if they were their
+ − 5909 non-weak equivalent (without changing their type, of course). This has
+ − 5910 not yet been a problem.
+ − 5911
+ − 5912 @node Address allocation, The header, Object inventory, Dumping phase
+ − 5913 @subsection Address allocation
+ − 5914
+ − 5915
+ − 5916 The next step is to allocate the offsets of each of the objects in the
+ − 5917 final dump file. This is done by @code{pdump_allocate_offset()} which
+ − 5918 is called indirectly by @code{pdump_scan_by_alignment()}.
+ − 5919
+ − 5920 The strategy to deal with alignment problems uses these facts:
+ − 5921
+ − 5922 @enumerate
+ − 5923 @item
+ − 5924 real world alignment requirements are powers of two.
+ − 5925
+ − 5926 @item
+ − 5927 the C compiler is required to adjust the size of a struct so that you
+ − 5928 can have an array of them next to each other. This means you can have a
+ − 5929 upper bound of the alignment requirements of a given structure by
+ − 5930 looking at which power of two its size is a multiple.
+ − 5931
+ − 5932 @item
+ − 5933 the non-variant part of variable size lrecords has an alignment
+ − 5934 requirement of 4.
+ − 5935 @end enumerate
+ − 5936
+ − 5937 Hence, for each lrecord type, C struct type or opaque data block the
+ − 5938 alignment requirement is computed as a power of two, with a minimum of
+ − 5939 2^2 for lrecords. @code{pdump_scan_by_alignment()} then scans all the
+ − 5940 @code{pdump_entry_list_elmt}'s, the ones with the highest requirements
+ − 5941 first. This ensures the best packing.
+ − 5942
+ − 5943 The maximum alignment requirement we take into account is 2^8.
+ − 5944
+ − 5945 @code{pdump_allocate_offset()} only has to do a linear allocation,
+ − 5946 starting at offset 256 (this leaves room for the header and keep the
+ − 5947 alignments happy).
+ − 5948
+ − 5949 @node The header, Data dumping, Address allocation, Dumping phase
+ − 5950 @subsection The header
+ − 5951
+ − 5952 The next step creates the file and writes a header with a signature and
+ − 5953 some random informations in it (number of staticpro, number of assigned
+ − 5954 lrecord types, etc...). The reloc_address field, which indicates at
+ − 5955 which address the file should be loaded if we want to avoid post-reload
+ − 5956 relocation, is set to 0. It then seeks to offset 256 (base offset for
+ − 5957 the objects).
+ − 5958
+ − 5959 @node Data dumping, Pointers dumping, The header, Dumping phase
+ − 5960 @subsection Data dumping
+ − 5961
+ − 5962 The data is dumped in the same order as the addresses were allocated by
+ − 5963 @code{pdump_dump_data()}, called from @code{pdump_scan_by_alignment()}.
+ − 5964 This function copies the data to a temporary buffer, relocates all
+ − 5965 pointers in the object to the addresses allocated in step Address
+ − 5966 Allocation, and writes it to the file. Using the same order means that,
+ − 5967 if we are careful with lrecords whose size is not a multiple of 4, we
+ − 5968 are ensured that the object is always written at the offset in the file
+ − 5969 allocated in step Address Allocation.
+ − 5970
+ − 5971 @node Pointers dumping, , Data dumping, Dumping phase
+ − 5972 @subsection Pointers dumping
+ − 5973
+ − 5974 A bunch of tables needed to reassign properly the global pointers are
+ − 5975 then written. They are:
+ − 5976
+ − 5977 @enumerate
+ − 5978 @item
+ − 5979 the staticpro array
+ − 5980 @item
+ − 5981 the dumpstruct array
+ − 5982 @item
+ − 5983 the lrecord_implementation_table array
+ − 5984 @item
+ − 5985 a vector of all the offsets to the objects in the file that include a
+ − 5986 description (for faster relocation at reload time)
+ − 5987 @item
+ − 5988 the pdump_wired and pdump_wired_list arrays
+ − 5989 @end enumerate
+ − 5990
+ − 5991 For each of the arrays we write both the pointer to the variables and
+ − 5992 the relocated offset of the object they point to. Since these variables
+ − 5993 are global, the pointers are still valid when restarting the program and
+ − 5994 are used to regenerate the global pointers.
+ − 5995
+ − 5996 The @code{pdump_wired_list} array is a special case. The variables it
+ − 5997 points to are the head of weak linked lists of lisp objects of the same
+ − 5998 type. Not all objects of this list are dumped so the relocated pointer
+ − 5999 we associate with them points to the first dumped object of the list, or
+ − 6000 Qnil if none is available. This is also the reason why they are not
+ − 6001 used as roots for the purpose of object enumeration.
+ − 6002
+ − 6003 This is the end of the dumping part.
+ − 6004
+ − 6005 @node Reloading phase, Remaining issues, Dumping phase, Dumping
+ − 6006 @section Reloading phase
+ − 6007
+ − 6008 @subsection File loading
+ − 6009
+ − 6010 The file is mmap'ed in memory (which ensures a PAGESIZE alignment, at
+ − 6011 least 4096), or if mmap is unavailable or fails, a 256-bytes aligned
+ − 6012 malloc is done and the file is loaded.
+ − 6013
+ − 6014 Some variables are reinitialized from the values found in the header.
+ − 6015
+ − 6016 The difference between the actual loading address and the reloc_address
+ − 6017 is computed and will be used for all the relocations.
+ − 6018
+ − 6019
+ − 6020 @subsection Putting back the staticvec
+ − 6021
+ − 6022 The staticvec array is memcpy'd from the file and the variables it
+ − 6023 points to are reset to the relocated objects addresses.
+ − 6024
+ − 6025
+ − 6026 @subsection Putting back the dumpstructed variables
+ − 6027
+ − 6028 The variables pointed to by dumpstruct in the dump phase are reset to
+ − 6029 the right relocated object addresses.
+ − 6030
+ − 6031
+ − 6032 @subsection lrecord_implementations_table
+ − 6033
+ − 6034 The lrecord_implementations_table is reset to its dump time state and
+ − 6035 the right lrecord_type_index values are put in.
+ − 6036
+ − 6037
+ − 6038 @subsection Object relocation
+ − 6039
+ − 6040 All the objects are relocated using their description and their offset
+ − 6041 by @code{pdump_reloc_one}. This step is unnecessary if the
+ − 6042 reloc_address is equal to the file loading address.
+ − 6043
+ − 6044
+ − 6045 @subsection Putting back the pdump_wire and pdump_wire_list variables
+ − 6046
+ − 6047 Same as Putting back the dumpstructed variables.
+ − 6048
+ − 6049
+ − 6050 @subsection Reorganize the hash tables
+ − 6051
+ − 6052 Since some of the hash values in the lisp hash tables are
+ − 6053 address-dependent, their layout is now wrong. So we go through each of
+ − 6054 them and have them resorted by calling @code{pdump_reorganize_hash_table}.
+ − 6055
+ − 6056 @node Remaining issues, , Reloading phase, Dumping
+ − 6057 @section Remaining issues
+ − 6058
+ − 6059 The build process will have to start a post-dump xemacs, ask it the
+ − 6060 loading address (which will, hopefully, be always the same between
+ − 6061 different xemacs invocations) and relocate the file to the new address.
+ − 6062 This way the object relocation phase will not have to be done, which
+ − 6063 means no writes in the objects and that, because of the use of mmap, the
+ − 6064 dumped data will be shared between all the xemacs running on the
+ − 6065 computer.
+ − 6066
+ − 6067 Some executable signature will be necessary to ensure that a given dump
+ − 6068 file is really associated with a given executable, or random crashes
+ − 6069 will occur. Maybe a random number set at compile or configure time thru
+ − 6070 a define. This will also allow for having differently-compiled xemacsen
+ − 6071 on the same system (mule and no-mule comes to mind).
+ − 6072
+ − 6073 The DOC file contents should probably end up in the dump file.
+ − 6074
+ − 6075
+ − 6076 @node Events and the Event Loop, Evaluation; Stack Frames; Bindings, Dumping, Top
428
+ − 6077 @chapter Events and the Event Loop
+ − 6078
+ − 6079 @menu
+ − 6080 * Introduction to Events::
+ − 6081 * Main Loop::
+ − 6082 * Specifics of the Event Gathering Mechanism::
+ − 6083 * Specifics About the Emacs Event::
+ − 6084 * The Event Stream Callback Routines::
+ − 6085 * Other Event Loop Functions::
+ − 6086 * Converting Events::
+ − 6087 * Dispatching Events; The Command Builder::
+ − 6088 @end menu
+ − 6089
442
+ − 6090 @node Introduction to Events, Main Loop, Events and the Event Loop, Events and the Event Loop
428
+ − 6091 @section Introduction to Events
+ − 6092
+ − 6093 An event is an object that encapsulates information about an
+ − 6094 interesting occurrence in the operating system. Events are
+ − 6095 generated either by user action, direct (e.g. typing on the
+ − 6096 keyboard or moving the mouse) or indirect (moving another
+ − 6097 window, thereby generating an expose event on an Emacs frame),
+ − 6098 or as a result of some other typically asynchronous action happening,
+ − 6099 such as output from a subprocess being ready or a timer expiring.
+ − 6100 Events come into the system in an asynchronous fashion (typically
+ − 6101 through a callback being called) and are converted into a
+ − 6102 synchronous event queue (first-in, first-out) in a process that
+ − 6103 we will call @dfn{collection}.
+ − 6104
+ − 6105 Note that each application has its own event queue. (It is
+ − 6106 immaterial whether the collection process directly puts the
+ − 6107 events in the proper application's queue, or puts them into
+ − 6108 a single system queue, which is later split up.)
+ − 6109
+ − 6110 The most basic level of event collection is done by the
+ − 6111 operating system or window system. Typically, XEmacs does
+ − 6112 its own event collection as well. Often there are multiple
+ − 6113 layers of collection in XEmacs, with events from various
+ − 6114 sources being collected into a queue, which is then combined
+ − 6115 with other sources to go into another queue (i.e. a second
+ − 6116 level of collection), with perhaps another level on top of
+ − 6117 this, etc.
+ − 6118
+ − 6119 XEmacs has its own types of events (called @dfn{Emacs events}),
+ − 6120 which provides an abstract layer on top of the system-dependent
+ − 6121 nature of the most basic events that are received. Part of the
+ − 6122 complex nature of the XEmacs event collection process involves
+ − 6123 converting from the operating-system events into the proper
440
+ − 6124 Emacs events---there may not be a one-to-one correspondence.
428
+ − 6125
+ − 6126 Emacs events are documented in @file{events.h}; I'll discuss them
+ − 6127 later.
+ − 6128
442
+ − 6129 @node Main Loop, Specifics of the Event Gathering Mechanism, Introduction to Events, Events and the Event Loop
428
+ − 6130 @section Main Loop
+ − 6131
+ − 6132 The @dfn{command loop} is the top-level loop that the editor is always
+ − 6133 running. It loops endlessly, calling @code{next-event} to retrieve an
+ − 6134 event and @code{dispatch-event} to execute it. @code{dispatch-event} does
+ − 6135 the appropriate thing with non-user events (process, timeout,
+ − 6136 magic, eval, mouse motion); this involves calling a Lisp handler
+ − 6137 function, redrawing a newly-exposed part of a frame, reading
+ − 6138 subprocess output, etc. For user events, @code{dispatch-event}
+ − 6139 looks up the event in relevant keymaps or menubars; when a
+ − 6140 full key sequence or menubar selection is reached, the appropriate
+ − 6141 function is executed. @code{dispatch-event} may have to keep state
+ − 6142 across calls; this is done in the ``command-builder'' structure
+ − 6143 associated with each console (remember, there's usually only
+ − 6144 one console), and the engine that looks up keystrokes and
+ − 6145 constructs full key sequences is called the @dfn{command builder}.
+ − 6146 This is documented elsewhere.
+ − 6147
+ − 6148 The guts of the command loop are in @code{command_loop_1()}. This
440
+ − 6149 function doesn't catch errors, though---that's the job of
428
+ − 6150 @code{command_loop_2()}, which is a condition-case (i.e. error-trapping)
+ − 6151 wrapper around @code{command_loop_1()}. @code{command_loop_1()} never
+ − 6152 returns, but may get thrown out of.
+ − 6153
+ − 6154 When an error occurs, @code{cmd_error()} is called, which usually
+ − 6155 invokes the Lisp error handler in @code{command-error}; however, a
+ − 6156 default error handler is provided if @code{command-error} is @code{nil}
+ − 6157 (e.g. during startup). The purpose of the error handler is simply to
+ − 6158 display the error message and do associated cleanup; it does not need to
+ − 6159 throw anywhere. When the error handler finishes, the condition-case in
+ − 6160 @code{command_loop_2()} will finish and @code{command_loop_2()} will
+ − 6161 reinvoke @code{command_loop_1()}.
+ − 6162
+ − 6163 @code{command_loop_2()} is invoked from three places: from
+ − 6164 @code{initial_command_loop()} (called from @code{main()} at the end of
+ − 6165 internal initialization), from the Lisp function @code{recursive-edit},
+ − 6166 and from @code{call_command_loop()}.
+ − 6167
+ − 6168 @code{call_command_loop()} is called when a macro is started and when
+ − 6169 the minibuffer is entered; normal termination of the macro or minibuffer
+ − 6170 causes a throw out of the recursive command loop. (To
+ − 6171 @code{execute-kbd-macro} for macros and @code{exit} for minibuffers.
+ − 6172 Note also that the low-level minibuffer-entering function,
+ − 6173 @code{read-minibuffer-internal}, provides its own error handling and
+ − 6174 does not need @code{command_loop_2()}'s error encapsulation; so it tells
+ − 6175 @code{call_command_loop()} to invoke @code{command_loop_1()} directly.)
+ − 6176
+ − 6177 Note that both read-minibuffer-internal and recursive-edit set up a
+ − 6178 catch for @code{exit}; this is why @code{abort-recursive-edit}, which
+ − 6179 throws to this catch, exits out of either one.
+ − 6180
+ − 6181 @code{initial_command_loop()}, called from @code{main()}, sets up a
+ − 6182 catch for @code{top-level} when invoking @code{command_loop_2()},
+ − 6183 allowing functions to throw all the way to the top level if they really
+ − 6184 need to. Before invoking @code{command_loop_2()},
+ − 6185 @code{initial_command_loop()} calls @code{top_level_1()}, which handles
+ − 6186 all of the startup stuff (creating the initial frame, handling the
+ − 6187 command-line options, loading the user's @file{.emacs} file, etc.). The
+ − 6188 function that actually does this is in Lisp and is pointed to by the
+ − 6189 variable @code{top-level}; normally this function is
+ − 6190 @code{normal-top-level}. @code{top_level_1()} is just an error-handling
+ − 6191 wrapper similar to @code{command_loop_2()}. Note also that
+ − 6192 @code{initial_command_loop()} sets up a catch for @code{top-level} when
+ − 6193 invoking @code{top_level_1()}, just like when it invokes
+ − 6194 @code{command_loop_2()}.
+ − 6195
442
+ − 6196 @node Specifics of the Event Gathering Mechanism, Specifics About the Emacs Event, Main Loop, Events and the Event Loop
428
+ − 6197 @section Specifics of the Event Gathering Mechanism
+ − 6198
+ − 6199 Here is an approximate diagram of the collection processes
+ − 6200 at work in XEmacs, under TTY's (TTY's are simpler than X
+ − 6201 so we'll look at this first):
+ − 6202
+ − 6203 @noindent
+ − 6204 @example
+ − 6205 asynch. asynch. asynch. asynch. [Collectors in
+ − 6206 kbd events kbd events process process the OS]
+ − 6207 | | output output
+ − 6208 | | | |
+ − 6209 | | | | SIGINT, [signal handlers
+ − 6210 | | | | SIGQUIT, in XEmacs]
+ − 6211 V V V V SIGWINCH,
+ − 6212 file file file file SIGALRM
+ − 6213 desc. desc. desc. desc. |
+ − 6214 (TTY) (TTY) (pipe) (pipe) |
+ − 6215 | | | | fake timeouts
+ − 6216 | | | | file |
+ − 6217 | | | | desc. |
+ − 6218 | | | | (pipe) |
+ − 6219 | | | | | |
+ − 6220 | | | | | |
+ − 6221 | | | | | |
+ − 6222 V V V V V V
+ − 6223 ------>-----------<----------------<----------------
+ − 6224 |
+ − 6225 |
+ − 6226 | [collected using select() in emacs_tty_next_event()
+ − 6227 | and converted to the appropriate Emacs event]
+ − 6228 |
+ − 6229 |
+ − 6230 V (above this line is TTY-specific)
+ − 6231 Emacs -----------------------------------------------
+ − 6232 event (below this line is the generic event mechanism)
+ − 6233 |
+ − 6234 |
+ − 6235 was there if not, call
+ − 6236 a SIGINT? emacs_tty_next_event()
+ − 6237 | |
+ − 6238 | |
+ − 6239 | |
+ − 6240 V V
+ − 6241 --->------<----
+ − 6242 |
+ − 6243 | [collected in event_stream_next_event();
+ − 6244 | SIGINT is converted using maybe_read_quit_event()]
+ − 6245 V
+ − 6246 Emacs
+ − 6247 event
+ − 6248 |
+ − 6249 \---->------>----- maybe_kbd_translate() ---->---\
+ − 6250 |
+ − 6251 |
+ − 6252 |
+ − 6253 command event queue |
+ − 6254 if not from command
+ − 6255 (contains events that were event queue, call
+ − 6256 read earlier but not processed, event_stream_next_event()
+ − 6257 typically when waiting in a |
+ − 6258 sit-for, sleep-for, etc. for |
+ − 6259 a particular event to be received) |
+ − 6260 | |
+ − 6261 | |
+ − 6262 V V
+ − 6263 ---->------------------------------------<----
+ − 6264 |
+ − 6265 | [collected in
+ − 6266 | next_event_internal()]
+ − 6267 |
+ − 6268 unread- unread- event from |
+ − 6269 command- command- keyboard else, call
+ − 6270 events event macro next_event_internal()
+ − 6271 | | | |
+ − 6272 | | | |
+ − 6273 | | | |
+ − 6274 V V V V
+ − 6275 --------->----------------------<------------
+ − 6276 |
+ − 6277 | [collected in `next-event', which may loop
+ − 6278 | more than once if the event it gets is on
+ − 6279 | a dead frame, device, etc.]
+ − 6280 |
+ − 6281 |
+ − 6282 V
+ − 6283 feed into top-level event loop,
+ − 6284 which repeatedly calls `next-event'
+ − 6285 and then dispatches the event
+ − 6286 using `dispatch-event'
+ − 6287 @end example
+ − 6288
+ − 6289 Notice the separation between TTY-specific and generic event mechanism.
+ − 6290 When using the Xt-based event loop, the TTY-specific stuff is replaced
+ − 6291 but the rest stays the same.
+ − 6292
+ − 6293 It's also important to realize that only one different kind of
+ − 6294 system-specific event loop can be operating at a time, and must be able
+ − 6295 to receive all kinds of events simultaneously. For the two existing
+ − 6296 event loops (implemented in @file{event-tty.c} and @file{event-Xt.c},
+ − 6297 respectively), the TTY event loop @emph{only} handles TTY consoles,
+ − 6298 while the Xt event loop handles @emph{both} TTY and X consoles. This
+ − 6299 situation is different from all of the output handlers, where you simply
+ − 6300 have one per console type.
+ − 6301
+ − 6302 Here's the Xt Event Loop Diagram (notice that below a certain point,
+ − 6303 it's the same as the above diagram):
+ − 6304
+ − 6305 @example
+ − 6306 asynch. asynch. asynch. asynch. [Collectors in
+ − 6307 kbd kbd process process the OS]
+ − 6308 events events output output
+ − 6309 | | | |
+ − 6310 | | | | asynch. asynch. [Collectors in the
+ − 6311 | | | | X X OS and X Window System]
+ − 6312 | | | | events events
+ − 6313 | | | | | |
+ − 6314 | | | | | |
+ − 6315 | | | | | | SIGINT, [signal handlers
+ − 6316 | | | | | | SIGQUIT, in XEmacs]
+ − 6317 | | | | | | SIGWINCH,
+ − 6318 | | | | | | SIGALRM
+ − 6319 | | | | | | |
+ − 6320 | | | | | | |
+ − 6321 | | | | | | | timeouts
+ − 6322 | | | | | | | |
+ − 6323 | | | | | | | |
+ − 6324 | | | | | | V |
+ − 6325 V V V V V V fake |
+ − 6326 file file file file file file file |
+ − 6327 desc. desc. desc. desc. desc. desc. desc. |
+ − 6328 (TTY) (TTY) (pipe) (pipe) (socket) (socket) (pipe) |
+ − 6329 | | | | | | | |
+ − 6330 | | | | | | | |
+ − 6331 | | | | | | | |
+ − 6332 V V V V V V V V
+ − 6333 --->----------------------------------------<---------<------
+ − 6334 | | |
+ − 6335 | | |[collected using select() in
+ − 6336 | | | _XtWaitForSomething(), called
+ − 6337 | | | from XtAppProcessEvent(), called
+ − 6338 | | | in emacs_Xt_next_event();
+ − 6339 | | | dispatched to various callbacks]
+ − 6340 | | |
+ − 6341 | | |
+ − 6342 emacs_Xt_ p_s_callback(), | [popup_selection_callback]
+ − 6343 event_handler() x_u_v_s_callback(),| [x_update_vertical_scrollbar_
+ − 6344 | x_u_h_s_callback(),| callback]
+ − 6345 | search_callback() | [x_update_horizontal_scrollbar_
+ − 6346 | | | callback]
+ − 6347 | | |
+ − 6348 | | |
+ − 6349 enqueue_Xt_ signal_special_ |
+ − 6350 dispatch_event() Xt_user_event() |
+ − 6351 [maybe multiple | |
+ − 6352 times, maybe 0 | |
+ − 6353 times] | |
+ − 6354 | enqueue_Xt_ |
+ − 6355 | dispatch_event() |
+ − 6356 | | |
+ − 6357 | | |
+ − 6358 V V |
+ − 6359 -->----------<-- |
+ − 6360 | |
+ − 6361 | |
+ − 6362 dispatch Xt_what_callback()
+ − 6363 event sets flags
+ − 6364 queue |
+ − 6365 | |
+ − 6366 | |
+ − 6367 | |
+ − 6368 | |
+ − 6369 ---->-----------<--------
+ − 6370 |
+ − 6371 |
+ − 6372 | [collected and converted as appropriate in
+ − 6373 | emacs_Xt_next_event()]
+ − 6374 |
+ − 6375 |
+ − 6376 V (above this line is Xt-specific)
+ − 6377 Emacs ------------------------------------------------
+ − 6378 event (below this line is the generic event mechanism)
+ − 6379 |
+ − 6380 |
+ − 6381 was there if not, call
+ − 6382 a SIGINT? emacs_Xt_next_event()
+ − 6383 | |
+ − 6384 | |
+ − 6385 | |
+ − 6386 V V
+ − 6387 --->-------<----
+ − 6388 |
+ − 6389 | [collected in event_stream_next_event();
+ − 6390 | SIGINT is converted using maybe_read_quit_event()]
+ − 6391 V
+ − 6392 Emacs
+ − 6393 event
+ − 6394 |
+ − 6395 \---->------>----- maybe_kbd_translate() -->-----\
+ − 6396 |
+ − 6397 |
+ − 6398 |
+ − 6399 command event queue |
+ − 6400 if not from command
+ − 6401 (contains events that were event queue, call
+ − 6402 read earlier but not processed, event_stream_next_event()
+ − 6403 typically when waiting in a |
+ − 6404 sit-for, sleep-for, etc. for |
+ − 6405 a particular event to be received) |
+ − 6406 | |
+ − 6407 | |
+ − 6408 V V
+ − 6409 ---->----------------------------------<------
+ − 6410 |
+ − 6411 | [collected in
+ − 6412 | next_event_internal()]
+ − 6413 |
+ − 6414 unread- unread- event from |
+ − 6415 command- command- keyboard else, call
+ − 6416 events event macro next_event_internal()
+ − 6417 | | | |
+ − 6418 | | | |
+ − 6419 | | | |
+ − 6420 V V V V
+ − 6421 --------->----------------------<------------
+ − 6422 |
+ − 6423 | [collected in `next-event', which may loop
+ − 6424 | more than once if the event it gets is on
+ − 6425 | a dead frame, device, etc.]
+ − 6426 |
+ − 6427 |
+ − 6428 V
+ − 6429 feed into top-level event loop,
+ − 6430 which repeatedly calls `next-event'
+ − 6431 and then dispatches the event
+ − 6432 using `dispatch-event'
+ − 6433 @end example
+ − 6434
442
+ − 6435 @node Specifics About the Emacs Event, The Event Stream Callback Routines, Specifics of the Event Gathering Mechanism, Events and the Event Loop
428
+ − 6436 @section Specifics About the Emacs Event
+ − 6437
442
+ − 6438 @node The Event Stream Callback Routines, Other Event Loop Functions, Specifics About the Emacs Event, Events and the Event Loop
428
+ − 6439 @section The Event Stream Callback Routines
+ − 6440
442
+ − 6441 @node Other Event Loop Functions, Converting Events, The Event Stream Callback Routines, Events and the Event Loop
428
+ − 6442 @section Other Event Loop Functions
+ − 6443
+ − 6444 @code{detect_input_pending()} and @code{input-pending-p} look for
+ − 6445 input by calling @code{event_stream->event_pending_p} and looking in
+ − 6446 @code{[V]unread-command-event} and the @code{command_event_queue} (they
+ − 6447 do not check for an executing keyboard macro, though).
+ − 6448
+ − 6449 @code{discard-input} cancels any command events pending (and any
+ − 6450 keyboard macros currently executing), and puts the others onto the
+ − 6451 @code{command_event_queue}. There is a comment about a ``race
+ − 6452 condition'', which is not a good sign.
+ − 6453
+ − 6454 @code{next-command-event} and @code{read-char} are higher-level
+ − 6455 interfaces to @code{next-event}. @code{next-command-event} gets the
+ − 6456 next @dfn{command} event (i.e. keypress, mouse event, menu selection,
+ − 6457 or scrollbar action), calling @code{dispatch-event} on any others.
+ − 6458 @code{read-char} calls @code{next-command-event} and uses
+ − 6459 @code{event_to_character()} to return the character equivalent. With
+ − 6460 the right kind of input method support, it is possible for (read-char)
+ − 6461 to return a Kanji character.
+ − 6462
442
+ − 6463 @node Converting Events, Dispatching Events; The Command Builder, Other Event Loop Functions, Events and the Event Loop
428
+ − 6464 @section Converting Events
+ − 6465
+ − 6466 @code{character_to_event()}, @code{event_to_character()},
+ − 6467 @code{event-to-character}, and @code{character-to-event} convert between
+ − 6468 characters and keypress events corresponding to the characters. If the
+ − 6469 event was not a keypress, @code{event_to_character()} returns -1 and
+ − 6470 @code{event-to-character} returns @code{nil}. These functions convert
+ − 6471 between character representation and the split-up event representation
+ − 6472 (keysym plus mod keys).
+ − 6473
442
+ − 6474 @node Dispatching Events; The Command Builder, , Converting Events, Events and the Event Loop
428
+ − 6475 @section Dispatching Events; The Command Builder
+ − 6476
+ − 6477 Not yet documented.
+ − 6478
+ − 6479 @node Evaluation; Stack Frames; Bindings, Symbols and Variables, Events and the Event Loop, Top
+ − 6480 @chapter Evaluation; Stack Frames; Bindings
+ − 6481
+ − 6482 @menu
+ − 6483 * Evaluation::
+ − 6484 * Dynamic Binding; The specbinding Stack; Unwind-Protects::
+ − 6485 * Simple Special Forms::
+ − 6486 * Catch and Throw::
+ − 6487 @end menu
+ − 6488
442
+ − 6489 @node Evaluation, Dynamic Binding; The specbinding Stack; Unwind-Protects, Evaluation; Stack Frames; Bindings, Evaluation; Stack Frames; Bindings
428
+ − 6490 @section Evaluation
+ − 6491
+ − 6492 @code{Feval()} evaluates the form (a Lisp object) that is passed to
+ − 6493 it. Note that evaluation is only non-trivial for two types of objects:
+ − 6494 symbols and conses. A symbol is evaluated simply by calling
+ − 6495 @code{symbol-value} on it and returning the value.
+ − 6496
+ − 6497 Evaluating a cons means calling a function. First, @code{eval} checks
+ − 6498 to see if garbage-collection is necessary, and calls
+ − 6499 @code{garbage_collect_1()} if so. It then increases the evaluation
+ − 6500 depth by 1 (@code{lisp_eval_depth}, which is always less than
+ − 6501 @code{max_lisp_eval_depth}) and adds an element to the linked list of
+ − 6502 @code{struct backtrace}'s (@code{backtrace_list}). Each such structure
+ − 6503 contains a pointer to the function being called plus a list of the
+ − 6504 function's arguments. Originally these values are stored unevalled, and
+ − 6505 as they are evaluated, the backtrace structure is updated. Garbage
+ − 6506 collection pays attention to the objects pointed to in the backtrace
+ − 6507 structures (garbage collection might happen while a function is being
+ − 6508 called or while an argument is being evaluated, and there could easily
+ − 6509 be no other references to the arguments in the argument list; once an
+ − 6510 argument is evaluated, however, the unevalled version is not needed by
+ − 6511 eval, and so the backtrace structure is changed).
+ − 6512
+ − 6513 At this point, the function to be called is determined by looking at
+ − 6514 the car of the cons (if this is a symbol, its function definition is
+ − 6515 retrieved and the process repeated). The function should then consist
+ − 6516 of either a @code{Lisp_Subr} (built-in function written in C), a
+ − 6517 @code{Lisp_Compiled_Function} object, or a cons whose car is one of the
+ − 6518 symbols @code{autoload}, @code{macro} or @code{lambda}.
+ − 6519
+ − 6520 If the function is a @code{Lisp_Subr}, the lisp object points to a
+ − 6521 @code{struct Lisp_Subr} (created by @code{DEFUN()}), which contains a
+ − 6522 pointer to the C function, a minimum and maximum number of arguments
+ − 6523 (or possibly the special constants @code{MANY} or @code{UNEVALLED}), a
+ − 6524 pointer to the symbol referring to that subr, and a couple of other
+ − 6525 things. If the subr wants its arguments @code{UNEVALLED}, they are
+ − 6526 passed raw as a list. Otherwise, an array of evaluated arguments is
+ − 6527 created and put into the backtrace structure, and either passed whole
+ − 6528 (@code{MANY}) or each argument is passed as a C argument.
+ − 6529
+ − 6530 If the function is a @code{Lisp_Compiled_Function},
+ − 6531 @code{funcall_compiled_function()} is called. If the function is a
+ − 6532 lambda list, @code{funcall_lambda()} is called. If the function is a
+ − 6533 macro, [..... fill in] is done. If the function is an autoload,
+ − 6534 @code{do_autoload()} is called to load the definition and then eval
+ − 6535 starts over [explain this more].
+ − 6536
+ − 6537 When @code{Feval()} exits, the evaluation depth is reduced by one, the
+ − 6538 debugger is called if appropriate, and the current backtrace structure
+ − 6539 is removed from the list.
+ − 6540
+ − 6541 Both @code{funcall_compiled_function()} and @code{funcall_lambda()} need
+ − 6542 to go through the list of formal parameters to the function and bind
+ − 6543 them to the actual arguments, checking for @code{&rest} and
+ − 6544 @code{&optional} symbols in the formal parameters and making sure the
+ − 6545 number of actual arguments is correct.
+ − 6546 @code{funcall_compiled_function()} can do this a little more
+ − 6547 efficiently, since the formal parameter list can be checked for sanity
+ − 6548 when the compiled function object is created.
+ − 6549
+ − 6550 @code{funcall_lambda()} simply calls @code{Fprogn} to execute the code
+ − 6551 in the lambda list.
+ − 6552
+ − 6553 @code{funcall_compiled_function()} calls the real byte-code interpreter
+ − 6554 @code{execute_optimized_program()} on the byte-code instructions, which
+ − 6555 are converted into an internal form for faster execution.
+ − 6556
+ − 6557 When a compiled function is executed for the first time by
442
+ − 6558 @code{funcall_compiled_function()}, or during the dump phase of building
+ − 6559 XEmacs, the byte-code instructions are converted from a
+ − 6560 @code{Lisp_String} (which is inefficient to access, especially in the
+ − 6561 presence of MULE) into a @code{Lisp_Opaque} object containing an array
+ − 6562 of unsigned char, which can be directly executed by the byte-code
+ − 6563 interpreter. At this time the byte code is also analyzed for validity
+ − 6564 and transformed into a more optimized form, so that
428
+ − 6565 @code{execute_optimized_program()} can really fly.
+ − 6566
+ − 6567 Here are some of the optimizations performed by the internal byte-code
+ − 6568 transformer:
+ − 6569 @enumerate
+ − 6570 @item
+ − 6571 References to the @code{constants} array are checked for out-of-range
+ − 6572 indices, so that the byte interpreter doesn't have to.
+ − 6573 @item
+ − 6574 References to the @code{constants} array that will be used as a Lisp
+ − 6575 variable are checked for being correct non-constant (i.e. not @code{t},
+ − 6576 @code{nil}, or @code{keywordp}) symbols, so that the byte interpreter
+ − 6577 doesn't have to.
+ − 6578 @item
442
+ − 6579 The maximum number of variable bindings in the byte-code is
428
+ − 6580 pre-computed, so that space on the @code{specpdl} stack can be
+ − 6581 pre-reserved once for the whole function execution.
+ − 6582 @item
+ − 6583 All byte-code jumps are relative to the current program counter instead
+ − 6584 of the start of the program, thereby saving a register.
+ − 6585 @item
+ − 6586 One-byte relative jumps are converted from the byte-code form of unsigned
+ − 6587 chars offset by 127 to machine-friendly signed chars.
+ − 6588 @end enumerate
+ − 6589
+ − 6590 Of course, this transformation of the @code{instructions} should not be
+ − 6591 visible to the user, so @code{Fcompiled_function_instructions()} needs
+ − 6592 to know how to convert the optimized opaque object back into a Lisp
+ − 6593 string that is identical to the original string from the @file{.elc}
+ − 6594 file. (Actually, the resulting string may (rarely) contain slightly
+ − 6595 different, yet equivalent, byte code.)
+ − 6596
+ − 6597 @code{Ffuncall()} implements Lisp @code{funcall}. @code{(funcall fun
+ − 6598 x1 x2 x3 ...)} is equivalent to @code{(eval (list fun (quote x1) (quote
+ − 6599 x2) (quote x3) ...))}. @code{Ffuncall()} contains its own code to do
+ − 6600 the evaluation, however, and is very similar to @code{Feval()}.
+ − 6601
+ − 6602 From the performance point of view, it is worth knowing that most of the
+ − 6603 time in Lisp evaluation is spent executing @code{Lisp_Subr} and
+ − 6604 @code{Lisp_Compiled_Function} objects via @code{Ffuncall()} (not
+ − 6605 @code{Feval()}).
+ − 6606
+ − 6607 @code{Fapply()} implements Lisp @code{apply}, which is very similar to
+ − 6608 @code{funcall} except that if the last argument is a list, the result is the
+ − 6609 same as if each of the arguments in the list had been passed separately.
+ − 6610 @code{Fapply()} does some business to expand the last argument if it's a
+ − 6611 list, then calls @code{Ffuncall()} to do the work.
+ − 6612
+ − 6613 @code{apply1()}, @code{call0()}, @code{call1()}, @code{call2()}, and
+ − 6614 @code{call3()} call a function, passing it the argument(s) given (the
+ − 6615 arguments are given as separate C arguments rather than being passed as
+ − 6616 an array). @code{apply1()} uses @code{Fapply()} while the others use
+ − 6617 @code{Ffuncall()} to do the real work.
+ − 6618
442
+ − 6619 @node Dynamic Binding; The specbinding Stack; Unwind-Protects, Simple Special Forms, Evaluation, Evaluation; Stack Frames; Bindings
428
+ − 6620 @section Dynamic Binding; The specbinding Stack; Unwind-Protects
+ − 6621
+ − 6622 @example
+ − 6623 struct specbinding
+ − 6624 @{
+ − 6625 Lisp_Object symbol;
+ − 6626 Lisp_Object old_value;
+ − 6627 Lisp_Object (*func) (Lisp_Object); /* for unwind-protect */
+ − 6628 @};
+ − 6629 @end example
+ − 6630
+ − 6631 @code{struct specbinding} is used for local-variable bindings and
+ − 6632 unwind-protects. @code{specpdl} holds an array of @code{struct specbinding}'s,
+ − 6633 @code{specpdl_ptr} points to the beginning of the free bindings in the
+ − 6634 array, @code{specpdl_size} specifies the total number of binding slots
+ − 6635 in the array, and @code{max_specpdl_size} specifies the maximum number
+ − 6636 of bindings the array can be expanded to hold. @code{grow_specpdl()}
+ − 6637 increases the size of the @code{specpdl} array, multiplying its size by
+ − 6638 2 but never exceeding @code{max_specpdl_size} (except that if this
+ − 6639 number is less than 400, it is first set to 400).
+ − 6640
+ − 6641 @code{specbind()} binds a symbol to a value and is used for local
+ − 6642 variables and @code{let} forms. The symbol and its old value (which
+ − 6643 might be @code{Qunbound}, indicating no prior value) are recorded in the
+ − 6644 specpdl array, and @code{specpdl_size} is increased by 1.
+ − 6645
+ − 6646 @code{record_unwind_protect()} implements an @dfn{unwind-protect},
+ − 6647 which, when placed around a section of code, ensures that some specified
+ − 6648 cleanup routine will be executed even if the code exits abnormally
+ − 6649 (e.g. through a @code{throw} or quit). @code{record_unwind_protect()}
+ − 6650 simply adds a new specbinding to the @code{specpdl} array and stores the
+ − 6651 appropriate information in it. The cleanup routine can either be a C
+ − 6652 function, which is stored in the @code{func} field, or a @code{progn}
+ − 6653 form, which is stored in the @code{old_value} field.
+ − 6654
+ − 6655 @code{unbind_to()} removes specbindings from the @code{specpdl} array
+ − 6656 until the specified position is reached. Each specbinding can be one of
+ − 6657 three types:
+ − 6658
+ − 6659 @enumerate
+ − 6660 @item
+ − 6661 an unwind-protect with a C cleanup function (@code{func} is not 0, and
+ − 6662 @code{old_value} holds an argument to be passed to the function);
+ − 6663 @item
+ − 6664 an unwind-protect with a Lisp form (@code{func} is 0, @code{symbol}
+ − 6665 is @code{nil}, and @code{old_value} holds the form to be executed with
+ − 6666 @code{Fprogn()}); or
+ − 6667 @item
+ − 6668 a local-variable binding (@code{func} is 0, @code{symbol} is not
+ − 6669 @code{nil}, and @code{old_value} holds the old value, which is stored as
+ − 6670 the symbol's value).
+ − 6671 @end enumerate
+ − 6672
442
+ − 6673 @node Simple Special Forms, Catch and Throw, Dynamic Binding; The specbinding Stack; Unwind-Protects, Evaluation; Stack Frames; Bindings
428
+ − 6674 @section Simple Special Forms
+ − 6675
+ − 6676 @code{or}, @code{and}, @code{if}, @code{cond}, @code{progn},
+ − 6677 @code{prog1}, @code{prog2}, @code{setq}, @code{quote}, @code{function},
+ − 6678 @code{let*}, @code{let}, @code{while}
+ − 6679
+ − 6680 All of these are very simple and work as expected, calling
+ − 6681 @code{Feval()} or @code{Fprogn()} as necessary and (in the case of
+ − 6682 @code{let} and @code{let*}) using @code{specbind()} to create bindings
+ − 6683 and @code{unbind_to()} to undo the bindings when finished.
+ − 6684
442
+ − 6685 Note that, with the exception of @code{Fprogn}, these functions are
428
+ − 6686 typically called in real life only in interpreted code, since the byte
+ − 6687 compiler knows how to convert calls to these functions directly into
+ − 6688 byte code.
+ − 6689
442
+ − 6690 @node Catch and Throw, , Simple Special Forms, Evaluation; Stack Frames; Bindings
428
+ − 6691 @section Catch and Throw
+ − 6692
+ − 6693 @example
+ − 6694 struct catchtag
+ − 6695 @{
+ − 6696 Lisp_Object tag;
+ − 6697 Lisp_Object val;
+ − 6698 struct catchtag *next;
+ − 6699 struct gcpro *gcpro;
+ − 6700 jmp_buf jmp;
+ − 6701 struct backtrace *backlist;
+ − 6702 int lisp_eval_depth;
+ − 6703 int pdlcount;
+ − 6704 @};
+ − 6705 @end example
+ − 6706
+ − 6707 @code{catch} is a Lisp function that places a catch around a body of
+ − 6708 code. A catch is a means of non-local exit from the code. When a catch
+ − 6709 is created, a tag is specified, and executing a @code{throw} to this tag
+ − 6710 will exit from the body of code caught with this tag, and its value will
+ − 6711 be the value given in the call to @code{throw}. If there is no such
+ − 6712 call, the code will be executed normally.
+ − 6713
+ − 6714 Information pertaining to a catch is held in a @code{struct catchtag},
+ − 6715 which is placed at the head of a linked list pointed to by
+ − 6716 @code{catchlist}. @code{internal_catch()} is passed a C function to
+ − 6717 call (@code{Fprogn()} when Lisp @code{catch} is called) and arguments to
+ − 6718 give it, and places a catch around the function. Each @code{struct
+ − 6719 catchtag} is held in the stack frame of the @code{internal_catch()}
+ − 6720 instance that created the catch.
+ − 6721
+ − 6722 @code{internal_catch()} is fairly straightforward. It stores into the
+ − 6723 @code{struct catchtag} the tag name and the current values of
+ − 6724 @code{backtrace_list}, @code{lisp_eval_depth}, @code{gcprolist}, and the
+ − 6725 offset into the @code{specpdl} array, sets a jump point with @code{_setjmp()}
+ − 6726 (storing the jump point into the @code{struct catchtag}), and calls the
+ − 6727 function. Control will return to @code{internal_catch()} either when
+ − 6728 the function exits normally or through a @code{_longjmp()} to this jump
+ − 6729 point. In the latter case, @code{throw} will store the value to be
+ − 6730 returned into the @code{struct catchtag} before jumping. When it's
+ − 6731 done, @code{internal_catch()} removes the @code{struct catchtag} from
+ − 6732 the catchlist and returns the proper value.
+ − 6733
+ − 6734 @code{Fthrow()} goes up through the catchlist until it finds one with
+ − 6735 a matching tag. It then calls @code{unbind_catch()} to restore
+ − 6736 everything to what it was when the appropriate catch was set, stores the
+ − 6737 return value in the @code{struct catchtag}, and jumps (with
+ − 6738 @code{_longjmp()}) to its jump point.
+ − 6739
+ − 6740 @code{unbind_catch()} removes all catches from the catchlist until it
+ − 6741 finds the correct one. Some of the catches might have been placed for
+ − 6742 error-trapping, and if so, the appropriate entries on the handlerlist
+ − 6743 must be removed (see ``errors''). @code{unbind_catch()} also restores
+ − 6744 the values of @code{gcprolist}, @code{backtrace_list}, and
+ − 6745 @code{lisp_eval}, and calls @code{unbind_to()} to undo any specbindings
+ − 6746 created since the catch.
+ − 6747
+ − 6748
+ − 6749 @node Symbols and Variables, Buffers and Textual Representation, Evaluation; Stack Frames; Bindings, Top
+ − 6750 @chapter Symbols and Variables
+ − 6751
+ − 6752 @menu
+ − 6753 * Introduction to Symbols::
+ − 6754 * Obarrays::
+ − 6755 * Symbol Values::
+ − 6756 @end menu
+ − 6757
442
+ − 6758 @node Introduction to Symbols, Obarrays, Symbols and Variables, Symbols and Variables
428
+ − 6759 @section Introduction to Symbols
+ − 6760
+ − 6761 A symbol is basically just an object with four fields: a name (a
+ − 6762 string), a value (some Lisp object), a function (some Lisp object), and
+ − 6763 a property list (usually a list of alternating keyword/value pairs).
+ − 6764 What makes symbols special is that there is usually only one symbol with
+ − 6765 a given name, and the symbol is referred to by name. This makes a
+ − 6766 symbol a convenient way of calling up data by name, i.e. of implementing
+ − 6767 variables. (The variable's value is stored in the @dfn{value slot}.)
+ − 6768 Similarly, functions are referenced by name, and the definition of the
+ − 6769 function is stored in a symbol's @dfn{function slot}. This means that
+ − 6770 there can be a distinct function and variable with the same name. The
+ − 6771 property list is used as a more general mechanism of associating
+ − 6772 additional values with particular names, and once again the namespace is
+ − 6773 independent of the function and variable namespaces.
+ − 6774
442
+ − 6775 @node Obarrays, Symbol Values, Introduction to Symbols, Symbols and Variables
428
+ − 6776 @section Obarrays
+ − 6777
+ − 6778 The identity of symbols with their names is accomplished through a
+ − 6779 structure called an obarray, which is just a poorly-implemented hash
+ − 6780 table mapping from strings to symbols whose name is that string. (I say
+ − 6781 ``poorly implemented'' because an obarray appears in Lisp as a vector
+ − 6782 with some hidden fields rather than as its own opaque type. This is an
+ − 6783 Emacs Lisp artifact that should be fixed.)
+ − 6784
+ − 6785 Obarrays are implemented as a vector of some fixed size (which should
+ − 6786 be a prime for best results), where each ``bucket'' of the vector
+ − 6787 contains one or more symbols, threaded through a hidden @code{next}
+ − 6788 field in the symbol. Lookup of a symbol in an obarray, and adding a
+ − 6789 symbol to an obarray, is accomplished through standard hash-table
+ − 6790 techniques.
+ − 6791
+ − 6792 The standard Lisp function for working with symbols and obarrays is
+ − 6793 @code{intern}. This looks up a symbol in an obarray given its name; if
+ − 6794 it's not found, a new symbol is automatically created with the specified
+ − 6795 name, added to the obarray, and returned. This is what happens when the
+ − 6796 Lisp reader encounters a symbol (or more precisely, encounters the name
+ − 6797 of a symbol) in some text that it is reading. There is a standard
+ − 6798 obarray called @code{obarray} that is used for this purpose, although
+ − 6799 the Lisp programmer is free to create his own obarrays and @code{intern}
+ − 6800 symbols in them.
+ − 6801
+ − 6802 Note that, once a symbol is in an obarray, it stays there until
+ − 6803 something is done about it, and the standard obarray @code{obarray}
+ − 6804 always stays around, so once you use any particular variable name, a
+ − 6805 corresponding symbol will stay around in @code{obarray} until you exit
+ − 6806 XEmacs.
+ − 6807
+ − 6808 Note that @code{obarray} itself is a variable, and as such there is a
+ − 6809 symbol in @code{obarray} whose name is @code{"obarray"} and which
+ − 6810 contains @code{obarray} as its value.
+ − 6811
+ − 6812 Note also that this call to @code{intern} occurs only when in the Lisp
+ − 6813 reader, not when the code is executed (at which point the symbol is
+ − 6814 already around, stored as such in the definition of the function).
+ − 6815
+ − 6816 You can create your own obarray using @code{make-vector} (this is
+ − 6817 horrible but is an artifact) and intern symbols into that obarray.
+ − 6818 Doing that will result in two or more symbols with the same name.
+ − 6819 However, at most one of these symbols is in the standard @code{obarray}:
+ − 6820 You cannot have two symbols of the same name in any particular obarray.
+ − 6821 Note that you cannot add a symbol to an obarray in any fashion other
+ − 6822 than using @code{intern}: i.e. you can't take an existing symbol and put
+ − 6823 it in an existing obarray. Nor can you change the name of an existing
+ − 6824 symbol. (Since obarrays are vectors, you can violate the consistency of
+ − 6825 things by storing directly into the vector, but let's ignore that
+ − 6826 possibility.)
+ − 6827
+ − 6828 Usually symbols are created by @code{intern}, but if you really want,
+ − 6829 you can explicitly create a symbol using @code{make-symbol}, giving it
+ − 6830 some name. The resulting symbol is not in any obarray (i.e. it is
+ − 6831 @dfn{uninterned}), and you can't add it to any obarray. Therefore its
+ − 6832 primary purpose is as a symbol to use in macros to avoid namespace
+ − 6833 pollution. It can also be used as a carrier of information, but cons
+ − 6834 cells could probably be used just as well.
+ − 6835
+ − 6836 You can also use @code{intern-soft} to look up a symbol but not create
+ − 6837 a new one, and @code{unintern} to remove a symbol from an obarray. This
+ − 6838 returns the removed symbol. (Remember: You can't put the symbol back
+ − 6839 into any obarray.) Finally, @code{mapatoms} maps over all of the symbols
+ − 6840 in an obarray.
+ − 6841
442
+ − 6842 @node Symbol Values, , Obarrays, Symbols and Variables
428
+ − 6843 @section Symbol Values
+ − 6844
+ − 6845 The value field of a symbol normally contains a Lisp object. However,
+ − 6846 a symbol can be @dfn{unbound}, meaning that it logically has no value.
+ − 6847 This is internally indicated by storing a special Lisp object, called
+ − 6848 @dfn{the unbound marker} and stored in the global variable
+ − 6849 @code{Qunbound}. The unbound marker is of a special Lisp object type
+ − 6850 called @dfn{symbol-value-magic}. It is impossible for the Lisp
+ − 6851 programmer to directly create or access any object of this type.
+ − 6852
+ − 6853 @strong{You must not let any ``symbol-value-magic'' object escape to
+ − 6854 the Lisp level.} Printing any of these objects will cause the message
+ − 6855 @samp{INTERNAL EMACS BUG} to appear as part of the print representation.
+ − 6856 (You may see this normally when you call @code{debug_print()} from the
+ − 6857 debugger on a Lisp object.) If you let one of these objects escape to
+ − 6858 the Lisp level, you will violate a number of assumptions contained in
+ − 6859 the C code and make the unbound marker not function right.
+ − 6860
+ − 6861 When a symbol is created, its value field (and function field) are set
+ − 6862 to @code{Qunbound}. The Lisp programmer can restore these conditions
+ − 6863 later using @code{makunbound} or @code{fmakunbound}, and can query to
+ − 6864 see whether the value of function fields are @dfn{bound} (i.e. have a
+ − 6865 value other than @code{Qunbound}) using @code{boundp} and
+ − 6866 @code{fboundp}. The fields are set to a normal Lisp object using
+ − 6867 @code{set} (or @code{setq}) and @code{fset}.
+ − 6868
+ − 6869 Other symbol-value-magic objects are used as special markers to
+ − 6870 indicate variables that have non-normal properties. This includes any
+ − 6871 variables that are tied into C variables (setting the variable magically
+ − 6872 sets some global variable in the C code, and likewise for retrieving the
+ − 6873 variable's value), variables that magically tie into slots in the
+ − 6874 current buffer, variables that are buffer-local, etc. The
+ − 6875 symbol-value-magic object is stored in the value cell in place of
+ − 6876 a normal object, and the code to retrieve a symbol's value
+ − 6877 (i.e. @code{symbol-value}) knows how to do special things with them.
+ − 6878 This means that you should not just fetch the value cell directly if you
+ − 6879 want a symbol's value.
+ − 6880
+ − 6881 The exact workings of this are rather complex and involved and are
+ − 6882 well-documented in comments in @file{buffer.c}, @file{symbols.c}, and
+ − 6883 @file{lisp.h}.
+ − 6884
+ − 6885 @node Buffers and Textual Representation, MULE Character Sets and Encodings, Symbols and Variables, Top
+ − 6886 @chapter Buffers and Textual Representation
+ − 6887
+ − 6888 @menu
+ − 6889 * Introduction to Buffers:: A buffer holds a block of text such as a file.
+ − 6890 * The Text in a Buffer:: Representation of the text in a buffer.
+ − 6891 * Buffer Lists:: Keeping track of all buffers.
+ − 6892 * Markers and Extents:: Tagging locations within a buffer.
+ − 6893 * Bufbytes and Emchars:: Representation of individual characters.
+ − 6894 * The Buffer Object:: The Lisp object corresponding to a buffer.
+ − 6895 @end menu
+ − 6896
442
+ − 6897 @node Introduction to Buffers, The Text in a Buffer, Buffers and Textual Representation, Buffers and Textual Representation
428
+ − 6898 @section Introduction to Buffers
+ − 6899
+ − 6900 A buffer is logically just a Lisp object that holds some text.
+ − 6901 In this, it is like a string, but a buffer is optimized for
+ − 6902 frequent insertion and deletion, while a string is not. Furthermore:
+ − 6903
+ − 6904 @enumerate
+ − 6905 @item
+ − 6906 Buffers are @dfn{permanent} objects, i.e. once you create them, they
+ − 6907 remain around, and need to be explicitly deleted before they go away.
+ − 6908 @item
+ − 6909 Each buffer has a unique name, which is a string. Buffers are
+ − 6910 normally referred to by name. In this respect, they are like
+ − 6911 symbols.
+ − 6912 @item
+ − 6913 Buffers have a default insertion position, called @dfn{point}.
+ − 6914 Inserting text (unless you explicitly give a position) goes at point,
+ − 6915 and moves point forward past the text. This is what is going on when
+ − 6916 you type text into Emacs.
+ − 6917 @item
+ − 6918 Buffers have lots of extra properties associated with them.
+ − 6919 @item
+ − 6920 Buffers can be @dfn{displayed}. What this means is that there
+ − 6921 exist a number of @dfn{windows}, which are objects that correspond
+ − 6922 to some visible section of your display, and each window has
+ − 6923 an associated buffer, and the current contents of the buffer
+ − 6924 are shown in that section of the display. The redisplay mechanism
+ − 6925 (which takes care of doing this) knows how to look at the
+ − 6926 text of a buffer and come up with some reasonable way of displaying
+ − 6927 this. Many of the properties of a buffer control how the
+ − 6928 buffer's text is displayed.
+ − 6929 @item
+ − 6930 One buffer is distinguished and called the @dfn{current buffer}. It is
+ − 6931 stored in the variable @code{current_buffer}. Buffer operations operate
+ − 6932 on this buffer by default. When you are typing text into a buffer, the
+ − 6933 buffer you are typing into is always @code{current_buffer}. Switching
+ − 6934 to a different window changes the current buffer. Note that Lisp code
+ − 6935 can temporarily change the current buffer using @code{set-buffer} (often
+ − 6936 enclosed in a @code{save-excursion} so that the former current buffer
+ − 6937 gets restored when the code is finished). However, calling
+ − 6938 @code{set-buffer} will NOT cause a permanent change in the current
+ − 6939 buffer. The reason for this is that the top-level event loop sets
+ − 6940 @code{current_buffer} to the buffer of the selected window, each time
+ − 6941 it finishes executing a user command.
+ − 6942 @end enumerate
+ − 6943
+ − 6944 Make sure you understand the distinction between @dfn{current buffer}
+ − 6945 and @dfn{buffer of the selected window}, and the distinction between
+ − 6946 @dfn{point} of the current buffer and @dfn{window-point} of the selected
+ − 6947 window. (This latter distinction is explained in detail in the section
+ − 6948 on windows.)
+ − 6949
442
+ − 6950 @node The Text in a Buffer, Buffer Lists, Introduction to Buffers, Buffers and Textual Representation
428
+ − 6951 @section The Text in a Buffer
+ − 6952
+ − 6953 The text in a buffer consists of a sequence of zero or more
+ − 6954 characters. A @dfn{character} is an integer that logically represents
+ − 6955 a letter, number, space, or other unit of text. Most of the characters
+ − 6956 that you will typically encounter belong to the ASCII set of characters,
+ − 6957 but there are also characters for various sorts of accented letters,
+ − 6958 special symbols, Chinese and Japanese ideograms (i.e. Kanji, Katakana,
+ − 6959 etc.), Cyrillic and Greek letters, etc. The actual number of possible
+ − 6960 characters is quite large.
+ − 6961
+ − 6962 For now, we can view a character as some non-negative integer that
+ − 6963 has some shape that defines how it typically appears (e.g. as an
+ − 6964 uppercase A). (The exact way in which a character appears depends on the
+ − 6965 font used to display the character.) The internal type of characters in
+ − 6966 the C code is an @code{Emchar}; this is just an @code{int}, but using a
+ − 6967 symbolic type makes the code clearer.
+ − 6968
+ − 6969 Between every character in a buffer is a @dfn{buffer position} or
+ − 6970 @dfn{character position}. We can speak of the character before or after
+ − 6971 a particular buffer position, and when you insert a character at a
+ − 6972 particular position, all characters after that position end up at new
+ − 6973 positions. When we speak of the character @dfn{at} a position, we
+ − 6974 really mean the character after the position. (This schizophrenia
+ − 6975 between a buffer position being ``between'' a character and ``on'' a
+ − 6976 character is rampant in Emacs.)
+ − 6977
+ − 6978 Buffer positions are numbered starting at 1. This means that
+ − 6979 position 1 is before the first character, and position 0 is not
+ − 6980 valid. If there are N characters in a buffer, then buffer
+ − 6981 position N+1 is after the last one, and position N+2 is not valid.
+ − 6982
+ − 6983 The internal makeup of the Emchar integer varies depending on whether
+ − 6984 we have compiled with MULE support. If not, the Emchar integer is an
+ − 6985 8-bit integer with possible values from 0 - 255. 0 - 127 are the
+ − 6986 standard ASCII characters, while 128 - 255 are the characters from the
+ − 6987 ISO-8859-1 character set. If we have compiled with MULE support, an
+ − 6988 Emchar is a 19-bit integer, with the various bits having meanings
+ − 6989 according to a complex scheme that will be detailed later. The
+ − 6990 characters numbered 0 - 255 still have the same meanings as for the
+ − 6991 non-MULE case, though.
+ − 6992
+ − 6993 Internally, the text in a buffer is represented in a fairly simple
+ − 6994 fashion: as a contiguous array of bytes, with a @dfn{gap} of some size
+ − 6995 in the middle. Although the gap is of some substantial size in bytes,
+ − 6996 there is no text contained within it: From the perspective of the text
+ − 6997 in the buffer, it does not exist. The gap logically sits at some buffer
+ − 6998 position, between two characters (or possibly at the beginning or end of
+ − 6999 the buffer). Insertion of text in a buffer at a particular position is
+ − 7000 always accomplished by first moving the gap to that position
+ − 7001 (i.e. through some block moving of text), then writing the text into the
+ − 7002 beginning of the gap, thereby shrinking the gap. If the gap shrinks
+ − 7003 down to nothing, a new gap is created. (What actually happens is that a
+ − 7004 new gap is ``created'' at the end of the buffer's text, which requires
+ − 7005 nothing more than changing a couple of indices; then the gap is
+ − 7006 ``moved'' to the position where the insertion needs to take place by
+ − 7007 moving up in memory all the text after that position.) Similarly,
+ − 7008 deletion occurs by moving the gap to the place where the text is to be
+ − 7009 deleted, and then simply expanding the gap to include the deleted text.
+ − 7010 (@dfn{Expanding} and @dfn{shrinking} the gap as just described means
+ − 7011 just that the internal indices that keep track of where the gap is
+ − 7012 located are changed.)
+ − 7013
+ − 7014 Note that the total amount of memory allocated for a buffer text never
+ − 7015 decreases while the buffer is live. Therefore, if you load up a
+ − 7016 20-megabyte file and then delete all but one character, there will be a
+ − 7017 20-megabyte gap, which won't get any smaller (except by inserting
+ − 7018 characters back again). Once the buffer is killed, the memory allocated
+ − 7019 for the buffer text will be freed, but it will still be sitting on the
+ − 7020 heap, taking up virtual memory, and will not be released back to the
+ − 7021 operating system. (However, if you have compiled XEmacs with rel-alloc,
+ − 7022 the situation is different. In this case, the space @emph{will} be
+ − 7023 released back to the operating system. However, this tends to result in a
+ − 7024 noticeable speed penalty.)
+ − 7025
+ − 7026 Astute readers may notice that the text in a buffer is represented as
+ − 7027 an array of @emph{bytes}, while (at least in the MULE case) an Emchar is
+ − 7028 a 19-bit integer, which clearly cannot fit in a byte. This means (of
+ − 7029 course) that the text in a buffer uses a different representation from
+ − 7030 an Emchar: specifically, the 19-bit Emchar becomes a series of one to
+ − 7031 four bytes. The conversion between these two representations is complex
+ − 7032 and will be described later.
+ − 7033
+ − 7034 In the non-MULE case, everything is very simple: An Emchar
+ − 7035 is an 8-bit value, which fits neatly into one byte.
+ − 7036
+ − 7037 If we are given a buffer position and want to retrieve the
+ − 7038 character at that position, we need to follow these steps:
+ − 7039
+ − 7040 @enumerate
+ − 7041 @item
+ − 7042 Pretend there's no gap, and convert the buffer position into a @dfn{byte
+ − 7043 index} that indexes to the appropriate byte in the buffer's stream of
+ − 7044 textual bytes. By convention, byte indices begin at 1, just like buffer
+ − 7045 positions. In the non-MULE case, byte indices and buffer positions are
+ − 7046 identical, since one character equals one byte.
+ − 7047 @item
+ − 7048 Convert the byte index into a @dfn{memory index}, which takes the gap
+ − 7049 into account. The memory index is a direct index into the block of
+ − 7050 memory that stores the text of a buffer. This basically just involves
+ − 7051 checking to see if the byte index is past the gap, and if so, adding the
+ − 7052 size of the gap to it. By convention, memory indices begin at 1, just
+ − 7053 like buffer positions and byte indices, and when referring to the
+ − 7054 position that is @dfn{at} the gap, we always use the memory position at
+ − 7055 the @emph{beginning}, not at the end, of the gap.
+ − 7056 @item
+ − 7057 Fetch the appropriate bytes at the determined memory position.
+ − 7058 @item
+ − 7059 Convert these bytes into an Emchar.
+ − 7060 @end enumerate
+ − 7061
+ − 7062 In the non-Mule case, (3) and (4) boil down to a simple one-byte
+ − 7063 memory access.
+ − 7064
+ − 7065 Note that we have defined three types of positions in a buffer:
+ − 7066
+ − 7067 @enumerate
+ − 7068 @item
+ − 7069 @dfn{buffer positions} or @dfn{character positions}, typedef @code{Bufpos}
+ − 7070 @item
+ − 7071 @dfn{byte indices}, typedef @code{Bytind}
+ − 7072 @item
+ − 7073 @dfn{memory indices}, typedef @code{Memind}
+ − 7074 @end enumerate
+ − 7075
+ − 7076 All three typedefs are just @code{int}s, but defining them this way makes
+ − 7077 things a lot clearer.
+ − 7078
+ − 7079 Most code works with buffer positions. In particular, all Lisp code
+ − 7080 that refers to text in a buffer uses buffer positions. Lisp code does
+ − 7081 not know that byte indices or memory indices exist.
+ − 7082
+ − 7083 Finally, we have a typedef for the bytes in a buffer. This is a
+ − 7084 @code{Bufbyte}, which is an unsigned char. Referring to them as
+ − 7085 Bufbytes underscores the fact that we are working with a string of bytes
+ − 7086 in the internal Emacs buffer representation rather than in one of a
+ − 7087 number of possible alternative representations (e.g. EUC-encoded text,
+ − 7088 etc.).
+ − 7089
442
+ − 7090 @node Buffer Lists, Markers and Extents, The Text in a Buffer, Buffers and Textual Representation
428
+ − 7091 @section Buffer Lists
+ − 7092
+ − 7093 Recall earlier that buffers are @dfn{permanent} objects, i.e. that
+ − 7094 they remain around until explicitly deleted. This entails that there is
+ − 7095 a list of all the buffers in existence. This list is actually an
+ − 7096 assoc-list (mapping from the buffer's name to the buffer) and is stored
+ − 7097 in the global variable @code{Vbuffer_alist}.
+ − 7098
+ − 7099 The order of the buffers in the list is important: the buffers are
+ − 7100 ordered approximately from most-recently-used to least-recently-used.
+ − 7101 Switching to a buffer using @code{switch-to-buffer},
+ − 7102 @code{pop-to-buffer}, etc. and switching windows using
+ − 7103 @code{other-window}, etc. usually brings the new current buffer to the
+ − 7104 front of the list. @code{switch-to-buffer}, @code{other-buffer},
+ − 7105 etc. look at the beginning of the list to find an alternative buffer to
+ − 7106 suggest. You can also explicitly move a buffer to the end of the list
+ − 7107 using @code{bury-buffer}.
+ − 7108
+ − 7109 In addition to the global ordering in @code{Vbuffer_alist}, each frame
+ − 7110 has its own ordering of the list. These lists always contain the same
+ − 7111 elements as in @code{Vbuffer_alist} although possibly in a different
+ − 7112 order. @code{buffer-list} normally returns the list for the selected
+ − 7113 frame. This allows you to work in separate frames without things
+ − 7114 interfering with each other.
+ − 7115
+ − 7116 The standard way to look up a buffer given a name is
+ − 7117 @code{get-buffer}, and the standard way to create a new buffer is
+ − 7118 @code{get-buffer-create}, which looks up a buffer with a given name,
+ − 7119 creating a new one if necessary. These operations correspond exactly
+ − 7120 with the symbol operations @code{intern-soft} and @code{intern},
+ − 7121 respectively. You can also force a new buffer to be created using
+ − 7122 @code{generate-new-buffer}, which takes a name and (if necessary) makes
+ − 7123 a unique name from this by appending a number, and then creates the
+ − 7124 buffer. This is basically like the symbol operation @code{gensym}.
+ − 7125
442
+ − 7126 @node Markers and Extents, Bufbytes and Emchars, Buffer Lists, Buffers and Textual Representation
428
+ − 7127 @section Markers and Extents
+ − 7128
+ − 7129 Among the things associated with a buffer are things that are
+ − 7130 logically attached to certain buffer positions. This can be used to
+ − 7131 keep track of a buffer position when text is inserted and deleted, so
+ − 7132 that it remains at the same spot relative to the text around it; to
+ − 7133 assign properties to particular sections of text; etc. There are two
+ − 7134 such objects that are useful in this regard: they are @dfn{markers} and
+ − 7135 @dfn{extents}.
+ − 7136
+ − 7137 A @dfn{marker} is simply a flag placed at a particular buffer
+ − 7138 position, which is moved around as text is inserted and deleted.
+ − 7139 Markers are used for all sorts of purposes, such as the @code{mark} that
+ − 7140 is the other end of textual regions to be cut, copied, etc.
+ − 7141
+ − 7142 An @dfn{extent} is similar to two markers plus some associated
+ − 7143 properties, and is used to keep track of regions in a buffer as text is
+ − 7144 inserted and deleted, and to add properties (e.g. fonts) to particular
+ − 7145 regions of text. The external interface of extents is explained
+ − 7146 elsewhere.
+ − 7147
+ − 7148 The important thing here is that markers and extents simply contain
+ − 7149 buffer positions in them as integers, and every time text is inserted or
+ − 7150 deleted, these positions must be updated. In order to minimize the
+ − 7151 amount of shuffling that needs to be done, the positions in markers and
442
+ − 7152 extents (there's one per marker, two per extent) are stored in Meminds.
428
+ − 7153 This means that they only need to be moved when the text is physically
+ − 7154 moved in memory; since the gap structure tries to minimize this, it also
+ − 7155 minimizes the number of marker and extent indices that need to be
+ − 7156 adjusted. Look in @file{insdel.c} for the details of how this works.
+ − 7157
+ − 7158 One other important distinction is that markers are @dfn{temporary}
+ − 7159 while extents are @dfn{permanent}. This means that markers disappear as
+ − 7160 soon as there are no more pointers to them, and correspondingly, there
+ − 7161 is no way to determine what markers are in a buffer if you are just
+ − 7162 given the buffer. Extents remain in a buffer until they are detached
+ − 7163 (which could happen as a result of text being deleted) or the buffer is
+ − 7164 deleted, and primitives do exist to enumerate the extents in a buffer.
+ − 7165
442
+ − 7166 @node Bufbytes and Emchars, The Buffer Object, Markers and Extents, Buffers and Textual Representation
428
+ − 7167 @section Bufbytes and Emchars
+ − 7168
+ − 7169 Not yet documented.
+ − 7170
442
+ − 7171 @node The Buffer Object, , Bufbytes and Emchars, Buffers and Textual Representation
428
+ − 7172 @section The Buffer Object
+ − 7173
+ − 7174 Buffers contain fields not directly accessible by the Lisp programmer.
+ − 7175 We describe them here, naming them by the names used in the C code.
+ − 7176 Many are accessible indirectly in Lisp programs via Lisp primitives.
+ − 7177
+ − 7178 @table @code
+ − 7179 @item name
+ − 7180 The buffer name is a string that names the buffer. It is guaranteed to
446
+ − 7181 be unique. @xref{Buffer Names,,, lispref, XEmacs Lisp Reference
428
+ − 7182 Manual}.
+ − 7183
+ − 7184 @item save_modified
+ − 7185 This field contains the time when the buffer was last saved, as an
446
+ − 7186 integer. @xref{Buffer Modification,,, lispref, XEmacs Lisp Reference
428
+ − 7187 Manual}.
+ − 7188
+ − 7189 @item modtime
+ − 7190 This field contains the modification time of the visited file. It is
+ − 7191 set when the file is written or read. Every time the buffer is written
+ − 7192 to the file, this field is compared to the modification time of the
446
+ − 7193 file. @xref{Buffer Modification,,, lispref, XEmacs Lisp Reference
428
+ − 7194 Manual}.
+ − 7195
+ − 7196 @item auto_save_modified
+ − 7197 This field contains the time when the buffer was last auto-saved.
+ − 7198
+ − 7199 @item last_window_start
+ − 7200 This field contains the @code{window-start} position in the buffer as of
+ − 7201 the last time the buffer was displayed in a window.
+ − 7202
+ − 7203 @item undo_list
+ − 7204 This field points to the buffer's undo list. @xref{Undo,,, lispref,
446
+ − 7205 XEmacs Lisp Reference Manual}.
428
+ − 7206
+ − 7207 @item syntax_table_v
+ − 7208 This field contains the syntax table for the buffer. @xref{Syntax
446
+ − 7209 Tables,,, lispref, XEmacs Lisp Reference Manual}.
428
+ − 7210
+ − 7211 @item downcase_table
+ − 7212 This field contains the conversion table for converting text to lower
446
+ − 7213 case. @xref{Case Tables,,, lispref, XEmacs Lisp Reference Manual}.
428
+ − 7214
+ − 7215 @item upcase_table
+ − 7216 This field contains the conversion table for converting text to upper
446
+ − 7217 case. @xref{Case Tables,,, lispref, XEmacs Lisp Reference Manual}.
428
+ − 7218
+ − 7219 @item case_canon_table
+ − 7220 This field contains the conversion table for canonicalizing text for
+ − 7221 case-folding search. @xref{Case Tables,,, lispref, XEmacs Lisp
446
+ − 7222 Reference Manual}.
428
+ − 7223
+ − 7224 @item case_eqv_table
+ − 7225 This field contains the equivalence table for case-folding search.
446
+ − 7226 @xref{Case Tables,,, lispref, XEmacs Lisp Reference Manual}.
428
+ − 7227
+ − 7228 @item display_table
+ − 7229 This field contains the buffer's display table, or @code{nil} if it
+ − 7230 doesn't have one. @xref{Display Tables,,, lispref, XEmacs Lisp
446
+ − 7231 Reference Manual}.
428
+ − 7232
+ − 7233 @item markers
+ − 7234 This field contains the chain of all markers that currently point into
+ − 7235 the buffer. Deletion of text in the buffer, and motion of the buffer's
+ − 7236 gap, must check each of these markers and perhaps update it.
446
+ − 7237 @xref{Markers,,, lispref, XEmacs Lisp Reference Manual}.
428
+ − 7238
+ − 7239 @item backed_up
+ − 7240 This field is a flag that tells whether a backup file has been made for
+ − 7241 the visited file of this buffer.
+ − 7242
+ − 7243 @item mark
+ − 7244 This field contains the mark for the buffer. The mark is a marker,
+ − 7245 hence it is also included on the list @code{markers}. @xref{The Mark,,,
446
+ − 7246 lispref, XEmacs Lisp Reference Manual}.
428
+ − 7247
+ − 7248 @item mark_active
+ − 7249 This field is non-@code{nil} if the buffer's mark is active.
+ − 7250
+ − 7251 @item local_var_alist
+ − 7252 This field contains the association list describing the variables local
+ − 7253 in this buffer, and their values, with the exception of local variables
+ − 7254 that have special slots in the buffer object. (Those slots are omitted
+ − 7255 from this table.) @xref{Buffer-Local Variables,,, lispref, XEmacs Lisp
446
+ − 7256 Reference Manual}.
428
+ − 7257
+ − 7258 @item modeline_format
+ − 7259 This field contains a Lisp object which controls how to display the mode
+ − 7260 line for this buffer. @xref{Modeline Format,,, lispref, XEmacs Lisp
446
+ − 7261 Reference Manual}.
428
+ − 7262
+ − 7263 @item base_buffer
+ − 7264 This field holds the buffer's base buffer (if it is an indirect buffer),
+ − 7265 or @code{nil}.
+ − 7266 @end table
+ − 7267
+ − 7268 @node MULE Character Sets and Encodings, The Lisp Reader and Compiler, Buffers and Textual Representation, Top
+ − 7269 @chapter MULE Character Sets and Encodings
+ − 7270
+ − 7271 Recall that there are two primary ways that text is represented in
+ − 7272 XEmacs. The @dfn{buffer} representation sees the text as a series of
+ − 7273 bytes (Bufbytes), with a variable number of bytes used per character.
+ − 7274 The @dfn{character} representation sees the text as a series of integers
+ − 7275 (Emchars), one per character. The character representation is a cleaner
+ − 7276 representation from a theoretical standpoint, and is thus used in many
+ − 7277 cases when lots of manipulations on a string need to be done. However,
+ − 7278 the buffer representation is the standard representation used in both
+ − 7279 Lisp strings and buffers, and because of this, it is the ``default''
+ − 7280 representation that text comes in. The reason for using this
+ − 7281 representation is that it's compact and is compatible with ASCII.
+ − 7282
+ − 7283 @menu
+ − 7284 * Character Sets::
+ − 7285 * Encodings::
+ − 7286 * Internal Mule Encodings::
+ − 7287 * CCL::
+ − 7288 @end menu
+ − 7289
442
+ − 7290 @node Character Sets, Encodings, MULE Character Sets and Encodings, MULE Character Sets and Encodings
428
+ − 7291 @section Character Sets
+ − 7292
+ − 7293 A character set (or @dfn{charset}) is an ordered set of characters. A
+ − 7294 particular character in a charset is indexed using one or more
+ − 7295 @dfn{position codes}, which are non-negative integers. The number of
+ − 7296 position codes needed to identify a particular character in a charset is
+ − 7297 called the @dfn{dimension} of the charset. In XEmacs/Mule, all charsets
+ − 7298 have dimension 1 or 2, and the size of all charsets (except for a few
+ − 7299 special cases) is either 94, 96, 94 by 94, or 96 by 96. The range of
+ − 7300 position codes used to index characters from any of these types of
+ − 7301 character sets is as follows:
+ − 7302
+ − 7303 @example
+ − 7304 Charset type Position code 1 Position code 2
+ − 7305 ------------------------------------------------------------
+ − 7306 94 33 - 126 N/A
+ − 7307 96 32 - 127 N/A
+ − 7308 94x94 33 - 126 33 - 126
+ − 7309 96x96 32 - 127 32 - 127
+ − 7310 @end example
+ − 7311
+ − 7312 Note that in the above cases position codes do not start at an
+ − 7313 expected value such as 0 or 1. The reason for this will become clear
+ − 7314 later.
+ − 7315
+ − 7316 For example, Latin-1 is a 96-character charset, and JISX0208 (the
+ − 7317 Japanese national character set) is a 94x94-character charset.
+ − 7318
+ − 7319 [Note that, although the ranges above define the @emph{valid} position
+ − 7320 codes for a charset, some of the slots in a particular charset may in
+ − 7321 fact be empty. This is the case for JISX0208, for example, where (e.g.)
+ − 7322 all the slots whose first position code is in the range 118 - 127 are
+ − 7323 empty.]
+ − 7324
+ − 7325 There are three charsets that do not follow the above rules. All of
+ − 7326 them have one dimension, and have ranges of position codes as follows:
+ − 7327
+ − 7328 @example
+ − 7329 Charset name Position code 1
+ − 7330 ------------------------------------
+ − 7331 ASCII 0 - 127
+ − 7332 Control-1 0 - 31
+ − 7333 Composite 0 - some large number
+ − 7334 @end example
+ − 7335
+ − 7336 (The upper bound of the position code for composite characters has not
+ − 7337 yet been determined, but it will probably be at least 16,383).
+ − 7338
+ − 7339 ASCII is the union of two subsidiary character sets: Printing-ASCII
+ − 7340 (the printing ASCII character set, consisting of position codes 33 -
+ − 7341 126, like for a standard 94-character charset) and Control-ASCII (the
+ − 7342 non-printing characters that would appear in a binary file with codes 0
+ − 7343 - 32 and 127).
+ − 7344
+ − 7345 Control-1 contains the non-printing characters that would appear in a
+ − 7346 binary file with codes 128 - 159.
+ − 7347
+ − 7348 Composite contains characters that are generated by overstriking one
+ − 7349 or more characters from other charsets.
+ − 7350
+ − 7351 Note that some characters in ASCII, and all characters in Control-1,
+ − 7352 are @dfn{control} (non-printing) characters. These have no printed
+ − 7353 representation but instead control some other function of the printing
+ − 7354 (e.g. TAB or 8 moves the current character position to the next tab
+ − 7355 stop). All other characters in all charsets are @dfn{graphic}
+ − 7356 (printing) characters.
+ − 7357
+ − 7358 When a binary file is read in, the bytes in the file are assigned to
+ − 7359 character sets as follows:
+ − 7360
+ − 7361 @example
+ − 7362 Bytes Character set Range
+ − 7363 --------------------------------------------------
+ − 7364 0 - 127 ASCII 0 - 127
+ − 7365 128 - 159 Control-1 0 - 31
+ − 7366 160 - 255 Latin-1 32 - 127
+ − 7367 @end example
+ − 7368
+ − 7369 This is a bit ad-hoc but gets the job done.
+ − 7370
442
+ − 7371 @node Encodings, Internal Mule Encodings, Character Sets, MULE Character Sets and Encodings
428
+ − 7372 @section Encodings
+ − 7373
+ − 7374 An @dfn{encoding} is a way of numerically representing characters from
+ − 7375 one or more character sets. If an encoding only encompasses one
+ − 7376 character set, then the position codes for the characters in that
+ − 7377 character set could be used directly. This is not possible, however, if
+ − 7378 more than one character set is to be used in the encoding.
+ − 7379
+ − 7380 For example, the conversion detailed above between bytes in a binary
+ − 7381 file and characters is effectively an encoding that encompasses the
+ − 7382 three character sets ASCII, Control-1, and Latin-1 in a stream of 8-bit
+ − 7383 bytes.
+ − 7384
+ − 7385 Thus, an encoding can be viewed as a way of encoding characters from a
+ − 7386 specified group of character sets using a stream of bytes, each of which
+ − 7387 contains a fixed number of bits (but not necessarily 8, as in the common
+ − 7388 usage of ``byte'').
+ − 7389
+ − 7390 Here are descriptions of a couple of common
+ − 7391 encodings:
+ − 7392
+ − 7393 @menu
+ − 7394 * Japanese EUC (Extended Unix Code)::
+ − 7395 * JIS7::
+ − 7396 @end menu
+ − 7397
442
+ − 7398 @node Japanese EUC (Extended Unix Code), JIS7, Encodings, Encodings
428
+ − 7399 @subsection Japanese EUC (Extended Unix Code)
+ − 7400
+ − 7401 This encompasses the character sets Printing-ASCII, Japanese-JISX0201,
+ − 7402 and Japanese-JISX0208-Kana (half-width katakana, the right half of
+ − 7403 JISX0201). It uses 8-bit bytes.
+ − 7404
+ − 7405 Note that Printing-ASCII and Japanese-JISX0201-Kana are 94-character
+ − 7406 charsets, while Japanese-JISX0208 is a 94x94-character charset.
+ − 7407
+ − 7408 The encoding is as follows:
+ − 7409
+ − 7410 @example
+ − 7411 Character set Representation (PC=position-code)
+ − 7412 ------------- --------------
+ − 7413 Printing-ASCII PC1
+ − 7414 Japanese-JISX0201-Kana 0x8E | PC1 + 0x80
+ − 7415 Japanese-JISX0208 PC1 + 0x80 | PC2 + 0x80
+ − 7416 Japanese-JISX0212 PC1 + 0x80 | PC2 + 0x80
+ − 7417 @end example
+ − 7418
+ − 7419
442
+ − 7420 @node JIS7, , Japanese EUC (Extended Unix Code), Encodings
428
+ − 7421 @subsection JIS7
+ − 7422
+ − 7423 This encompasses the character sets Printing-ASCII,
+ − 7424 Japanese-JISX0201-Roman (the left half of JISX0201; this character set
+ − 7425 is very similar to Printing-ASCII and is a 94-character charset),
+ − 7426 Japanese-JISX0208, and Japanese-JISX0201-Kana. It uses 7-bit bytes.
+ − 7427
+ − 7428 Unlike Japanese EUC, this is a @dfn{modal} encoding, which
+ − 7429 means that there are multiple states that the encoding can
+ − 7430 be in, which affect how the bytes are to be interpreted.
+ − 7431 Special sequences of bytes (called @dfn{escape sequences})
+ − 7432 are used to change states.
+ − 7433
+ − 7434 The encoding is as follows:
+ − 7435
+ − 7436 @example
+ − 7437 Character set Representation (PC=position-code)
+ − 7438 ------------- --------------
+ − 7439 Printing-ASCII PC1
+ − 7440 Japanese-JISX0201-Roman PC1
+ − 7441 Japanese-JISX0201-Kana PC1
+ − 7442 Japanese-JISX0208 PC1 PC2
+ − 7443
+ − 7444
+ − 7445 Escape sequence ASCII equivalent Meaning
+ − 7446 --------------- ---------------- -------
+ − 7447 0x1B 0x28 0x4A ESC ( J invoke Japanese-JISX0201-Roman
+ − 7448 0x1B 0x28 0x49 ESC ( I invoke Japanese-JISX0201-Kana
+ − 7449 0x1B 0x24 0x42 ESC $ B invoke Japanese-JISX0208
+ − 7450 0x1B 0x28 0x42 ESC ( B invoke Printing-ASCII
+ − 7451 @end example
+ − 7452
+ − 7453 Initially, Printing-ASCII is invoked.
+ − 7454
442
+ − 7455 @node Internal Mule Encodings, CCL, Encodings, MULE Character Sets and Encodings
428
+ − 7456 @section Internal Mule Encodings
+ − 7457
+ − 7458 In XEmacs/Mule, each character set is assigned a unique number, called a
+ − 7459 @dfn{leading byte}. This is used in the encodings of a character.
+ − 7460 Leading bytes are in the range 0x80 - 0xFF (except for ASCII, which has
+ − 7461 a leading byte of 0), although some leading bytes are reserved.
+ − 7462
+ − 7463 Charsets whose leading byte is in the range 0x80 - 0x9F are called
+ − 7464 @dfn{official} and are used for built-in charsets. Other charsets are
+ − 7465 called @dfn{private} and have leading bytes in the range 0xA0 - 0xFF;
+ − 7466 these are user-defined charsets.
+ − 7467
+ − 7468 More specifically:
+ − 7469
+ − 7470 @example
+ − 7471 Character set Leading byte
+ − 7472 ------------- ------------
+ − 7473 ASCII 0
+ − 7474 Composite 0x80
+ − 7475 Dimension-1 Official 0x81 - 0x8D
+ − 7476 (0x8E is free)
+ − 7477 Control-1 0x8F
+ − 7478 Dimension-2 Official 0x90 - 0x99
+ − 7479 (0x9A - 0x9D are free;
+ − 7480 0x9E and 0x9F are reserved)
+ − 7481 Dimension-1 Private 0xA0 - 0xEF
+ − 7482 Dimension-2 Private 0xF0 - 0xFF
+ − 7483 @end example
+ − 7484
+ − 7485 There are two internal encodings for characters in XEmacs/Mule. One is
+ − 7486 called @dfn{string encoding} and is an 8-bit encoding that is used for
+ − 7487 representing characters in a buffer or string. It uses 1 to 4 bytes per
+ − 7488 character. The other is called @dfn{character encoding} and is a 19-bit
+ − 7489 encoding that is used for representing characters individually in a
+ − 7490 variable.
+ − 7491
+ − 7492 (In the following descriptions, we'll ignore composite characters for
+ − 7493 the moment. We also give a general (structural) overview first,
+ − 7494 followed later by the exact details.)
+ − 7495
+ − 7496 @menu
+ − 7497 * Internal String Encoding::
+ − 7498 * Internal Character Encoding::
+ − 7499 @end menu
+ − 7500
442
+ − 7501 @node Internal String Encoding, Internal Character Encoding, Internal Mule Encodings, Internal Mule Encodings
428
+ − 7502 @subsection Internal String Encoding
+ − 7503
+ − 7504 ASCII characters are encoded using their position code directly. Other
+ − 7505 characters are encoded using their leading byte followed by their
+ − 7506 position code(s) with the high bit set. Characters in private character
+ − 7507 sets have their leading byte prefixed with a @dfn{leading byte prefix},
+ − 7508 which is either 0x9E or 0x9F. (No character sets are ever assigned these
+ − 7509 leading bytes.) Specifically:
+ − 7510
+ − 7511 @example
+ − 7512 Character set Encoding (PC=position-code, LB=leading-byte)
+ − 7513 ------------- --------
+ − 7514 ASCII PC-1 |
+ − 7515 Control-1 LB | PC1 + 0xA0 |
+ − 7516 Dimension-1 official LB | PC1 + 0x80 |
+ − 7517 Dimension-1 private 0x9E | LB | PC1 + 0x80 |
+ − 7518 Dimension-2 official LB | PC1 + 0x80 | PC2 + 0x80 |
+ − 7519 Dimension-2 private 0x9F | LB | PC1 + 0x80 | PC2 + 0x80
+ − 7520 @end example
+ − 7521
+ − 7522 The basic characteristic of this encoding is that the first byte
+ − 7523 of all characters is in the range 0x00 - 0x9F, and the second and
+ − 7524 following bytes of all characters is in the range 0xA0 - 0xFF.
+ − 7525 This means that it is impossible to get out of sync, or more
+ − 7526 specifically:
+ − 7527
+ − 7528 @enumerate
+ − 7529 @item
+ − 7530 Given any byte position, the beginning of the character it is
+ − 7531 within can be determined in constant time.
+ − 7532 @item
+ − 7533 Given any byte position at the beginning of a character, the
+ − 7534 beginning of the next character can be determined in constant
+ − 7535 time.
+ − 7536 @item
+ − 7537 Given any byte position at the beginning of a character, the
+ − 7538 beginning of the previous character can be determined in constant
+ − 7539 time.
+ − 7540 @item
+ − 7541 Textual searches can simply treat encoded strings as if they
+ − 7542 were encoded in a one-byte-per-character fashion rather than
+ − 7543 the actual multi-byte encoding.
+ − 7544 @end enumerate
+ − 7545
+ − 7546 None of the standard non-modal encodings meet all of these
+ − 7547 conditions. For example, EUC satisfies only (2) and (3), while
+ − 7548 Shift-JIS and Big5 (not yet described) satisfy only (2). (All
+ − 7549 non-modal encodings must satisfy (2), in order to be unambiguous.)
+ − 7550
442
+ − 7551 @node Internal Character Encoding, , Internal String Encoding, Internal Mule Encodings
428
+ − 7552 @subsection Internal Character Encoding
+ − 7553
+ − 7554 One 19-bit word represents a single character. The word is
+ − 7555 separated into three fields:
+ − 7556
+ − 7557 @example
+ − 7558 Bit number: 18 17 16 15 14 13 12 11 10 09 08 07 06 05 04 03 02 01 00
+ − 7559 <------------> <------------------> <------------------>
+ − 7560 Field: 1 2 3
+ − 7561 @end example
+ − 7562
+ − 7563 Note that fields 2 and 3 hold 7 bits each, while field 1 holds 5 bits.
+ − 7564
+ − 7565 @example
+ − 7566 Character set Field 1 Field 2 Field 3
+ − 7567 ------------- ------- ------- -------
+ − 7568 ASCII 0 0 PC1
+ − 7569 range: (00 - 7F)
+ − 7570 Control-1 0 1 PC1
+ − 7571 range: (00 - 1F)
+ − 7572 Dimension-1 official 0 LB - 0x80 PC1
+ − 7573 range: (01 - 0D) (20 - 7F)
+ − 7574 Dimension-1 private 0 LB - 0x80 PC1
+ − 7575 range: (20 - 6F) (20 - 7F)
+ − 7576 Dimension-2 official LB - 0x8F PC1 PC2
+ − 7577 range: (01 - 0A) (20 - 7F) (20 - 7F)
+ − 7578 Dimension-2 private LB - 0xE1 PC1 PC2
+ − 7579 range: (0F - 1E) (20 - 7F) (20 - 7F)
+ − 7580 Composite 0x1F ? ?
+ − 7581 @end example
+ − 7582
+ − 7583 Note that character codes 0 - 255 are the same as the ``binary encoding''
+ − 7584 described above.
+ − 7585
442
+ − 7586 @node CCL, , Internal Mule Encodings, MULE Character Sets and Encodings
428
+ − 7587 @section CCL
+ − 7588
+ − 7589 @example
+ − 7590 CCL PROGRAM SYNTAX:
+ − 7591 CCL_PROGRAM := (CCL_MAIN_BLOCK
+ − 7592 [ CCL_EOF_BLOCK ])
+ − 7593
+ − 7594 CCL_MAIN_BLOCK := CCL_BLOCK
+ − 7595 CCL_EOF_BLOCK := CCL_BLOCK
+ − 7596
+ − 7597 CCL_BLOCK := STATEMENT | (STATEMENT [STATEMENT ...])
+ − 7598 STATEMENT :=
+ − 7599 SET | IF | BRANCH | LOOP | REPEAT | BREAK
+ − 7600 | READ | WRITE
+ − 7601
+ − 7602 SET := (REG = EXPRESSION) | (REG SELF_OP EXPRESSION)
+ − 7603 | INT-OR-CHAR
+ − 7604
+ − 7605 EXPRESSION := ARG | (EXPRESSION OP ARG)
+ − 7606
+ − 7607 IF := (if EXPRESSION CCL_BLOCK CCL_BLOCK)
+ − 7608 BRANCH := (branch EXPRESSION CCL_BLOCK [CCL_BLOCK ...])
+ − 7609 LOOP := (loop STATEMENT [STATEMENT ...])
+ − 7610 BREAK := (break)
+ − 7611 REPEAT := (repeat)
+ − 7612 | (write-repeat [REG | INT-OR-CHAR | string])
+ − 7613 | (write-read-repeat REG [INT-OR-CHAR | string | ARRAY]?)
+ − 7614 READ := (read REG) | (read REG REG)
+ − 7615 | (read-if REG ARITH_OP ARG CCL_BLOCK CCL_BLOCK)
+ − 7616 | (read-branch REG CCL_BLOCK [CCL_BLOCK ...])
+ − 7617 WRITE := (write REG) | (write REG REG)
+ − 7618 | (write INT-OR-CHAR) | (write STRING) | STRING
+ − 7619 | (write REG ARRAY)
+ − 7620 END := (end)
+ − 7621
+ − 7622 REG := r0 | r1 | r2 | r3 | r4 | r5 | r6 | r7
+ − 7623 ARG := REG | INT-OR-CHAR
+ − 7624 OP := + | - | * | / | % | & | '|' | ^ | << | >> | <8 | >8 | //
+ − 7625 | < | > | == | <= | >= | !=
+ − 7626 SELF_OP :=
+ − 7627 += | -= | *= | /= | %= | &= | '|=' | ^= | <<= | >>=
+ − 7628 ARRAY := '[' INT-OR-CHAR ... ']'
+ − 7629 INT-OR-CHAR := INT | CHAR
+ − 7630
+ − 7631 MACHINE CODE:
+ − 7632
+ − 7633 The machine code consists of a vector of 32-bit words.
+ − 7634 The first such word specifies the start of the EOF section of the code;
+ − 7635 this is the code executed to handle any stuff that needs to be done
+ − 7636 (e.g. designating back to ASCII and left-to-right mode) after all
+ − 7637 other encoded/decoded data has been written out. This is not used for
+ − 7638 charset CCL programs.
+ − 7639
442
+ − 7640 REGISTER: 0..7 -- referred by RRR or rrr
428
+ − 7641
+ − 7642 OPERATOR BIT FIELD (27-bit): XXXXXXXXXXXXXXX RRR TTTTT
+ − 7643 TTTTT (5-bit): operator type
+ − 7644 RRR (3-bit): register number
+ − 7645 XXXXXXXXXXXXXXXX (15-bit):
+ − 7646 CCCCCCCCCCCCCCC: constant or address
+ − 7647 000000000000rrr: register number
+ − 7648
+ − 7649 AAAA: 00000 +
+ − 7650 00001 -
+ − 7651 00010 *
+ − 7652 00011 /
+ − 7653 00100 %
+ − 7654 00101 &
+ − 7655 00110 |
+ − 7656 00111 ~
+ − 7657
+ − 7658 01000 <<
+ − 7659 01001 >>
+ − 7660 01010 <8
+ − 7661 01011 >8
+ − 7662 01100 //
+ − 7663 01101 not used
+ − 7664 01110 not used
+ − 7665 01111 not used
+ − 7666
+ − 7667 10000 <
+ − 7668 10001 >
+ − 7669 10010 ==
+ − 7670 10011 <=
+ − 7671 10100 >=
+ − 7672 10101 !=
+ − 7673
+ − 7674 OPERATORS: TTTTT RRR XX..
+ − 7675
+ − 7676 SetCS: 00000 RRR C...C RRR = C...C
+ − 7677 SetCL: 00001 RRR ..... RRR = c...c
+ − 7678 c.............c
+ − 7679 SetR: 00010 RRR ..rrr RRR = rrr
+ − 7680 SetA: 00011 RRR ..rrr RRR = array[rrr]
+ − 7681 C.............C size of array = C...C
+ − 7682 c.............c contents = c...c
+ − 7683
+ − 7684 Jump: 00100 000 c...c jump to c...c
+ − 7685 JumpCond: 00101 RRR c...c if (!RRR) jump to c...c
+ − 7686 WriteJump: 00110 RRR c...c Write1 RRR, jump to c...c
+ − 7687 WriteReadJump: 00111 RRR c...c Write1, Read1 RRR, jump to c...c
+ − 7688 WriteCJump: 01000 000 c...c Write1 C...C, jump to c...c
+ − 7689 C...C
+ − 7690 WriteCReadJump: 01001 RRR c...c Write1 C...C, Read1 RRR,
+ − 7691 C.............C and jump to c...c
+ − 7692 WriteSJump: 01010 000 c...c WriteS, jump to c...c
+ − 7693 C.............C
+ − 7694 S.............S
+ − 7695 ...
+ − 7696 WriteSReadJump: 01011 RRR c...c WriteS, Read1 RRR, jump to c...c
+ − 7697 C.............C
+ − 7698 S.............S
+ − 7699 ...
+ − 7700 WriteAReadJump: 01100 RRR c...c WriteA, Read1 RRR, jump to c...c
+ − 7701 C.............C size of array = C...C
+ − 7702 c.............c contents = c...c
+ − 7703 ...
+ − 7704 Branch: 01101 RRR C...C if (RRR >= 0 && RRR < C..)
+ − 7705 c.............c branch to (RRR+1)th address
+ − 7706 Read1: 01110 RRR ... read 1-byte to RRR
+ − 7707 Read2: 01111 RRR ..rrr read 2-byte to RRR and rrr
+ − 7708 ReadBranch: 10000 RRR C...C Read1 and Branch
+ − 7709 c.............c
+ − 7710 ...
+ − 7711 Write1: 10001 RRR ..... write 1-byte RRR
+ − 7712 Write2: 10010 RRR ..rrr write 2-byte RRR and rrr
+ − 7713 WriteC: 10011 000 ..... write 1-char C...CC
+ − 7714 C.............C
+ − 7715 WriteS: 10100 000 ..... write C..-byte of string
+ − 7716 C.............C
+ − 7717 S.............S
+ − 7718 ...
+ − 7719 WriteA: 10101 RRR ..... write array[RRR]
+ − 7720 C.............C size of array = C...C
+ − 7721 c.............c contents = c...c
+ − 7722 ...
+ − 7723 End: 10110 000 ..... terminate the execution
+ − 7724
+ − 7725 SetSelfCS: 10111 RRR C...C RRR AAAAA= C...C
+ − 7726 ..........AAAAA
+ − 7727 SetSelfCL: 11000 RRR ..... RRR AAAAA= c...c
+ − 7728 c.............c
+ − 7729 ..........AAAAA
+ − 7730 SetSelfR: 11001 RRR ..Rrr RRR AAAAA= rrr
+ − 7731 ..........AAAAA
+ − 7732 SetExprCL: 11010 RRR ..Rrr RRR = rrr AAAAA c...c
+ − 7733 c.............c
+ − 7734 ..........AAAAA
+ − 7735 SetExprR: 11011 RRR ..rrr RRR = rrr AAAAA Rrr
+ − 7736 ............Rrr
+ − 7737 ..........AAAAA
+ − 7738 JumpCondC: 11100 RRR c...c if !(RRR AAAAA C..) jump to c...c
+ − 7739 C.............C
+ − 7740 ..........AAAAA
+ − 7741 JumpCondR: 11101 RRR c...c if !(RRR AAAAA rrr) jump to c...c
+ − 7742 ............rrr
+ − 7743 ..........AAAAA
+ − 7744 ReadJumpCondC: 11110 RRR c...c Read1 and JumpCondC
+ − 7745 C.............C
+ − 7746 ..........AAAAA
+ − 7747 ReadJumpCondR: 11111 RRR c...c Read1 and JumpCondR
+ − 7748 ............rrr
+ − 7749 ..........AAAAA
+ − 7750 @end example
+ − 7751
+ − 7752 @node The Lisp Reader and Compiler, Lstreams, MULE Character Sets and Encodings, Top
+ − 7753 @chapter The Lisp Reader and Compiler
+ − 7754
+ − 7755 Not yet documented.
+ − 7756
+ − 7757 @node Lstreams, Consoles; Devices; Frames; Windows, The Lisp Reader and Compiler, Top
+ − 7758 @chapter Lstreams
+ − 7759
+ − 7760 An @dfn{lstream} is an internal Lisp object that provides a generic
+ − 7761 buffering stream implementation. Conceptually, you send data to the
+ − 7762 stream or read data from the stream, not caring what's on the other end
+ − 7763 of the stream. The other end could be another stream, a file
+ − 7764 descriptor, a stdio stream, a fixed block of memory, a reallocating
+ − 7765 block of memory, etc. The main purpose of the stream is to provide a
+ − 7766 standard interface and to do buffering. Macros are defined to read or
+ − 7767 write characters, so the calling functions do not have to worry about
+ − 7768 blocking data together in order to achieve efficiency.
+ − 7769
+ − 7770 @menu
+ − 7771 * Creating an Lstream:: Creating an lstream object.
+ − 7772 * Lstream Types:: Different sorts of things that are streamed.
+ − 7773 * Lstream Functions:: Functions for working with lstreams.
+ − 7774 * Lstream Methods:: Creating new lstream types.
+ − 7775 @end menu
+ − 7776
442
+ − 7777 @node Creating an Lstream, Lstream Types, Lstreams, Lstreams
428
+ − 7778 @section Creating an Lstream
+ − 7779
+ − 7780 Lstreams come in different types, depending on what is being interfaced
+ − 7781 to. Although the primitive for creating new lstreams is
+ − 7782 @code{Lstream_new()}, generally you do not call this directly. Instead,
+ − 7783 you call some type-specific creation function, which creates the lstream
+ − 7784 and initializes it as appropriate for the particular type.
+ − 7785
+ − 7786 All lstream creation functions take a @var{mode} argument, specifying
+ − 7787 what mode the lstream should be opened as. This controls whether the
+ − 7788 lstream is for input and output, and optionally whether data should be
+ − 7789 blocked up in units of MULE characters. Note that some types of
+ − 7790 lstreams can only be opened for input; others only for output; and
+ − 7791 others can be opened either way. #### Richard Mlynarik thinks that
+ − 7792 there should be a strict separation between input and output streams,
+ − 7793 and he's probably right.
+ − 7794
+ − 7795 @var{mode} is a string, one of
+ − 7796
+ − 7797 @table @code
+ − 7798 @item "r"
+ − 7799 Open for reading.
+ − 7800 @item "w"
+ − 7801 Open for writing.
+ − 7802 @item "rc"
+ − 7803 Open for reading, but ``read'' never returns partial MULE characters.
+ − 7804 @item "wc"
+ − 7805 Open for writing, but never writes partial MULE characters.
+ − 7806 @end table
+ − 7807
442
+ − 7808 @node Lstream Types, Lstream Functions, Creating an Lstream, Lstreams
428
+ − 7809 @section Lstream Types
+ − 7810
+ − 7811 @table @asis
+ − 7812 @item stdio
+ − 7813
+ − 7814 @item filedesc
+ − 7815
+ − 7816 @item lisp-string
+ − 7817
+ − 7818 @item fixed-buffer
+ − 7819
+ − 7820 @item resizing-buffer
+ − 7821
+ − 7822 @item dynarr
+ − 7823
+ − 7824 @item lisp-buffer
+ − 7825
+ − 7826 @item print
+ − 7827
+ − 7828 @item decoding
+ − 7829
+ − 7830 @item encoding
+ − 7831 @end table
+ − 7832
442
+ − 7833 @node Lstream Functions, Lstream Methods, Lstream Types, Lstreams
428
+ − 7834 @section Lstream Functions
+ − 7835
442
+ − 7836 @deftypefun {Lstream *} Lstream_new (Lstream_implementation *@var{imp}, const char *@var{mode})
428
+ − 7837 Allocate and return a new Lstream. This function is not really meant to
+ − 7838 be called directly; rather, each stream type should provide its own
+ − 7839 stream creation function, which creates the stream and does any other
+ − 7840 necessary creation stuff (e.g. opening a file).
+ − 7841 @end deftypefun
+ − 7842
+ − 7843 @deftypefun void Lstream_set_buffering (Lstream *@var{lstr}, Lstream_buffering @var{buffering}, int @var{buffering_size})
+ − 7844 Change the buffering of a stream. See @file{lstream.h}. By default the
+ − 7845 buffering is @code{STREAM_BLOCK_BUFFERED}.
+ − 7846 @end deftypefun
+ − 7847
+ − 7848 @deftypefun int Lstream_flush (Lstream *@var{lstr})
+ − 7849 Flush out any pending unwritten data in the stream. Clear any buffered
+ − 7850 input data. Returns 0 on success, -1 on error.
+ − 7851 @end deftypefun
+ − 7852
+ − 7853 @deftypefn Macro int Lstream_putc (Lstream *@var{stream}, int @var{c})
+ − 7854 Write out one byte to the stream. This is a macro and so it is very
+ − 7855 efficient. The @var{c} argument is only evaluated once but the @var{stream}
+ − 7856 argument is evaluated more than once. Returns 0 on success, -1 on
+ − 7857 error.
+ − 7858 @end deftypefn
+ − 7859
+ − 7860 @deftypefn Macro int Lstream_getc (Lstream *@var{stream})
+ − 7861 Read one byte from the stream. This is a macro and so it is very
+ − 7862 efficient. The @var{stream} argument is evaluated more than once. Return
+ − 7863 value is -1 for EOF or error.
+ − 7864 @end deftypefn
+ − 7865
+ − 7866 @deftypefn Macro void Lstream_ungetc (Lstream *@var{stream}, int @var{c})
+ − 7867 Push one byte back onto the input queue. This will be the next byte
+ − 7868 read from the stream. Any number of bytes can be pushed back and will
440
+ − 7869 be read in the reverse order they were pushed back---most recent
+ − 7870 first. (This is necessary for consistency---if there are a number of
428
+ − 7871 bytes that have been unread and I read and unread a byte, it needs to be
+ − 7872 the first to be read again.) This is a macro and so it is very
+ − 7873 efficient. The @var{c} argument is only evaluated once but the @var{stream}
+ − 7874 argument is evaluated more than once.
+ − 7875 @end deftypefn
+ − 7876
+ − 7877 @deftypefun int Lstream_fputc (Lstream *@var{stream}, int @var{c})
+ − 7878 @deftypefunx int Lstream_fgetc (Lstream *@var{stream})
+ − 7879 @deftypefunx void Lstream_fungetc (Lstream *@var{stream}, int @var{c})
+ − 7880 Function equivalents of the above macros.
+ − 7881 @end deftypefun
+ − 7882
+ − 7883 @deftypefun ssize_t Lstream_read (Lstream *@var{stream}, void *@var{data}, size_t @var{size})
+ − 7884 Read @var{size} bytes of @var{data} from the stream. Return the number
+ − 7885 of bytes read. 0 means EOF. -1 means an error occurred and no bytes
+ − 7886 were read.
+ − 7887 @end deftypefun
+ − 7888
+ − 7889 @deftypefun ssize_t Lstream_write (Lstream *@var{stream}, void *@var{data}, size_t @var{size})
+ − 7890 Write @var{size} bytes of @var{data} to the stream. Return the number
+ − 7891 of bytes written. -1 means an error occurred and no bytes were written.
+ − 7892 @end deftypefun
+ − 7893
+ − 7894 @deftypefun void Lstream_unread (Lstream *@var{stream}, void *@var{data}, size_t @var{size})
+ − 7895 Push back @var{size} bytes of @var{data} onto the input queue. The next
+ − 7896 call to @code{Lstream_read()} with the same size will read the same
+ − 7897 bytes back. Note that this will be the case even if there is other
+ − 7898 pending unread data.
+ − 7899 @end deftypefun
+ − 7900
+ − 7901 @deftypefun int Lstream_close (Lstream *@var{stream})
+ − 7902 Close the stream. All data will be flushed out.
+ − 7903 @end deftypefun
+ − 7904
+ − 7905 @deftypefun void Lstream_reopen (Lstream *@var{stream})
+ − 7906 Reopen a closed stream. This enables I/O on it again. This is not
+ − 7907 meant to be called except from a wrapper routine that reinitializes
440
+ − 7908 variables and such---the close routine may well have freed some
428
+ − 7909 necessary storage structures, for example.
+ − 7910 @end deftypefun
+ − 7911
+ − 7912 @deftypefun void Lstream_rewind (Lstream *@var{stream})
+ − 7913 Rewind the stream to the beginning.
+ − 7914 @end deftypefun
+ − 7915
442
+ − 7916 @node Lstream Methods, , Lstream Functions, Lstreams
428
+ − 7917 @section Lstream Methods
+ − 7918
+ − 7919 @deftypefn {Lstream Method} ssize_t reader (Lstream *@var{stream}, unsigned char *@var{data}, size_t @var{size})
+ − 7920 Read some data from the stream's end and store it into @var{data}, which
+ − 7921 can hold @var{size} bytes. Return the number of bytes read. A return
+ − 7922 value of 0 means no bytes can be read at this time. This may be because
+ − 7923 of an EOF, or because there is a granularity greater than one byte that
+ − 7924 the stream imposes on the returned data, and @var{size} is less than
+ − 7925 this granularity. (This will happen frequently for streams that need to
+ − 7926 return whole characters, because @code{Lstream_read()} calls the reader
+ − 7927 function repeatedly until it has the number of bytes it wants or until 0
+ − 7928 is returned.) The lstream functions do not treat a 0 return as EOF or
+ − 7929 do anything special; however, the calling function will interpret any 0
+ − 7930 it gets back as EOF. This will normally not happen unless the caller
+ − 7931 calls @code{Lstream_read()} with a very small size.
+ − 7932
+ − 7933 This function can be @code{NULL} if the stream is output-only.
+ − 7934 @end deftypefn
+ − 7935
442
+ − 7936 @deftypefn {Lstream Method} ssize_t writer (Lstream *@var{stream}, const unsigned char *@var{data}, size_t @var{size})
428
+ − 7937 Send some data to the stream's end. Data to be sent is in @var{data}
+ − 7938 and is @var{size} bytes. Return the number of bytes sent. This
+ − 7939 function can send and return fewer bytes than is passed in; in that
+ − 7940 case, the function will just be called again until there is no data left
+ − 7941 or 0 is returned. A return value of 0 means that no more data can be
+ − 7942 currently stored, but there is no error; the data will be squirreled
+ − 7943 away until the writer can accept data. (This is useful, e.g., if you're
+ − 7944 dealing with a non-blocking file descriptor and are getting
+ − 7945 @code{EWOULDBLOCK} errors.) This function can be @code{NULL} if the
+ − 7946 stream is input-only.
+ − 7947 @end deftypefn
+ − 7948
+ − 7949 @deftypefn {Lstream Method} int rewinder (Lstream *@var{stream})
+ − 7950 Rewind the stream. If this is @code{NULL}, the stream is not seekable.
+ − 7951 @end deftypefn
+ − 7952
+ − 7953 @deftypefn {Lstream Method} int seekable_p (Lstream *@var{stream})
440
+ − 7954 Indicate whether this stream is seekable---i.e. it can be rewound.
428
+ − 7955 This method is ignored if the stream does not have a rewind method. If
+ − 7956 this method is not present, the result is determined by whether a rewind
+ − 7957 method is present.
+ − 7958 @end deftypefn
+ − 7959
+ − 7960 @deftypefn {Lstream Method} int flusher (Lstream *@var{stream})
+ − 7961 Perform any additional operations necessary to flush the data in this
+ − 7962 stream.
+ − 7963 @end deftypefn
+ − 7964
+ − 7965 @deftypefn {Lstream Method} int pseudo_closer (Lstream *@var{stream})
+ − 7966 @end deftypefn
+ − 7967
+ − 7968 @deftypefn {Lstream Method} int closer (Lstream *@var{stream})
+ − 7969 Perform any additional operations necessary to close this stream down.
+ − 7970 May be @code{NULL}. This function is called when @code{Lstream_close()}
+ − 7971 is called or when the stream is garbage-collected. When this function
+ − 7972 is called, all pending data in the stream will already have been written
+ − 7973 out.
+ − 7974 @end deftypefn
+ − 7975
+ − 7976 @deftypefn {Lstream Method} Lisp_Object marker (Lisp_Object @var{lstream}, void (*@var{markfun}) (Lisp_Object))
+ − 7977 Mark this object for garbage collection. Same semantics as a standard
+ − 7978 @code{Lisp_Object} marker. This function can be @code{NULL}.
+ − 7979 @end deftypefn
+ − 7980
+ − 7981 @node Consoles; Devices; Frames; Windows, The Redisplay Mechanism, Lstreams, Top
+ − 7982 @chapter Consoles; Devices; Frames; Windows
+ − 7983
+ − 7984 @menu
+ − 7985 * Introduction to Consoles; Devices; Frames; Windows::
+ − 7986 * Point::
+ − 7987 * Window Hierarchy::
+ − 7988 * The Window Object::
+ − 7989 @end menu
+ − 7990
442
+ − 7991 @node Introduction to Consoles; Devices; Frames; Windows, Point, Consoles; Devices; Frames; Windows, Consoles; Devices; Frames; Windows
428
+ − 7992 @section Introduction to Consoles; Devices; Frames; Windows
+ − 7993
+ − 7994 A window-system window that you see on the screen is called a
+ − 7995 @dfn{frame} in Emacs terminology. Each frame is subdivided into one or
+ − 7996 more non-overlapping panes, called (confusingly) @dfn{windows}. Each
+ − 7997 window displays the text of a buffer in it. (See above on Buffers.) Note
+ − 7998 that buffers and windows are independent entities: Two or more windows
+ − 7999 can be displaying the same buffer (potentially in different locations),
+ − 8000 and a buffer can be displayed in no windows.
+ − 8001
+ − 8002 A single display screen that contains one or more frames is called
+ − 8003 a @dfn{display}. Under most circumstances, there is only one display.
+ − 8004 However, more than one display can exist, for example if you have
+ − 8005 a @dfn{multi-headed} console, i.e. one with a single keyboard but
+ − 8006 multiple displays. (Typically in such a situation, the various
+ − 8007 displays act like one large display, in that the mouse is only
+ − 8008 in one of them at a time, and moving the mouse off of one moves
+ − 8009 it into another.) In some cases, the different displays will
+ − 8010 have different characteristics, e.g. one color and one mono.
+ − 8011
+ − 8012 XEmacs can display frames on multiple displays. It can even deal
+ − 8013 simultaneously with frames on multiple keyboards (called @dfn{consoles} in
+ − 8014 XEmacs terminology). Here is one case where this might be useful: You
+ − 8015 are using XEmacs on your workstation at work, and leave it running.
+ − 8016 Then you go home and dial in on a TTY line, and you can use the
+ − 8017 already-running XEmacs process to display another frame on your local
+ − 8018 TTY.
+ − 8019
+ − 8020 Thus, there is a hierarchy console -> display -> frame -> window.
+ − 8021 There is a separate Lisp object type for each of these four concepts.
+ − 8022 Furthermore, there is logically a @dfn{selected console},
+ − 8023 @dfn{selected display}, @dfn{selected frame}, and @dfn{selected window}.
+ − 8024 Each of these objects is distinguished in various ways, such as being the
+ − 8025 default object for various functions that act on objects of that type.
442
+ − 8026 Note that every containing object remembers the ``selected'' object
428
+ − 8027 among the objects that it contains: e.g. not only is there a selected
+ − 8028 window, but every frame remembers the last window in it that was
+ − 8029 selected, and changing the selected frame causes the remembered window
+ − 8030 within it to become the selected window. Similar relationships apply
+ − 8031 for consoles to devices and devices to frames.
+ − 8032
442
+ − 8033 @node Point, Window Hierarchy, Introduction to Consoles; Devices; Frames; Windows, Consoles; Devices; Frames; Windows
428
+ − 8034 @section Point
+ − 8035
+ − 8036 Recall that every buffer has a current insertion position, called
+ − 8037 @dfn{point}. Now, two or more windows may be displaying the same buffer,
+ − 8038 and the text cursor in the two windows (i.e. @code{point}) can be in
+ − 8039 two different places. You may ask, how can that be, since each
+ − 8040 buffer has only one value of @code{point}? The answer is that each window
+ − 8041 also has a value of @code{point} that is squirreled away in it. There
+ − 8042 is only one selected window, and the value of ``point'' in that buffer
+ − 8043 corresponds to that window. When the selected window is changed
+ − 8044 from one window to another displaying the same buffer, the old
+ − 8045 value of @code{point} is stored into the old window's ``point'' and the
+ − 8046 value of @code{point} from the new window is retrieved and made the
+ − 8047 value of @code{point} in the buffer. This means that @code{window-point}
+ − 8048 for the selected window is potentially inaccurate, and if you
+ − 8049 want to retrieve the correct value of @code{point} for a window,
+ − 8050 you must special-case on the selected window and retrieve the
+ − 8051 buffer's point instead. This is related to why @code{save-window-excursion}
+ − 8052 does not save the selected window's value of @code{point}.
+ − 8053
442
+ − 8054 @node Window Hierarchy, The Window Object, Point, Consoles; Devices; Frames; Windows
428
+ − 8055 @section Window Hierarchy
+ − 8056 @cindex window hierarchy
+ − 8057 @cindex hierarchy of windows
+ − 8058
+ − 8059 If a frame contains multiple windows (panes), they are always created
+ − 8060 by splitting an existing window along the horizontal or vertical axis.
+ − 8061 Terminology is a bit confusing here: to @dfn{split a window
+ − 8062 horizontally} means to create two side-by-side windows, i.e. to make a
+ − 8063 @emph{vertical} cut in a window. Likewise, to @dfn{split a window
+ − 8064 vertically} means to create two windows, one above the other, by making
+ − 8065 a @emph{horizontal} cut.
+ − 8066
+ − 8067 If you split a window and then split again along the same axis, you
+ − 8068 will end up with a number of panes all arranged along the same axis.
+ − 8069 The precise way in which the splits were made should not be important,
+ − 8070 and this is reflected internally. Internally, all windows are arranged
+ − 8071 in a tree, consisting of two types of windows, @dfn{combination} windows
+ − 8072 (which have children, and are covered completely by those children) and
+ − 8073 @dfn{leaf} windows, which have no children and are visible. Every
+ − 8074 combination window has two or more children, all arranged along the same
+ − 8075 axis. There are (logically) two subtypes of windows, depending on
+ − 8076 whether their children are horizontally or vertically arrayed. There is
+ − 8077 always one root window, which is either a leaf window (if the frame
+ − 8078 contains only one window) or a combination window (if the frame contains
+ − 8079 more than one window). In the latter case, the root window will have
+ − 8080 two or more children, either horizontally or vertically arrayed, and
+ − 8081 each of those children will be either a leaf window or another
+ − 8082 combination window.
+ − 8083
+ − 8084 Here are some rules:
+ − 8085
+ − 8086 @enumerate
+ − 8087 @item
+ − 8088 Horizontal combination windows can never have children that are
+ − 8089 horizontal combination windows; same for vertical.
+ − 8090
+ − 8091 @item
+ − 8092 Only leaf windows can be split (obviously) and this splitting does one
+ − 8093 of two things: (a) turns the leaf window into a combination window and
+ − 8094 creates two new leaf children, or (b) turns the leaf window into one of
+ − 8095 the two new leaves and creates the other leaf. Rule (1) dictates which
+ − 8096 of these two outcomes happens.
+ − 8097
+ − 8098 @item
+ − 8099 Every combination window must have at least two children.
+ − 8100
+ − 8101 @item
+ − 8102 Leaf windows can never become combination windows. They can be deleted,
+ − 8103 however. If this results in a violation of (3), the parent combination
+ − 8104 window also gets deleted.
+ − 8105
+ − 8106 @item
+ − 8107 All functions that accept windows must be prepared to accept combination
+ − 8108 windows, and do something sane (e.g. signal an error if so).
+ − 8109 Combination windows @emph{do} escape to the Lisp level.
+ − 8110
+ − 8111 @item
+ − 8112 All windows have three fields governing their contents:
+ − 8113 these are @dfn{hchild} (a list of horizontally-arrayed children),
+ − 8114 @dfn{vchild} (a list of vertically-arrayed children), and @dfn{buffer}
+ − 8115 (the buffer contained in a leaf window). Exactly one of
444
+ − 8116 these will be non-@code{nil}. Remember that @dfn{horizontally-arrayed}
428
+ − 8117 means ``side-by-side'' and @dfn{vertically-arrayed} means
+ − 8118 @dfn{one above the other}.
+ − 8119
+ − 8120 @item
+ − 8121 Leaf windows also have markers in their @code{start} (the
+ − 8122 first buffer position displayed in the window) and @code{pointm}
440
+ − 8123 (the window's stashed value of @code{point}---see above) fields,
444
+ − 8124 while combination windows have @code{nil} in these fields.
428
+ − 8125
+ − 8126 @item
+ − 8127 The list of children for a window is threaded through the
+ − 8128 @code{next} and @code{prev} fields of each child window.
+ − 8129
+ − 8130 @item
+ − 8131 @strong{Deleted windows can be undeleted}. This happens as a result of
+ − 8132 restoring a window configuration, and is unlike frames, displays, and
+ − 8133 consoles, which, once deleted, can never be restored. Deleting a window
+ − 8134 does nothing except set a special @code{dead} bit to 1 and clear out the
+ − 8135 @code{next}, @code{prev}, @code{hchild}, and @code{vchild} fields, for
+ − 8136 GC purposes.
+ − 8137
+ − 8138 @item
440
+ − 8139 Most frames actually have two top-level windows---one for the
428
+ − 8140 minibuffer and one (the @dfn{root}) for everything else. The modeline
+ − 8141 (if present) separates these two. The @code{next} field of the root
+ − 8142 points to the minibuffer, and the @code{prev} field of the minibuffer
+ − 8143 points to the root. The other @code{next} and @code{prev} fields are
+ − 8144 @code{nil}, and the frame points to both of these windows.
+ − 8145 Minibuffer-less frames have no minibuffer window, and the @code{next}
+ − 8146 and @code{prev} of the root window are @code{nil}. Minibuffer-only
+ − 8147 frames have no root window, and the @code{next} of the minibuffer window
+ − 8148 is @code{nil} but the @code{prev} points to itself. (#### This is an
+ − 8149 artifact that should be fixed.)
+ − 8150 @end enumerate
+ − 8151
442
+ − 8152 @node The Window Object, , Window Hierarchy, Consoles; Devices; Frames; Windows
428
+ − 8153 @section The Window Object
+ − 8154
+ − 8155 Windows have the following accessible fields:
+ − 8156
+ − 8157 @table @code
+ − 8158 @item frame
+ − 8159 The frame that this window is on.
+ − 8160
+ − 8161 @item mini_p
+ − 8162 Non-@code{nil} if this window is a minibuffer window.
+ − 8163
+ − 8164 @item buffer
+ − 8165 The buffer that the window is displaying. This may change often during
+ − 8166 the life of the window.
+ − 8167
+ − 8168 @item dedicated
+ − 8169 Non-@code{nil} if this window is dedicated to its buffer.
+ − 8170
+ − 8171 @item pointm
+ − 8172 @cindex window point internals
+ − 8173 This is the value of point in the current buffer when this window is
+ − 8174 selected; when it is not selected, it retains its previous value.
+ − 8175
+ − 8176 @item start
+ − 8177 The position in the buffer that is the first character to be displayed
+ − 8178 in the window.
+ − 8179
+ − 8180 @item force_start
+ − 8181 If this flag is non-@code{nil}, it says that the window has been
+ − 8182 scrolled explicitly by the Lisp program. This affects what the next
+ − 8183 redisplay does if point is off the screen: instead of scrolling the
+ − 8184 window to show the text around point, it moves point to a location that
+ − 8185 is on the screen.
+ − 8186
+ − 8187 @item last_modified
+ − 8188 The @code{modified} field of the window's buffer, as of the last time
+ − 8189 a redisplay completed in this window.
+ − 8190
+ − 8191 @item last_point
+ − 8192 The buffer's value of point, as of the last time
+ − 8193 a redisplay completed in this window.
+ − 8194
+ − 8195 @item left
+ − 8196 This is the left-hand edge of the window, measured in columns. (The
+ − 8197 leftmost column on the screen is @w{column 0}.)
+ − 8198
+ − 8199 @item top
+ − 8200 This is the top edge of the window, measured in lines. (The top line on
+ − 8201 the screen is @w{line 0}.)
+ − 8202
+ − 8203 @item height
+ − 8204 The height of the window, measured in lines.
+ − 8205
+ − 8206 @item width
+ − 8207 The width of the window, measured in columns.
+ − 8208
+ − 8209 @item next
+ − 8210 This is the window that is the next in the chain of siblings. It is
+ − 8211 @code{nil} in a window that is the rightmost or bottommost of a group of
+ − 8212 siblings.
+ − 8213
+ − 8214 @item prev
+ − 8215 This is the window that is the previous in the chain of siblings. It is
+ − 8216 @code{nil} in a window that is the leftmost or topmost of a group of
+ − 8217 siblings.
+ − 8218
+ − 8219 @item parent
+ − 8220 Internally, XEmacs arranges windows in a tree; each group of siblings has
+ − 8221 a parent window whose area includes all the siblings. This field points
+ − 8222 to a window's parent.
+ − 8223
+ − 8224 Parent windows do not display buffers, and play little role in display
+ − 8225 except to shape their child windows. Emacs Lisp programs usually have
+ − 8226 no access to the parent windows; they operate on the windows at the
+ − 8227 leaves of the tree, which actually display buffers.
+ − 8228
+ − 8229 @item hscroll
+ − 8230 This is the number of columns that the display in the window is scrolled
+ − 8231 horizontally to the left. Normally, this is 0.
+ − 8232
+ − 8233 @item use_time
+ − 8234 This is the last time that the window was selected. The function
+ − 8235 @code{get-lru-window} uses this field.
+ − 8236
+ − 8237 @item display_table
+ − 8238 The window's display table, or @code{nil} if none is specified for it.
+ − 8239
+ − 8240 @item update_mode_line
+ − 8241 Non-@code{nil} means this window's mode line needs to be updated.
+ − 8242
+ − 8243 @item base_line_number
+ − 8244 The line number of a certain position in the buffer, or @code{nil}.
+ − 8245 This is used for displaying the line number of point in the mode line.
+ − 8246
+ − 8247 @item base_line_pos
+ − 8248 The position in the buffer for which the line number is known, or
+ − 8249 @code{nil} meaning none is known.
+ − 8250
+ − 8251 @item region_showing
+ − 8252 If the region (or part of it) is highlighted in this window, this field
+ − 8253 holds the mark position that made one end of that region. Otherwise,
+ − 8254 this field is @code{nil}.
+ − 8255 @end table
+ − 8256
+ − 8257 @node The Redisplay Mechanism, Extents, Consoles; Devices; Frames; Windows, Top
+ − 8258 @chapter The Redisplay Mechanism
+ − 8259
+ − 8260 The redisplay mechanism is one of the most complicated sections of
+ − 8261 XEmacs, especially from a conceptual standpoint. This is doubly so
+ − 8262 because, unlike for the basic aspects of the Lisp interpreter, the
+ − 8263 computer science theories of how to efficiently handle redisplay are not
+ − 8264 well-developed.
+ − 8265
+ − 8266 When working with the redisplay mechanism, remember the Golden Rules
+ − 8267 of Redisplay:
+ − 8268
+ − 8269 @enumerate
+ − 8270 @item
+ − 8271 It Is Better To Be Correct Than Fast.
+ − 8272 @item
+ − 8273 Thou Shalt Not Run Elisp From Within Redisplay.
+ − 8274 @item
+ − 8275 It Is Better To Be Fast Than Not To Be.
+ − 8276 @end enumerate
+ − 8277
+ − 8278 @menu
+ − 8279 * Critical Redisplay Sections::
+ − 8280 * Line Start Cache::
+ − 8281 * Redisplay Piece by Piece::
+ − 8282 @end menu
+ − 8283
442
+ − 8284 @node Critical Redisplay Sections, Line Start Cache, The Redisplay Mechanism, The Redisplay Mechanism
428
+ − 8285 @section Critical Redisplay Sections
+ − 8286 @cindex critical redisplay sections
+ − 8287
+ − 8288 Within this section, we are defenseless and assume that the
+ − 8289 following cannot happen:
+ − 8290
+ − 8291 @enumerate
+ − 8292 @item
+ − 8293 garbage collection
+ − 8294 @item
+ − 8295 Lisp code evaluation
+ − 8296 @item
+ − 8297 frame size changes
+ − 8298 @end enumerate
+ − 8299
+ − 8300 We ensure (3) by calling @code{hold_frame_size_changes()}, which
+ − 8301 will cause any pending frame size changes to get put on hold
+ − 8302 till after the end of the critical section. (1) follows
+ − 8303 automatically if (2) is met. #### Unfortunately, there are
+ − 8304 some places where Lisp code can be called within this section.
+ − 8305 We need to remove them.
+ − 8306
+ − 8307 If @code{Fsignal()} is called during this critical section, we
+ − 8308 will @code{abort()}.
+ − 8309
+ − 8310 If garbage collection is called during this critical section,
+ − 8311 we simply return. #### We should abort instead.
+ − 8312
+ − 8313 #### If a frame-size change does occur we should probably
+ − 8314 actually be preempting redisplay.
+ − 8315
442
+ − 8316 @node Line Start Cache, Redisplay Piece by Piece, Critical Redisplay Sections, The Redisplay Mechanism
428
+ − 8317 @section Line Start Cache
+ − 8318 @cindex line start cache
+ − 8319
+ − 8320 The traditional scrolling code in Emacs breaks in a variable height
+ − 8321 world. It depends on the key assumption that the number of lines that
+ − 8322 can be displayed at any given time is fixed. This led to a complete
+ − 8323 separation of the scrolling code from the redisplay code. In order to
+ − 8324 fully support variable height lines, the scrolling code must actually be
+ − 8325 tightly integrated with redisplay. Only redisplay can determine how
+ − 8326 many lines will be displayed on a screen for any given starting point.
+ − 8327
+ − 8328 What is ideally wanted is a complete list of the starting buffer
+ − 8329 position for every possible display line of a buffer along with the
+ − 8330 height of that display line. Maintaining such a full list would be very
+ − 8331 expensive. We settle for having it include information for all areas
+ − 8332 which we happen to generate anyhow (i.e. the region currently being
+ − 8333 displayed) and for those areas we need to work with.
+ − 8334
+ − 8335 In order to ensure that the cache accurately represents what redisplay
+ − 8336 would actually show, it is necessary to invalidate it in many
+ − 8337 situations. If the buffer changes, the starting positions may no longer
+ − 8338 be correct. If a face or an extent has changed then the line heights
+ − 8339 may have altered. These events happen frequently enough that the cache
+ − 8340 can end up being constantly disabled. With this potentially constant
+ − 8341 invalidation when is the cache ever useful?
+ − 8342
+ − 8343 Even if the cache is invalidated before every single usage, it is
+ − 8344 necessary. Scrolling often requires knowledge about display lines which
+ − 8345 are actually above or below the visible region. The cache provides a
+ − 8346 convenient light-weight method of storing this information for multiple
+ − 8347 display regions. This knowledge is necessary for the scrolling code to
+ − 8348 always obey the First Golden Rule of Redisplay.
+ − 8349
+ − 8350 If the cache already contains all of the information that the scrolling
+ − 8351 routines happen to need so that it doesn't have to go generate it, then
+ − 8352 we are able to obey the Third Golden Rule of Redisplay. The first thing
+ − 8353 we do to help out the cache is to always add the displayed region. This
+ − 8354 region had to be generated anyway, so the cache ends up getting the
+ − 8355 information basically for free. In those cases where a user is simply
+ − 8356 scrolling around viewing a buffer there is a high probability that this
+ − 8357 is sufficient to always provide the needed information. The second
+ − 8358 thing we can do is be smart about invalidating the cache.
+ − 8359
440
+ − 8360 TODO---Be smart about invalidating the cache. Potential places:
428
+ − 8361
+ − 8362 @itemize @bullet
+ − 8363 @item
+ − 8364 Insertions at end-of-line which don't cause line-wraps do not alter the
+ − 8365 starting positions of any display lines. These types of buffer
+ − 8366 modifications should not invalidate the cache. This is actually a large
+ − 8367 optimization for redisplay speed as well.
+ − 8368 @item
+ − 8369 Buffer modifications frequently only affect the display of lines at and
+ − 8370 below where they occur. In these situations we should only invalidate
+ − 8371 the part of the cache starting at where the modification occurs.
+ − 8372 @end itemize
+ − 8373
+ − 8374 In case you're wondering, the Second Golden Rule of Redisplay is not
+ − 8375 applicable.
+ − 8376
442
+ − 8377 @node Redisplay Piece by Piece, , Line Start Cache, The Redisplay Mechanism
428
+ − 8378 @section Redisplay Piece by Piece
+ − 8379 @cindex Redisplay Piece by Piece
+ − 8380
+ − 8381 As you can begin to see redisplay is complex and also not well
+ − 8382 documented. Chuck no longer works on XEmacs so this section is my take
+ − 8383 on the workings of redisplay.
+ − 8384
+ − 8385 Redisplay happens in three phases:
+ − 8386
+ − 8387 @enumerate
+ − 8388 @item
+ − 8389 Determine desired display in area that needs redisplay.
+ − 8390 Implemented by @code{redisplay.c}
+ − 8391 @item
+ − 8392 Compare desired display with current display
+ − 8393 Implemented by @code{redisplay-output.c}
+ − 8394 @item
+ − 8395 Output changes Implemented by @code{redisplay-output.c},
+ − 8396 @code{redisplay-x.c}, @code{redisplay-msw.c} and @code{redisplay-tty.c}
+ − 8397 @end enumerate
+ − 8398
442
+ − 8399 Steps 1 and 2 are device-independent and relatively complex. Step 3 is
428
+ − 8400 mostly device-dependent.
+ − 8401
+ − 8402 Determining the desired display
+ − 8403
+ − 8404 Display attributes are stored in @code{display_line} structures. Each
+ − 8405 @code{display_line} consists of a set of @code{display_block}'s and each
+ − 8406 @code{display_block} contains a number of @code{rune}'s. Generally
+ − 8407 dynarr's of @code{display_line}'s are held by each window representing
+ − 8408 the current display and the desired display.
+ − 8409
442
+ − 8410 The @code{display_line} structures are tightly tied to buffers which
428
+ − 8411 presents a problem for redisplay as this connection is bogus for the
+ − 8412 modeline. Hence the @code{display_line} generation routines are
+ − 8413 duplicated for generating the modeline. This means that the modeline
+ − 8414 display code has many bugs that the standard redisplay code does not.
+ − 8415
+ − 8416 The guts of @code{display_line} generation are in
+ − 8417 @code{create_text_block}, which creates a single display line for the
+ − 8418 desired locale. This incrementally parses the characters on the current
442
+ − 8419 line and generates redisplay structures for each.
428
+ − 8420
+ − 8421 Gutter redisplay is different. Because the data to display is stored in
+ − 8422 a string we cannot use @code{create_text_block}. Instead we use
+ − 8423 @code{create_text_string_block} which performs the same function as
+ − 8424 @code{create_text_block} but for strings. Many of the complexities of
+ − 8425 @code{create_text_block} to do with cursor handling and selective
+ − 8426 display have been removed.
+ − 8427
+ − 8428 @node Extents, Faces, The Redisplay Mechanism, Top
+ − 8429 @chapter Extents
+ − 8430
+ − 8431 @menu
+ − 8432 * Introduction to Extents:: Extents are ranges over text, with properties.
+ − 8433 * Extent Ordering:: How extents are ordered internally.
+ − 8434 * Format of the Extent Info:: The extent information in a buffer or string.
+ − 8435 * Zero-Length Extents:: A weird special case.
442
+ − 8436 * Mathematics of Extent Ordering:: A rigorous foundation.
428
+ − 8437 * Extent Fragments:: Cached information useful for redisplay.
+ − 8438 @end menu
+ − 8439
442
+ − 8440 @node Introduction to Extents, Extent Ordering, Extents, Extents
428
+ − 8441 @section Introduction to Extents
+ − 8442
+ − 8443 Extents are regions over a buffer, with a start and an end position
+ − 8444 denoting the region of the buffer included in the extent. In
+ − 8445 addition, either end can be closed or open, meaning that the endpoint
+ − 8446 is or is not logically included in the extent. Insertion of a character
+ − 8447 at a closed endpoint causes the character to go inside the extent;
+ − 8448 insertion at an open endpoint causes the character to go outside.
+ − 8449
+ − 8450 Extent endpoints are stored using memory indices (see @file{insdel.c}),
+ − 8451 to minimize the amount of adjusting that needs to be done when
+ − 8452 characters are inserted or deleted.
+ − 8453
+ − 8454 (Formerly, extent endpoints at the gap could be either before or
+ − 8455 after the gap, depending on the open/closedness of the endpoint.
+ − 8456 The intent of this was to make it so that insertions would
+ − 8457 automatically go inside or out of extents as necessary with no
+ − 8458 further work needing to be done. It didn't work out that way,
+ − 8459 however, and just ended up complexifying and buggifying all the
+ − 8460 rest of the code.)
+ − 8461
442
+ − 8462 @node Extent Ordering, Format of the Extent Info, Introduction to Extents, Extents
428
+ − 8463 @section Extent Ordering
+ − 8464
+ − 8465 Extents are compared using memory indices. There are two orderings
+ − 8466 for extents and both orders are kept current at all times. The normal
+ − 8467 or @dfn{display} order is as follows:
+ − 8468
+ − 8469 @example
+ − 8470 Extent A is ``less than'' extent B,
+ − 8471 that is, earlier in the display order,
+ − 8472 if: A-start < B-start,
+ − 8473 or if: A-start = B-start, and A-end > B-end
+ − 8474 @end example
+ − 8475
+ − 8476 So if two extents begin at the same position, the larger of them is the
+ − 8477 earlier one in the display order (@code{EXTENT_LESS} is true).
+ − 8478
+ − 8479 For the e-order, the same thing holds:
+ − 8480
+ − 8481 @example
+ − 8482 Extent A is ``less than'' extent B in e-order,
+ − 8483 that is, later in the buffer,
+ − 8484 if: A-end < B-end,
+ − 8485 or if: A-end = B-end, and A-start > B-start
+ − 8486 @end example
+ − 8487
+ − 8488 So if two extents end at the same position, the smaller of them is the
+ − 8489 earlier one in the e-order (@code{EXTENT_E_LESS} is true).
+ − 8490
+ − 8491 The display order and the e-order are complementary orders: any
+ − 8492 theorem about the display order also applies to the e-order if you swap
+ − 8493 all occurrences of ``display order'' and ``e-order'', ``less than'' and
+ − 8494 ``greater than'', and ``extent start'' and ``extent end''.
+ − 8495
442
+ − 8496 @node Format of the Extent Info, Zero-Length Extents, Extent Ordering, Extents
428
+ − 8497 @section Format of the Extent Info
+ − 8498
+ − 8499 An extent-info structure consists of a list of the buffer or string's
+ − 8500 extents and a @dfn{stack of extents} that lists all of the extents over
+ − 8501 a particular position. The stack-of-extents info is used for
440
+ − 8502 optimization purposes---it basically caches some info that might
428
+ − 8503 be expensive to compute. Certain otherwise hard computations are easy
+ − 8504 given the stack of extents over a particular position, and if the
+ − 8505 stack of extents over a nearby position is known (because it was
+ − 8506 calculated at some prior point in time), it's easy to move the stack
+ − 8507 of extents to the proper position.
+ − 8508
+ − 8509 Given that the stack of extents is an optimization, and given that
+ − 8510 it requires memory, a string's stack of extents is wiped out each
+ − 8511 time a garbage collection occurs. Therefore, any time you retrieve
+ − 8512 the stack of extents, it might not be there. If you need it to
+ − 8513 be there, use the @code{_force} version.
+ − 8514
+ − 8515 Similarly, a string may or may not have an extent_info structure.
+ − 8516 (Generally it won't if there haven't been any extents added to the
+ − 8517 string.) So use the @code{_force} version if you need the extent_info
+ − 8518 structure to be there.
+ − 8519
+ − 8520 A list of extents is maintained as a double gap array: one gap array
+ − 8521 is ordered by start index (the @dfn{display order}) and the other is
+ − 8522 ordered by end index (the @dfn{e-order}). Note that positions in an
+ − 8523 extent list should logically be conceived of as referring @emph{to} a
+ − 8524 particular extent (as is the norm in programs) rather than sitting
+ − 8525 between two extents. Note also that callers of these functions should
+ − 8526 not be aware of the fact that the extent list is implemented as an
+ − 8527 array, except for the fact that positions are integers (this should be
+ − 8528 generalized to handle integers and linked list equally well).
+ − 8529
442
+ − 8530 @node Zero-Length Extents, Mathematics of Extent Ordering, Format of the Extent Info, Extents
428
+ − 8531 @section Zero-Length Extents
+ − 8532
+ − 8533 Extents can be zero-length, and will end up that way if their endpoints
444
+ − 8534 are explicitly set that way or if their detachable property is @code{nil}
428
+ − 8535 and all the text in the extent is deleted. (The exception is open-open
+ − 8536 zero-length extents, which are barred from existing because there is
+ − 8537 no sensible way to define their properties. Deletion of the text in
+ − 8538 an open-open extent causes it to be converted into a closed-open
+ − 8539 extent.) Zero-length extents are primarily used to represent
+ − 8540 annotations, and behave as follows:
+ − 8541
+ − 8542 @enumerate
+ − 8543 @item
+ − 8544 Insertion at the position of a zero-length extent expands the extent
+ − 8545 if both endpoints are closed; goes after the extent if it is closed-open;
+ − 8546 and goes before the extent if it is open-closed.
+ − 8547
+ − 8548 @item
+ − 8549 Deletion of a character on a side of a zero-length extent whose
+ − 8550 corresponding endpoint is closed causes the extent to be detached if
+ − 8551 it is detachable; if the extent is not detachable or the corresponding
+ − 8552 endpoint is open, the extent remains in the buffer, moving as necessary.
+ − 8553 @end enumerate
+ − 8554
+ − 8555 Note that closed-open, non-detachable zero-length extents behave
+ − 8556 exactly like markers and that open-closed, non-detachable zero-length
+ − 8557 extents behave like the ``point-type'' marker in Mule.
+ − 8558
442
+ − 8559 @node Mathematics of Extent Ordering, Extent Fragments, Zero-Length Extents, Extents
428
+ − 8560 @section Mathematics of Extent Ordering
+ − 8561 @cindex extent mathematics
+ − 8562 @cindex mathematics of extents
+ − 8563 @cindex extent ordering
+ − 8564
+ − 8565 @cindex display order of extents
+ − 8566 @cindex extents, display order
+ − 8567 The extents in a buffer are ordered by ``display order'' because that
+ − 8568 is that order that the redisplay mechanism needs to process them in.
+ − 8569 The e-order is an auxiliary ordering used to facilitate operations
+ − 8570 over extents. The operations that can be performed on the ordered
+ − 8571 list of extents in a buffer are
+ − 8572
+ − 8573 @enumerate
+ − 8574 @item
+ − 8575 Locate where an extent would go if inserted into the list.
+ − 8576 @item
+ − 8577 Insert an extent into the list.
+ − 8578 @item
+ − 8579 Remove an extent from the list.
+ − 8580 @item
+ − 8581 Map over all the extents that overlap a range.
+ − 8582 @end enumerate
+ − 8583
+ − 8584 (4) requires being able to determine the first and last extents
+ − 8585 that overlap a range.
+ − 8586
+ − 8587 NOTE: @dfn{overlap} is used as follows:
+ − 8588
+ − 8589 @itemize @bullet
+ − 8590 @item
+ − 8591 two ranges overlap if they have at least one point in common.
+ − 8592 Whether the endpoints are open or closed makes a difference here.
+ − 8593 @item
+ − 8594 a point overlaps a range if the point is contained within the
+ − 8595 range; this is equivalent to treating a point @math{P} as the range
+ − 8596 @math{[P, P]}.
+ − 8597 @item
+ − 8598 In the case of an @emph{extent} overlapping a point or range, the extent
+ − 8599 is normally treated as having closed endpoints. This applies
+ − 8600 consistently in the discussion of stacks of extents and such below.
+ − 8601 Note that this definition of overlap is not necessarily consistent with
+ − 8602 the extents that @code{map-extents} maps over, since @code{map-extents}
+ − 8603 sometimes pays attention to whether the endpoints of an extents are open
+ − 8604 or closed. But for our purposes, it greatly simplifies things to treat
+ − 8605 all extents as having closed endpoints.
+ − 8606 @end itemize
+ − 8607
+ − 8608 First, define @math{>}, @math{<}, @math{<=}, etc. as applied to extents
+ − 8609 to mean comparison according to the display order. Comparison between
+ − 8610 an extent @math{E} and an index @math{I} means comparison between
+ − 8611 @math{E} and the range @math{[I, I]}.
+ − 8612
+ − 8613 Also define @math{e>}, @math{e<}, @math{e<=}, etc. to mean comparison
+ − 8614 according to the e-order.
+ − 8615
+ − 8616 For any range @math{R}, define @math{R(0)} to be the starting index of
+ − 8617 the range and @math{R(1)} to be the ending index of the range.
+ − 8618
+ − 8619 For any extent @math{E}, define @math{E(next)} to be the extent directly
+ − 8620 following @math{E}, and @math{E(prev)} to be the extent directly
+ − 8621 preceding @math{E}. Assume @math{E(next)} and @math{E(prev)} can be
+ − 8622 determined from @math{E} in constant time. (This is because we store
+ − 8623 the extent list as a doubly linked list.)
+ − 8624
+ − 8625 Similarly, define @math{E(e-next)} and @math{E(e-prev)} to be the
+ − 8626 extents directly following and preceding @math{E} in the e-order.
+ − 8627
+ − 8628 Now:
+ − 8629
+ − 8630 Let @math{R} be a range.
+ − 8631 Let @math{F} be the first extent overlapping @math{R}.
+ − 8632 Let @math{L} be the last extent overlapping @math{R}.
+ − 8633
+ − 8634 Theorem 1: @math{R(1)} lies between @math{L} and @math{L(next)},
+ − 8635 i.e. @math{L <= R(1) < L(next)}.
+ − 8636
+ − 8637 This follows easily from the definition of display order. The
+ − 8638 basic reason that this theorem applies is that the display order
+ − 8639 sorts by increasing starting index.
+ − 8640
+ − 8641 Therefore, we can determine @math{L} just by looking at where we would
+ − 8642 insert @math{R(1)} into the list, and if we know @math{F} and are moving
+ − 8643 forward over extents, we can easily determine when we've hit @math{L} by
+ − 8644 comparing the extent we're at to @math{R(1)}.
+ − 8645
+ − 8646 @example
+ − 8647 Theorem 2: @math{F(e-prev) e< [1, R(0)] e<= F}.
+ − 8648 @end example
+ − 8649
+ − 8650 This is the analog of Theorem 1, and applies because the e-order
+ − 8651 sorts by increasing ending index.
+ − 8652
+ − 8653 Therefore, @math{F} can be found in the same amount of time as
+ − 8654 operation (1), i.e. the time that it takes to locate where an extent
+ − 8655 would go if inserted into the e-order list.
+ − 8656
+ − 8657 If the lists were stored as balanced binary trees, then operation (1)
+ − 8658 would take logarithmic time, which is usually quite fast. However,
+ − 8659 currently they're stored as simple doubly-linked lists, and instead we
+ − 8660 do some caching to try to speed things up.
+ − 8661
+ − 8662 Define a @dfn{stack of extents} (or @dfn{SOE}) as the set of extents
+ − 8663 (ordered in the display order) that overlap an index @math{I}, together
+ − 8664 with the SOE's @dfn{previous} extent, which is an extent that precedes
+ − 8665 @math{I} in the e-order. (Hopefully there will not be very many extents
+ − 8666 between @math{I} and the previous extent.)
+ − 8667
+ − 8668 Now:
+ − 8669
+ − 8670 Let @math{I} be an index, let @math{S} be the stack of extents on
+ − 8671 @math{I}, let @math{F} be the first extent in @math{S}, and let @math{P}
+ − 8672 be @math{S}'s previous extent.
+ − 8673
+ − 8674 Theorem 3: The first extent in @math{S} is the first extent that overlaps
+ − 8675 any range @math{[I, J]}.
+ − 8676
+ − 8677 Proof: Any extent that overlaps @math{[I, J]} but does not include
+ − 8678 @math{I} must have a start index @math{> I}, and thus be greater than
+ − 8679 any extent in @math{S}.
+ − 8680
+ − 8681 Therefore, finding the first extent that overlaps a range @math{R} is
+ − 8682 the same as finding the first extent that overlaps @math{R(0)}.
+ − 8683
+ − 8684 Theorem 4: Let @math{I2} be an index such that @math{I2 > I}, and let
+ − 8685 @math{F2} be the first extent that overlaps @math{I2}. Then, either
+ − 8686 @math{F2} is in @math{S} or @math{F2} is greater than any extent in
+ − 8687 @math{S}.
+ − 8688
+ − 8689 Proof: If @math{F2} does not include @math{I} then its start index is
+ − 8690 greater than @math{I} and thus it is greater than any extent in
+ − 8691 @math{S}, including @math{F}. Otherwise, @math{F2} includes @math{I}
+ − 8692 and thus is in @math{S}, and thus @math{F2 >= F}.
+ − 8693
442
+ − 8694 @node Extent Fragments, , Mathematics of Extent Ordering, Extents
428
+ − 8695 @section Extent Fragments
+ − 8696 @cindex extent fragment
+ − 8697
+ − 8698 Imagine that the buffer is divided up into contiguous, non-overlapping
+ − 8699 @dfn{runs} of text such that no extent starts or ends within a run
+ − 8700 (extents that abut the run don't count).
+ − 8701
+ − 8702 An extent fragment is a structure that holds data about the run that
+ − 8703 contains a particular buffer position (if the buffer position is at the
440
+ − 8704 junction of two runs, the run after the position is used)---the
428
+ − 8705 beginning and end of the run, a list of all of the extents in that run,
+ − 8706 the @dfn{merged face} that results from merging all of the faces
+ − 8707 corresponding to those extents, the begin and end glyphs at the
+ − 8708 beginning of the run, etc. This is the information that redisplay needs
+ − 8709 in order to display this run.
+ − 8710
+ − 8711 Extent fragments have to be very quick to update to a new buffer
+ − 8712 position when moving linearly through the buffer. They rely on the
+ − 8713 stack-of-extents code, which does the heavy-duty algorithmic work of
+ − 8714 determining which extents overly a particular position.
+ − 8715
+ − 8716 @node Faces, Glyphs, Extents, Top
+ − 8717 @chapter Faces
+ − 8718
+ − 8719 Not yet documented.
+ − 8720
+ − 8721 @node Glyphs, Specifiers, Faces, Top
+ − 8722 @chapter Glyphs
+ − 8723
+ − 8724 Glyphs are graphical elements that can be displayed in XEmacs buffers or
+ − 8725 gutters. We use the term graphical element here in the broadest possible
446
+ − 8726 sense since glyphs can be as mundane as text or as arcane as a native
428
+ − 8727 tab widget.
+ − 8728
+ − 8729 In XEmacs, glyphs represent the uninstantiated state of graphical
+ − 8730 elements, i.e. they hold all the information necessary to produce an
446
+ − 8731 image on-screen but the image need not exist at this stage, and multiple
+ − 8732 screen images can be instantiated from a single glyph.
428
+ − 8733
+ − 8734 Glyphs are lazily instantiated by calling one of the glyph
+ − 8735 functions. This usually occurs within redisplay when
+ − 8736 @code{Fglyph_height} is called. Instantiation causes an image-instance
+ − 8737 to be created and cached. This cache is on a device basis for all glyphs
+ − 8738 except glyph-widgets, and on a window basis for glyph widgets. The
+ − 8739 caching is done by @code{image_instantiate} and is necessary because it
+ − 8740 is generally possible to display an image-instance in multiple
+ − 8741 domains. For instance if we create a Pixmap, we can actually display
+ − 8742 this on multiple windows - even though we only need a single Pixmap
+ − 8743 instance to do this. If caching wasn't done then it would be necessary
442
+ − 8744 to create image-instances for every displayable occurrence of a glyph -
428
+ − 8745 and every usage - and this would be extremely memory and cpu intensive.
+ − 8746
+ − 8747 Widget-glyphs (a.k.a native widgets) are not cached in this way. This is
+ − 8748 because widget-glyph image-instances on screen are toolkit windows, and
+ − 8749 thus cannot be reused in multiple XEmacs domains. Thus widget-glyphs are
446
+ − 8750 cached on an XEmacs window basis.
428
+ − 8751
+ − 8752 Any action on a glyph first consults the cache before actually
+ − 8753 instantiating a widget.
+ − 8754
440
+ − 8755 @section Widget-Glyphs in the MS-Windows Environment
428
+ − 8756
+ − 8757 To Do
+ − 8758
+ − 8759 @section Widget-Glyphs in the X Environment
+ − 8760
446
+ − 8761 Widget-glyphs under X make heavy use of lwlib (@pxref{Lucid Widget
+ − 8762 Library}) for manipulating the native toolkit objects. This is primarily
+ − 8763 so that different toolkits can be supported for widget-glyphs, just as
+ − 8764 they are supported for features such as menubars etc.
428
+ − 8765
+ − 8766 @node Specifiers, Menus, Glyphs, Top
+ − 8767 @chapter Specifiers
+ − 8768
+ − 8769 Not yet documented.
+ − 8770
+ − 8771 @node Menus, Subprocesses, Specifiers, Top
+ − 8772 @chapter Menus
+ − 8773
+ − 8774 A menu is set by setting the value of the variable
+ − 8775 @code{current-menubar} (which may be buffer-local) and then calling
+ − 8776 @code{set-menubar-dirty-flag} to signal a change. This will cause the
+ − 8777 menu to be redrawn at the next redisplay. The format of the data in
+ − 8778 @code{current-menubar} is described in @file{menubar.c}.
+ − 8779
+ − 8780 Internally the data in current-menubar is parsed into a tree of
+ − 8781 @code{widget_value's} (defined in @file{lwlib.h}); this is accomplished
+ − 8782 by the recursive function @code{menu_item_descriptor_to_widget_value()},
+ − 8783 called by @code{compute_menubar_data()}. Such a tree is deallocated
+ − 8784 using @code{free_widget_value()}.
+ − 8785
+ − 8786 @code{update_screen_menubars()} is one of the external entry points.
+ − 8787 This checks to see, for each screen, if that screen's menubar needs to
+ − 8788 be updated. This is the case if
+ − 8789
+ − 8790 @enumerate
+ − 8791 @item
+ − 8792 @code{set-menubar-dirty-flag} was called since the last redisplay. (This
+ − 8793 function sets the C variable menubar_has_changed.)
+ − 8794 @item
+ − 8795 The buffer displayed in the screen has changed.
+ − 8796 @item
+ − 8797 The screen has no menubar currently displayed.
+ − 8798 @end enumerate
+ − 8799
+ − 8800 @code{set_screen_menubar()} is called for each such screen. This
+ − 8801 function calls @code{compute_menubar_data()} to create the tree of
+ − 8802 widget_value's, then calls @code{lw_create_widget()},
+ − 8803 @code{lw_modify_all_widgets()}, and/or @code{lw_destroy_all_widgets()}
+ − 8804 to create the X-Toolkit widget associated with the menu.
+ − 8805
+ − 8806 @code{update_psheets()}, the other external entry point, actually
+ − 8807 changes the menus being displayed. It uses the widgets fixed by
+ − 8808 @code{update_screen_menubars()} and calls various X functions to ensure
+ − 8809 that the menus are displayed properly.
+ − 8810
+ − 8811 The menubar widget is set up so that @code{pre_activate_callback()} is
+ − 8812 called when the menu is first selected (i.e. mouse button goes down),
+ − 8813 and @code{menubar_selection_callback()} is called when an item is
+ − 8814 selected. @code{pre_activate_callback()} calls the function in
+ − 8815 activate-menubar-hook, which can change the menubar (this is described
+ − 8816 in @file{menubar.c}). If the menubar is changed,
+ − 8817 @code{set_screen_menubars()} is called.
+ − 8818 @code{menubar_selection_callback()} enqueues a menu event, putting in it
+ − 8819 a function to call (either @code{eval} or @code{call-interactively}) and
+ − 8820 its argument, which is the callback function or form given in the menu's
+ − 8821 description.
+ − 8822
446
+ − 8823 @node Subprocesses, Interface to the X Window System, Menus, Top
428
+ − 8824 @chapter Subprocesses
+ − 8825
+ − 8826 The fields of a process are:
+ − 8827
+ − 8828 @table @code
+ − 8829 @item name
+ − 8830 A string, the name of the process.
+ − 8831
+ − 8832 @item command
+ − 8833 A list containing the command arguments that were used to start this
+ − 8834 process.
+ − 8835
+ − 8836 @item filter
+ − 8837 A function used to accept output from the process instead of a buffer,
+ − 8838 or @code{nil}.
+ − 8839
+ − 8840 @item sentinel
+ − 8841 A function called whenever the process receives a signal, or @code{nil}.
+ − 8842
+ − 8843 @item buffer
+ − 8844 The associated buffer of the process.
+ − 8845
+ − 8846 @item pid
+ − 8847 An integer, the Unix process @sc{id}.
+ − 8848
+ − 8849 @item childp
+ − 8850 A flag, non-@code{nil} if this is really a child process.
+ − 8851 It is @code{nil} for a network connection.
+ − 8852
+ − 8853 @item mark
+ − 8854 A marker indicating the position of the end of the last output from this
+ − 8855 process inserted into the buffer. This is often but not always the end
+ − 8856 of the buffer.
+ − 8857
+ − 8858 @item kill_without_query
+ − 8859 If this is non-@code{nil}, killing XEmacs while this process is still
+ − 8860 running does not ask for confirmation about killing the process.
+ − 8861
+ − 8862 @item raw_status_low
+ − 8863 @itemx raw_status_high
+ − 8864 These two fields record 16 bits each of the process status returned by
+ − 8865 the @code{wait} system call.
+ − 8866
+ − 8867 @item status
+ − 8868 The process status, as @code{process-status} should return it.
+ − 8869
+ − 8870 @item tick
+ − 8871 @itemx update_tick
+ − 8872 If these two fields are not equal, a change in the status of the process
+ − 8873 needs to be reported, either by running the sentinel or by inserting a
+ − 8874 message in the process buffer.
+ − 8875
+ − 8876 @item pty_flag
+ − 8877 Non-@code{nil} if communication with the subprocess uses a @sc{pty};
+ − 8878 @code{nil} if it uses a pipe.
+ − 8879
+ − 8880 @item infd
+ − 8881 The file descriptor for input from the process.
+ − 8882
+ − 8883 @item outfd
+ − 8884 The file descriptor for output to the process.
+ − 8885
+ − 8886 @item subtty
+ − 8887 The file descriptor for the terminal that the subprocess is using. (On
+ − 8888 some systems, there is no need to record this, so the value is
+ − 8889 @code{-1}.)
+ − 8890
+ − 8891 @item tty_name
+ − 8892 The name of the terminal that the subprocess is using,
+ − 8893 or @code{nil} if it is using pipes.
+ − 8894 @end table
+ − 8895
446
+ − 8896 @node Interface to the X Window System, Index, Subprocesses, Top
+ − 8897 @chapter Interface to the X Window System
+ − 8898
+ − 8899 Mostly undocumented.
+ − 8900
+ − 8901 @menu
+ − 8902 * Lucid Widget Library:: An interface to various widget sets.
+ − 8903 @end menu
+ − 8904
+ − 8905 @node Lucid Widget Library, , , Interface to the X Window System
+ − 8906 @section Lucid Widget Library
+ − 8907
+ − 8908 Lwlib is extremely poorly documented and quite hairy. The author(s)
+ − 8909 blame that on X, Xt, and Motif, with some justice, but also sufficient
+ − 8910 hypocrisy to avoid drawing the obvious conclusion about their own work.
+ − 8911
+ − 8912 The Lucid Widget Library is composed of two more or less independent
+ − 8913 pieces. The first, as the name suggests, is a set of widgets. These
+ − 8914 widgets are intended to resemble and improve on widgets provided in the
+ − 8915 Motif toolkit but not in the Athena widgets, including menubars and
+ − 8916 scrollbars. Recent additions by Andy Piper integrate some ``modern''
+ − 8917 widgets by Edward Falk, including checkboxes, radio buttons, progress
+ − 8918 gauges, and index tab controls (aka notebooks).
+ − 8919
+ − 8920 The second piece of the Lucid widget library is a generic interface to
+ − 8921 several toolkits for X (including Xt, the Athena widget set, and Motif,
+ − 8922 as well as the Lucid widgets themselves) so that core XEmacs code need
+ − 8923 not know which widget set has been used to build the graphical user
+ − 8924 interface.
+ − 8925
+ − 8926 @menu
+ − 8927 * Generic Widget Interface:: The lwlib generic widget interface.
+ − 8928 * Scrollbars::
+ − 8929 * Menubars::
+ − 8930 * Checkboxes and Radio Buttons::
+ − 8931 * Progress Bars::
+ − 8932 * Tab Controls::
+ − 8933 @end menu
+ − 8934
+ − 8935 @node Generic Widget Interface, Scrollbars, , Lucid Widget Library
+ − 8936 @subsection Generic Widget Interface
+ − 8937
+ − 8938 In general in any toolkit a widget may be a composite object. In Xt,
+ − 8939 all widgets have an X window that they manage, but typically a complex
+ − 8940 widget will have widget children, each of which manages a subwindow of
+ − 8941 the parent widget's X window. These children may themselves be
+ − 8942 composite widgets. Thus a widget is actually a tree or hierarchy of
+ − 8943 widgets.
+ − 8944
+ − 8945 For each toolkit widget, lwlib maintains a tree of @code{widget_values}
+ − 8946 which mirror the hierarchical state of Xt widgets (including Motif,
+ − 8947 Athena, 3D Athena, and Falk's widget sets). Each @code{widget_value}
+ − 8948 has @code{contents} member, which points to the head of a linked list of
+ − 8949 its children. The linked list of siblings is chained through the
+ − 8950 @code{next} member of @code{widget_value}.
+ − 8951
+ − 8952 @example
+ − 8953 +-----------+
+ − 8954 | composite |
+ − 8955 +-----------+
+ − 8956 |
+ − 8957 | contents
+ − 8958 V
+ − 8959 +-------+ next +-------+ next +-------+
+ − 8960 | child |----->| child |----->| child |
+ − 8961 +-------+ +-------+ +-------+
+ − 8962 |
+ − 8963 | contents
+ − 8964 V
+ − 8965 +-------------+ next +-------------+
+ − 8966 | grand child |----->| grand child |
+ − 8967 +-------------+ +-------------+
+ − 8968
+ − 8969 The @code{widget_value} hierarchy of a composite widget with two simple
+ − 8970 children and one composite child.
+ − 8971 @end example
+ − 8972
+ − 8973 The @code{widget_instance} structure maintains the inverse view of the
+ − 8974 tree. As for the @code{widget_value}, siblings are chained through the
+ − 8975 @code{next} member. However, rather than naming children, the
+ − 8976 @code{widget_instance} tree links to parents.
+ − 8977
+ − 8978 @example
+ − 8979 +-----------+
+ − 8980 | composite |
+ − 8981 +-----------+
+ − 8982 A
+ − 8983 | parent
+ − 8984 |
+ − 8985 +-------+ next +-------+ next +-------+
+ − 8986 | child |----->| child |----->| child |
+ − 8987 +-------+ +-------+ +-------+
+ − 8988 A
+ − 8989 | parent
+ − 8990 |
+ − 8991 +-------------+ next +-------------+
+ − 8992 | grand child |----->| grand child |
+ − 8993 +-------------+ +-------------+
+ − 8994
+ − 8995 The @code{widget_value} hierarchy of a composite widget with two simple
+ − 8996 children and one composite child.
+ − 8997 @end example
+ − 8998
+ − 8999 This permits widgets derived from different toolkits to be updated and
+ − 9000 manipulated generically by the lwlib library. For instance
+ − 9001 @code{update_one_widget_instance} can cope with multiple types of widget
+ − 9002 and multiple types of toolkit. Each element in the widget hierarchy is
+ − 9003 updated from its corresponding @code{widget_value} by walking the
+ − 9004 @code{widget_value} tree. This has desirable properties. For example,
+ − 9005 @code{lw_modify_all_widgets} is called from @file{glyphs-x.c} and
+ − 9006 updates all the properties of a widget without having to know what the
+ − 9007 widget is or what toolkit it is from. Unfortunately this also has its
+ − 9008 hairy properties; the lwlib code quite complex. And of course lwlib has
+ − 9009 to know at some level what the widget is and how to set its properties.
+ − 9010
+ − 9011 The @code{widget_instance} structure also contains a pointer to the root
+ − 9012 of its tree. Widget instances are further confi
+ − 9013
+ − 9014
+ − 9015 @node Scrollbars, Menubars, Generic Widget Interface, Lucid Widget Library
+ − 9016 @subsection Scrollbars
+ − 9017
+ − 9018 @node Menubars, Checkboxes and Radio Buttons, Scrollbars, Lucid Widget Library
+ − 9019 @subsection Menubars
+ − 9020
+ − 9021 @node Checkboxes and Radio Buttons, Progress Bars, Menubars, Lucid Widget Library
+ − 9022 @subsection Checkboxes and Radio Buttons
+ − 9023
+ − 9024 @node Progress Bars, Tab Controls, Checkboxes and Radio Buttons, Lucid Widget Library
+ − 9025 @subsection Progress Bars
+ − 9026
+ − 9027 @node Tab Controls, , Progress Bars, Lucid Widget Library
+ − 9028 @subsection Tab Controls
428
+ − 9029
+ − 9030 @include index.texi
+ − 9031
+ − 9032 @c Print the tables of contents
+ − 9033 @summarycontents
+ − 9034 @contents
+ − 9035 @c That's all
+ − 9036
+ − 9037 @bye