428
|
1 \input texinfo @c -*-texinfo-*-
|
|
2 @c %**start of header
|
|
3 @setfilename ../../info/internals.info
|
|
4 @settitle XEmacs Internals Manual
|
|
5 @c %**end of header
|
|
6
|
|
7 @ifinfo
|
|
8 @dircategory XEmacs Editor
|
|
9 @direntry
|
440
|
10 * Internals: (internals). XEmacs Internals Manual.
|
428
|
11 @end direntry
|
|
12
|
|
13 Copyright @copyright{} 1992 - 1996 Ben Wing.
|
|
14 Copyright @copyright{} 1996, 1997 Sun Microsystems.
|
|
15 Copyright @copyright{} 1994 - 1998 Free Software Foundation.
|
|
16 Copyright @copyright{} 1994, 1995 Board of Trustees, University of Illinois.
|
|
17
|
|
18
|
|
19 Permission is granted to make and distribute verbatim copies of this
|
|
20 manual provided the copyright notice and this permission notice are
|
|
21 preserved on all copies.
|
|
22
|
|
23 @ignore
|
|
24 Permission is granted to process this file through TeX and print the
|
|
25 results, provided the printed document carries copying permission notice
|
|
26 identical to this one except for the removal of this paragraph (this
|
|
27 paragraph not being relevant to the printed manual).
|
|
28
|
|
29 @end ignore
|
|
30 Permission is granted to copy and distribute modified versions of this
|
|
31 manual under the conditions for verbatim copying, provided that the
|
|
32 entire resulting derived work is distributed under the terms of a
|
|
33 permission notice identical to this one.
|
|
34
|
|
35 Permission is granted to copy and distribute translations of this manual
|
|
36 into another language, under the above conditions for modified versions,
|
|
37 except that this permission notice may be stated in a translation
|
|
38 approved by the Foundation.
|
|
39
|
|
40 Permission is granted to copy and distribute modified versions of this
|
|
41 manual under the conditions for verbatim copying, provided also that the
|
|
42 section entitled ``GNU General Public License'' is included exactly as
|
|
43 in the original, and provided that the entire resulting derived work is
|
|
44 distributed under the terms of a permission notice identical to this
|
|
45 one.
|
|
46
|
|
47 Permission is granted to copy and distribute translations of this manual
|
|
48 into another language, under the above conditions for modified versions,
|
|
49 except that the section entitled ``GNU General Public License'' may be
|
|
50 included in a translation approved by the Free Software Foundation
|
|
51 instead of in the original English.
|
|
52 @end ifinfo
|
|
53
|
|
54 @c Combine indices.
|
|
55 @synindex cp fn
|
|
56 @syncodeindex vr fn
|
|
57 @syncodeindex ky fn
|
|
58 @syncodeindex pg fn
|
|
59 @syncodeindex tp fn
|
|
60
|
|
61 @setchapternewpage odd
|
|
62 @finalout
|
|
63
|
|
64 @titlepage
|
|
65 @title XEmacs Internals Manual
|
464
|
66 @subtitle Version 1.4, March 2001
|
428
|
67
|
|
68 @author Ben Wing
|
|
69 @author Martin Buchholz
|
|
70 @author Hrvoje Niksic
|
|
71 @author Matthias Neubauer
|
442
|
72 @author Olivier Galibert
|
428
|
73 @page
|
|
74 @vskip 0pt plus 1fill
|
|
75
|
|
76 @noindent
|
464
|
77 Copyright @copyright{} 1992 - 1996, 2001 Ben Wing. @*
|
428
|
78 Copyright @copyright{} 1996, 1997 Sun Microsystems, Inc. @*
|
|
79 Copyright @copyright{} 1994 - 1998 Free Software Foundation. @*
|
|
80 Copyright @copyright{} 1994, 1995 Board of Trustees, University of Illinois.
|
|
81
|
|
82 @sp 2
|
464
|
83 Version 1.4 @*
|
|
84 March 2001.@*
|
428
|
85
|
|
86 Permission is granted to make and distribute verbatim copies of this
|
|
87 manual provided the copyright notice and this permission notice are
|
|
88 preserved on all copies.
|
|
89
|
|
90 Permission is granted to copy and distribute modified versions of this
|
|
91 manual under the conditions for verbatim copying, provided also that the
|
|
92 section entitled ``GNU General Public License'' is included
|
|
93 exactly as in the original, and provided that the entire resulting
|
|
94 derived work is distributed under the terms of a permission notice
|
|
95 identical to this one.
|
|
96
|
|
97 Permission is granted to copy and distribute translations of this manual
|
|
98 into another language, under the above conditions for modified versions,
|
|
99 except that the section entitled ``GNU General Public License'' may be
|
|
100 included in a translation approved by the Free Software Foundation
|
|
101 instead of in the original English.
|
|
102 @end titlepage
|
|
103 @page
|
|
104
|
|
105 @node Top, A History of Emacs, (dir), (dir)
|
|
106
|
|
107 @ifinfo
|
464
|
108 This Info file contains v1.4 of the XEmacs Internals Manual, March 2001.
|
428
|
109 @end ifinfo
|
|
110
|
|
111 @menu
|
|
112 * A History of Emacs:: Times, dates, important events.
|
|
113 * XEmacs From the Outside:: A broad conceptual overview.
|
|
114 * The Lisp Language:: An overview.
|
|
115 * XEmacs From the Perspective of Building::
|
|
116 * XEmacs From the Inside::
|
|
117 * The XEmacs Object System (Abstractly Speaking)::
|
|
118 * How Lisp Objects Are Represented in C::
|
868
|
119 * Major Textual Changes::
|
428
|
120 * Rules When Writing New C Code::
|
965
|
121 * Regression Testing XEmacs::
|
802
|
122 * CVS Techniques::
|
428
|
123 * A Summary of the Various XEmacs Modules::
|
|
124 * Allocation of Objects in XEmacs Lisp::
|
442
|
125 * Dumping::
|
428
|
126 * Events and the Event Loop::
|
|
127 * Evaluation; Stack Frames; Bindings::
|
|
128 * Symbols and Variables::
|
|
129 * Buffers and Textual Representation::
|
|
130 * MULE Character Sets and Encodings::
|
|
131 * The Lisp Reader and Compiler::
|
|
132 * Lstreams::
|
|
133 * Consoles; Devices; Frames; Windows::
|
|
134 * The Redisplay Mechanism::
|
|
135 * Extents::
|
|
136 * Faces::
|
|
137 * Glyphs::
|
|
138 * Specifiers::
|
|
139 * Menus::
|
|
140 * Subprocesses::
|
446
|
141 * Interface to the X Window System::
|
442
|
142 * Index::
|
|
143
|
|
144 @detailmenu
|
|
145
|
|
146 --- The Detailed Node Listing ---
|
428
|
147
|
|
148 A History of Emacs
|
|
149
|
|
150 * Through Version 18:: Unification prevails.
|
|
151 * Lucid Emacs:: One version 19 Emacs.
|
|
152 * GNU Emacs 19:: The other version 19 Emacs.
|
442
|
153 * GNU Emacs 20:: The other version 20 Emacs.
|
428
|
154 * XEmacs:: The continuation of Lucid Emacs.
|
|
155
|
|
156 Rules When Writing New C Code
|
|
157
|
|
158 * General Coding Rules::
|
|
159 * Writing Lisp Primitives::
|
802
|
160 * Writing Good Comments::
|
428
|
161 * Adding Global Lisp Variables::
|
802
|
162 * Proper Use of Unsigned Types::
|
442
|
163 * Coding for Mule::
|
428
|
164 * Techniques for XEmacs Developers::
|
|
165
|
442
|
166 Coding for Mule
|
|
167
|
|
168 * Character-Related Data Types::
|
|
169 * Working With Character and Byte Positions::
|
|
170 * Conversion to and from External Data::
|
|
171 * General Guidelines for Writing Mule-Aware Code::
|
|
172 * An Example of Mule-Aware Code::
|
|
173
|
802
|
174 CVS Techniques
|
|
175
|
|
176 * Merging a Branch into the Trunk::
|
|
177
|
965
|
178 Regression Testing XEmacs
|
|
179
|
428
|
180 A Summary of the Various XEmacs Modules
|
|
181
|
|
182 * Low-Level Modules::
|
|
183 * Basic Lisp Modules::
|
|
184 * Modules for Standard Editing Operations::
|
|
185 * Editor-Level Control Flow Modules::
|
|
186 * Modules for the Basic Displayable Lisp Objects::
|
|
187 * Modules for other Display-Related Lisp Objects::
|
|
188 * Modules for the Redisplay Mechanism::
|
|
189 * Modules for Interfacing with the File System::
|
|
190 * Modules for Other Aspects of the Lisp Interpreter and Object System::
|
|
191 * Modules for Interfacing with the Operating System::
|
|
192 * Modules for Interfacing with X Windows::
|
|
193 * Modules for Internationalization::
|
965
|
194 * Modules for Regression Testing::
|
428
|
195
|
|
196 Allocation of Objects in XEmacs Lisp
|
|
197
|
|
198 * Introduction to Allocation::
|
|
199 * Garbage Collection::
|
|
200 * GCPROing::
|
|
201 * Garbage Collection - Step by Step::
|
|
202 * Integers and Characters::
|
|
203 * Allocation from Frob Blocks::
|
|
204 * lrecords::
|
|
205 * Low-level allocation::
|
|
206 * Cons::
|
|
207 * Vector::
|
|
208 * Bit Vector::
|
|
209 * Symbol::
|
|
210 * Marker::
|
|
211 * String::
|
|
212 * Compiled Function::
|
|
213
|
442
|
214 Garbage Collection - Step by Step
|
|
215
|
|
216 * Invocation::
|
|
217 * garbage_collect_1::
|
|
218 * mark_object::
|
|
219 * gc_sweep::
|
|
220 * sweep_lcrecords_1::
|
|
221 * compact_string_chars::
|
|
222 * sweep_strings::
|
|
223 * sweep_bit_vectors_1::
|
|
224
|
|
225 Dumping
|
|
226
|
|
227 * Overview::
|
|
228 * Data descriptions::
|
|
229 * Dumping phase::
|
|
230 * Reloading phase::
|
|
231
|
|
232 Dumping phase
|
|
233
|
|
234 * Object inventory::
|
|
235 * Address allocation::
|
|
236 * The header::
|
|
237 * Data dumping::
|
|
238 * Pointers dumping::
|
|
239
|
428
|
240 Events and the Event Loop
|
|
241
|
|
242 * Introduction to Events::
|
|
243 * Main Loop::
|
|
244 * Specifics of the Event Gathering Mechanism::
|
|
245 * Specifics About the Emacs Event::
|
|
246 * The Event Stream Callback Routines::
|
|
247 * Other Event Loop Functions::
|
|
248 * Converting Events::
|
|
249 * Dispatching Events; The Command Builder::
|
|
250
|
|
251 Evaluation; Stack Frames; Bindings
|
|
252
|
|
253 * Evaluation::
|
|
254 * Dynamic Binding; The specbinding Stack; Unwind-Protects::
|
|
255 * Simple Special Forms::
|
|
256 * Catch and Throw::
|
|
257
|
|
258 Symbols and Variables
|
|
259
|
|
260 * Introduction to Symbols::
|
|
261 * Obarrays::
|
|
262 * Symbol Values::
|
|
263
|
|
264 Buffers and Textual Representation
|
|
265
|
|
266 * Introduction to Buffers:: A buffer holds a block of text such as a file.
|
|
267 * The Text in a Buffer:: Representation of the text in a buffer.
|
|
268 * Buffer Lists:: Keeping track of all buffers.
|
|
269 * Markers and Extents:: Tagging locations within a buffer.
|
|
270 * Bufbytes and Emchars:: Representation of individual characters.
|
|
271 * The Buffer Object:: The Lisp object corresponding to a buffer.
|
|
272
|
|
273 MULE Character Sets and Encodings
|
|
274
|
|
275 * Character Sets::
|
|
276 * Encodings::
|
|
277 * Internal Mule Encodings::
|
442
|
278 * CCL::
|
428
|
279
|
|
280 Encodings
|
|
281
|
|
282 * Japanese EUC (Extended Unix Code)::
|
|
283 * JIS7::
|
|
284
|
|
285 Internal Mule Encodings
|
|
286
|
|
287 * Internal String Encoding::
|
|
288 * Internal Character Encoding::
|
|
289
|
|
290 Lstreams
|
|
291
|
442
|
292 * Creating an Lstream:: Creating an lstream object.
|
|
293 * Lstream Types:: Different sorts of things that are streamed.
|
|
294 * Lstream Functions:: Functions for working with lstreams.
|
|
295 * Lstream Methods:: Creating new lstream types.
|
|
296
|
428
|
297 Consoles; Devices; Frames; Windows
|
|
298
|
|
299 * Introduction to Consoles; Devices; Frames; Windows::
|
|
300 * Point::
|
|
301 * Window Hierarchy::
|
442
|
302 * The Window Object::
|
428
|
303
|
|
304 The Redisplay Mechanism
|
|
305
|
|
306 * Critical Redisplay Sections::
|
|
307 * Line Start Cache::
|
442
|
308 * Redisplay Piece by Piece::
|
428
|
309
|
|
310 Extents
|
|
311
|
|
312 * Introduction to Extents:: Extents are ranges over text, with properties.
|
|
313 * Extent Ordering:: How extents are ordered internally.
|
|
314 * Format of the Extent Info:: The extent information in a buffer or string.
|
|
315 * Zero-Length Extents:: A weird special case.
|
442
|
316 * Mathematics of Extent Ordering:: A rigorous foundation.
|
428
|
317 * Extent Fragments:: Cached information useful for redisplay.
|
|
318
|
442
|
319 @end detailmenu
|
428
|
320 @end menu
|
|
321
|
|
322 @node A History of Emacs, XEmacs From the Outside, Top, Top
|
|
323 @chapter A History of Emacs
|
462
|
324 @cindex history of Emacs, a
|
|
325 @cindex Emacs, a history of
|
428
|
326 @cindex Hackers (Steven Levy)
|
|
327 @cindex Levy, Steven
|
|
328 @cindex ITS (Incompatible Timesharing System)
|
|
329 @cindex Stallman, Richard
|
|
330 @cindex RMS
|
|
331 @cindex MIT
|
|
332 @cindex TECO
|
|
333 @cindex FSF
|
|
334 @cindex Free Software Foundation
|
|
335
|
|
336 XEmacs is a powerful, customizable text editor and development
|
|
337 environment. It began as Lucid Emacs, which was in turn derived from
|
|
338 GNU Emacs, a program written by Richard Stallman of the Free Software
|
|
339 Foundation. GNU Emacs dates back to the 1970's, and was modelled
|
|
340 after a package called ``Emacs'', written in 1976, that was a set of
|
|
341 macros on top of TECO, an old, old text editor written at MIT on the
|
|
342 DEC PDP 10 under one of the earliest time-sharing operating systems,
|
|
343 ITS (Incompatible Timesharing System). (ITS dates back well before
|
|
344 Unix.) ITS, TECO, and Emacs were products of a group of people at MIT
|
|
345 who called themselves ``hackers'', who shared an idealistic belief
|
|
346 system about the free exchange of information and were fanatical in
|
|
347 their devotion to and time spent with computers. (The hacker
|
|
348 subculture dates back to the late 1950's at MIT and is described in
|
|
349 detail in Steven Levy's book @cite{Hackers}. This book also includes
|
|
350 a lot of information about Stallman himself and the development of
|
|
351 Lisp, a programming language developed at MIT that underlies Emacs.)
|
|
352
|
|
353 @menu
|
|
354 * Through Version 18:: Unification prevails.
|
|
355 * Lucid Emacs:: One version 19 Emacs.
|
|
356 * GNU Emacs 19:: The other version 19 Emacs.
|
|
357 * GNU Emacs 20:: The other version 20 Emacs.
|
|
358 * XEmacs:: The continuation of Lucid Emacs.
|
|
359 @end menu
|
|
360
|
462
|
361 @node Through Version 18
|
428
|
362 @section Through Version 18
|
462
|
363 @cindex version 18, through
|
428
|
364 @cindex Gosling, James
|
|
365 @cindex Great Usenet Renaming
|
|
366
|
|
367 Although the history of the early versions of GNU Emacs is unclear,
|
|
368 the history is well-known from the middle of 1985. A time line is:
|
|
369
|
|
370 @itemize @bullet
|
|
371 @item
|
|
372 GNU Emacs version 15 (15.34) was released sometime in 1984 or 1985 and
|
|
373 shared some code with a version of Emacs written by James Gosling (the
|
|
374 same James Gosling who later created the Java language).
|
|
375 @item
|
|
376 GNU Emacs version 16 (first released version was 16.56) was released on
|
|
377 July 15, 1985. All Gosling code was removed due to potential copyright
|
|
378 problems with the code.
|
|
379 @item
|
|
380 version 16.57: released on September 16, 1985.
|
|
381 @item
|
|
382 versions 16.58, 16.59: released on September 17, 1985.
|
|
383 @item
|
|
384 version 16.60: released on September 19, 1985. These later version 16's
|
|
385 incorporated patches from the net, esp. for getting Emacs to work under
|
|
386 System V.
|
|
387 @item
|
|
388 version 17.36 (first official v17 release) released on December 20,
|
|
389 1985. Included a TeX-able user manual. First official unpatched
|
|
390 version that worked on vanilla System V machines.
|
|
391 @item
|
|
392 version 17.43 (second official v17 release) released on January 25,
|
|
393 1986.
|
|
394 @item
|
|
395 version 17.45 released on January 30, 1986.
|
|
396 @item
|
|
397 version 17.46 released on February 4, 1986.
|
|
398 @item
|
|
399 version 17.48 released on February 10, 1986.
|
|
400 @item
|
|
401 version 17.49 released on February 12, 1986.
|
|
402 @item
|
|
403 version 17.55 released on March 18, 1986.
|
|
404 @item
|
|
405 version 17.57 released on March 27, 1986.
|
|
406 @item
|
|
407 version 17.58 released on April 4, 1986.
|
|
408 @item
|
|
409 version 17.61 released on April 12, 1986.
|
|
410 @item
|
|
411 version 17.63 released on May 7, 1986.
|
|
412 @item
|
|
413 version 17.64 released on May 12, 1986.
|
|
414 @item
|
|
415 version 18.24 (a beta version) released on October 2, 1986.
|
|
416 @item
|
|
417 version 18.30 (a beta version) released on November 15, 1986.
|
|
418 @item
|
|
419 version 18.31 (a beta version) released on November 23, 1986.
|
|
420 @item
|
|
421 version 18.32 (a beta version) released on December 7, 1986.
|
|
422 @item
|
|
423 version 18.33 (a beta version) released on December 12, 1986.
|
|
424 @item
|
|
425 version 18.35 (a beta version) released on January 5, 1987.
|
|
426 @item
|
|
427 version 18.36 (a beta version) released on January 21, 1987.
|
|
428 @item
|
|
429 January 27, 1987: The Great Usenet Renaming. net.emacs is now
|
|
430 comp.emacs.
|
|
431 @item
|
|
432 version 18.37 (a beta version) released on February 12, 1987.
|
|
433 @item
|
|
434 version 18.38 (a beta version) released on March 3, 1987.
|
|
435 @item
|
|
436 version 18.39 (a beta version) released on March 14, 1987.
|
|
437 @item
|
|
438 version 18.40 (a beta version) released on March 18, 1987.
|
|
439 @item
|
|
440 version 18.41 (the first ``official'' release) released on March 22,
|
|
441 1987.
|
|
442 @item
|
|
443 version 18.45 released on June 2, 1987.
|
|
444 @item
|
|
445 version 18.46 released on June 9, 1987.
|
|
446 @item
|
|
447 version 18.47 released on June 18, 1987.
|
|
448 @item
|
|
449 version 18.48 released on September 3, 1987.
|
|
450 @item
|
|
451 version 18.49 released on September 18, 1987.
|
|
452 @item
|
|
453 version 18.50 released on February 13, 1988.
|
|
454 @item
|
|
455 version 18.51 released on May 7, 1988.
|
|
456 @item
|
|
457 version 18.52 released on September 1, 1988.
|
|
458 @item
|
|
459 version 18.53 released on February 24, 1989.
|
|
460 @item
|
|
461 version 18.54 released on April 26, 1989.
|
|
462 @item
|
|
463 version 18.55 released on August 23, 1989. This is the earliest version
|
|
464 that is still available by FTP.
|
|
465 @item
|
|
466 version 18.56 released on January 17, 1991.
|
|
467 @item
|
|
468 version 18.57 released late January, 1991.
|
|
469 @item
|
|
470 version 18.58 released ?????.
|
|
471 @item
|
|
472 version 18.59 released October 31, 1992.
|
|
473 @end itemize
|
|
474
|
462
|
475 @node Lucid Emacs
|
428
|
476 @section Lucid Emacs
|
|
477 @cindex Lucid Emacs
|
|
478 @cindex Lucid Inc.
|
|
479 @cindex Energize
|
|
480 @cindex Epoch
|
|
481
|
|
482 Lucid Emacs was developed by the (now-defunct) Lucid Inc., a maker of
|
|
483 C++ and Lisp development environments. It began when Lucid decided they
|
|
484 wanted to use Emacs as the editor and cornerstone of their C++
|
|
485 development environment (called ``Energize''). They needed many features
|
|
486 that were not available in the existing version of GNU Emacs (version
|
|
487 18.5something), in particular good and integrated support for GUI
|
|
488 elements such as mouse support, multiple fonts, multiple window-system
|
|
489 windows, etc. A branch of GNU Emacs called Epoch, written at the
|
|
490 University of Illinois, existed that supplied many of these features;
|
|
491 however, Lucid needed more than what existed in Epoch. At the time, the
|
|
492 Free Software Foundation was working on version 19 of Emacs (this was
|
|
493 sometime around 1991), which was planned to have similar features, and
|
|
494 so Lucid decided to work with the Free Software Foundation. Their plan
|
|
495 was to add features that they needed, and coordinate with the FSF so
|
|
496 that the features would get included back into Emacs version 19.
|
|
497
|
|
498 Delays in the release of version 19 occurred, however (resulting in it
|
|
499 finally being released more than a year after what was initially
|
|
500 planned), and Lucid encountered unexpected technical resistance in
|
|
501 getting their changes merged back into version 19, so they decided to
|
|
502 release their own version of Emacs, which became Lucid Emacs 19.0.
|
|
503
|
|
504 @cindex Zawinski, Jamie
|
|
505 @cindex Sexton, Harlan
|
|
506 @cindex Benson, Eric
|
|
507 @cindex Devin, Matthieu
|
|
508 The initial authors of Lucid Emacs were Matthieu Devin, Harlan Sexton,
|
|
509 and Eric Benson, and the work was later taken over by Jamie Zawinski,
|
|
510 who became ``Mr. Lucid Emacs'' for many releases.
|
|
511
|
464
|
512 A time line for Lucid Emacs is
|
428
|
513
|
|
514 @itemize @bullet
|
|
515 @item
|
|
516 version 19.0 shipped with Energize 1.0, April 1992.
|
|
517 @item
|
|
518 version 19.1 released June 4, 1992.
|
|
519 @item
|
|
520 version 19.2 released June 19, 1992.
|
|
521 @item
|
|
522 version 19.3 released September 9, 1992.
|
|
523 @item
|
|
524 version 19.4 released January 21, 1993.
|
|
525 @item
|
|
526 version 19.5 was a repackaging of 19.4 with a few bug fixes and
|
|
527 shipped with Energize 2.0. Never released to the net.
|
|
528 @item
|
|
529 version 19.6 released April 9, 1993.
|
|
530 @item
|
|
531 version 19.7 was a repackaging of 19.6 with a few bug fixes and
|
|
532 shipped with Energize 2.1. Never released to the net.
|
|
533 @item
|
|
534 version 19.8 released September 6, 1993.
|
|
535 @item
|
|
536 version 19.9 released January 12, 1994.
|
|
537 @item
|
|
538 version 19.10 released May 27, 1994.
|
|
539 @end itemize
|
|
540
|
462
|
541 @node GNU Emacs 19
|
428
|
542 @section GNU Emacs 19
|
|
543 @cindex GNU Emacs 19
|
462
|
544 @cindex Emacs 19, GNU
|
|
545 @cindex version 19, GNU Emacs
|
428
|
546 @cindex FSF Emacs
|
|
547
|
|
548 About a year after the initial release of Lucid Emacs, the FSF
|
|
549 released a beta of their version of Emacs 19 (referred to here as ``GNU
|
|
550 Emacs''). By this time, the current version of Lucid Emacs was
|
|
551 19.6. (Strangely, the first released beta from the FSF was GNU Emacs
|
|
552 19.7.) A time line for GNU Emacs version 19 is
|
|
553
|
|
554 @itemize @bullet
|
|
555 @item
|
|
556 version 19.8 (beta) released May 27, 1993.
|
|
557 @item
|
|
558 version 19.9 (beta) released May 27, 1993.
|
|
559 @item
|
|
560 version 19.10 (beta) released May 30, 1993.
|
|
561 @item
|
|
562 version 19.11 (beta) released June 1, 1993.
|
|
563 @item
|
|
564 version 19.12 (beta) released June 2, 1993.
|
|
565 @item
|
|
566 version 19.13 (beta) released June 8, 1993.
|
|
567 @item
|
|
568 version 19.14 (beta) released June 17, 1993.
|
|
569 @item
|
|
570 version 19.15 (beta) released June 19, 1993.
|
|
571 @item
|
|
572 version 19.16 (beta) released July 6, 1993.
|
|
573 @item
|
|
574 version 19.17 (beta) released late July, 1993.
|
|
575 @item
|
|
576 version 19.18 (beta) released August 9, 1993.
|
|
577 @item
|
|
578 version 19.19 (beta) released August 15, 1993.
|
|
579 @item
|
|
580 version 19.20 (beta) released November 17, 1993.
|
|
581 @item
|
|
582 version 19.21 (beta) released November 17, 1993.
|
|
583 @item
|
|
584 version 19.22 (beta) released November 28, 1993.
|
|
585 @item
|
|
586 version 19.23 (beta) released May 17, 1994.
|
|
587 @item
|
|
588 version 19.24 (beta) released May 16, 1994.
|
|
589 @item
|
|
590 version 19.25 (beta) released June 3, 1994.
|
|
591 @item
|
|
592 version 19.26 (beta) released September 11, 1994.
|
|
593 @item
|
|
594 version 19.27 (beta) released September 14, 1994.
|
|
595 @item
|
|
596 version 19.28 (first ``official'' release) released November 1, 1994.
|
|
597 @item
|
|
598 version 19.29 released June 21, 1995.
|
|
599 @item
|
|
600 version 19.30 released November 24, 1995.
|
|
601 @item
|
|
602 version 19.31 released May 25, 1996.
|
|
603 @item
|
|
604 version 19.32 released July 31, 1996.
|
|
605 @item
|
|
606 version 19.33 released August 11, 1996.
|
|
607 @item
|
|
608 version 19.34 released August 21, 1996.
|
|
609 @item
|
|
610 version 19.34b released September 6, 1996.
|
|
611 @end itemize
|
|
612
|
|
613 @cindex Mlynarik, Richard
|
|
614 In some ways, GNU Emacs 19 was better than Lucid Emacs; in some ways,
|
|
615 worse. Lucid soon began incorporating features from GNU Emacs 19 into
|
|
616 Lucid Emacs; the work was mostly done by Richard Mlynarik, who had been
|
|
617 working on and using GNU Emacs for a long time (back as far as version
|
|
618 16 or 17).
|
|
619
|
462
|
620 @node GNU Emacs 20
|
428
|
621 @section GNU Emacs 20
|
|
622 @cindex GNU Emacs 20
|
462
|
623 @cindex Emacs 20, GNU
|
|
624 @cindex version 20, GNU Emacs
|
428
|
625 @cindex FSF Emacs
|
|
626
|
|
627 On February 2, 1997 work began on GNU Emacs to integrate Mule. The first
|
|
628 release was made in September of that year.
|
|
629
|
|
630 A timeline for Emacs 20 is
|
|
631
|
|
632 @itemize @bullet
|
|
633 @item
|
|
634 version 20.1 released September 17, 1997.
|
|
635 @item
|
|
636 version 20.2 released September 20, 1997.
|
|
637 @item
|
|
638 version 20.3 released August 19, 1998.
|
|
639 @end itemize
|
|
640
|
462
|
641 @node XEmacs
|
428
|
642 @section XEmacs
|
|
643 @cindex XEmacs
|
|
644
|
|
645 @cindex Sun Microsystems
|
|
646 @cindex University of Illinois
|
|
647 @cindex Illinois, University of
|
|
648 @cindex SPARCWorks
|
|
649 @cindex Andreessen, Marc
|
|
650 @cindex Baur, Steve
|
|
651 @cindex Buchholz, Martin
|
|
652 @cindex Kaplan, Simon
|
|
653 @cindex Wing, Ben
|
|
654 @cindex Thompson, Chuck
|
|
655 @cindex Win-Emacs
|
|
656 @cindex Epoch
|
|
657 @cindex Amdahl Corporation
|
|
658 Around the time that Lucid was developing Energize, Sun Microsystems
|
|
659 was developing their own development environment (called ``SPARCWorks'')
|
|
660 and also decided to use Emacs. They joined forces with the Epoch team
|
|
661 at the University of Illinois and later with Lucid. The maintainer of
|
|
662 the last-released version of Epoch was Marc Andreessen, but he dropped
|
|
663 out and the Epoch project, headed by Simon Kaplan, lured Chuck Thompson
|
|
664 away from a system administration job to become the primary Lucid Emacs
|
|
665 author for Epoch and Sun. Chuck's area of specialty became the
|
|
666 redisplay engine (he replaced the old Lucid Emacs redisplay engine with
|
|
667 a ported version from Epoch and then later rewrote it from scratch).
|
|
668 Sun also hired Ben Wing (the author of Win-Emacs, a port of Lucid Emacs
|
|
669 to Microsoft Windows 3.1) in 1993, for what was initially a one-month
|
|
670 contract to fix some event problems but later became a many-year
|
|
671 involvement, punctuated by a six-month contract with Amdahl Corporation.
|
|
672
|
|
673 @cindex rename to XEmacs
|
|
674 In 1994, Sun and Lucid agreed to rename Lucid Emacs to XEmacs (a name
|
|
675 not favorable to either company); the first release called XEmacs was
|
|
676 version 19.11. In June 1994, Lucid folded and Jamie quit to work for
|
|
677 the newly formed Mosaic Communications Corp., later Netscape
|
|
678 Communications Corp. (co-founded by the same Marc Andreessen, who had
|
|
679 quit his Epoch job to work on a graphical browser for the World Wide
|
|
680 Web). Chuck then become the primary maintainer of XEmacs, and put out
|
|
681 versions 19.11 through 19.14 in conjunction with Ben. For 19.12 and
|
|
682 19.13, Chuck added the new redisplay and many other display improvements
|
|
683 and Ben added MULE support (support for Asian and other languages) and
|
|
684 redesigned most of the internal Lisp subsystems to better support the
|
|
685 MULE work and the various other features being added to XEmacs. After
|
|
686 19.14 Chuck retired as primary maintainer and Steve Baur stepped in.
|
|
687
|
|
688 @cindex MULE merged XEmacs appears
|
|
689 Soon after 19.13 was released, work began in earnest on the MULE
|
|
690 internationalization code and the source tree was divided into two
|
|
691 development paths. The MULE version was initially called 19.20, but was
|
|
692 soon renamed to 20.0. In 1996 Martin Buchholz of Sun Microsystems took
|
|
693 over the care and feeding of it and worked on it in parallel with the
|
|
694 19.14 development that was occurring at the same time. After much work
|
|
695 by Martin, it was decided to release 20.0 ahead of 19.15 in February
|
|
696 1997. The source tree remained divided until 20.2 when the version 19
|
|
697 source was finally retired at version 19.16.
|
|
698
|
|
699 @cindex Baur, Steve
|
|
700 @cindex Buchholz, Martin
|
|
701 @cindex Jones, Kyle
|
|
702 @cindex Niksic, Hrvoje
|
|
703 @cindex XEmacs goes it alone
|
|
704 In 1997, Sun finally dropped all pretense of support for XEmacs and
|
|
705 Martin Buchholz left the company in November. Since then, and mostly
|
|
706 for the previous year, because Steve Baur was never paid to work on
|
|
707 XEmacs, XEmacs has existed solely on the contributions of volunteers
|
|
708 from the Free Software Community. Starting from 1997, Hrvoje Niksic and
|
|
709 Kyle Jones have figured prominently in XEmacs development.
|
|
710
|
|
711 @cindex merging attempts
|
|
712 Many attempts have been made to merge XEmacs and GNU Emacs, but they
|
|
713 have consistently failed.
|
|
714
|
|
715 A more detailed history is contained in the XEmacs About page.
|
|
716
|
464
|
717 A time line for XEmacs is
|
|
718
|
|
719 @itemize @bullet
|
|
720 @item
|
|
721 version 19.11 (first XEmacs) released September 13, 1994.
|
|
722 @item
|
|
723 version 19.12 released June 23, 1995.
|
|
724 @item
|
|
725 version 19.13 released September 1, 1995.
|
|
726 @item
|
|
727 version 19.14 released June 23, 1996.
|
|
728 @item
|
|
729 version 20.0 released February 9, 1997.
|
|
730 @item
|
|
731 version 19.15 released March 28, 1997.
|
|
732 @item
|
|
733 version 20.1 (not released to the net) April 15, 1997.
|
|
734 @item
|
|
735 version 20.2 released May 16, 1997.
|
|
736 @item
|
|
737 version 19.16 released October 31, 1997.
|
|
738 @item
|
|
739 version 20.3 (the first stable version of XEmacs 20.x) released November 30,
|
|
740 1997.
|
|
741 @item
|
|
742 version 20.4 released February 28, 1998.
|
|
743 @item
|
|
744 version 21.0.60 released December 10, 1998. (The version naming scheme was
|
|
745 changed at this point: [a] the second version number is odd for stable
|
|
746 versions, even for beta versions; [b] a third version number is added,
|
|
747 replacing the "beta xxx" ending for beta versions and allowing for
|
|
748 periodic maintenance releases for stable versions. Therefore, 21.0 was
|
|
749 never "officially" released; similarly for 21.2, etc.)
|
|
750 @item
|
|
751 version 21.0.61 released January 4, 1999.
|
|
752 @item
|
|
753 version 21.0.63 released February 3, 1999.
|
|
754 @item
|
|
755 version 21.0.64 released March 1, 1999.
|
|
756 @item
|
|
757 version 21.0.65 released March 5, 1999.
|
|
758 @item
|
|
759 version 21.0.66 released March 12, 1999.
|
|
760 @item
|
|
761 version 21.0.67 released March 25, 1999.
|
|
762 @item
|
|
763 version 21.1.2 released May 14, 1999. (This is the followup to 21.0.67.
|
|
764 The second version number was bumped to indicate the beginning of the
|
|
765 "stable" series.)
|
|
766 @item
|
|
767 version 21.1.3 released June 26, 1999.
|
|
768 @item
|
|
769 version 21.1.4 released July 8, 1999.
|
|
770 @item
|
|
771 version 21.1.6 released August 14, 1999. (There was no 21.1.5.)
|
|
772 @item
|
|
773 version 21.1.7 released September 26, 1999.
|
|
774 @item
|
|
775 version 21.1.8 released November 2, 1999.
|
|
776 @item
|
|
777 version 21.1.9 released February 13, 2000.
|
|
778 @item
|
|
779 version 21.1.10 released May 7, 2000.
|
|
780 @item
|
|
781 version 21.1.10a released June 24, 2000.
|
|
782 @item
|
|
783 version 21.1.11 released July 18, 2000.
|
|
784 @item
|
|
785 version 21.1.12 released August 5, 2000.
|
|
786 @item
|
|
787 version 21.1.13 released January 7, 2001.
|
|
788 @item
|
|
789 version 21.1.14 released January 27, 2001.
|
|
790 @item
|
|
791 version 21.2.9 released February 3, 1999.
|
|
792 @item
|
|
793 version 21.2.10 released February 5, 1999.
|
|
794 @item
|
|
795 version 21.2.11 released March 1, 1999.
|
|
796 @item
|
|
797 version 21.2.12 released March 5, 1999.
|
|
798 @item
|
|
799 version 21.2.13 released March 12, 1999.
|
|
800 @item
|
|
801 version 21.2.14 released May 14, 1999.
|
|
802 @item
|
|
803 version 21.2.15 released June 4, 1999.
|
|
804 @item
|
|
805 version 21.2.16 released June 11, 1999.
|
|
806 @item
|
|
807 version 21.2.17 released June 22, 1999.
|
|
808 @item
|
|
809 version 21.2.18 released July 14, 1999.
|
|
810 @item
|
|
811 version 21.2.19 released July 30, 1999.
|
|
812 @item
|
|
813 version 21.2.20 released November 10, 1999.
|
|
814 @item
|
|
815 version 21.2.21 released November 28, 1999.
|
|
816 @item
|
|
817 version 21.2.22 released November 29, 1999.
|
|
818 @item
|
|
819 version 21.2.23 released December 7, 1999.
|
|
820 @item
|
|
821 version 21.2.24 released December 14, 1999.
|
|
822 @item
|
|
823 version 21.2.25 released December 24, 1999.
|
|
824 @item
|
|
825 version 21.2.26 released December 31, 1999.
|
|
826 @item
|
|
827 version 21.2.27 released January 18, 2000.
|
|
828 @item
|
|
829 version 21.2.28 released February 7, 2000.
|
|
830 @item
|
|
831 version 21.2.29 released February 16, 2000.
|
|
832 @item
|
|
833 version 21.2.30 released February 21, 2000.
|
|
834 @item
|
|
835 version 21.2.31 released February 23, 2000.
|
|
836 @item
|
|
837 version 21.2.32 released March 20, 2000.
|
|
838 @item
|
|
839 version 21.2.33 released May 1, 2000.
|
|
840 @item
|
|
841 version 21.2.34 released May 28, 2000.
|
|
842 @item
|
|
843 version 21.2.35 released July 19, 2000.
|
|
844 @item
|
|
845 version 21.2.36 released October 4, 2000.
|
|
846 @item
|
|
847 version 21.2.37 released November 14, 2000.
|
|
848 @item
|
|
849 version 21.2.38 released December 5, 2000.
|
|
850 @item
|
|
851 version 21.2.39 released December 31, 2000.
|
|
852 @item
|
|
853 version 21.2.40 released January 8, 2001.
|
|
854 @item
|
|
855 version 21.2.41 released January 17, 2001.
|
|
856 @item
|
|
857 version 21.2.42 released January 20, 2001.
|
|
858 @item
|
|
859 version 21.2.43 released January 26, 2001.
|
|
860 @item
|
|
861 version 21.2.44 released February 8, 2001.
|
|
862 @item
|
|
863 version 21.2.45 released February 23, 2001.
|
|
864 @item
|
|
865 version 21.2.46 released March 21, 2001.
|
|
866 @end itemize
|
|
867
|
428
|
868 @node XEmacs From the Outside, The Lisp Language, A History of Emacs, Top
|
|
869 @chapter XEmacs From the Outside
|
462
|
870 @cindex XEmacs from the outside
|
|
871 @cindex outside, XEmacs from the
|
428
|
872 @cindex read-eval-print
|
|
873
|
|
874 XEmacs appears to the outside world as an editor, but it is really a
|
|
875 Lisp environment. At its heart is a Lisp interpreter; it also
|
|
876 ``happens'' to contain many specialized object types (e.g. buffers,
|
|
877 windows, frames, events) that are useful for implementing an editor.
|
|
878 Some of these objects (in particular windows and frames) have
|
|
879 displayable representations, and XEmacs provides a function
|
|
880 @code{redisplay()} that ensures that the display of all such objects
|
|
881 matches their internal state. Most of the time, a standard Lisp
|
440
|
882 environment is in a @dfn{read-eval-print} loop---i.e. ``read some Lisp
|
428
|
883 code, execute it, and print the results''. XEmacs has a similar loop:
|
|
884
|
|
885 @itemize @bullet
|
|
886 @item
|
|
887 read an event
|
|
888 @item
|
|
889 dispatch the event (i.e. ``do it'')
|
|
890 @item
|
|
891 redisplay
|
|
892 @end itemize
|
|
893
|
|
894 Reading an event is done using the Lisp function @code{next-event},
|
|
895 which waits for something to happen (typically, the user presses a key
|
|
896 or moves the mouse) and returns an event object describing this.
|
|
897 Dispatching an event is done using the Lisp function
|
|
898 @code{dispatch-event}, which looks up the event in a keymap object (a
|
|
899 particular kind of object that associates an event with a Lisp function)
|
|
900 and calls that function. The function ``does'' what the user has
|
|
901 requested by changing the state of particular frame objects, buffer
|
|
902 objects, etc. Finally, @code{redisplay()} is called, which updates the
|
|
903 display to reflect those changes just made. Thus is an ``editor'' born.
|
|
904
|
|
905 @cindex bridge, playing
|
|
906 @cindex taxes, doing
|
|
907 @cindex pi, calculating
|
|
908 Note that you do not have to use XEmacs as an editor; you could just
|
|
909 as well make it do your taxes, compute pi, play bridge, etc. You'd just
|
|
910 have to write functions to do those operations in Lisp.
|
|
911
|
|
912 @node The Lisp Language, XEmacs From the Perspective of Building, XEmacs From the Outside, Top
|
|
913 @chapter The Lisp Language
|
462
|
914 @cindex Lisp language, the
|
428
|
915 @cindex Lisp vs. C
|
|
916 @cindex C vs. Lisp
|
|
917 @cindex Lisp vs. Java
|
|
918 @cindex Java vs. Lisp
|
|
919 @cindex dynamic scoping
|
|
920 @cindex scoping, dynamic
|
|
921 @cindex dynamic types
|
|
922 @cindex types, dynamic
|
|
923 @cindex Java
|
|
924 @cindex Common Lisp
|
|
925 @cindex Gosling, James
|
|
926
|
|
927 Lisp is a general-purpose language that is higher-level than C and in
|
|
928 many ways more powerful than C. Powerful dialects of Lisp such as
|
|
929 Common Lisp are probably much better languages for writing very large
|
|
930 applications than is C. (Unfortunately, for many non-technical
|
|
931 reasons C and its successor C++ have become the dominant languages for
|
|
932 application development. These languages are both inadequate for
|
|
933 extremely large applications, which is evidenced by the fact that newer,
|
|
934 larger programs are becoming ever harder to write and are requiring ever
|
|
935 more programmers despite great increases in C development environments;
|
|
936 and by the fact that, although hardware speeds and reliability have been
|
|
937 growing at an exponential rate, most software is still generally
|
|
938 considered to be slow and buggy.)
|
|
939
|
|
940 The new Java language holds promise as a better general-purpose
|
|
941 development language than C. Java has many features in common with
|
|
942 Lisp that are not shared by C (this is not a coincidence, since
|
|
943 Java was designed by James Gosling, a former Lisp hacker). This
|
|
944 will be discussed more later.
|
|
945
|
|
946 For those used to C, here is a summary of the basic differences between
|
|
947 C and Lisp:
|
|
948
|
|
949 @enumerate
|
|
950 @item
|
|
951 Lisp has an extremely regular syntax. Every function, expression,
|
|
952 and control statement is written in the form
|
|
953
|
|
954 @example
|
|
955 (@var{func} @var{arg1} @var{arg2} ...)
|
|
956 @end example
|
|
957
|
|
958 This is as opposed to C, which writes functions as
|
|
959
|
|
960 @example
|
|
961 func(@var{arg1}, @var{arg2}, ...)
|
|
962 @end example
|
|
963
|
|
964 but writes expressions involving operators as (e.g.)
|
|
965
|
|
966 @example
|
|
967 @var{arg1} + @var{arg2}
|
|
968 @end example
|
|
969
|
|
970 and writes control statements as (e.g.)
|
|
971
|
|
972 @example
|
|
973 while (@var{expr}) @{ @var{statement1}; @var{statement2}; ... @}
|
|
974 @end example
|
|
975
|
|
976 Lisp equivalents of the latter two would be
|
|
977
|
|
978 @example
|
|
979 (+ @var{arg1} @var{arg2} ...)
|
|
980 @end example
|
|
981
|
|
982 and
|
|
983
|
|
984 @example
|
|
985 (while @var{expr} @var{statement1} @var{statement2} ...)
|
|
986 @end example
|
|
987
|
|
988 @item
|
|
989 Lisp is a safe language. Assuming there are no bugs in the Lisp
|
|
990 interpreter/compiler, it is impossible to write a program that ``core
|
|
991 dumps'' or otherwise causes the machine to execute an illegal
|
|
992 instruction. This is very different from C, where perhaps the most
|
|
993 common outcome of a bug is exactly such a crash. A corollary of this is that
|
|
994 the C operation of casting a pointer is impossible (and unnecessary) in
|
|
995 Lisp, and that it is impossible to access memory outside the bounds of
|
|
996 an array.
|
|
997
|
|
998 @item
|
|
999 Programs and data are written in the same form. The
|
|
1000 parenthesis-enclosing form described above for statements is the same
|
|
1001 form used for the most common data type in Lisp, the list. Thus, it is
|
|
1002 possible to represent any Lisp program using Lisp data types, and for
|
|
1003 one program to construct Lisp statements and then dynamically
|
|
1004 @dfn{evaluate} them, or cause them to execute.
|
|
1005
|
|
1006 @item
|
|
1007 All objects are @dfn{dynamically typed}. This means that part of every
|
|
1008 object is an indication of what type it is. A Lisp program can
|
|
1009 manipulate an object without knowing what type it is, and can query an
|
|
1010 object to determine its type. This means that, correspondingly,
|
|
1011 variables and function parameters can hold objects of any type and are
|
|
1012 not normally declared as being of any particular type. This is opposed
|
|
1013 to the @dfn{static typing} of C, where variables can hold exactly one
|
|
1014 type of object and must be declared as such, and objects do not contain
|
|
1015 an indication of their type because it's implicit in the variables they
|
|
1016 are stored in. It is possible in C to have a variable hold different
|
|
1017 types of objects (e.g. through the use of @code{void *} pointers or
|
|
1018 variable-argument functions), but the type information must then be
|
|
1019 passed explicitly in some other fashion, leading to additional program
|
|
1020 complexity.
|
|
1021
|
|
1022 @item
|
|
1023 Allocated memory is automatically reclaimed when it is no longer in use.
|
|
1024 This operation is called @dfn{garbage collection} and involves looking
|
|
1025 through all variables to see what memory is being pointed to, and
|
|
1026 reclaiming any memory that is not pointed to and is thus
|
|
1027 ``inaccessible'' and out of use. This is as opposed to C, in which
|
|
1028 allocated memory must be explicitly reclaimed using @code{free()}. If
|
|
1029 you simply drop all pointers to memory without freeing it, it becomes
|
|
1030 ``leaked'' memory that still takes up space. Over a long period of
|
|
1031 time, this can cause your program to grow and grow until it runs out of
|
|
1032 memory.
|
|
1033
|
|
1034 @item
|
|
1035 Lisp has built-in facilities for handling errors and exceptions. In C,
|
|
1036 when an error occurs, usually either the program exits entirely or the
|
|
1037 routine in which the error occurs returns a value indicating this. If
|
|
1038 an error occurs in a deeply-nested routine, then every routine currently
|
|
1039 called must unwind itself normally and return an error value back up to
|
|
1040 the next routine. This means that every routine must explicitly check
|
|
1041 for an error in all the routines it calls; if it does not do so,
|
|
1042 unexpected and often random behavior results. This is an extremely
|
|
1043 common source of bugs in C programs. An alternative would be to do a
|
|
1044 non-local exit using @code{longjmp()}, but that is often very dangerous
|
|
1045 because the routines that were exited past had no opportunity to clean
|
|
1046 up after themselves and may leave things in an inconsistent state,
|
|
1047 causing a crash shortly afterwards.
|
|
1048
|
|
1049 Lisp provides mechanisms to make such non-local exits safe. When an
|
|
1050 error occurs, a routine simply signals that an error of a particular
|
|
1051 class has occurred, and a non-local exit takes place. Any routine can
|
|
1052 trap errors occurring in routines it calls by registering an error
|
|
1053 handler for some or all classes of errors. (If no handler is registered,
|
|
1054 a default handler, generally installed by the top-level event loop, is
|
|
1055 executed; this prints out the error and continues.) Routines can also
|
|
1056 specify cleanup code (called an @dfn{unwind-protect}) that will be
|
|
1057 called when control exits from a block of code, no matter how that exit
|
440
|
1058 occurs---i.e. even if a function deeply nested below it causes a
|
428
|
1059 non-local exit back to the top level.
|
|
1060
|
|
1061 Note that this facility has appeared in some recent vintages of C, in
|
|
1062 particular Visual C++ and other PC compilers written for the Microsoft
|
|
1063 Win32 API.
|
|
1064
|
|
1065 @item
|
|
1066 In Emacs Lisp, local variables are @dfn{dynamically scoped}. This means
|
|
1067 that if you declare a local variable in a particular function, and then
|
|
1068 call another function, that subfunction can ``see'' the local variable
|
|
1069 you declared. This is actually considered a bug in Emacs Lisp and in
|
|
1070 all other early dialects of Lisp, and was corrected in Common Lisp. (In
|
|
1071 Common Lisp, you can still declare dynamically scoped variables if you
|
440
|
1072 want to---they are sometimes useful---but variables by default are
|
428
|
1073 @dfn{lexically scoped} as in C.)
|
|
1074 @end enumerate
|
|
1075
|
|
1076 For those familiar with Lisp, Emacs Lisp is modelled after MacLisp, an
|
|
1077 early dialect of Lisp developed at MIT (no relation to the Macintosh
|
|
1078 computer). There is a Common Lisp compatibility package available for
|
|
1079 Emacs that provides many of the features of Common Lisp.
|
|
1080
|
|
1081 The Java language is derived in many ways from C, and shares a similar
|
|
1082 syntax, but has the following features in common with Lisp (and different
|
|
1083 from C):
|
|
1084
|
|
1085 @enumerate
|
|
1086 @item
|
|
1087 Java is a safe language, like Lisp.
|
|
1088 @item
|
|
1089 Java provides garbage collection, like Lisp.
|
|
1090 @item
|
|
1091 Java has built-in facilities for handling errors and exceptions, like
|
|
1092 Lisp.
|
|
1093 @item
|
|
1094 Java has a type system that combines the best advantages of both static
|
|
1095 and dynamic typing. Objects (except very simple types) are explicitly
|
|
1096 marked with their type, as in dynamic typing; but there is a hierarchy
|
|
1097 of types and functions are declared to accept only certain types, thus
|
|
1098 providing the increased compile-time error-checking of static typing.
|
|
1099 @end enumerate
|
|
1100
|
|
1101 The Java language also has some negative attributes:
|
|
1102
|
|
1103 @enumerate
|
|
1104 @item
|
|
1105 Java uses the edit/compile/run model of software development. This
|
|
1106 makes it hard to use interactively. For example, to use Java like
|
|
1107 @code{bc} it is necessary to write a special purpose, albeit tiny,
|
|
1108 application. In Emacs Lisp, a calculator comes built-in without any
|
|
1109 effort - one can always just type an expression in the @code{*scratch*}
|
|
1110 buffer.
|
|
1111 @item
|
|
1112 Java tries too hard to enforce, not merely enable, portability, making
|
|
1113 ordinary access to standard OS facilities painful. Java has an
|
|
1114 @dfn{agenda}. I think this is why @code{chdir} is not part of standard
|
|
1115 Java, which is inexcusable.
|
|
1116 @end enumerate
|
|
1117
|
|
1118 Unfortunately, there is no perfect language. Static typing allows a
|
|
1119 compiler to catch programmer errors and produce more efficient code, but
|
442
|
1120 makes programming more tedious and less fun. For the foreseeable future,
|
428
|
1121 an Ideal Editing and Programming Environment (and that is what XEmacs
|
|
1122 aspires to) will be programmable in multiple languages: high level ones
|
|
1123 like Lisp for user customization and prototyping, and lower level ones
|
|
1124 for infrastructure and industrial strength applications. If I had my
|
|
1125 way, XEmacs would be friendly towards the Python, Scheme, C++, ML,
|
|
1126 etc... communities. But there are serious technical difficulties to
|
|
1127 achieving that goal.
|
|
1128
|
|
1129 The word @dfn{application} in the previous paragraph was used
|
|
1130 intentionally. XEmacs implements an API for programs written in Lisp
|
|
1131 that makes it a full-fledged application platform, very much like an OS
|
|
1132 inside the real OS.
|
|
1133
|
|
1134 @node XEmacs From the Perspective of Building, XEmacs From the Inside, The Lisp Language, Top
|
|
1135 @chapter XEmacs From the Perspective of Building
|
462
|
1136 @cindex XEmacs from the perspective of building
|
|
1137 @cindex building, XEmacs from the perspective of
|
428
|
1138
|
|
1139 The heart of XEmacs is the Lisp environment, which is written in C.
|
|
1140 This is contained in the @file{src/} subdirectory. Underneath
|
|
1141 @file{src/} are two subdirectories of header files: @file{s/} (header
|
|
1142 files for particular operating systems) and @file{m/} (header files for
|
|
1143 particular machine types). In practice the distinction between the two
|
|
1144 types of header files is blurred. These header files define or undefine
|
|
1145 certain preprocessor constants and macros to indicate particular
|
|
1146 characteristics of the associated machine or operating system. As part
|
|
1147 of the configure process, one @file{s/} file and one @file{m/} file is
|
|
1148 identified for the particular environment in which XEmacs is being
|
|
1149 built.
|
|
1150
|
|
1151 XEmacs also contains a great deal of Lisp code. This implements the
|
|
1152 operations that make XEmacs useful as an editor as well as just a Lisp
|
|
1153 environment, and also contains many add-on packages that allow XEmacs to
|
|
1154 browse directories, act as a mail and Usenet news reader, compile Lisp
|
|
1155 code, etc. There is actually more Lisp code than C code associated with
|
|
1156 XEmacs, but much of the Lisp code is peripheral to the actual operation
|
|
1157 of the editor. The Lisp code all lies in subdirectories underneath the
|
|
1158 @file{lisp/} directory.
|
|
1159
|
|
1160 The @file{lwlib/} directory contains C code that implements a
|
|
1161 generalized interface onto different X widget toolkits and also
|
|
1162 implements some widgets of its own that behave like Motif widgets but
|
|
1163 are faster, free, and in some cases more powerful. The code in this
|
|
1164 directory compiles into a library and is mostly independent from XEmacs.
|
|
1165
|
|
1166 The @file{etc/} directory contains various data files associated with
|
|
1167 XEmacs. Some of them are actually read by XEmacs at startup; others
|
|
1168 merely contain useful information of various sorts.
|
|
1169
|
|
1170 The @file{lib-src/} directory contains C code for various auxiliary
|
|
1171 programs that are used in connection with XEmacs. Some of them are used
|
|
1172 during the build process; others are used to perform certain functions
|
|
1173 that cannot conveniently be placed in the XEmacs executable (e.g. the
|
|
1174 @file{movemail} program for fetching mail out of @file{/var/spool/mail},
|
|
1175 which must be setgid to @file{mail} on many systems; and the
|
|
1176 @file{gnuclient} program, which allows an external script to communicate
|
|
1177 with a running XEmacs process).
|
|
1178
|
|
1179 The @file{man/} directory contains the sources for the XEmacs
|
|
1180 documentation. It is mostly in a form called Texinfo, which can be
|
|
1181 converted into either a printed document (by passing it through @TeX{})
|
|
1182 or into on-line documentation called @dfn{info files}.
|
|
1183
|
|
1184 The @file{info/} directory contains the results of formatting the XEmacs
|
|
1185 documentation as @dfn{info files}, for on-line use. These files are
|
|
1186 used when you enter the Info system using @kbd{C-h i} or through the
|
|
1187 Help menu.
|
|
1188
|
|
1189 The @file{dynodump/} directory contains auxiliary code used to build
|
|
1190 XEmacs on Solaris platforms.
|
|
1191
|
|
1192 The other directories contain various miscellaneous code and information
|
|
1193 that is not normally used or needed.
|
|
1194
|
|
1195 The first step of building involves running the @file{configure} program
|
|
1196 and passing it various parameters to specify any optional features you
|
|
1197 want and compiler arguments and such, as described in the @file{INSTALL}
|
|
1198 file. This determines what the build environment is, chooses the
|
|
1199 appropriate @file{s/} and @file{m/} file, and runs a series of tests to
|
|
1200 determine many details about your environment, such as which library
|
|
1201 functions are available and exactly how they work. The reason for
|
|
1202 running these tests is that it allows XEmacs to be compiled on a much
|
|
1203 wider variety of platforms than those that the XEmacs developers happen
|
|
1204 to be familiar with, including various sorts of hybrid platforms. This
|
|
1205 is especially important now that many operating systems give you a great
|
|
1206 deal of control over exactly what features you want installed, and allow
|
|
1207 for easy upgrading of parts of a system without upgrading the rest. It
|
|
1208 would be impossible to pre-determine and pre-specify the information for
|
|
1209 all possible configurations.
|
|
1210
|
|
1211 In fact, the @file{s/} and @file{m/} files are basically @emph{evil},
|
|
1212 since they contain unmaintainable platform-specific hard-coded
|
|
1213 information. XEmacs has been moving in the direction of having all
|
|
1214 system-specific information be determined dynamically by
|
|
1215 @file{configure}. Perhaps someday we can @code{rm -rf src/s src/m}.
|
|
1216
|
|
1217 When configure is done running, it generates @file{Makefile}s and
|
|
1218 @file{GNUmakefile}s and the file @file{src/config.h} (which describes
|
|
1219 the features of your system) from template files. You then run
|
|
1220 @file{make}, which compiles the auxiliary code and programs in
|
|
1221 @file{lib-src/} and @file{lwlib/} and the main XEmacs executable in
|
|
1222 @file{src/}. The result of compiling and linking is an executable
|
|
1223 called @file{temacs}, which is @emph{not} the final XEmacs executable.
|
|
1224 @file{temacs} by itself is not intended to function as an editor or even
|
|
1225 display any windows on the screen, and if you simply run it, it will
|
|
1226 exit immediately. The @file{Makefile} runs @file{temacs} with certain
|
|
1227 options that cause it to initialize itself, read in a number of basic
|
|
1228 Lisp files, and then dump itself out into a new executable called
|
|
1229 @file{xemacs}. This new executable has been pre-initialized and
|
|
1230 contains pre-digested Lisp code that is necessary for the editor to
|
|
1231 function (this includes most basic editing functions,
|
|
1232 e.g. @code{kill-line}, that can be defined in terms of other Lisp
|
|
1233 primitives; some initialization code that is called when certain
|
|
1234 objects, such as frames, are created; and all of the standard
|
|
1235 keybindings and code for the actions they result in). This executable,
|
|
1236 @file{xemacs}, is the executable that you run to use the XEmacs editor.
|
|
1237
|
|
1238 Although @file{temacs} is not intended to be run as an editor, it can,
|
|
1239 by using the incantation @code{temacs -batch -l loadup.el run-temacs}.
|
|
1240 This is useful when the dumping procedure described above is broken, or
|
|
1241 when using certain program debugging tools such as Purify. These tools
|
|
1242 get mighty confused by the tricks played by the XEmacs build process,
|
|
1243 such as allocation memory in one process, and freeing it in the next.
|
|
1244
|
|
1245 @node XEmacs From the Inside, The XEmacs Object System (Abstractly Speaking), XEmacs From the Perspective of Building, Top
|
|
1246 @chapter XEmacs From the Inside
|
462
|
1247 @cindex XEmacs from the inside
|
|
1248 @cindex inside, XEmacs from the
|
428
|
1249
|
|
1250 Internally, XEmacs is quite complex, and can be very confusing. To
|
|
1251 simplify things, it can be useful to think of XEmacs as containing an
|
|
1252 event loop that ``drives'' everything, and a number of other subsystems,
|
|
1253 such as a Lisp engine and a redisplay mechanism. Each of these other
|
|
1254 subsystems exists simultaneously in XEmacs, and each has a certain
|
|
1255 state. The flow of control continually passes in and out of these
|
|
1256 different subsystems in the course of normal operation of the editor.
|
|
1257
|
|
1258 It is important to keep in mind that, most of the time, the editor is
|
|
1259 ``driven'' by the event loop. Except during initialization and batch
|
|
1260 mode, all subsystems are entered directly or indirectly through the
|
|
1261 event loop, and ultimately, control exits out of all subsystems back up
|
|
1262 to the event loop. This cycle of entering a subsystem, exiting back out
|
|
1263 to the event loop, and starting another iteration of the event loop
|
|
1264 occurs once each keystroke, mouse motion, etc.
|
|
1265
|
|
1266 If you're trying to understand a particular subsystem (other than the
|
|
1267 event loop), think of it as a ``daemon'' process or ``servant'' that is
|
|
1268 responsible for one particular aspect of a larger system, and
|
|
1269 periodically receives commands or environment changes that cause it to
|
|
1270 do something. Ultimately, these commands and environment changes are
|
|
1271 always triggered by the event loop. For example:
|
|
1272
|
|
1273 @itemize @bullet
|
|
1274 @item
|
|
1275 The window and frame mechanism is responsible for keeping track of what
|
|
1276 windows and frames exist, what buffers are in them, etc. It is
|
|
1277 periodically given commands (usually from the user) to make a change to
|
|
1278 the current window/frame state: i.e. create a new frame, delete a
|
|
1279 window, etc.
|
|
1280
|
|
1281 @item
|
|
1282 The buffer mechanism is responsible for keeping track of what buffers
|
|
1283 exist and what text is in them. It is periodically given commands
|
|
1284 (usually from the user) to insert or delete text, create a buffer, etc.
|
|
1285 When it receives a text-change command, it notifies the redisplay
|
|
1286 mechanism.
|
|
1287
|
|
1288 @item
|
|
1289 The redisplay mechanism is responsible for making sure that windows and
|
|
1290 frames are displayed correctly. It is periodically told (by the event
|
|
1291 loop) to actually ``do its job'', i.e. snoop around and see what the
|
|
1292 current state of the environment (mostly of the currently-existing
|
625
|
1293 windows, frames, and buffers) is, and make sure that state matches
|
428
|
1294 what's actually displayed. It keeps lots and lots of information around
|
|
1295 (such as what is actually being displayed currently, and what the
|
|
1296 environment was last time it checked) so that it can minimize the work
|
|
1297 it has to do. It is also helped along in that whenever a relevant
|
|
1298 change to the environment occurs, the redisplay mechanism is told about
|
|
1299 this, so it has a pretty good idea of where it has to look to find
|
|
1300 possible changes and doesn't have to look everywhere.
|
|
1301
|
|
1302 @item
|
|
1303 The Lisp engine is responsible for executing the Lisp code in which most
|
|
1304 user commands are written. It is entered through a call to @code{eval}
|
|
1305 or @code{funcall}, which occurs as a result of dispatching an event from
|
|
1306 the event loop. The functions it calls issue commands to the buffer
|
|
1307 mechanism, the window/frame subsystem, etc.
|
|
1308
|
|
1309 @item
|
|
1310 The Lisp allocation subsystem is responsible for keeping track of Lisp
|
|
1311 objects. It is given commands from the Lisp engine to allocate objects,
|
|
1312 garbage collect, etc.
|
|
1313 @end itemize
|
|
1314
|
|
1315 etc.
|
|
1316
|
|
1317 The important idea here is that there are a number of independent
|
|
1318 subsystems each with its own responsibility and persistent state, just
|
|
1319 like different employees in a company, and each subsystem is
|
|
1320 periodically given commands from other subsystems. Commands can flow
|
|
1321 from any one subsystem to any other, but there is usually some sort of
|
|
1322 hierarchy, with all commands originating from the event subsystem.
|
|
1323
|
|
1324 XEmacs is entered in @code{main()}, which is in @file{emacs.c}. When
|
|
1325 this is called the first time (in a properly-invoked @file{temacs}), it
|
|
1326 does the following:
|
|
1327
|
|
1328 @enumerate
|
|
1329 @item
|
|
1330 It does some very basic environment initializations, such as determining
|
|
1331 where it and its directories (e.g. @file{lisp/} and @file{etc/}) reside
|
|
1332 and setting up signal handlers.
|
|
1333 @item
|
|
1334 It initializes the entire Lisp interpreter.
|
|
1335 @item
|
|
1336 It sets the initial values of many built-in variables (including many
|
|
1337 variables that are visible to Lisp programs), such as the global keymap
|
|
1338 object and the built-in faces (a face is an object that describes the
|
|
1339 display characteristics of text). This involves creating Lisp objects
|
|
1340 and thus is dependent on step (2).
|
|
1341 @item
|
|
1342 It performs various other initializations that are relevant to the
|
|
1343 particular environment it is running in, such as retrieving environment
|
|
1344 variables, determining the current date and the user who is running the
|
|
1345 program, examining its standard input, creating any necessary file
|
|
1346 descriptors, etc.
|
|
1347 @item
|
|
1348 At this point, the C initialization is complete. A Lisp program that
|
|
1349 was specified on the command line (usually @file{loadup.el}) is called
|
|
1350 (temacs is normally invoked as @code{temacs -batch -l loadup.el dump}).
|
|
1351 @file{loadup.el} loads all of the other Lisp files that are needed for
|
|
1352 the operation of the editor, calls the @code{dump-emacs} function to
|
|
1353 write out @file{xemacs}, and then kills the temacs process.
|
|
1354 @end enumerate
|
|
1355
|
|
1356 When @file{xemacs} is then run, it only redoes steps (1) and (4)
|
|
1357 above; all variables already contain the values they were set to when
|
|
1358 the executable was dumped, and all memory that was allocated with
|
|
1359 @code{malloc()} is still around. (XEmacs knows whether it is being run
|
|
1360 as @file{xemacs} or @file{temacs} because it sets the global variable
|
|
1361 @code{initialized} to 1 after step (4) above.) At this point,
|
|
1362 @file{xemacs} calls a Lisp function to do any further initialization,
|
|
1363 which includes parsing the command-line (the C code can only do limited
|
|
1364 command-line parsing, which includes looking for the @samp{-batch} and
|
|
1365 @samp{-l} flags and a few other flags that it needs to know about before
|
|
1366 initialization is complete), creating the first frame (or @dfn{window}
|
|
1367 in standard window-system parlance), running the user's init file
|
|
1368 (usually the file @file{.emacs} in the user's home directory), etc. The
|
|
1369 function to do this is usually called @code{normal-top-level};
|
|
1370 @file{loadup.el} tells the C code about this function by setting its
|
|
1371 name as the value of the Lisp variable @code{top-level}.
|
|
1372
|
|
1373 When the Lisp initialization code is done, the C code enters the event
|
|
1374 loop, and stays there for the duration of the XEmacs process. The code
|
442
|
1375 for the event loop is contained in @file{cmdloop.c}, and is called
|
428
|
1376 @code{Fcommand_loop_1()}. Note that this event loop could very well be
|
|
1377 written in Lisp, and in fact a Lisp version exists; but apparently,
|
|
1378 doing this makes XEmacs run noticeably slower.
|
|
1379
|
|
1380 Notice how much of the initialization is done in Lisp, not in C.
|
|
1381 In general, XEmacs tries to move as much code as is possible
|
|
1382 into Lisp. Code that remains in C is code that implements the
|
|
1383 Lisp interpreter itself, or code that needs to be very fast, or
|
|
1384 code that needs to do system calls or other such stuff that
|
|
1385 needs to be done in C, or code that needs to have access to
|
|
1386 ``forbidden'' structures. (One conscious aspect of the design of
|
|
1387 Lisp under XEmacs is a clean separation between the external
|
|
1388 interface to a Lisp object's functionality and its internal
|
|
1389 implementation. Part of this design is that Lisp programs
|
|
1390 are forbidden from accessing the contents of the object other
|
|
1391 than through using a standard API. In this respect, XEmacs Lisp
|
|
1392 is similar to modern Lisp dialects but differs from GNU Emacs,
|
|
1393 which tends to expose the implementation and allow Lisp
|
|
1394 programs to look at it directly. The major advantage of
|
|
1395 hiding the implementation is that it allows the implementation
|
|
1396 to be redesigned without affecting any Lisp programs, including
|
|
1397 those that might want to be ``clever'' by looking directly at
|
|
1398 the object's contents and possibly manipulating them.)
|
|
1399
|
|
1400 Moving code into Lisp makes the code easier to debug and maintain and
|
|
1401 makes it much easier for people who are not XEmacs developers to
|
|
1402 customize XEmacs, because they can make a change with much less chance
|
|
1403 of obscure and unwanted interactions occurring than if they were to
|
|
1404 change the C code.
|
|
1405
|
|
1406 @node The XEmacs Object System (Abstractly Speaking), How Lisp Objects Are Represented in C, XEmacs From the Inside, Top
|
|
1407 @chapter The XEmacs Object System (Abstractly Speaking)
|
462
|
1408 @cindex XEmacs object system (abstractly speaking), the
|
|
1409 @cindex object system (abstractly speaking), the XEmacs
|
428
|
1410
|
|
1411 At the heart of the Lisp interpreter is its management of objects.
|
|
1412 XEmacs Lisp contains many built-in objects, some of which are
|
|
1413 simple and others of which can be very complex; and some of which
|
|
1414 are very common, and others of which are rarely used or are only
|
|
1415 used internally. (Since the Lisp allocation system, with its
|
|
1416 automatic reclamation of unused storage, is so much more convenient
|
|
1417 than @code{malloc()} and @code{free()}, the C code makes extensive use of it
|
|
1418 in its internal operations.)
|
|
1419
|
|
1420 The basic Lisp objects are
|
|
1421
|
|
1422 @table @code
|
|
1423 @item integer
|
|
1424 28 or 31 bits of precision, or 60 or 63 bits on 64-bit machines; the
|
|
1425 reason for this is described below when the internal Lisp object
|
|
1426 representation is described.
|
|
1427 @item float
|
|
1428 Same precision as a double in C.
|
|
1429 @item cons
|
|
1430 A simple container for two Lisp objects, used to implement lists and
|
|
1431 most other data structures in Lisp.
|
|
1432 @item char
|
|
1433 An object representing a single character of text; chars behave like
|
|
1434 integers in many ways but are logically considered text rather than
|
|
1435 numbers and have a different read syntax. (the read syntax for a char
|
440
|
1436 contains the char itself or some textual encoding of it---for example,
|
428
|
1437 a Japanese Kanji character might be encoded as @samp{^[$(B#&^[(B} using the
|
440
|
1438 ISO-2022 encoding standard---rather than the numerical representation
|
428
|
1439 of the char; this way, if the mapping between chars and integers
|
|
1440 changes, which is quite possible for Kanji characters and other extended
|
|
1441 characters, the same character will still be created. Note that some
|
|
1442 primitives confuse chars and integers. The worst culprit is @code{eq},
|
|
1443 which makes a special exception and considers a char to be @code{eq} to
|
|
1444 its integer equivalent, even though in no other case are objects of two
|
|
1445 different types @code{eq}. The reason for this monstrosity is
|
|
1446 compatibility with existing code; the separation of char from integer
|
|
1447 came fairly recently.)
|
|
1448 @item symbol
|
|
1449 An object that contains Lisp objects and is referred to by name;
|
|
1450 symbols are used to implement variables and named functions
|
|
1451 and to provide the equivalent of preprocessor constants in C.
|
|
1452 @item vector
|
|
1453 A one-dimensional array of Lisp objects providing constant-time access
|
|
1454 to any of the objects; access to an arbitrary object in a vector is
|
|
1455 faster than for lists, but the operations that can be done on a vector
|
|
1456 are more limited.
|
|
1457 @item string
|
|
1458 Self-explanatory; behaves much like a vector of chars
|
|
1459 but has a different read syntax and is stored and manipulated
|
|
1460 more compactly.
|
|
1461 @item bit-vector
|
|
1462 A vector of bits; similar to a string in spirit.
|
|
1463 @item compiled-function
|
|
1464 An object containing compiled Lisp code, known as @dfn{byte code}.
|
|
1465 @item subr
|
|
1466 A Lisp primitive, i.e. a Lisp-callable function implemented in C.
|
|
1467 @end table
|
|
1468
|
|
1469 @cindex closure
|
|
1470 Note that there is no basic ``function'' type, as in more powerful
|
|
1471 versions of Lisp (where it's called a @dfn{closure}). XEmacs Lisp does
|
|
1472 not provide the closure semantics implemented by Common Lisp and Scheme.
|
|
1473 The guts of a function in XEmacs Lisp are represented in one of four
|
|
1474 ways: a symbol specifying another function (when one function is an
|
|
1475 alias for another), a list (whose first element must be the symbol
|
|
1476 @code{lambda}) containing the function's source code, a
|
|
1477 compiled-function object, or a subr object. (In other words, given a
|
|
1478 symbol specifying the name of a function, calling @code{symbol-function}
|
|
1479 to retrieve the contents of the symbol's function cell will return one
|
|
1480 of these types of objects.)
|
|
1481
|
|
1482 XEmacs Lisp also contains numerous specialized objects used to implement
|
|
1483 the editor:
|
|
1484
|
|
1485 @table @code
|
|
1486 @item buffer
|
|
1487 Stores text like a string, but is optimized for insertion and deletion
|
|
1488 and has certain other properties that can be set.
|
|
1489 @item frame
|
|
1490 An object with various properties whose displayable representation is a
|
|
1491 @dfn{window} in window-system parlance.
|
|
1492 @item window
|
|
1493 A section of a frame that displays the contents of a buffer;
|
|
1494 often called a @dfn{pane} in window-system parlance.
|
|
1495 @item window-configuration
|
|
1496 An object that represents a saved configuration of windows in a frame.
|
|
1497 @item device
|
|
1498 An object representing a screen on which frames can be displayed;
|
|
1499 equivalent to a @dfn{display} in the X Window System and a @dfn{TTY} in
|
|
1500 character mode.
|
|
1501 @item face
|
|
1502 An object specifying the appearance of text or graphics; it has
|
|
1503 properties such as font, foreground color, and background color.
|
|
1504 @item marker
|
|
1505 An object that refers to a particular position in a buffer and moves
|
|
1506 around as text is inserted and deleted to stay in the same relative
|
|
1507 position to the text around it.
|
|
1508 @item extent
|
|
1509 Similar to a marker but covers a range of text in a buffer; can also
|
|
1510 specify properties of the text, such as a face in which the text is to
|
|
1511 be displayed, whether the text is invisible or unmodifiable, etc.
|
|
1512 @item event
|
|
1513 Generated by calling @code{next-event} and contains information
|
|
1514 describing a particular event happening in the system, such as the user
|
|
1515 pressing a key or a process terminating.
|
|
1516 @item keymap
|
|
1517 An object that maps from events (described using lists, vectors, and
|
|
1518 symbols rather than with an event object because the mapping is for
|
|
1519 classes of events, rather than individual events) to functions to
|
|
1520 execute or other events to recursively look up; the functions are
|
|
1521 described by name, using a symbol, or using lists to specify the
|
|
1522 function's code.
|
|
1523 @item glyph
|
|
1524 An object that describes the appearance of an image (e.g. pixmap) on
|
|
1525 the screen; glyphs can be attached to the beginning or end of extents
|
|
1526 and in some future version of XEmacs will be able to be inserted
|
|
1527 directly into a buffer.
|
|
1528 @item process
|
|
1529 An object that describes a connection to an externally-running process.
|
|
1530 @end table
|
|
1531
|
|
1532 There are some other, less-commonly-encountered general objects:
|
|
1533
|
|
1534 @table @code
|
|
1535 @item hash-table
|
|
1536 An object that maps from an arbitrary Lisp object to another arbitrary
|
|
1537 Lisp object, using hashing for fast lookup.
|
|
1538 @item obarray
|
|
1539 A limited form of hash-table that maps from strings to symbols; obarrays
|
|
1540 are used to look up a symbol given its name and are not actually their
|
|
1541 own object type but are kludgily represented using vectors with hidden
|
|
1542 fields (this representation derives from GNU Emacs).
|
|
1543 @item specifier
|
|
1544 A complex object used to specify the value of a display property; a
|
|
1545 default value is given and different values can be specified for
|
|
1546 particular frames, buffers, windows, devices, or classes of device.
|
|
1547 @item char-table
|
|
1548 An object that maps from chars or classes of chars to arbitrary Lisp
|
|
1549 objects; internally char tables use a complex nested-vector
|
|
1550 representation that is optimized to the way characters are represented
|
|
1551 as integers.
|
|
1552 @item range-table
|
|
1553 An object that maps from ranges of integers to arbitrary Lisp objects.
|
|
1554 @end table
|
|
1555
|
|
1556 And some strange special-purpose objects:
|
|
1557
|
|
1558 @table @code
|
|
1559 @item charset
|
|
1560 @itemx coding-system
|
|
1561 Objects used when MULE, or multi-lingual/Asian-language, support is
|
|
1562 enabled.
|
|
1563 @item color-instance
|
|
1564 @itemx font-instance
|
|
1565 @itemx image-instance
|
|
1566 An object that encapsulates a window-system resource; instances are
|
|
1567 mostly used internally but are exposed on the Lisp level for cleanness
|
|
1568 of the specifier model and because it's occasionally useful for Lisp
|
|
1569 program to create or query the properties of instances.
|
|
1570 @item subwindow
|
|
1571 An object that encapsulate a @dfn{subwindow} resource, i.e. a
|
|
1572 window-system child window that is drawn into by an external process;
|
|
1573 this object should be integrated into the glyph system but isn't yet,
|
|
1574 and may change form when this is done.
|
|
1575 @item tooltalk-message
|
|
1576 @itemx tooltalk-pattern
|
|
1577 Objects that represent resources used in the ToolTalk interprocess
|
|
1578 communication protocol.
|
|
1579 @item toolbar-button
|
|
1580 An object used in conjunction with the toolbar.
|
|
1581 @end table
|
|
1582
|
|
1583 And objects that are only used internally:
|
|
1584
|
|
1585 @table @code
|
|
1586 @item opaque
|
|
1587 A generic object for encapsulating arbitrary memory; this allows you the
|
|
1588 generality of @code{malloc()} and the convenience of the Lisp object
|
|
1589 system.
|
|
1590 @item lstream
|
|
1591 A buffering I/O stream, used to provide a unified interface to anything
|
|
1592 that can accept output or provide input, such as a file descriptor, a
|
|
1593 stdio stream, a chunk of memory, a Lisp buffer, a Lisp string, etc.;
|
|
1594 it's a Lisp object to make its memory management more convenient.
|
|
1595 @item char-table-entry
|
|
1596 Subsidiary objects in the internal char-table representation.
|
|
1597 @item extent-auxiliary
|
|
1598 @itemx menubar-data
|
|
1599 @itemx toolbar-data
|
|
1600 Various special-purpose objects that are basically just used to
|
|
1601 encapsulate memory for particular subsystems, similar to the more
|
|
1602 general ``opaque'' object.
|
|
1603 @item symbol-value-forward
|
|
1604 @itemx symbol-value-buffer-local
|
|
1605 @itemx symbol-value-varalias
|
|
1606 @itemx symbol-value-lisp-magic
|
|
1607 Special internal-only objects that are placed in the value cell of a
|
|
1608 symbol to indicate that there is something special with this variable --
|
|
1609 e.g. it has no value, it mirrors another variable, or it mirrors some C
|
|
1610 variable; there is really only one kind of object, called a
|
|
1611 @dfn{symbol-value-magic}, but it is sort-of halfway kludged into
|
|
1612 semi-different object types.
|
|
1613 @end table
|
|
1614
|
|
1615 @cindex permanent objects
|
|
1616 @cindex temporary objects
|
|
1617 Some types of objects are @dfn{permanent}, meaning that once created,
|
|
1618 they do not disappear until explicitly destroyed, using a function such
|
|
1619 as @code{delete-buffer}, @code{delete-window}, @code{delete-frame}, etc.
|
|
1620 Others will disappear once they are not longer used, through the garbage
|
|
1621 collection mechanism. Buffers, frames, windows, devices, and processes
|
|
1622 are among the objects that are permanent. Note that some objects can go
|
|
1623 both ways: Faces can be created either way; extents are normally
|
|
1624 permanent, but detached extents (extents not referring to any text, as
|
|
1625 happens to some extents when the text they are referring to is deleted)
|
|
1626 are temporary. Note that some permanent objects, such as faces and
|
|
1627 coding systems, cannot be deleted. Note also that windows are unique in
|
|
1628 that they can be @emph{undeleted} after having previously been
|
|
1629 deleted. (This happens as a result of restoring a window configuration.)
|
|
1630
|
|
1631 @cindex read syntax
|
|
1632 Note that many types of objects have a @dfn{read syntax}, i.e. a way of
|
|
1633 specifying an object of that type in Lisp code. When you load a Lisp
|
|
1634 file, or type in code to be evaluated, what really happens is that the
|
|
1635 function @code{read} is called, which reads some text and creates an object
|
|
1636 based on the syntax of that text; then @code{eval} is called, which
|
|
1637 possibly does something special; then this loop repeats until there's
|
|
1638 no more text to read. (@code{eval} only actually does something special
|
|
1639 with symbols, which causes the symbol's value to be returned,
|
|
1640 similar to referencing a variable; and with conses [i.e. lists],
|
|
1641 which cause a function invocation. All other values are returned
|
|
1642 unchanged.)
|
|
1643
|
|
1644 The read syntax
|
|
1645
|
|
1646 @example
|
|
1647 17297
|
|
1648 @end example
|
|
1649
|
|
1650 converts to an integer whose value is 17297.
|
|
1651
|
|
1652 @example
|
|
1653 1.983e-4
|
|
1654 @end example
|
|
1655
|
|
1656 converts to a float whose value is 1.983e-4, or .0001983.
|
|
1657
|
|
1658 @example
|
|
1659 ?b
|
|
1660 @end example
|
|
1661
|
|
1662 converts to a char that represents the lowercase letter b.
|
|
1663
|
|
1664 @example
|
|
1665 ?^[$(B#&^[(B
|
|
1666 @end example
|
|
1667
|
|
1668 (where @samp{^[} actually is an @samp{ESC} character) converts to a
|
|
1669 particular Kanji character when using an ISO2022-based coding system for
|
|
1670 input. (To decode this goo: @samp{ESC} begins an escape sequence;
|
|
1671 @samp{ESC $ (} is a class of escape sequences meaning ``switch to a
|
|
1672 94x94 character set''; @samp{ESC $ ( B} means ``switch to Japanese
|
|
1673 Kanji''; @samp{#} and @samp{&} collectively index into a 94-by-94 array
|
|
1674 of characters [subtract 33 from the ASCII value of each character to get
|
|
1675 the corresponding index]; @samp{ESC (} is a class of escape sequences
|
|
1676 meaning ``switch to a 94 character set''; @samp{ESC (B} means ``switch
|
|
1677 to US ASCII''. It is a coincidence that the letter @samp{B} is used to
|
|
1678 denote both Japanese Kanji and US ASCII. If the first @samp{B} were
|
|
1679 replaced with an @samp{A}, you'd be requesting a Chinese Hanzi character
|
|
1680 from the GB2312 character set.)
|
|
1681
|
|
1682 @example
|
|
1683 "foobar"
|
|
1684 @end example
|
|
1685
|
|
1686 converts to a string.
|
|
1687
|
|
1688 @example
|
|
1689 foobar
|
|
1690 @end example
|
|
1691
|
|
1692 converts to a symbol whose name is @code{"foobar"}. This is done by
|
|
1693 looking up the string equivalent in the global variable
|
|
1694 @code{obarray}, whose contents should be an obarray. If no symbol
|
|
1695 is found, a new symbol with the name @code{"foobar"} is automatically
|
|
1696 created and added to @code{obarray}; this process is called
|
|
1697 @dfn{interning} the symbol.
|
|
1698 @cindex interning
|
|
1699
|
|
1700 @example
|
|
1701 (foo . bar)
|
|
1702 @end example
|
|
1703
|
|
1704 converts to a cons cell containing the symbols @code{foo} and @code{bar}.
|
|
1705
|
|
1706 @example
|
|
1707 (1 a 2.5)
|
|
1708 @end example
|
|
1709
|
|
1710 converts to a three-element list containing the specified objects
|
|
1711 (note that a list is actually a set of nested conses; see the
|
|
1712 XEmacs Lisp Reference).
|
|
1713
|
|
1714 @example
|
|
1715 [1 a 2.5]
|
|
1716 @end example
|
|
1717
|
|
1718 converts to a three-element vector containing the specified objects.
|
|
1719
|
|
1720 @example
|
|
1721 #[... ... ... ...]
|
|
1722 @end example
|
|
1723
|
|
1724 converts to a compiled-function object (the actual contents are not
|
|
1725 shown since they are not relevant here; look at a file that ends with
|
|
1726 @file{.elc} for examples).
|
|
1727
|
|
1728 @example
|
|
1729 #*01110110
|
|
1730 @end example
|
|
1731
|
|
1732 converts to a bit-vector.
|
|
1733
|
|
1734 @example
|
|
1735 #s(hash-table ... ...)
|
|
1736 @end example
|
|
1737
|
|
1738 converts to a hash table (the actual contents are not shown).
|
|
1739
|
|
1740 @example
|
|
1741 #s(range-table ... ...)
|
|
1742 @end example
|
|
1743
|
|
1744 converts to a range table (the actual contents are not shown).
|
|
1745
|
|
1746 @example
|
|
1747 #s(char-table ... ...)
|
|
1748 @end example
|
|
1749
|
|
1750 converts to a char table (the actual contents are not shown).
|
|
1751
|
|
1752 Note that the @code{#s()} syntax is the general syntax for structures,
|
|
1753 which are not really implemented in XEmacs Lisp but should be.
|
|
1754
|
|
1755 When an object is printed out (using @code{print} or a related
|
|
1756 function), the read syntax is used, so that the same object can be read
|
|
1757 in again.
|
|
1758
|
|
1759 The other objects do not have read syntaxes, usually because it does not
|
|
1760 really make sense to create them in this fashion (i.e. processes, where
|
|
1761 it doesn't make sense to have a subprocess created as a side effect of
|
|
1762 reading some Lisp code), or because they can't be created at all
|
|
1763 (e.g. subrs). Permanent objects, as a rule, do not have a read syntax;
|
|
1764 nor do most complex objects, which contain too much state to be easily
|
|
1765 initialized through a read syntax.
|
|
1766
|
868
|
1767 @node How Lisp Objects Are Represented in C, Major Textual Changes, The XEmacs Object System (Abstractly Speaking), Top
|
428
|
1768 @chapter How Lisp Objects Are Represented in C
|
462
|
1769 @cindex Lisp objects are represented in C, how
|
|
1770 @cindex objects are represented in C, how Lisp
|
|
1771 @cindex represented in C, how Lisp objects are
|
428
|
1772
|
|
1773 Lisp objects are represented in C using a 32-bit or 64-bit machine word
|
|
1774 (depending on the processor; i.e. DEC Alphas use 64-bit Lisp objects and
|
|
1775 most other processors use 32-bit Lisp objects). The representation
|
|
1776 stuffs a pointer together with a tag, as follows:
|
|
1777
|
|
1778 @example
|
|
1779 [ 3 3 2 2 2 2 2 2 2 2 2 2 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 ]
|
|
1780 [ 1 0 9 8 7 6 5 4 3 2 1 0 9 8 7 6 5 4 3 2 1 0 9 8 7 6 5 4 3 2 1 0 ]
|
|
1781
|
442
|
1782 <---------------------------------------------------------> <->
|
|
1783 a pointer to a structure, or an integer tag
|
|
1784 @end example
|
|
1785
|
|
1786 A tag of 00 is used for all pointer object types, a tag of 10 is used
|
|
1787 for characters, and the other two tags 01 and 11 are joined together to
|
|
1788 form the integer object type. This representation gives us 31 bit
|
|
1789 integers and 30 bit characters, while pointers are represented directly
|
|
1790 without any bit masking or shifting. This representation, though,
|
|
1791 assumes that pointers to structs are always aligned to multiples of 4,
|
|
1792 so the lower 2 bits are always zero.
|
428
|
1793
|
|
1794 Lisp objects use the typedef @code{Lisp_Object}, but the actual C type
|
|
1795 used for the Lisp object can vary. It can be either a simple type
|
|
1796 (@code{long} on the DEC Alpha, @code{int} on other machines) or a
|
|
1797 structure whose fields are bit fields that line up properly (actually, a
|
|
1798 union of structures is used). Generally the simple integral type is
|
|
1799 preferable because it ensures that the compiler will actually use a
|
|
1800 machine word to represent the object (some compilers will use more
|
|
1801 general and less efficient code for unions and structs even if they can
|
|
1802 fit in a machine word). The union type, however, has the advantage of
|
442
|
1803 stricter type checking. If you accidentally pass an integer where a Lisp
|
|
1804 object is desired, you get a compile error. The choice of which type
|
|
1805 to use is determined by the preprocessor constant @code{USE_UNION_TYPE}
|
|
1806 which is defined via the @code{--use-union-type} option to
|
|
1807 @code{configure}.
|
|
1808
|
|
1809 Various macros are used to convert between Lisp_Objects and the
|
|
1810 corresponding C type. Macros of the form @code{XINT()}, @code{XCHAR()},
|
|
1811 @code{XSTRING()}, @code{XSYMBOL()}, do any required bit shifting and/or
|
|
1812 masking and cast it to the appropriate type. @code{XINT()} needs to be
|
|
1813 a bit tricky so that negative numbers are properly sign-extended. Since
|
|
1814 integers are stored left-shifted, if the right-shift operator does an
|
|
1815 arithmetic shift (i.e. it leaves the most-significant bit as-is rather
|
|
1816 than shifting in a zero, so that it mimics a divide-by-two even for
|
|
1817 negative numbers) the shift to remove the tag bit is enough. This is
|
|
1818 the case on all the systems we support.
|
|
1819
|
|
1820 Note that when @code{ERROR_CHECK_TYPECHECK} is defined, the converter
|
440
|
1821 macros become more complicated---they check the tag bits and/or the
|
428
|
1822 type field in the first four bytes of a record type to ensure that the
|
|
1823 object is really of the correct type. This is great for catching places
|
440
|
1824 where an incorrect type is being dereferenced---this typically results
|
428
|
1825 in a pointer being dereferenced as the wrong type of structure, with
|
|
1826 unpredictable (and sometimes not easily traceable) results.
|
|
1827
|
|
1828 There are similar @code{XSET@var{TYPE}()} macros that construct a Lisp
|
|
1829 object. These macros are of the form @code{XSET@var{TYPE}
|
442
|
1830 (@var{lvalue}, @var{result})}, i.e. they have to be a statement rather
|
|
1831 than just used in an expression. The reason for this is that standard C
|
|
1832 doesn't let you ``construct'' a structure (but GCC does). Granted, this
|
|
1833 sometimes isn't too convenient; for the case of integers, at least, you
|
|
1834 can use the function @code{make_int()}, which constructs and
|
|
1835 @emph{returns} an integer Lisp object. Note that the
|
|
1836 @code{XSET@var{TYPE}()} macros are also affected by
|
|
1837 @code{ERROR_CHECK_TYPECHECK} and make sure that the structure is of the
|
|
1838 right type in the case of record types, where the type is contained in
|
|
1839 the structure.
|
428
|
1840
|
|
1841 The C programmer is responsible for @strong{guaranteeing} that a
|
442
|
1842 Lisp_Object is the correct type before using the @code{X@var{TYPE}}
|
428
|
1843 macros. This is especially important in the case of lists. Use
|
|
1844 @code{XCAR} and @code{XCDR} if a Lisp_Object is certainly a cons cell,
|
|
1845 else use @code{Fcar()} and @code{Fcdr()}. Trust other C code, but not
|
|
1846 Lisp code. On the other hand, if XEmacs has an internal logic error,
|
442
|
1847 it's better to crash immediately, so sprinkle @code{assert()}s and
|
|
1848 ``unreachable'' @code{abort()}s liberally about the source code. Where
|
|
1849 performance is an issue, use @code{type_checking_assert},
|
|
1850 @code{bufpos_checking_assert}, and @code{gc_checking_assert}, which do
|
|
1851 nothing unless the corresponding configure error checking flag was
|
|
1852 specified.
|
428
|
1853
|
868
|
1854 @node Major Textual Changes, Rules When Writing New C Code, How Lisp Objects Are Represented in C, Top
|
|
1855 @chapter Major Textual Changes
|
|
1856 @cindex textual changes, major
|
|
1857 @cindex major textual changes
|
|
1858
|
|
1859 Sometimes major textual changes are made to the source. This means that
|
|
1860 a search-and-replace is done to change type names and such. Some people
|
|
1861 disagree with such changes, and certainly if done without good reason
|
|
1862 will just lead to headaches. But it's important to keep the code clean
|
|
1863 and understable, and consistent naming goes a long way towards this.
|
|
1864
|
|
1865 An example of the right way to do this was the so-called "great integral
|
|
1866 type renaming".
|
|
1867
|
|
1868 @menu
|
|
1869 * Great Integral Type Renaming::
|
|
1870 * Text/Char Type Renaming::
|
|
1871 @end menu
|
|
1872
|
|
1873 @node Great Integral Type Renaming
|
|
1874 @section Great Integral Type Renaming
|
|
1875 @cindex Great Integral Type Renaming
|
|
1876 @cindex integral type renaming, great
|
|
1877 @cindex type renaming, integral
|
|
1878 @cindex renaming, integral types
|
|
1879
|
|
1880 The purpose of this is to rationalize the names used for various
|
|
1881 integral types, so that they match their intended uses and follow
|
|
1882 consist conventions, and eliminate types that were not semantically
|
|
1883 different from each other.
|
|
1884
|
|
1885 The conventions are:
|
|
1886
|
|
1887 @itemize @bullet
|
|
1888 @item
|
|
1889 All integral types that measure quantities of anything are signed. Some
|
|
1890 people disagree vociferously with this, but their arguments are mostly
|
|
1891 theoretical, and are vastly outweighed by the practical headaches of
|
|
1892 mixing signed and unsigned values, and more importantly by the far
|
|
1893 increased likelihood of inadvertent bugs: Because of the broken "viral"
|
|
1894 nature of unsigned quantities in C (operations involving mixed
|
|
1895 signed/unsigned are done unsigned, when exactly the opposite is nearly
|
|
1896 always wanted), even a single error in declaring a quantity unsigned
|
|
1897 that should be signed, or even the even more subtle error of comparing
|
|
1898 signed and unsigned values and forgetting the necessary cast, can be
|
|
1899 catastrophic, as comparisons will yield wrong results. -Wsign-compare
|
|
1900 is turned on specifically to catch this, but this tends to result in a
|
|
1901 great number of warnings when mixing signed and unsigned, and the casts
|
|
1902 are annoying. More has been written on this elsewhere.
|
|
1903
|
|
1904 @item
|
|
1905 All such quantity types just mentioned boil down to EMACS_INT, which is
|
|
1906 32 bits on 32-bit machines and 64 bits on 64-bit machines. This is
|
|
1907 guaranteed to be the same size as Lisp objects of type `int', and (as
|
|
1908 far as I can tell) of size_t (unsigned!) and ssize_t. The only type
|
|
1909 below that is not an EMACS_INT is Hashcode, which is an unsigned value
|
|
1910 of the same size as EMACS_INT.
|
|
1911
|
|
1912 @item
|
|
1913 Type names should be relatively short (no more than 10 characters or
|
|
1914 so), with the first letter capitalized and no underscores if they can at
|
|
1915 all be avoided.
|
|
1916
|
|
1917 @item
|
|
1918 "count" == a zero-based measurement of some quantity. Includes sizes,
|
|
1919 offsets, and indexes.
|
|
1920
|
|
1921 @item
|
|
1922 "bpos" == a one-based measurement of a position in a buffer. "Charbpos"
|
|
1923 and "Bytebpos" count text in the buffer, rather than bytes in memory;
|
|
1924 thus Bytebpos does not directly correspond to the memory representation.
|
|
1925 Use "Membpos" for this.
|
|
1926
|
|
1927 @item
|
|
1928 "Char" refers to internal-format characters, not to the C type "char",
|
|
1929 which is really a byte.
|
|
1930 @end itemize
|
|
1931
|
|
1932 For the actual name changes, see the script below.
|
|
1933
|
|
1934 I ran the following script to do the conversion. (NOTE: This script is
|
|
1935 idempotent. You can safely run it multiple times and it will not screw
|
|
1936 up previous results -- in fact, it will do nothing if nothing has
|
|
1937 changed. Thus, it can be run repeatedly as necessary to handle patches
|
|
1938 coming in from old workspaces, or old branches.) There are two tags,
|
|
1939 just before and just after the change: @samp{pre-integral-type-rename}
|
|
1940 and @samp{post-integral-type-rename}. When merging code from the main
|
|
1941 trunk into a branch, the best thing to do is first merge up to
|
|
1942 @samp{pre-integral-type-rename}, then apply the script and associated
|
|
1943 changes, then merge from @samp{post-integral-type-change} to the
|
|
1944 present. (Alternatively, just do the merging in one operation; but you
|
|
1945 may then have a lot of conflicts needing to be resolved by hand.)
|
|
1946
|
|
1947 Script @samp{fixtypes.sh} follows:
|
|
1948
|
|
1949 @example
|
|
1950 ----------------------------------- cut ------------------------------------
|
|
1951 files="*.[ch] s/*.h m/*.h config.h.in ../configure.in Makefile.in.in ../lib-src/*.[ch] ../lwlib/*.[ch]"
|
|
1952 gr Memory_Count Bytecount $files
|
|
1953 gr Lstream_Data_Count Bytecount $files
|
|
1954 gr Element_Count Elemcount $files
|
|
1955 gr Hash_Code Hashcode $files
|
|
1956 gr extcount bytecount $files
|
|
1957 gr bufpos charbpos $files
|
|
1958 gr bytind bytebpos $files
|
|
1959 gr memind membpos $files
|
|
1960 gr bufbyte intbyte $files
|
|
1961 gr Extcount Bytecount $files
|
|
1962 gr Bufpos Charbpos $files
|
|
1963 gr Bytind Bytebpos $files
|
|
1964 gr Memind Membpos $files
|
|
1965 gr Bufbyte Intbyte $files
|
|
1966 gr EXTCOUNT BYTECOUNT $files
|
|
1967 gr BUFPOS CHARBPOS $files
|
|
1968 gr BYTIND BYTEBPOS $files
|
|
1969 gr MEMIND MEMBPOS $files
|
|
1970 gr BUFBYTE INTBYTE $files
|
|
1971 gr MEMORY_COUNT BYTECOUNT $files
|
|
1972 gr LSTREAM_DATA_COUNT BYTECOUNT $files
|
|
1973 gr ELEMENT_COUNT ELEMCOUNT $files
|
|
1974 gr HASH_CODE HASHCODE $files
|
|
1975 ----------------------------------- cut ------------------------------------
|
|
1976 @end example
|
|
1977
|
|
1978 The @samp{gr} script, and the scripts it uses, are documented in
|
|
1979 @file{README.global-renaming}, because if placed in this file they would
|
|
1980 need to have their @@ characters doubled, meaning you couldn't easily
|
|
1981 cut and paste from the source.
|
|
1982
|
|
1983 In addition to those programs, I needed to fix up a few other
|
|
1984 things, particularly relating to the duplicate definitions of
|
|
1985 types, now that some types merged with others. Specifically:
|
|
1986
|
|
1987 @enumerate
|
|
1988 @item
|
|
1989 in lisp.h, removed duplicate declarations of Bytecount. The changed
|
|
1990 code should now look like this: (In each code snippet below, the first
|
|
1991 and last lines are the same as the original, as are all lines outside of
|
|
1992 those lines. That allows you to locate the section to be replaced, and
|
|
1993 replace the stuff in that section, verifying that there isn't anything
|
|
1994 new added that would need to be kept.)
|
|
1995
|
|
1996 @example
|
|
1997 --------------------------------- snip -------------------------------------
|
|
1998 /* Counts of bytes or chars */
|
|
1999 typedef EMACS_INT Bytecount;
|
|
2000 typedef EMACS_INT Charcount;
|
|
2001
|
|
2002 /* Counts of elements */
|
|
2003 typedef EMACS_INT Elemcount;
|
|
2004
|
|
2005 /* Hash codes */
|
|
2006 typedef unsigned long Hashcode;
|
|
2007
|
|
2008 /* ------------------------ dynamic arrays ------------------- */
|
|
2009 --------------------------------- snip -------------------------------------
|
|
2010 @end example
|
|
2011
|
|
2012 @item
|
|
2013 in lstream.h, removed duplicate declaration of Bytecount. Rewrote the
|
|
2014 comment about this type. The changed code should now look like this:
|
|
2015
|
|
2016 @example
|
|
2017 --------------------------------- snip -------------------------------------
|
|
2018 #endif
|
|
2019
|
|
2020 /* The have been some arguments over the what the type should be that
|
|
2021 specifies a count of bytes in a data block to be written out or read in,
|
|
2022 using Lstream_read(), Lstream_write(), and related functions.
|
|
2023 Originally it was long, which worked fine; Martin "corrected" these to
|
|
2024 size_t and ssize_t on the grounds that this is theoretically cleaner and
|
|
2025 is in keeping with the C standards. Unfortunately, this practice is
|
|
2026 horribly error-prone due to design flaws in the way that mixed
|
|
2027 signed/unsigned arithmetic happens. In fact, by doing this change,
|
|
2028 Martin introduced a subtle but fatal error that caused the operation of
|
|
2029 sending large mail messages to the SMTP server under Windows to fail.
|
|
2030 By putting all values back to be signed, avoiding any signed/unsigned
|
|
2031 mixing, the bug immediately went away. The type then in use was
|
|
2032 Lstream_Data_Count, so that it be reverted cleanly if a vote came to
|
|
2033 that. Now it is Bytecount.
|
|
2034
|
|
2035 Some earlier comments about why the type must be signed: This MUST BE
|
|
2036 SIGNED, since it also is used in functions that return the number of
|
|
2037 bytes actually read to or written from in an operation, and these
|
|
2038 functions can return -1 to signal error.
|
|
2039
|
|
2040 Note that the standard Unix read() and write() functions define the
|
|
2041 count going in as a size_t, which is UNSIGNED, and the count going
|
|
2042 out as an ssize_t, which is SIGNED. This is a horrible design
|
|
2043 flaw. Not only is it highly likely to lead to logic errors when a
|
|
2044 -1 gets interpreted as a large positive number, but operations are
|
|
2045 bound to fail in all sorts of horrible ways when a number in the
|
|
2046 upper-half of the size_t range is passed in -- this number is
|
|
2047 unrepresentable as an ssize_t, so code that checks to see how many
|
|
2048 bytes are actually written (which is mandatory if you are dealing
|
|
2049 with certain types of devices) will get completely screwed up.
|
|
2050
|
|
2051 --ben
|
|
2052 */
|
|
2053
|
|
2054 typedef enum lstream_buffering
|
|
2055 --------------------------------- snip -------------------------------------
|
|
2056 @end example
|
|
2057
|
|
2058 @item
|
|
2059 in dumper.c, there are four places, all inside of switch() statements,
|
|
2060 where XD_BYTECOUNT appears twice as a case tag. In each case, the two
|
|
2061 case blocks contain identical code, and you should *REMOVE THE SECOND*
|
|
2062 and leave the first.
|
|
2063 @end enumerate
|
|
2064
|
|
2065 @node Text/Char Type Renaming
|
|
2066 @section Text/Char Type Renaming
|
|
2067 @cindex Text/Char Type Renaming
|
|
2068 @cindex type renaming, text/char
|
|
2069 @cindex renaming, text/char types
|
|
2070
|
|
2071 The purpose of this was
|
|
2072
|
|
2073 @enumerate
|
|
2074 @item
|
|
2075 To distinguish between ``charptr'' when it refers to operations on
|
|
2076 the pointer itself and when it refers to operations on text
|
|
2077 @item
|
|
2078 To use consistent naming for everything referring to internal format, i.e.
|
|
2079 @end enumerate
|
|
2080
|
|
2081 @example
|
|
2082 Itext == text in internal format
|
|
2083 Ibyte == a byte in such text
|
|
2084 Ichar == a char as represented in internal character format
|
|
2085 @end example
|
|
2086
|
|
2087 Thus e.g.
|
|
2088
|
|
2089 @example
|
|
2090 set_charptr_emchar -> set_itext_ichar
|
|
2091 @end example
|
|
2092
|
|
2093 This was done using a script like this:
|
|
2094
|
|
2095 @example
|
|
2096 files="*.[ch] s/*.h m/*.h config.h.in ../configure.in Makefile.in.in ../lib-src/*.[ch] ../lwlib/*.[ch]"
|
|
2097 gr Intbyte Ibyte $files
|
|
2098 gr INTBYTE IBYTE $files
|
|
2099 gr intbyte ibyte $files
|
|
2100 gr EMCHAR ICHAR $files
|
|
2101 gr emchar ichar $files
|
|
2102 gr Emchar Ichar $files
|
|
2103 gr INC_CHARPTR INC_IBYTEPTR $files
|
|
2104 gr DEC_CHARPTR DEC_IBYTEPTR $files
|
|
2105 gr VALIDATE_CHARPTR VALIDATE_IBYTEPTR $files
|
|
2106 gr valid_charptr valid_ibyteptr $files
|
|
2107 gr CHARPTR ITEXT $files
|
|
2108 gr charptr itext $files
|
|
2109 gr Charptr Itext $files
|
|
2110 @end example
|
|
2111
|
|
2112 See above for the source to @samp{gr}.
|
|
2113
|
|
2114 As in the integral-types change, there are pre and post tags before and
|
|
2115 after the change:
|
|
2116
|
|
2117 @example
|
|
2118 pre-internal-format-textual-renaming
|
|
2119 post-internal-format-textual-renaming
|
|
2120 @end example
|
|
2121
|
|
2122 When merging a large branch, follow the same sort of procedure
|
|
2123 documented above, using these tags -- essentially sync up to the pre
|
|
2124 tag, then apply the script yourself, then sync from the post tag to the
|
|
2125 present. You can probably do the same if you don't have a separate
|
|
2126 workspace, but do have lots of outstanding changes and you'd rather not
|
|
2127 just merge all the textual changes directly. Use something like this:
|
|
2128
|
|
2129 (WARNING: I'm not a CVS guru; before trying this, or any large operation
|
|
2130 that might potentially mess things up, *DEFINITELY* make a backup of
|
|
2131 your existing workspace.)
|
|
2132
|
|
2133 @example
|
|
2134 cup -r pre-internal-format-textual-renaming
|
|
2135 <apply script>
|
|
2136 cup -A -j post-internal-format-textual-renaming -j HEAD
|
|
2137 @end example
|
|
2138
|
|
2139 This might also work:
|
|
2140
|
|
2141 @example
|
|
2142 cup -j pre-internal-format-textual-renaming
|
|
2143 <apply script>
|
|
2144 cup -j post-internal-format-textual-renaming -j HEAD
|
|
2145 @end example
|
|
2146
|
|
2147 ben
|
|
2148
|
|
2149 The following is a script to go in the opposite direction:
|
|
2150
|
|
2151 @example
|
|
2152 files="*.[ch] s/*.h m/*.h config.h.in ../configure.in Makefile.in.in ../lib-src/*.[ch] ../lwlib/*.[ch]"
|
|
2153
|
|
2154 # Evidently Perl considers _ to be a word char ala \b, even though XEmacs
|
|
2155 # doesn't. We need to be careful here with ibyte/ichar because of words
|
|
2156 # like Richard, eicharlen(), multibyte, HIBYTE, etc.
|
|
2157
|
|
2158 gr Ibyte Intbyte $files
|
|
2159 gr '\bIBYTE' INTBYTE $files
|
|
2160 gr '\bibyte' intbyte $files
|
|
2161 gr '\bICHAR' EMCHAR $files
|
|
2162 gr '\bichar' emchar $files
|
|
2163 gr '\bIchar' Emchar $files
|
|
2164 gr '\bIBYTEPTR' CHARPTR $files
|
|
2165 gr '\bibyteptr' charptr $files
|
|
2166 gr '\bITEXT' CHARPTR $files
|
|
2167 gr '\bitext' charptr $files
|
|
2168 gr '\bItext' CHARPTR $files
|
|
2169
|
|
2170 gr '_IBYTE' _INTBYTE $files
|
|
2171 gr '_ibyte' _intbyte $files
|
|
2172 gr '_ICHAR' _EMCHAR $files
|
|
2173 gr '_ichar' _emchar $files
|
|
2174 gr '_Ichar' _Emchar $files
|
|
2175 gr '_IBYTEPTR' _CHARPTR $files
|
|
2176 gr '_ibyteptr' _charptr $files
|
|
2177 gr '_ITEXT' _CHARPTR $files
|
|
2178 gr '_itext' _charptr $files
|
|
2179 gr '_Itext' _CHARPTR $files
|
|
2180 @end example
|
|
2181
|
965
|
2182 @node Rules When Writing New C Code, Regression Testing XEmacs, Major Textual Changes, Top
|
428
|
2183 @chapter Rules When Writing New C Code
|
462
|
2184 @cindex writing new C code, rules when
|
|
2185 @cindex C code, rules when writing new
|
|
2186 @cindex code, rules when writing new C
|
428
|
2187
|
|
2188 The XEmacs C Code is extremely complex and intricate, and there are many
|
|
2189 rules that are more or less consistently followed throughout the code.
|
|
2190 Many of these rules are not obvious, so they are explained here. It is
|
|
2191 of the utmost importance that you follow them. If you don't, you may
|
|
2192 get something that appears to work, but which will crash in odd
|
|
2193 situations, often in code far away from where the actual breakage is.
|
|
2194
|
|
2195 @menu
|
|
2196 * General Coding Rules::
|
|
2197 * Writing Lisp Primitives::
|
462
|
2198 * Writing Good Comments::
|
428
|
2199 * Adding Global Lisp Variables::
|
462
|
2200 * Proper Use of Unsigned Types::
|
428
|
2201 * Coding for Mule::
|
|
2202 * Techniques for XEmacs Developers::
|
|
2203 @end menu
|
|
2204
|
462
|
2205 @node General Coding Rules
|
428
|
2206 @section General Coding Rules
|
462
|
2207 @cindex coding rules, general
|
428
|
2208
|
|
2209 The C code is actually written in a dialect of C called @dfn{Clean C},
|
|
2210 meaning that it can be compiled, mostly warning-free, with either a C or
|
|
2211 C++ compiler. Coding in Clean C has several advantages over plain C.
|
|
2212 C++ compilers are more nit-picking, and a number of coding errors have
|
|
2213 been found by compiling with C++. The ability to use both C and C++
|
|
2214 tools means that a greater variety of development tools are available to
|
|
2215 the developer.
|
|
2216
|
|
2217 Every module includes @file{<config.h>} (angle brackets so that
|
|
2218 @samp{--srcdir} works correctly; @file{config.h} may or may not be in
|
|
2219 the same directory as the C sources) and @file{lisp.h}. @file{config.h}
|
|
2220 must always be included before any other header files (including
|
|
2221 system header files) to ensure that certain tricks played by various
|
|
2222 @file{s/} and @file{m/} files work out correctly.
|
|
2223
|
440
|
2224 When including header files, always use angle brackets, not double
|
442
|
2225 quotes, except when the file to be included is always in the same
|
|
2226 directory as the including file. If either file is a generated file,
|
|
2227 then that is not likely to be the case. In order to understand why we
|
|
2228 have this rule, imagine what happens when you do a build in the source
|
|
2229 directory using @samp{./configure} and another build in another
|
|
2230 directory using @samp{../work/configure}. There will be two different
|
|
2231 @file{config.h} files. Which one will be used if you @samp{#include
|
|
2232 "config.h"}?
|
440
|
2233
|
448
|
2234 Almost every module contains a @code{syms_of_*()} function and a
|
|
2235 @code{vars_of_*()} function. The former declares any Lisp primitives
|
|
2236 you have defined and defines any symbols you will be using. The latter
|
|
2237 declares any global Lisp variables you have added and initializes global
|
|
2238 C variables in the module. @strong{Important}: There are stringent
|
|
2239 requirements on exactly what can go into these functions. See the
|
|
2240 comment in @file{emacs.c}. The reason for this is to avoid obscure
|
|
2241 unwanted interactions during initialization. If you don't follow these
|
|
2242 rules, you'll be sorry! If you want to do anything that isn't allowed,
|
|
2243 create a @code{complex_vars_of_*()} function for it. Doing this is
|
|
2244 tricky, though: you have to make sure your function is called at the
|
|
2245 right time so that all the initialization dependencies work out.
|
|
2246
|
|
2247 Declare each function of these kinds in @file{symsinit.h}. Make sure
|
|
2248 it's called in the appropriate place in @file{emacs.c}. You never need
|
|
2249 to include @file{symsinit.h} directly, because it is included by
|
|
2250 @file{lisp.h}.
|
|
2251
|
428
|
2252 @strong{All global and static variables that are to be modifiable must
|
|
2253 be declared uninitialized.} This means that you may not use the
|
|
2254 ``declare with initializer'' form for these variables, such as @code{int
|
|
2255 some_variable = 0;}. The reason for this has to do with some kludges
|
|
2256 done during the dumping process: If possible, the initialized data
|
|
2257 segment is re-mapped so that it becomes part of the (unmodifiable) code
|
|
2258 segment in the dumped executable. This allows this memory to be shared
|
|
2259 among multiple running XEmacs processes. XEmacs is careful to place as
|
442
|
2260 much constant data as possible into initialized variables during the
|
|
2261 @file{temacs} phase.
|
428
|
2262
|
|
2263 @cindex copy-on-write
|
|
2264 @strong{Please note:} This kludge only works on a few systems nowadays,
|
|
2265 and is rapidly becoming irrelevant because most modern operating systems
|
|
2266 provide @dfn{copy-on-write} semantics. All data is initially shared
|
|
2267 between processes, and a private copy is automatically made (on a
|
|
2268 page-by-page basis) when a process first attempts to write to a page of
|
|
2269 memory.
|
|
2270
|
|
2271 Formerly, there was a requirement that static variables not be declared
|
|
2272 inside of functions. This had to do with another hack along the same
|
|
2273 vein as what was just described: old USG systems put statically-declared
|
|
2274 variables in the initialized data space, so those header files had a
|
|
2275 @code{#define static} declaration. (That way, the data-segment remapping
|
|
2276 described above could still work.) This fails badly on static variables
|
|
2277 inside of functions, which suddenly become automatic variables;
|
|
2278 therefore, you weren't supposed to have any of them. This awful kludge
|
|
2279 has been removed in XEmacs because
|
|
2280
|
|
2281 @enumerate
|
|
2282 @item
|
|
2283 almost all of the systems that used this kludge ended up having
|
|
2284 to disable the data-segment remapping anyway;
|
|
2285 @item
|
|
2286 the only systems that didn't were extremely outdated ones;
|
|
2287 @item
|
|
2288 this hack completely messed up inline functions.
|
|
2289 @end enumerate
|
|
2290
|
|
2291 The C source code makes heavy use of C preprocessor macros. One popular
|
|
2292 macro style is:
|
|
2293
|
|
2294 @example
|
442
|
2295 #define FOO(var, value) do @{ \
|
440
|
2296 Lisp_Object FOO_value = (value); \
|
|
2297 ... /* compute using FOO_value */ \
|
|
2298 (var) = bar; \
|
428
|
2299 @} while (0)
|
|
2300 @end example
|
|
2301
|
|
2302 The @code{do @{...@} while (0)} is a standard trick to allow FOO to have
|
|
2303 statement semantics, so that it can safely be used within an @code{if}
|
|
2304 statement in C, for example. Multiple evaluation is prevented by
|
|
2305 copying a supplied argument into a local variable, so that
|
|
2306 @code{FOO(var,fun(1))} only calls @code{fun} once.
|
|
2307
|
|
2308 Lisp lists are popular data structures in the C code as well as in
|
|
2309 Elisp. There are two sets of macros that iterate over lists.
|
|
2310 @code{EXTERNAL_LIST_LOOP_@var{n}} should be used when the list has been
|
|
2311 supplied by the user, and cannot be trusted to be acyclic and
|
444
|
2312 @code{nil}-terminated. A @code{malformed-list} or @code{circular-list} error
|
428
|
2313 will be generated if the list being iterated over is not entirely
|
|
2314 kosher. @code{LIST_LOOP_@var{n}}, on the other hand, is faster and less
|
|
2315 safe, and can be used only on trusted lists.
|
|
2316
|
|
2317 Related macros are @code{GET_EXTERNAL_LIST_LENGTH} and
|
|
2318 @code{GET_LIST_LENGTH}, which calculate the length of a list, and in the
|
|
2319 case of @code{GET_EXTERNAL_LIST_LENGTH}, validating the properness of
|
|
2320 the list. The macros @code{EXTERNAL_LIST_LOOP_DELETE_IF} and
|
|
2321 @code{LIST_LOOP_DELETE_IF} delete elements from a lisp list satisfying some
|
|
2322 predicate.
|
|
2323
|
462
|
2324 @node Writing Lisp Primitives
|
428
|
2325 @section Writing Lisp Primitives
|
462
|
2326 @cindex writing Lisp primitives
|
|
2327 @cindex Lisp primitives, writing
|
|
2328 @cindex primitives, writing Lisp
|
428
|
2329
|
|
2330 Lisp primitives are Lisp functions implemented in C. The details of
|
|
2331 interfacing the C function so that Lisp can call it are handled by a few
|
|
2332 C macros. The only way to really understand how to write new C code is
|
|
2333 to read the source, but we can explain some things here.
|
|
2334
|
|
2335 An example of a special form is the definition of @code{prog1}, from
|
|
2336 @file{eval.c}. (An ordinary function would have the same general
|
|
2337 appearance.)
|
|
2338
|
|
2339 @cindex garbage collection protection
|
|
2340 @smallexample
|
|
2341 @group
|
|
2342 DEFUN ("prog1", Fprog1, 1, UNEVALLED, 0, /*
|
|
2343 Similar to `progn', but the value of the first form is returned.
|
|
2344 \(prog1 FIRST BODY...): All the arguments are evaluated sequentially.
|
|
2345 The value of FIRST is saved during evaluation of the remaining args,
|
|
2346 whose values are discarded.
|
|
2347 */
|
|
2348 (args))
|
|
2349 @{
|
|
2350 /* This function can GC */
|
|
2351 REGISTER Lisp_Object val, form, tail;
|
|
2352 struct gcpro gcpro1;
|
|
2353
|
|
2354 val = Feval (XCAR (args));
|
|
2355
|
|
2356 GCPRO1 (val);
|
|
2357
|
|
2358 LIST_LOOP_3 (form, XCDR (args), tail)
|
|
2359 Feval (form);
|
|
2360
|
|
2361 UNGCPRO;
|
|
2362 return val;
|
|
2363 @}
|
|
2364 @end group
|
|
2365 @end smallexample
|
|
2366
|
|
2367 Let's start with a precise explanation of the arguments to the
|
|
2368 @code{DEFUN} macro. Here is a template for them:
|
|
2369
|
|
2370 @example
|
|
2371 @group
|
|
2372 DEFUN (@var{lname}, @var{fname}, @var{min_args}, @var{max_args}, @var{interactive}, /*
|
|
2373 @var{docstring}
|
|
2374 */
|
|
2375 (@var{arglist}))
|
|
2376 @end group
|
|
2377 @end example
|
|
2378
|
|
2379 @table @var
|
|
2380 @item lname
|
|
2381 This string is the name of the Lisp symbol to define as the function
|
|
2382 name; in the example above, it is @code{"prog1"}.
|
|
2383
|
|
2384 @item fname
|
|
2385 This is the C function name for this function. This is the name that is
|
|
2386 used in C code for calling the function. The name is, by convention,
|
|
2387 @samp{F} prepended to the Lisp name, with all dashes (@samp{-}) in the
|
|
2388 Lisp name changed to underscores. Thus, to call this function from C
|
|
2389 code, call @code{Fprog1}. Remember that the arguments are of type
|
|
2390 @code{Lisp_Object}; various macros and functions for creating values of
|
|
2391 type @code{Lisp_Object} are declared in the file @file{lisp.h}.
|
|
2392
|
|
2393 Primitives whose names are special characters (e.g. @code{+} or
|
|
2394 @code{<}) are named by spelling out, in some fashion, the special
|
|
2395 character: e.g. @code{Fplus()} or @code{Flss()}. Primitives whose names
|
|
2396 begin with normal alphanumeric characters but also contain special
|
|
2397 characters are spelled out in some creative way, e.g. @code{let*}
|
|
2398 becomes @code{FletX()}.
|
|
2399
|
|
2400 Each function also has an associated structure that holds the data for
|
|
2401 the subr object that represents the function in Lisp. This structure
|
|
2402 conveys the Lisp symbol name to the initialization routine that will
|
|
2403 create the symbol and store the subr object as its definition. The C
|
|
2404 variable name of this structure is always @samp{S} prepended to the
|
|
2405 @var{fname}. You hardly ever need to be aware of the existence of this
|
|
2406 structure, since @code{DEFUN} plus @code{DEFSUBR} takes care of all the
|
|
2407 details.
|
|
2408
|
|
2409 @item min_args
|
|
2410 This is the minimum number of arguments that the function requires. The
|
|
2411 function @code{prog1} allows a minimum of one argument.
|
|
2412
|
|
2413 @item max_args
|
|
2414 This is the maximum number of arguments that the function accepts, if
|
|
2415 there is a fixed maximum. Alternatively, it can be @code{UNEVALLED},
|
|
2416 indicating a special form that receives unevaluated arguments, or
|
|
2417 @code{MANY}, indicating an unlimited number of evaluated arguments (the
|
|
2418 C equivalent of @code{&rest}). Both @code{UNEVALLED} and @code{MANY}
|
|
2419 are macros. If @var{max_args} is a number, it may not be less than
|
|
2420 @var{min_args} and it may not be greater than 8. (If you need to add a
|
|
2421 function with more than 8 arguments, use the @code{MANY} form. Resist
|
|
2422 the urge to edit the definition of @code{DEFUN} in @file{lisp.h}. If
|
|
2423 you do it anyways, make sure to also add another clause to the switch
|
|
2424 statement in @code{primitive_funcall().})
|
|
2425
|
|
2426 @item interactive
|
|
2427 This is an interactive specification, a string such as might be used as
|
|
2428 the argument of @code{interactive} in a Lisp function. In the case of
|
|
2429 @code{prog1}, it is 0 (a null pointer), indicating that @code{prog1}
|
|
2430 cannot be called interactively. A value of @code{""} indicates a
|
|
2431 function that should receive no arguments when called interactively.
|
|
2432
|
|
2433 @item docstring
|
|
2434 This is the documentation string. It is written just like a
|
|
2435 documentation string for a function defined in Lisp; in particular, the
|
|
2436 first line should be a single sentence. Note how the documentation
|
|
2437 string is enclosed in a comment, none of the documentation is placed on
|
|
2438 the same lines as the comment-start and comment-end characters, and the
|
|
2439 comment-start characters are on the same line as the interactive
|
|
2440 specification. @file{make-docfile}, which scans the C files for
|
|
2441 documentation strings, is very particular about what it looks for, and
|
|
2442 will not properly extract the doc string if it's not in this exact format.
|
|
2443
|
|
2444 In order to make both @file{etags} and @file{make-docfile} happy, make
|
|
2445 sure that the @code{DEFUN} line contains the @var{lname} and
|
|
2446 @var{fname}, and that the comment-start characters for the doc string
|
|
2447 are on the same line as the interactive specification, and put a newline
|
|
2448 directly after them (and before the comment-end characters).
|
|
2449
|
|
2450 @item arglist
|
|
2451 This is the comma-separated list of arguments to the C function. For a
|
|
2452 function with a fixed maximum number of arguments, provide a C argument
|
|
2453 for each Lisp argument. In this case, unlike regular C functions, the
|
|
2454 types of the arguments are not declared; they are simply always of type
|
|
2455 @code{Lisp_Object}.
|
|
2456
|
|
2457 The names of the C arguments will be used as the names of the arguments
|
|
2458 to the Lisp primitive as displayed in its documentation, modulo the same
|
|
2459 concerns described above for @code{F...} names (in particular,
|
|
2460 underscores in the C arguments become dashes in the Lisp arguments).
|
|
2461
|
|
2462 There is one additional kludge: A trailing `_' on the C argument is
|
|
2463 discarded when forming the Lisp argument. This allows C language
|
|
2464 reserved words (like @code{default}) or global symbols (like
|
|
2465 @code{dirname}) to be used as argument names without compiler warnings
|
|
2466 or errors.
|
|
2467
|
|
2468 A Lisp function with @w{@var{max_args} = @code{UNEVALLED}} is a
|
|
2469 @w{@dfn{special form}}; its arguments are not evaluated. Instead it
|
|
2470 receives one argument of type @code{Lisp_Object}, a (Lisp) list of the
|
|
2471 unevaluated arguments, conventionally named @code{(args)}.
|
|
2472
|
|
2473 When a Lisp function has no upper limit on the number of arguments,
|
|
2474 specify @w{@var{max_args} = @code{MANY}}. In this case its implementation in
|
|
2475 C actually receives exactly two arguments: the number of Lisp arguments
|
|
2476 (an @code{int}) and the address of a block containing their values (a
|
|
2477 @w{@code{Lisp_Object *}}). In this case only are the C types specified
|
|
2478 in the @var{arglist}: @w{@code{(int nargs, Lisp_Object *args)}}.
|
|
2479
|
|
2480 @end table
|
|
2481
|
|
2482 Within the function @code{Fprog1} itself, note the use of the macros
|
|
2483 @code{GCPRO1} and @code{UNGCPRO}. @code{GCPRO1} is used to ``protect''
|
|
2484 a variable from garbage collection---to inform the garbage collector
|
|
2485 that it must look in that variable and regard the object pointed at by
|
|
2486 its contents as an accessible object. This is necessary whenever you
|
|
2487 call @code{Feval} or anything that can directly or indirectly call
|
|
2488 @code{Feval} (this includes the @code{QUIT} macro!). At such a time,
|
|
2489 any Lisp object that you intend to refer to again must be protected
|
|
2490 somehow. @code{UNGCPRO} cancels the protection of the variables that
|
|
2491 are protected in the current function. It is necessary to do this
|
|
2492 explicitly.
|
|
2493
|
|
2494 The macro @code{GCPRO1} protects just one local variable. If you want
|
|
2495 to protect two, use @code{GCPRO2} instead; repeating @code{GCPRO1} will
|
|
2496 not work. Macros @code{GCPRO3} and @code{GCPRO4} also exist.
|
|
2497
|
|
2498 These macros implicitly use local variables such as @code{gcpro1}; you
|
|
2499 must declare these explicitly, with type @code{struct gcpro}. Thus, if
|
|
2500 you use @code{GCPRO2}, you must declare @code{gcpro1} and @code{gcpro2}.
|
|
2501
|
|
2502 @cindex caller-protects (@code{GCPRO} rule)
|
|
2503 Note also that the general rule is @dfn{caller-protects}; i.e. you are
|
|
2504 only responsible for protecting those Lisp objects that you create. Any
|
|
2505 objects passed to you as arguments should have been protected by whoever
|
|
2506 created them, so you don't in general have to protect them.
|
|
2507
|
|
2508 In particular, the arguments to any Lisp primitive are always
|
|
2509 automatically @code{GCPRO}ed, when called ``normally'' from Lisp code or
|
|
2510 bytecode. So only a few Lisp primitives that are called frequently from
|
|
2511 C code, such as @code{Fprogn} protect their arguments as a service to
|
|
2512 their caller. You don't need to protect your arguments when writing a
|
|
2513 new @code{DEFUN}.
|
|
2514
|
|
2515 @code{GCPRO}ing is perhaps the trickiest and most error-prone part of
|
|
2516 XEmacs coding. It is @strong{extremely} important that you get this
|
|
2517 right and use a great deal of discipline when writing this code.
|
|
2518 @xref{GCPROing, ,@code{GCPRO}ing}, for full details on how to do this.
|
|
2519
|
|
2520 What @code{DEFUN} actually does is declare a global structure of type
|
|
2521 @code{Lisp_Subr} whose name begins with capital @samp{SF} and which
|
|
2522 contains information about the primitive (e.g. a pointer to the
|
|
2523 function, its minimum and maximum allowed arguments, a string describing
|
|
2524 its Lisp name); @code{DEFUN} then begins a normal C function declaration
|
|
2525 using the @code{F...} name. The Lisp subr object that is the function
|
|
2526 definition of a primitive (i.e. the object in the function slot of the
|
|
2527 symbol that names the primitive) actually points to this @samp{SF}
|
|
2528 structure; when @code{Feval} encounters a subr, it looks in the
|
|
2529 structure to find out how to call the C function.
|
|
2530
|
|
2531 Defining the C function is not enough to make a Lisp primitive
|
|
2532 available; you must also create the Lisp symbol for the primitive (the
|
|
2533 symbol is @dfn{interned}; @pxref{Obarrays}) and store a suitable subr
|
|
2534 object in its function cell. (If you don't do this, the primitive won't
|
|
2535 be seen by Lisp code.) The code looks like this:
|
|
2536
|
|
2537 @example
|
|
2538 DEFSUBR (@var{fname});
|
|
2539 @end example
|
|
2540
|
|
2541 @noindent
|
|
2542 Here @var{fname} is the same name you used as the second argument to
|
|
2543 @code{DEFUN}.
|
|
2544
|
|
2545 This call to @code{DEFSUBR} should go in the @code{syms_of_*()} function
|
|
2546 at the end of the module. If no such function exists, create it and
|
|
2547 make sure to also declare it in @file{symsinit.h} and call it from the
|
|
2548 appropriate spot in @code{main()}. @xref{General Coding Rules}.
|
|
2549
|
|
2550 Note that C code cannot call functions by name unless they are defined
|
|
2551 in C. The way to call a function written in Lisp from C is to use
|
|
2552 @code{Ffuncall}, which embodies the Lisp function @code{funcall}. Since
|
|
2553 the Lisp function @code{funcall} accepts an unlimited number of
|
|
2554 arguments, in C it takes two: the number of Lisp-level arguments, and a
|
|
2555 one-dimensional array containing their values. The first Lisp-level
|
|
2556 argument is the Lisp function to call, and the rest are the arguments to
|
|
2557 pass to it. Since @code{Ffuncall} can call the evaluator, you must
|
|
2558 protect pointers from garbage collection around the call to
|
|
2559 @code{Ffuncall}. (However, @code{Ffuncall} explicitly protects all of
|
|
2560 its parameters, so you don't have to protect any pointers passed as
|
|
2561 parameters to it.)
|
|
2562
|
|
2563 The C functions @code{call0}, @code{call1}, @code{call2}, and so on,
|
|
2564 provide handy ways to call a Lisp function conveniently with a fixed
|
|
2565 number of arguments. They work by calling @code{Ffuncall}.
|
|
2566
|
|
2567 @file{eval.c} is a very good file to look through for examples;
|
|
2568 @file{lisp.h} contains the definitions for important macros and
|
|
2569 functions.
|
|
2570
|
462
|
2571 @node Writing Good Comments
|
|
2572 @section Writing Good Comments
|
|
2573 @cindex writing good comments
|
|
2574 @cindex comments, writing good
|
|
2575
|
|
2576 Comments are a lifeline for programmers trying to understand tricky
|
|
2577 code. In general, the less obvious it is what you are doing, the more
|
|
2578 you need a comment, and the more detailed it needs to be. You should
|
|
2579 always be on guard when you're writing code for stuff that's tricky, and
|
|
2580 should constantly be putting yourself in someone else's shoes and asking
|
|
2581 if that person could figure out without much difficulty what's going
|
|
2582 on. (Assume they are a competent programmer who understands the
|
|
2583 essentials of how the XEmacs code is structured but doesn't know much
|
|
2584 about the module you're working on or any algorithms you're using.) If
|
|
2585 you're not sure whether they would be able to, add a comment. Always
|
|
2586 err on the side of more comments, rather than less.
|
|
2587
|
|
2588 Generally, when making comments, there is no need to attribute them with
|
|
2589 your name or initials. This especially goes for small,
|
|
2590 easy-to-understand, non-opinionated ones. Also, comments indicating
|
|
2591 where, when, and by whom a file was changed are @emph{strongly}
|
|
2592 discouraged, and in general will be removed as they are discovered.
|
|
2593 This is exactly what @file{ChangeLogs} are there for. However, it can
|
|
2594 occasionally be useful to mark exactly where (but not when or by whom)
|
|
2595 changes are made, particularly when making small changes to a file
|
|
2596 imported from elsewhere. These marks help when later on a newer version
|
|
2597 of the file is imported and the changes need to be merged. (If
|
|
2598 everything were always kept in CVS, there would be no need for this.
|
|
2599 But in practice, this often doesn't happen, or the CVS repository is
|
|
2600 later on lost or unavailable to the person doing the update.)
|
|
2601
|
|
2602 When putting in an explicit opinion in a comment, you should
|
|
2603 @emph{always} attribute it with your name, and optionally the date.
|
|
2604 This also goes for long, complex comments explaining in detail the
|
|
2605 workings of something -- by putting your name there, you make it
|
|
2606 possible for someone who has questions about how that thing works to
|
|
2607 determine who wrote the comment so they can write to them. Preferably,
|
|
2608 use your actual name and not your initials, unless your initials are
|
|
2609 generally recognized (e.g. @samp{jwz}). You can use only your first
|
|
2610 name if it's obvious who you are; otherwise, give first and last name.
|
|
2611 If you're not a regular contributor, you might consider putting your
|
|
2612 email address in -- it may be in the ChangeLog, but after awhile
|
|
2613 ChangeLogs have a tendency of disappearing or getting
|
|
2614 muddled. (E.g. your comment may get copied somewhere else or even into
|
|
2615 another program, and tracking down the proper ChangeLog may be very
|
|
2616 difficult.)
|
|
2617
|
|
2618 If you come across an opinion that is not or no longer valid, or you
|
|
2619 come across any comment that no longer applies but you want to keep it
|
|
2620 around, enclose it in @samp{[[ } and @samp{ ]]} marks and add a comment
|
|
2621 afterwards explaining why the preceding comment is no longer valid. Put
|
|
2622 your name on this comment, as explained above.
|
|
2623
|
|
2624 Just as comments are a lifeline to programmers, incorrect comments are
|
|
2625 death. If you come across an incorrect comment, @strong{immediately}
|
|
2626 correct it or flag it as incorrect, as described in the previous
|
|
2627 paragraph. Whenever you work on a section of code, @emph{always} make
|
|
2628 sure to update any comments to be correct -- or, at the very least, flag
|
|
2629 them as incorrect.
|
|
2630
|
|
2631 To indicate a "todo" or other problem, use four pound signs --
|
|
2632 i.e. @samp{####}.
|
|
2633
|
|
2634 @node Adding Global Lisp Variables
|
428
|
2635 @section Adding Global Lisp Variables
|
462
|
2636 @cindex global Lisp variables, adding
|
|
2637 @cindex variables, adding global Lisp
|
428
|
2638
|
|
2639 Global variables whose names begin with @samp{Q} are constants whose
|
|
2640 value is a symbol of a particular name. The name of the variable should
|
|
2641 be derived from the name of the symbol using the same rules as for Lisp
|
|
2642 primitives. These variables are initialized using a call to
|
|
2643 @code{defsymbol()} in the @code{syms_of_*()} function. (This call
|
|
2644 interns a symbol, sets the C variable to the resulting Lisp object, and
|
|
2645 calls @code{staticpro()} on the C variable to tell the
|
|
2646 garbage-collection mechanism about this variable. What
|
|
2647 @code{staticpro()} does is add a pointer to the variable to a large
|
|
2648 global array; when garbage-collection happens, all pointers listed in
|
|
2649 the array are used as starting points for marking Lisp objects. This is
|
|
2650 important because it's quite possible that the only current reference to
|
|
2651 the object is the C variable. In the case of symbols, the
|
|
2652 @code{staticpro()} doesn't matter all that much because the symbol is
|
|
2653 contained in @code{obarray}, which is itself @code{staticpro()}ed.
|
|
2654 However, it's possible that a naughty user could do something like
|
|
2655 uninterning the symbol out of @code{obarray} or even setting
|
|
2656 @code{obarray} to a different value [although this is likely to make
|
|
2657 XEmacs crash!].)
|
|
2658
|
|
2659 @strong{Please note:} It is potentially deadly if you declare a
|
|
2660 @samp{Q...} variable in two different modules. The two calls to
|
|
2661 @code{defsymbol()} are no problem, but some linkers will complain about
|
|
2662 multiply-defined symbols. The most insidious aspect of this is that
|
|
2663 often the link will succeed anyway, but then the resulting executable
|
|
2664 will sometimes crash in obscure ways during certain operations! To
|
|
2665 avoid this problem, declare any symbols with common names (such as
|
|
2666 @code{text}) that are not obviously associated with this particular
|
|
2667 module in the module @file{general.c}.
|
|
2668
|
|
2669 Global variables whose names begin with @samp{V} are variables that
|
|
2670 contain Lisp objects. The convention here is that all global variables
|
|
2671 of type @code{Lisp_Object} begin with @samp{V}, and all others don't
|
|
2672 (including integer and boolean variables that have Lisp
|
|
2673 equivalents). Most of the time, these variables have equivalents in
|
|
2674 Lisp, but some don't. Those that do are declared this way by a call to
|
|
2675 @code{DEFVAR_LISP()} in the @code{vars_of_*()} initializer for the
|
|
2676 module. What this does is create a special @dfn{symbol-value-forward}
|
|
2677 Lisp object that contains a pointer to the C variable, intern a symbol
|
|
2678 whose name is as specified in the call to @code{DEFVAR_LISP()}, and set
|
|
2679 its value to the symbol-value-forward Lisp object; it also calls
|
|
2680 @code{staticpro()} on the C variable to tell the garbage-collection
|
|
2681 mechanism about the variable. When @code{eval} (or actually
|
|
2682 @code{symbol-value}) encounters this special object in the process of
|
|
2683 retrieving a variable's value, it follows the indirection to the C
|
|
2684 variable and gets its value. @code{setq} does similar things so that
|
|
2685 the C variable gets changed.
|
|
2686
|
|
2687 Whether or not you @code{DEFVAR_LISP()} a variable, you need to
|
|
2688 initialize it in the @code{vars_of_*()} function; otherwise it will end
|
|
2689 up as all zeroes, which is the integer 0 (@emph{not} @code{nil}), and
|
|
2690 this is probably not what you want. Also, if the variable is not
|
|
2691 @code{DEFVAR_LISP()}ed, @strong{you must call} @code{staticpro()} on the
|
|
2692 C variable in the @code{vars_of_*()} function. Otherwise, the
|
|
2693 garbage-collection mechanism won't know that the object in this variable
|
|
2694 is in use, and will happily collect it and reuse its storage for another
|
|
2695 Lisp object, and you will be the one who's unhappy when you can't figure
|
|
2696 out how your variable got overwritten.
|
|
2697
|
462
|
2698 @node Proper Use of Unsigned Types
|
|
2699 @section Proper Use of Unsigned Types
|
|
2700 @cindex unsigned types, proper use of
|
|
2701 @cindex types, proper use of unsigned
|
|
2702
|
|
2703 Avoid using @code{unsigned int} and @code{unsigned long} whenever
|
|
2704 possible. Unsigned types are viral -- any arithmetic or comparisons
|
|
2705 involving mixed signed and unsigned types are automatically converted to
|
|
2706 unsigned, which is almost certainly not what you want. Many subtle and
|
|
2707 hard-to-find bugs are created by careless use of unsigned types. In
|
|
2708 general, you should almost @emph{never} use an unsigned type to hold a
|
|
2709 regular quantity of any sort. The only exceptions are
|
|
2710
|
|
2711 @enumerate
|
|
2712 @item
|
|
2713 When there's a reasonable possibility you will actually need all 32 or
|
|
2714 64 bits to store the quantity.
|
|
2715 @item
|
|
2716 When calling existing API's that require unsigned types. In this case,
|
|
2717 you should still do all manipulation using signed types, and do the
|
|
2718 conversion at the very threshold of the API call.
|
|
2719 @item
|
|
2720 In existing code that you don't want to modify because you don't
|
|
2721 maintain it.
|
|
2722 @item
|
|
2723 In bit-field structures.
|
|
2724 @end enumerate
|
|
2725
|
|
2726 Other reasonable uses of @code{unsigned int} and @code{unsigned long}
|
|
2727 are representing non-quantities -- e.g. bit-oriented flags and such.
|
|
2728
|
|
2729 @node Coding for Mule
|
428
|
2730 @section Coding for Mule
|
462
|
2731 @cindex coding for Mule
|
|
2732 @cindex Mule, coding for
|
428
|
2733
|
|
2734 Although Mule support is not compiled by default in XEmacs, many people
|
|
2735 are using it, and we consider it crucial that new code works correctly
|
|
2736 with multibyte characters. This is not hard; it is only a matter of
|
|
2737 following several simple user-interface guidelines. Even if you never
|
|
2738 compile with Mule, with a little practice you will find it quite easy
|
|
2739 to code Mule-correctly.
|
|
2740
|
|
2741 Note that these guidelines are not necessarily tied to the current Mule
|
|
2742 implementation; they are also a good idea to follow on the grounds of
|
|
2743 code generalization for future I18N work.
|
|
2744
|
|
2745 @menu
|
|
2746 * Character-Related Data Types::
|
|
2747 * Working With Character and Byte Positions::
|
|
2748 * Conversion to and from External Data::
|
|
2749 * General Guidelines for Writing Mule-Aware Code::
|
|
2750 * An Example of Mule-Aware Code::
|
|
2751 @end menu
|
|
2752
|
462
|
2753 @node Character-Related Data Types
|
428
|
2754 @subsection Character-Related Data Types
|
462
|
2755 @cindex character-related data types
|
|
2756 @cindex data types, character-related
|
428
|
2757
|
|
2758 First, let's review the basic character-related datatypes used by
|
|
2759 XEmacs. Note that the separate @code{typedef}s are not mandatory in the
|
|
2760 current implementation (all of them boil down to @code{unsigned char} or
|
|
2761 @code{int}), but they improve clarity of code a great deal, because one
|
|
2762 glance at the declaration can tell the intended use of the variable.
|
|
2763
|
|
2764 @table @code
|
|
2765 @item Emchar
|
|
2766 @cindex Emchar
|
|
2767 An @code{Emchar} holds a single Emacs character.
|
|
2768
|
|
2769 Obviously, the equality between characters and bytes is lost in the Mule
|
|
2770 world. Characters can be represented by one or more bytes in the
|
|
2771 buffer, and @code{Emchar} is the C type large enough to hold any
|
|
2772 character.
|
|
2773
|
|
2774 Without Mule support, an @code{Emchar} is equivalent to an
|
|
2775 @code{unsigned char}.
|
|
2776
|
|
2777 @item Bufbyte
|
|
2778 @cindex Bufbyte
|
|
2779 The data representing the text in a buffer or string is logically a set
|
|
2780 of @code{Bufbyte}s.
|
|
2781
|
442
|
2782 XEmacs does not work with the same character formats all the time; when
|
|
2783 reading characters from the outside, it decodes them to an internal
|
|
2784 format, and likewise encodes them when writing. @code{Bufbyte} (in fact
|
428
|
2785 @code{unsigned char}) is the basic unit of XEmacs internal buffers and
|
442
|
2786 strings format. A @code{Bufbyte *} is the type that points at text
|
|
2787 encoded in the variable-width internal encoding.
|
428
|
2788
|
|
2789 One character can correspond to one or more @code{Bufbyte}s. In the
|
442
|
2790 current Mule implementation, an ASCII character is represented by the
|
|
2791 same @code{Bufbyte}, and other characters are represented by a sequence
|
|
2792 of two or more @code{Bufbyte}s.
|
|
2793
|
|
2794 Without Mule support, there are exactly 256 characters, implicitly
|
|
2795 Latin-1, and each character is represented using one @code{Bufbyte}, and
|
|
2796 there is a one-to-one correspondence between @code{Bufbyte}s and
|
|
2797 @code{Emchar}s.
|
428
|
2798
|
|
2799 @item Bufpos
|
|
2800 @itemx Charcount
|
|
2801 @cindex Bufpos
|
|
2802 @cindex Charcount
|
|
2803 A @code{Bufpos} represents a character position in a buffer or string.
|
|
2804 A @code{Charcount} represents a number (count) of characters.
|
|
2805 Logically, subtracting two @code{Bufpos} values yields a
|
|
2806 @code{Charcount} value. Although all of these are @code{typedef}ed to
|
442
|
2807 @code{EMACS_INT}, we use them in preference to @code{EMACS_INT} to make
|
|
2808 it clear what sort of position is being used.
|
428
|
2809
|
|
2810 @code{Bufpos} and @code{Charcount} values are the only ones that are
|
|
2811 ever visible to Lisp.
|
|
2812
|
|
2813 @item Bytind
|
|
2814 @itemx Bytecount
|
|
2815 @cindex Bytind
|
|
2816 @cindex Bytecount
|
|
2817 A @code{Bytind} represents a byte position in a buffer or string. A
|
442
|
2818 @code{Bytecount} represents the distance between two positions, in bytes.
|
428
|
2819 The relationship between @code{Bytind} and @code{Bytecount} is the same
|
|
2820 as the relationship between @code{Bufpos} and @code{Charcount}.
|
|
2821
|
|
2822 @item Extbyte
|
|
2823 @itemx Extcount
|
|
2824 @cindex Extbyte
|
|
2825 @cindex Extcount
|
|
2826 When dealing with the outside world, XEmacs works with @code{Extbyte}s,
|
|
2827 which are equivalent to @code{unsigned char}. Obviously, an
|
|
2828 @code{Extcount} is the distance between two @code{Extbyte}s. Extbytes
|
|
2829 and Extcounts are not all that frequent in XEmacs code.
|
|
2830 @end table
|
|
2831
|
462
|
2832 @node Working With Character and Byte Positions
|
428
|
2833 @subsection Working With Character and Byte Positions
|
462
|
2834 @cindex character and byte positions, working with
|
|
2835 @cindex byte positions, working with character and
|
|
2836 @cindex positions, working with character and byte
|
428
|
2837
|
|
2838 Now that we have defined the basic character-related types, we can look
|
|
2839 at the macros and functions designed for work with them and for
|
|
2840 conversion between them. Most of these macros are defined in
|
|
2841 @file{buffer.h}, and we don't discuss all of them here, but only the
|
|
2842 most important ones. Examining the existing code is the best way to
|
|
2843 learn about them.
|
|
2844
|
|
2845 @table @code
|
|
2846 @item MAX_EMCHAR_LEN
|
|
2847 @cindex MAX_EMCHAR_LEN
|
442
|
2848 This preprocessor constant is the maximum number of buffer bytes to
|
|
2849 represent an Emacs character in the variable width internal encoding.
|
|
2850 It is useful when allocating temporary strings to keep a known number of
|
|
2851 characters. For instance:
|
428
|
2852
|
|
2853 @example
|
|
2854 @group
|
|
2855 @{
|
|
2856 Charcount cclen;
|
|
2857 ...
|
|
2858 @{
|
|
2859 /* Allocate place for @var{cclen} characters. */
|
|
2860 Bufbyte *buf = (Bufbyte *)alloca (cclen * MAX_EMCHAR_LEN);
|
|
2861 ...
|
|
2862 @end group
|
|
2863 @end example
|
|
2864
|
|
2865 If you followed the previous section, you can guess that, logically,
|
|
2866 multiplying a @code{Charcount} value with @code{MAX_EMCHAR_LEN} produces
|
|
2867 a @code{Bytecount} value.
|
|
2868
|
|
2869 In the current Mule implementation, @code{MAX_EMCHAR_LEN} equals 4.
|
|
2870 Without Mule, it is 1.
|
|
2871
|
|
2872 @item charptr_emchar
|
|
2873 @itemx set_charptr_emchar
|
|
2874 @cindex charptr_emchar
|
|
2875 @cindex set_charptr_emchar
|
|
2876 The @code{charptr_emchar} macro takes a @code{Bufbyte} pointer and
|
|
2877 returns the @code{Emchar} stored at that position. If it were a
|
|
2878 function, its prototype would be:
|
|
2879
|
|
2880 @example
|
|
2881 Emchar charptr_emchar (Bufbyte *p);
|
|
2882 @end example
|
|
2883
|
|
2884 @code{set_charptr_emchar} stores an @code{Emchar} to the specified byte
|
|
2885 position. It returns the number of bytes stored:
|
|
2886
|
|
2887 @example
|
|
2888 Bytecount set_charptr_emchar (Bufbyte *p, Emchar c);
|
|
2889 @end example
|
|
2890
|
|
2891 It is important to note that @code{set_charptr_emchar} is safe only for
|
|
2892 appending a character at the end of a buffer, not for overwriting a
|
|
2893 character in the middle. This is because the width of characters
|
|
2894 varies, and @code{set_charptr_emchar} cannot resize the string if it
|
|
2895 writes, say, a two-byte character where a single-byte character used to
|
|
2896 reside.
|
|
2897
|
|
2898 A typical use of @code{set_charptr_emchar} can be demonstrated by this
|
|
2899 example, which copies characters from buffer @var{buf} to a temporary
|
|
2900 string of Bufbytes.
|
|
2901
|
|
2902 @example
|
|
2903 @group
|
|
2904 @{
|
|
2905 Bufpos pos;
|
|
2906 for (pos = beg; pos < end; pos++)
|
|
2907 @{
|
|
2908 Emchar c = BUF_FETCH_CHAR (buf, pos);
|
|
2909 p += set_charptr_emchar (buf, c);
|
|
2910 @}
|
|
2911 @}
|
|
2912 @end group
|
|
2913 @end example
|
|
2914
|
|
2915 Note how @code{set_charptr_emchar} is used to store the @code{Emchar}
|
|
2916 and increment the counter, at the same time.
|
|
2917
|
|
2918 @item INC_CHARPTR
|
|
2919 @itemx DEC_CHARPTR
|
|
2920 @cindex INC_CHARPTR
|
|
2921 @cindex DEC_CHARPTR
|
|
2922 These two macros increment and decrement a @code{Bufbyte} pointer,
|
|
2923 respectively. They will adjust the pointer by the appropriate number of
|
|
2924 bytes according to the byte length of the character stored there. Both
|
|
2925 macros assume that the memory address is located at the beginning of a
|
|
2926 valid character.
|
|
2927
|
|
2928 Without Mule support, @code{INC_CHARPTR (p)} and @code{DEC_CHARPTR (p)}
|
|
2929 simply expand to @code{p++} and @code{p--}, respectively.
|
|
2930
|
|
2931 @item bytecount_to_charcount
|
|
2932 @cindex bytecount_to_charcount
|
|
2933 Given a pointer to a text string and a length in bytes, return the
|
|
2934 equivalent length in characters.
|
|
2935
|
|
2936 @example
|
|
2937 Charcount bytecount_to_charcount (Bufbyte *p, Bytecount bc);
|
|
2938 @end example
|
|
2939
|
|
2940 @item charcount_to_bytecount
|
|
2941 @cindex charcount_to_bytecount
|
|
2942 Given a pointer to a text string and a length in characters, return the
|
|
2943 equivalent length in bytes.
|
|
2944
|
|
2945 @example
|
|
2946 Bytecount charcount_to_bytecount (Bufbyte *p, Charcount cc);
|
|
2947 @end example
|
|
2948
|
|
2949 @item charptr_n_addr
|
|
2950 @cindex charptr_n_addr
|
|
2951 Return a pointer to the beginning of the character offset @var{cc} (in
|
|
2952 characters) from @var{p}.
|
|
2953
|
|
2954 @example
|
|
2955 Bufbyte *charptr_n_addr (Bufbyte *p, Charcount cc);
|
|
2956 @end example
|
|
2957 @end table
|
|
2958
|
462
|
2959 @node Conversion to and from External Data
|
428
|
2960 @subsection Conversion to and from External Data
|
462
|
2961 @cindex conversion to and from external data
|
|
2962 @cindex external data, conversion to and from
|
428
|
2963
|
|
2964 When an external function, such as a C library function, returns a
|
|
2965 @code{char} pointer, you should almost never treat it as @code{Bufbyte}.
|
|
2966 This is because these returned strings may contain 8bit characters which
|
|
2967 can be misinterpreted by XEmacs, and cause a crash. Likewise, when
|
|
2968 exporting a piece of internal text to the outside world, you should
|
|
2969 always convert it to an appropriate external encoding, lest the internal
|
|
2970 stuff (such as the infamous \201 characters) leak out.
|
|
2971
|
|
2972 The interface to conversion between the internal and external
|
|
2973 representations of text are the numerous conversion macros defined in
|
442
|
2974 @file{buffer.h}. There used to be a fixed set of external formats
|
|
2975 supported by these macros, but now any coding system can be used with
|
|
2976 these macros. The coding system alias mechanism is used to create the
|
|
2977 following logical coding systems, which replace the fixed external
|
|
2978 formats. The (dontusethis-set-symbol-value-handler) mechanism was
|
|
2979 enhanced to make this possible (more work on that is needed - like
|
|
2980 remove the @code{dontusethis-} prefix).
|
428
|
2981
|
|
2982 @table @code
|
442
|
2983 @item Qbinary
|
|
2984 This is the simplest format and is what we use in the absence of a more
|
|
2985 appropriate format. This converts according to the @code{binary} coding
|
|
2986 system:
|
|
2987
|
|
2988 @enumerate a
|
|
2989 @item
|
|
2990 On input, bytes 0--255 are converted into (implicitly Latin-1)
|
|
2991 characters 0--255. A non-Mule xemacs doesn't really know about
|
|
2992 different character sets and the fonts to display them, so the bytes can
|
|
2993 be treated as text in different 1-byte encodings by simply setting the
|
|
2994 appropriate fonts. So in a sense, non-Mule xemacs is a multi-lingual
|
|
2995 editor if, for example, different fonts are used to display text in
|
|
2996 different buffers, faces, or windows. The specifier mechanism gives the
|
|
2997 user complete control over this kind of behavior.
|
|
2998 @item
|
|
2999 On output, characters 0--255 are converted into bytes 0--255 and other
|
|
3000 characters are converted into `~'.
|
|
3001 @end enumerate
|
|
3002
|
|
3003 @item Qfile_name
|
|
3004 Format used for filenames. This is user-definable via either the
|
|
3005 @code{file-name-coding-system} or @code{pathname-coding-system} (now
|
|
3006 obsolete) variables.
|
|
3007
|
|
3008 @item Qnative
|
|
3009 Format used for the external Unix environment---@code{argv[]}, stuff
|
|
3010 from @code{getenv()}, stuff from the @file{/etc/passwd} file, etc.
|
|
3011 Currently this is the same as Qfile_name. The two should be
|
|
3012 distinguished for clarity and possible future separation.
|
|
3013
|
|
3014 @item Qctext
|
|
3015 Compound--text format. This is the standard X11 format used for data
|
|
3016 stored in properties, selections, and the like. This is an 8-bit
|
|
3017 no-lock-shift ISO2022 coding system. This is a real coding system,
|
|
3018 unlike Qfile_name, which is user-definable.
|
428
|
3019 @end table
|
|
3020
|
442
|
3021 There are two fundamental macros to convert between external and
|
|
3022 internal format.
|
|
3023
|
|
3024 @code{TO_INTERNAL_FORMAT} converts external data to internal format, and
|
|
3025 @code{TO_EXTERNAL_FORMAT} converts the other way around. The arguments
|
|
3026 each of these receives are a source type, a source, a sink type, a sink,
|
|
3027 and a coding system (or a symbol naming a coding system).
|
|
3028
|
|
3029 A typical call looks like
|
|
3030 @example
|
|
3031 TO_EXTERNAL_FORMAT (LISP_STRING, str, C_STRING_MALLOC, ptr, Qfile_name);
|
|
3032 @end example
|
|
3033
|
|
3034 which means that the contents of the lisp string @code{str} are written
|
|
3035 to a malloc'ed memory area which will be pointed to by @code{ptr}, after
|
|
3036 the function returns. The conversion will be done using the
|
|
3037 @code{file-name} coding system, which will be controlled by the user
|
|
3038 indirectly by setting or binding the variable
|
|
3039 @code{file-name-coding-system}.
|
|
3040
|
|
3041 Some sources and sinks require two C variables to specify. We use some
|
|
3042 preprocessor magic to allow different source and sink types, and even
|
|
3043 different numbers of arguments to specify different types of sources and
|
|
3044 sinks.
|
|
3045
|
|
3046 So we can have a call that looks like
|
|
3047 @example
|
|
3048 TO_INTERNAL_FORMAT (DATA, (ptr, len),
|
|
3049 MALLOC, (ptr, len),
|
|
3050 coding_system);
|
|
3051 @end example
|
|
3052
|
|
3053 The parenthesized argument pairs are required to make the preprocessor
|
|
3054 magic work.
|
|
3055
|
|
3056 Here are the different source and sink types:
|
|
3057
|
|
3058 @table @code
|
|
3059 @item @code{DATA, (ptr, len),}
|
|
3060 input data is a fixed buffer of size @var{len} at address @var{ptr}
|
|
3061 @item @code{ALLOCA, (ptr, len),}
|
|
3062 output data is placed in an alloca()ed buffer of size @var{len} pointed to by @var{ptr}
|
|
3063 @item @code{MALLOC, (ptr, len),}
|
|
3064 output data is in a malloc()ed buffer of size @var{len} pointed to by @var{ptr}
|
|
3065 @item @code{C_STRING_ALLOCA, ptr,}
|
|
3066 equivalent to @code{ALLOCA (ptr, len_ignored)} on output.
|
|
3067 @item @code{C_STRING_MALLOC, ptr,}
|
|
3068 equivalent to @code{MALLOC (ptr, len_ignored)} on output
|
|
3069 @item @code{C_STRING, ptr,}
|
|
3070 equivalent to @code{DATA, (ptr, strlen (ptr) + 1)} on input
|
|
3071 @item @code{LISP_STRING, string,}
|
|
3072 input or output is a Lisp_Object of type string
|
|
3073 @item @code{LISP_BUFFER, buffer,}
|
|
3074 output is written to @code{(point)} in lisp buffer @var{buffer}
|
|
3075 @item @code{LISP_LSTREAM, lstream,}
|
|
3076 input or output is a Lisp_Object of type lstream
|
|
3077 @item @code{LISP_OPAQUE, object,}
|
|
3078 input or output is a Lisp_Object of type opaque
|
|
3079 @end table
|
|
3080
|
|
3081 Often, the data is being converted to a '\0'-byte-terminated string,
|
|
3082 which is the format required by many external system C APIs. For these
|
|
3083 purposes, a source type of @code{C_STRING} or a sink type of
|
|
3084 @code{C_STRING_ALLOCA} or @code{C_STRING_MALLOC} is appropriate.
|
|
3085 Otherwise, we should try to keep XEmacs '\0'-byte-clean, which means
|
|
3086 using (ptr, len) pairs.
|
|
3087
|
|
3088 The sinks to be specified must be lvalues, unless they are the lisp
|
|
3089 object types @code{LISP_LSTREAM} or @code{LISP_BUFFER}.
|
|
3090
|
|
3091 For the sink types @code{ALLOCA} and @code{C_STRING_ALLOCA}, the
|
|
3092 resulting text is stored in a stack-allocated buffer, which is
|
|
3093 automatically freed on returning from the function. However, the sink
|
|
3094 types @code{MALLOC} and @code{C_STRING_MALLOC} return @code{xmalloc()}ed
|
|
3095 memory. The caller is responsible for freeing this memory using
|
|
3096 @code{xfree()}.
|
|
3097
|
|
3098 Note that it doesn't make sense for @code{LISP_STRING} to be a source
|
|
3099 for @code{TO_INTERNAL_FORMAT} or a sink for @code{TO_EXTERNAL_FORMAT}.
|
|
3100 You'll get an assertion failure if you try.
|
|
3101
|
|
3102
|
462
|
3103 @node General Guidelines for Writing Mule-Aware Code
|
428
|
3104 @subsection General Guidelines for Writing Mule-Aware Code
|
462
|
3105 @cindex writing Mule-aware code, general guidelines for
|
|
3106 @cindex Mule-aware code, general guidelines for writing
|
|
3107 @cindex code, general guidelines for writing Mule-aware
|
428
|
3108
|
|
3109 This section contains some general guidance on how to write Mule-aware
|
|
3110 code, as well as some pitfalls you should avoid.
|
|
3111
|
|
3112 @table @emph
|
|
3113 @item Never use @code{char} and @code{char *}.
|
|
3114 In XEmacs, the use of @code{char} and @code{char *} is almost always a
|
|
3115 mistake. If you want to manipulate an Emacs character from ``C'', use
|
|
3116 @code{Emchar}. If you want to examine a specific octet in the internal
|
|
3117 format, use @code{Bufbyte}. If you want a Lisp-visible character, use a
|
|
3118 @code{Lisp_Object} and @code{make_char}. If you want a pointer to move
|
|
3119 through the internal text, use @code{Bufbyte *}. Also note that you
|
|
3120 almost certainly do not need @code{Emchar *}.
|
|
3121
|
|
3122 @item Be careful not to confuse @code{Charcount}, @code{Bytecount}, and @code{Bufpos}.
|
|
3123 The whole point of using different types is to avoid confusion about the
|
|
3124 use of certain variables. Lest this effect be nullified, you need to be
|
|
3125 careful about using the right types.
|
|
3126
|
|
3127 @item Always convert external data
|
|
3128 It is extremely important to always convert external data, because
|
|
3129 XEmacs can crash if unexpected 8bit sequences are copied to its internal
|
|
3130 buffers literally.
|
|
3131
|
|
3132 This means that when a system function, such as @code{readdir}, returns
|
442
|
3133 a string, you may need to convert it using one of the conversion macros
|
428
|
3134 described in the previous chapter, before passing it further to Lisp.
|
442
|
3135
|
|
3136 Actually, most of the basic system functions that accept '\0'-terminated
|
|
3137 string arguments, like @code{stat()} and @code{open()}, have been
|
|
3138 @strong{encapsulated} so that they are they @code{always} do internal to
|
|
3139 external conversion themselves. This means you must pass internally
|
|
3140 encoded data, typically the @code{XSTRING_DATA} of a Lisp_String to
|
|
3141 these functions. This is actually a design bug, since it unexpectedly
|
|
3142 changes the semantics of the system functions. A better design would be
|
|
3143 to provide separate versions of these system functions that accepted
|
|
3144 Lisp_Objects which were lisp strings in place of their current
|
|
3145 @code{char *} arguments.
|
|
3146
|
|
3147 @example
|
|
3148 int stat_lisp (Lisp_Object path, struct stat *buf); /* Implement me */
|
|
3149 @end example
|
428
|
3150
|
|
3151 Also note that many internal functions, such as @code{make_string},
|
|
3152 accept Bufbytes, which removes the need for them to convert the data
|
|
3153 they receive. This increases efficiency because that way external data
|
|
3154 needs to be decoded only once, when it is read. After that, it is
|
|
3155 passed around in internal format.
|
|
3156 @end table
|
|
3157
|
462
|
3158 @node An Example of Mule-Aware Code
|
428
|
3159 @subsection An Example of Mule-Aware Code
|
462
|
3160 @cindex code, an example of Mule-aware
|
|
3161 @cindex Mule-aware code, an example of
|
428
|
3162
|
442
|
3163 As an example of Mule-aware code, we will analyze the @code{string}
|
|
3164 function, which conses up a Lisp string from the character arguments it
|
|
3165 receives. Here is the definition, pasted from @code{alloc.c}:
|
428
|
3166
|
|
3167 @example
|
|
3168 @group
|
|
3169 DEFUN ("string", Fstring, 0, MANY, 0, /*
|
|
3170 Concatenate all the argument characters and make the result a string.
|
|
3171 */
|
|
3172 (int nargs, Lisp_Object *args))
|
|
3173 @{
|
|
3174 Bufbyte *storage = alloca_array (Bufbyte, nargs * MAX_EMCHAR_LEN);
|
|
3175 Bufbyte *p = storage;
|
|
3176
|
|
3177 for (; nargs; nargs--, args++)
|
|
3178 @{
|
|
3179 Lisp_Object lisp_char = *args;
|
|
3180 CHECK_CHAR_COERCE_INT (lisp_char);
|
|
3181 p += set_charptr_emchar (p, XCHAR (lisp_char));
|
|
3182 @}
|
|
3183 return make_string (storage, p - storage);
|
|
3184 @}
|
|
3185 @end group
|
|
3186 @end example
|
|
3187
|
|
3188 Now we can analyze the source line by line.
|
|
3189
|
|
3190 Obviously, string will be as long as there are arguments to the
|
|
3191 function. This is why we allocate @code{MAX_EMCHAR_LEN} * @var{nargs}
|
|
3192 bytes on the stack, i.e. the worst-case number of bytes for @var{nargs}
|
|
3193 @code{Emchar}s to fit in the string.
|
|
3194
|
|
3195 Then, the loop checks that each element is a character, converting
|
|
3196 integers in the process. Like many other functions in XEmacs, this
|
|
3197 function silently accepts integers where characters are expected, for
|
|
3198 historical and compatibility reasons. Unless you know what you are
|
|
3199 doing, @code{CHECK_CHAR} will also suffice. @code{XCHAR (lisp_char)}
|
|
3200 extracts the @code{Emchar} from the @code{Lisp_Object}, and
|
|
3201 @code{set_charptr_emchar} stores it to storage, increasing @code{p} in
|
|
3202 the process.
|
|
3203
|
|
3204 Other instructive examples of correct coding under Mule can be found all
|
|
3205 over the XEmacs code. For starters, I recommend
|
|
3206 @code{Fnormalize_menu_item_name} in @file{menubar.c}. After you have
|
|
3207 understood this section of the manual and studied the examples, you can
|
|
3208 proceed writing new Mule-aware code.
|
|
3209
|
462
|
3210 @node Techniques for XEmacs Developers
|
428
|
3211 @section Techniques for XEmacs Developers
|
462
|
3212 @cindex techniques for XEmacs developers
|
|
3213 @cindex developers, techniques for XEmacs
|
|
3214
|
|
3215 @cindex Purify
|
|
3216 @cindex Quantify
|
442
|
3217 To make a purified XEmacs, do: @code{make puremacs}.
|
428
|
3218 To make a quantified XEmacs, do: @code{make quantmacs}.
|
|
3219
|
442
|
3220 You simply can't dump Quantified and Purified images (unless using the
|
|
3221 portable dumper). Purify gets confused when xemacs frees memory in one
|
|
3222 process that was allocated in a @emph{different} process on a different
|
|
3223 machine!. Run it like so:
|
|
3224 @example
|
|
3225 temacs -batch -l loadup.el run-temacs @var{xemacs-args...}
|
|
3226 @end example
|
428
|
3227
|
462
|
3228 @cindex error checking
|
428
|
3229 Before you go through the trouble, are you compiling with all
|
442
|
3230 debugging and error-checking off? If not, try that first. Be warned
|
428
|
3231 that while Quantify is directly responsible for quite a few
|
|
3232 optimizations which have been made to XEmacs, doing a run which
|
|
3233 generates results which can be acted upon is not necessarily a trivial
|
|
3234 task.
|
|
3235
|
|
3236 Also, if you're still willing to do some runs make sure you configure
|
|
3237 with the @samp{--quantify} flag. That will keep Quantify from starting
|
|
3238 to record data until after the loadup is completed and will shut off
|
|
3239 recording right before it shuts down (which generates enough bogus data
|
|
3240 to throw most results off). It also enables three additional elisp
|
|
3241 commands: @code{quantify-start-recording-data},
|
|
3242 @code{quantify-stop-recording-data} and @code{quantify-clear-data}.
|
|
3243
|
|
3244 If you want to make XEmacs faster, target your favorite slow benchmark,
|
|
3245 run a profiler like Quantify, @code{gprof}, or @code{tcov}, and figure
|
|
3246 out where the cycles are going. Specific projects:
|
|
3247
|
|
3248 @itemize @bullet
|
|
3249 @item
|
|
3250 Make the garbage collector faster. Figure out how to write an
|
|
3251 incremental garbage collector.
|
|
3252 @item
|
|
3253 Write a compiler that takes bytecode and spits out C code.
|
|
3254 Unfortunately, you will then need a C compiler and a more fully
|
|
3255 developed module system.
|
|
3256 @item
|
|
3257 Speed up redisplay.
|
|
3258 @item
|
|
3259 Speed up syntax highlighting. Maybe moving some of the syntax
|
|
3260 highlighting capabilities into C would make a difference.
|
|
3261 @item
|
|
3262 Implement tail recursion in Emacs Lisp (hard!).
|
|
3263 @end itemize
|
|
3264
|
|
3265 Unfortunately, Emacs Lisp is slow, and is going to stay slow. Function
|
|
3266 calls in elisp are especially expensive. Iterating over a long list is
|
|
3267 going to be 30 times faster implemented in C than in Elisp.
|
|
3268
|
442
|
3269 Heavily used small code fragments need to be fast. The traditional way
|
|
3270 to implement such code fragments in C is with macros. But macros in C
|
|
3271 are known to be broken.
|
|
3272
|
462
|
3273 @cindex macro hygiene
|
442
|
3274 Macro arguments that are repeatedly evaluated may suffer from repeated
|
|
3275 side effects or suboptimal performance.
|
|
3276
|
|
3277 Variable names used in macros may collide with caller's variables,
|
|
3278 causing (at least) unwanted compiler warnings.
|
|
3279
|
|
3280 In order to solve these problems, and maintain statement semantics, one
|
|
3281 should use the @code{do @{ ... @} while (0)} trick while trying to
|
|
3282 reference macro arguments exactly once using local variables.
|
|
3283
|
|
3284 Let's take a look at this poor macro definition:
|
|
3285
|
|
3286 @example
|
|
3287 #define MARK_OBJECT(obj) \
|
|
3288 if (!marked_p (obj)) mark_object (obj), did_mark = 1
|
|
3289 @end example
|
|
3290
|
|
3291 This macro evaluates its argument twice, and also fails if used like this:
|
|
3292 @example
|
|
3293 if (flag) MARK_OBJECT (obj); else do_something();
|
|
3294 @end example
|
|
3295
|
|
3296 A much better definition is
|
|
3297
|
|
3298 @example
|
|
3299 #define MARK_OBJECT(obj) do @{ \
|
|
3300 Lisp_Object mo_obj = (obj); \
|
|
3301 if (!marked_p (mo_obj)) \
|
|
3302 @{ \
|
|
3303 mark_object (mo_obj); \
|
|
3304 did_mark = 1; \
|
|
3305 @} \
|
|
3306 @} while (0)
|
|
3307 @end example
|
|
3308
|
|
3309 Notice the elimination of double evaluation by using the local variable
|
|
3310 with the obscure name. Writing safe and efficient macros requires great
|
|
3311 care. The one problem with macros that cannot be portably worked around
|
|
3312 is, since a C block has no value, a macro used as an expression rather
|
|
3313 than a statement cannot use the techniques just described to avoid
|
|
3314 multiple evaluation.
|
|
3315
|
462
|
3316 @cindex inline functions
|
442
|
3317 In most cases where a macro has function semantics, an inline function
|
|
3318 is a better implementation technique. Modern compiler optimizers tend
|
|
3319 to inline functions even if they have no @code{inline} keyword, and
|
|
3320 configure magic ensures that the @code{inline} keyword can be safely
|
|
3321 used as an additional compiler hint. Inline functions used in a single
|
|
3322 .c files are easy. The function must already be defined to be
|
|
3323 @code{static}. Just add another @code{inline} keyword to the
|
|
3324 definition.
|
|
3325
|
|
3326 @example
|
|
3327 inline static int
|
|
3328 heavily_used_small_function (int arg)
|
|
3329 @{
|
|
3330 ...
|
|
3331 @}
|
|
3332 @end example
|
|
3333
|
|
3334 Inline functions in header files are trickier, because we would like to
|
|
3335 make the following optimization if the function is @emph{not} inlined
|
|
3336 (for example, because we're compiling for debugging). We would like the
|
|
3337 function to be defined externally exactly once, and each calling
|
|
3338 translation unit would create an external reference to the function,
|
|
3339 instead of including a definition of the inline function in the object
|
|
3340 code of every translation unit that uses it. This optimization is
|
|
3341 currently only available for gcc. But you don't have to worry about the
|
|
3342 trickiness; just define your inline functions in header files using this
|
|
3343 pattern:
|
|
3344
|
|
3345 @example
|
|
3346 INLINE_HEADER int
|
|
3347 i_used_to_be_a_crufty_macro_but_look_at_me_now (int arg);
|
|
3348 INLINE_HEADER int
|
|
3349 i_used_to_be_a_crufty_macro_but_look_at_me_now (int arg)
|
|
3350 @{
|
|
3351 ...
|
|
3352 @}
|
|
3353 @end example
|
|
3354
|
|
3355 The declaration right before the definition is to prevent warnings when
|
|
3356 compiling with @code{gcc -Wmissing-declarations}. I consider issuing
|
|
3357 this warning for inline functions a gcc bug, but the gcc maintainers disagree.
|
|
3358
|
462
|
3359 @cindex inline functions, headers
|
|
3360 @cindex header files, inline functions
|
442
|
3361 Every header which contains inline functions, either directly by using
|
|
3362 @code{INLINE_HEADER} or indirectly by using @code{DECLARE_LRECORD} must
|
|
3363 be added to @file{inline.c}'s includes to make the optimization
|
|
3364 described above work. (Optimization note: if all INLINE_HEADER
|
|
3365 functions are in fact inlined in all translation units, then the linker
|
|
3366 can just discard @code{inline.o}, since it contains only unreferenced code).
|
|
3367
|
438
|
3368 To get started debugging XEmacs, take a look at the @file{.gdbinit} and
|
442
|
3369 @file{.dbxrc} files in the @file{src} directory. See the section in the
|
|
3370 XEmacs FAQ on How to Debug an XEmacs problem with a debugger.
|
428
|
3371
|
|
3372 After making source code changes, run @code{make check} to ensure that
|
442
|
3373 you haven't introduced any regressions. If you want to make xemacs more
|
|
3374 reliable, please improve the test suite in @file{tests/automated}.
|
|
3375
|
|
3376 Did you make sure you didn't introduce any new compiler warnings?
|
|
3377
|
|
3378 Before submitting a patch, please try compiling at least once with
|
|
3379
|
|
3380 @example
|
607
|
3381 configure --with-mule --use-union-type --error-checking=all
|
442
|
3382 @end example
|
428
|
3383
|
|
3384 Here are things to know when you create a new source file:
|
|
3385
|
|
3386 @itemize @bullet
|
|
3387 @item
|
|
3388 All @file{.c} files should @code{#include <config.h>} first. Almost all
|
|
3389 @file{.c} files should @code{#include "lisp.h"} second.
|
|
3390
|
|
3391 @item
|
|
3392 Generated header files should be included using the @code{#include <...>} syntax,
|
|
3393 not the @code{#include "..."} syntax. The generated headers are:
|
|
3394
|
442
|
3395 @file{config.h sheap-adjust.h paths.h Emacs.ad.h}
|
428
|
3396
|
|
3397 The basic rule is that you should assume builds using @code{--srcdir}
|
|
3398 and the @code{#include <...>} syntax needs to be used when the
|
|
3399 to-be-included generated file is in a potentially different directory
|
|
3400 @emph{at compile time}. The non-obvious C rule is that @code{#include "..."}
|
|
3401 means to search for the included file in the same directory as the
|
|
3402 including file, @emph{not} in the current directory.
|
|
3403
|
|
3404 @item
|
|
3405 Header files should @emph{not} include @code{<config.h>} and
|
|
3406 @code{"lisp.h"}. It is the responsibility of the @file{.c} files that
|
|
3407 use it to do so.
|
|
3408
|
|
3409 @end itemize
|
|
3410
|
462
|
3411 @cindex Lisp object types, creating
|
|
3412 @cindex creating Lisp object types
|
|
3413 @cindex object types, creating Lisp
|
442
|
3414 Here is a checklist of things to do when creating a new lisp object type
|
|
3415 named @var{foo}:
|
|
3416
|
|
3417 @enumerate
|
|
3418 @item
|
|
3419 create @var{foo}.h
|
|
3420 @item
|
|
3421 create @var{foo}.c
|
|
3422 @item
|
|
3423 add definitions of @code{syms_of_@var{foo}}, etc. to @file{@var{foo}.c}
|
|
3424 @item
|
|
3425 add declarations of @code{syms_of_@var{foo}}, etc. to @file{symsinit.h}
|
|
3426 @item
|
|
3427 add calls to @code{syms_of_@var{foo}}, etc. to @file{emacs.c}
|
|
3428 @item
|
|
3429 add definitions of macros like @code{CHECK_@var{FOO}} and
|
|
3430 @code{@var{FOO}P} to @file{@var{foo}.h}
|
|
3431 @item
|
|
3432 add the new type index to @code{enum lrecord_type}
|
|
3433 @item
|
|
3434 add a DEFINE_LRECORD_IMPLEMENTATION call to @file{@var{foo}.c}
|
|
3435 @item
|
|
3436 add an INIT_LRECORD_IMPLEMENTATION call to @code{syms_of_@var{foo}.c}
|
|
3437 @end enumerate
|
428
|
3438
|
965
|
3439 @node Regression Testing XEmacs, CVS Techniques, Rules When Writing New C Code, Top
|
|
3440 @chapter Regression Testing XEmacs
|
|
3441 @cindex testing, regression
|
|
3442
|
|
3443 The source directory @file{tests/automated} contains XEmacs' automated
|
|
3444 test suite. The usual way of running all the tests is running
|
|
3445 @code{make check} from the top-level build directory.
|
|
3446
|
|
3447 The test suite is unfinished and it's still lacking some essential
|
|
3448 features. It is nevertheless recommended that you run the tests to
|
|
3449 confirm that XEmacs behaves correctly.
|
|
3450
|
|
3451 If you want to run a specific test case, you can do it from the
|
|
3452 command-line like this:
|
|
3453
|
|
3454 @example
|
|
3455 $ xemacs -batch -l test-harness.elc -f batch-test-emacs TEST-FILE
|
|
3456 @end{example}
|
|
3457
|
|
3458 If something goes wrong, you can run the test suite interactively by
|
|
3459 loading @file{test-harness.el} into a running XEmacs and typing
|
|
3460 @kbd{M-x test-emacs-test-file RET <filename> RET}. You will see a log of
|
|
3461 passed and failed tests, which should allow you to investigate the
|
|
3462 source of the error and ultimately fix the bug.
|
|
3463
|
|
3464 Adding a new test file is trivial: just create a new file here and it
|
|
3465 will be run. There is no need to byte-compile any of the files in
|
|
3466 this directory---the test-harness will take care of any necessary
|
|
3467 byte-compilation.
|
|
3468
|
|
3469 Look at the existing test cases for the examples of coding test cases.
|
|
3470 It all boils down to your imagination and judicious use of the macros
|
|
3471 @code{Assert}, @code{Check-Error}, @code{Check-Error-Message}, and
|
|
3472 @code{Check-Message}.
|
|
3473
|
|
3474 Here's a simple example checking case-sensitive and case-insensitive
|
|
3475 comparisons from @file{case-tests.el}.
|
|
3476
|
|
3477 @example
|
|
3478 (with-temp-buffer
|
|
3479 (insert "Test Buffer")
|
|
3480 (let ((case-fold-search t))
|
|
3481 (goto-char (point-min))
|
|
3482 (Assert (eq (search-forward "test buffer" nil t) 12))
|
|
3483 (goto-char (point-min))
|
|
3484 (Assert (eq (search-forward "Test buffer" nil t) 12))
|
|
3485 (goto-char (point-min))
|
|
3486 (Assert (eq (search-forward "Test Buffer" nil t) 12))
|
|
3487
|
|
3488 (setq case-fold-search nil)
|
|
3489 (goto-char (point-min))
|
|
3490 (Assert (not (search-forward "test buffer" nil t)))
|
|
3491 (goto-char (point-min))
|
|
3492 (Assert (not (search-forward "Test buffer" nil t)))
|
|
3493 (goto-char (point-min))
|
|
3494 (Assert (eq (search-forward "Test Buffer" nil t) 12))))
|
|
3495 @end{example}
|
|
3496
|
|
3497 This example could be inserted in a file in @file{tests/automated}, and
|
|
3498 it would be a complete test, automatically executed when you run
|
|
3499 @kbd{make check} after building XEmacs. More complex tests may require
|
|
3500 substantial temporary scaffolding to create the environment that elicits
|
|
3501 the bugs, but the top-level Makefile and @file{test-harness.el} handle
|
|
3502 the running and collection of results from the @code{Assert},
|
|
3503 @code{Check-Error}, @code{Check-Error-Message}, and @code{Check-Message}
|
|
3504 macros.
|
|
3505
|
|
3506 @node CVS Techniques, A Summary of the Various XEmacs Modules, Regression Testing XEmacs, Top
|
802
|
3507 @chapter CVS Techniques
|
|
3508 @cindex CVS techniques
|
|
3509
|
|
3510 @menu
|
|
3511 * Merging a Branch into the Trunk::
|
|
3512 @end menu
|
|
3513
|
|
3514 @node Merging a Branch into the Trunk
|
|
3515 @section Merging a Branch into the Trunk
|
|
3516 @cindex merging a branch into the trunk
|
|
3517
|
|
3518 @enumerate
|
|
3519 @item
|
|
3520 If you haven't already done a merge, you will be merging from the branch
|
|
3521 point; otherwise you'll be merging from the last merge point, which
|
|
3522 should be marked by a tag, e.g. @samp{last-sync-ben-mule-21-5}. In the
|
|
3523 former case, create the last-sync tag, e.g.
|
|
3524
|
|
3525 @example
|
|
3526 crw rtag -r ben-mule-21-5-bp last-sync-ben-mule-21-5 xemacs
|
|
3527 @end example
|
|
3528
|
|
3529 (You did create a branch point tag when you created the branch, didn't
|
|
3530 you?)
|
|
3531
|
|
3532 @item
|
|
3533 Check everything in on your branch.
|
|
3534
|
|
3535 @item
|
|
3536 Tag your branch with a pre-sync tag, e.g.
|
|
3537
|
|
3538 @example
|
|
3539 crw rtag -r ben-mule-21-5 ben-mule-21-5-pre-feb-20-2002-sync xemacs
|
|
3540 @end example
|
|
3541
|
|
3542 Note, you need to use rtag and specify a version with @samp{-r} (use
|
|
3543 @samp{-r HEAD} if necessary) so that removed files are handled correctly
|
|
3544 in some obscure cases. See section 4.8 of the CVS manual.
|
|
3545
|
|
3546 @item
|
|
3547 Tag the trunk so you have a stable place to merge up to in case people
|
|
3548 are asynchronously committing to the trunk, e.g.
|
|
3549
|
|
3550 @example
|
|
3551 crw rtag -r HEAD main-branch-ben-mule-21-5-syncpoint-feb-20-2002 xemacs
|
|
3552 crw rtag -F -r main-branch-ben-mule-21-5-syncpoint-feb-20-2002 next-sync-ben-mule-21-5 xemacs
|
|
3553 @end example
|
|
3554
|
|
3555 Use -F in the second case because the name might already exist, e.g. if
|
|
3556 you've already done a merge. We make two tags because one is a
|
|
3557 permanent mark indicating a syncpoint when merging, and the other is a
|
|
3558 symbolic tag to make other operations easier.
|
|
3559
|
|
3560 @item
|
|
3561 Make a backup of your source tree (not totally necessary but useful for
|
|
3562 reference and peace of mind): Move one level up from the top directory
|
|
3563 of your branch and do, e.g.
|
|
3564
|
|
3565 @example
|
|
3566 cp -a mule mule-backup-2-23-02
|
|
3567 @end example
|
|
3568
|
|
3569 @item
|
|
3570 Now, we're ready to merge! Make sure you're in the top directory of
|
|
3571 your branch and do, e.g.
|
|
3572
|
|
3573 @example
|
|
3574 cvs update -j last-sync-ben-mule-21-5 -j next-sync-ben-mule-21-5
|
|
3575 @end example
|
|
3576
|
|
3577 @item
|
|
3578 Fix all merge conflicts. Get the sucker to compile and run.
|
|
3579
|
|
3580 @item
|
|
3581 Tag your branch with a post-sync tag, e.g.
|
|
3582
|
|
3583 @example
|
|
3584 crw rtag -r ben-mule-21-5 ben-mule-21-5-post-feb-20-2002-sync xemacs
|
|
3585 @end example
|
|
3586
|
|
3587 @item
|
|
3588 Update the last-sync tag, e.g.
|
|
3589
|
|
3590 @example
|
|
3591 crw rtag -F -r next-sync-ben-mule-21-5 last-sync-ben-mule-21-5 xemacs
|
|
3592 @end example
|
|
3593 @end enumerate
|
|
3594
|
|
3595
|
|
3596 @node A Summary of the Various XEmacs Modules, Allocation of Objects in XEmacs Lisp, CVS Techniques, Top
|
428
|
3597 @chapter A Summary of the Various XEmacs Modules
|
462
|
3598 @cindex modules, a summary of the various XEmacs
|
428
|
3599
|
|
3600 This is accurate as of XEmacs 20.0.
|
|
3601
|
|
3602 @menu
|
|
3603 * Low-Level Modules::
|
|
3604 * Basic Lisp Modules::
|
|
3605 * Modules for Standard Editing Operations::
|
|
3606 * Editor-Level Control Flow Modules::
|
|
3607 * Modules for the Basic Displayable Lisp Objects::
|
|
3608 * Modules for other Display-Related Lisp Objects::
|
|
3609 * Modules for the Redisplay Mechanism::
|
|
3610 * Modules for Interfacing with the File System::
|
|
3611 * Modules for Other Aspects of the Lisp Interpreter and Object System::
|
|
3612 * Modules for Interfacing with the Operating System::
|
|
3613 * Modules for Interfacing with X Windows::
|
|
3614 * Modules for Internationalization::
|
965
|
3615 * Modules for Regression Testing::
|
428
|
3616 @end menu
|
|
3617
|
462
|
3618 @node Low-Level Modules
|
428
|
3619 @section Low-Level Modules
|
462
|
3620 @cindex low-level modules
|
|
3621 @cindex modules, low-level
|
428
|
3622
|
|
3623 @example
|
|
3624 config.h
|
|
3625 @end example
|
|
3626
|
|
3627 This is automatically generated from @file{config.h.in} based on the
|
|
3628 results of configure tests and user-selected optional features and
|
|
3629 contains preprocessor definitions specifying the nature of the
|
|
3630 environment in which XEmacs is being compiled.
|
|
3631
|
|
3632
|
|
3633
|
|
3634 @example
|
|
3635 paths.h
|
|
3636 @end example
|
|
3637
|
|
3638 This is automatically generated from @file{paths.h.in} based on supplied
|
|
3639 configure values, and allows for non-standard installed configurations
|
|
3640 of the XEmacs directories. It's currently broken, though.
|
|
3641
|
|
3642
|
|
3643
|
|
3644 @example
|
|
3645 emacs.c
|
|
3646 signal.c
|
|
3647 @end example
|
|
3648
|
|
3649 @file{emacs.c} contains @code{main()} and other code that performs the most
|
|
3650 basic environment initializations and handles shutting down the XEmacs
|
|
3651 process (this includes @code{kill-emacs}, the normal way that XEmacs is
|
|
3652 exited; @code{dump-emacs}, which is used during the build process to
|
|
3653 write out the XEmacs executable; @code{run-emacs-from-temacs}, which can
|
|
3654 be used to start XEmacs directly when temacs has finished loading all
|
|
3655 the Lisp code; and emergency code to handle crashes [XEmacs tries to
|
|
3656 auto-save all files before it crashes]).
|
|
3657
|
|
3658 Low-level code that directly interacts with the Unix signal mechanism,
|
|
3659 however, is in @file{signal.c}. Note that this code does not handle system
|
|
3660 dependencies in interfacing to signals; that is handled using the
|
|
3661 @file{syssignal.h} header file, described in section J below.
|
|
3662
|
|
3663
|
|
3664
|
|
3665 @example
|
|
3666 unexaix.c
|
|
3667 unexalpha.c
|
|
3668 unexapollo.c
|
|
3669 unexconvex.c
|
|
3670 unexec.c
|
|
3671 unexelf.c
|
|
3672 unexelfsgi.c
|
|
3673 unexencap.c
|
|
3674 unexenix.c
|
|
3675 unexfreebsd.c
|
|
3676 unexfx2800.c
|
|
3677 unexhp9k3.c
|
|
3678 unexhp9k800.c
|
|
3679 unexmips.c
|
|
3680 unexnext.c
|
|
3681 unexsol2.c
|
|
3682 unexsunos4.c
|
|
3683 @end example
|
|
3684
|
|
3685 These modules contain code dumping out the XEmacs executable on various
|
|
3686 different systems. (This process is highly machine-specific and
|
|
3687 requires intimate knowledge of the executable format and the memory map
|
|
3688 of the process.) Only one of these modules is actually used; this is
|
|
3689 chosen by @file{configure}.
|
|
3690
|
|
3691
|
|
3692
|
|
3693 @example
|
442
|
3694 ecrt0.c
|
428
|
3695 lastfile.c
|
|
3696 pre-crt0.c
|
|
3697 @end example
|
|
3698
|
|
3699 These modules are used in conjunction with the dump mechanism. On some
|
|
3700 systems, an alternative version of the C startup code (the actual code
|
|
3701 that receives control from the operating system when the process is
|
|
3702 started, and which calls @code{main()}) is required so that the dumping
|
|
3703 process works properly; @file{crt0.c} provides this.
|
|
3704
|
|
3705 @file{pre-crt0.c} and @file{lastfile.c} should be the very first and
|
|
3706 very last file linked, respectively. (Actually, this is not really true.
|
|
3707 @file{lastfile.c} should be after all Emacs modules whose initialized
|
|
3708 data should be made constant, and before all other Emacs files and all
|
|
3709 libraries. In particular, the allocation modules @file{gmalloc.c},
|
|
3710 @file{alloca.c}, etc. are normally placed past @file{lastfile.c}, and
|
|
3711 all of the files that implement Xt widget classes @emph{must} be placed
|
|
3712 after @file{lastfile.c} because they contain various structures that
|
|
3713 must be statically initialized and into which Xt writes at various
|
|
3714 times.) @file{pre-crt0.c} and @file{lastfile.c} contain exported symbols
|
|
3715 that are used to determine the start and end of XEmacs' initialized
|
|
3716 data space when dumping.
|
|
3717
|
|
3718
|
|
3719
|
|
3720 @example
|
|
3721 alloca.c
|
|
3722 free-hook.c
|
|
3723 getpagesize.h
|
|
3724 gmalloc.c
|
|
3725 malloc.c
|
|
3726 mem-limits.h
|
|
3727 ralloc.c
|
|
3728 vm-limit.c
|
|
3729 @end example
|
|
3730
|
|
3731 These handle basic C allocation of memory. @file{alloca.c} is an emulation of
|
|
3732 the stack allocation function @code{alloca()} on machines that lack
|
|
3733 this. (XEmacs makes extensive use of @code{alloca()} in its code.)
|
|
3734
|
|
3735 @file{gmalloc.c} and @file{malloc.c} are two implementations of the standard C
|
|
3736 functions @code{malloc()}, @code{realloc()} and @code{free()}. They are
|
|
3737 often used in place of the standard system-provided @code{malloc()}
|
|
3738 because they usually provide a much faster implementation, at the
|
|
3739 expense of additional memory use. @file{gmalloc.c} is a newer implementation
|
|
3740 that is much more memory-efficient for large allocations than @file{malloc.c},
|
|
3741 and should always be preferred if it works. (At one point, @file{gmalloc.c}
|
|
3742 didn't work on some systems where @file{malloc.c} worked; but this should be
|
|
3743 fixed now.)
|
|
3744
|
|
3745 @cindex relocating allocator
|
|
3746 @file{ralloc.c} is the @dfn{relocating allocator}. It provides
|
|
3747 functions similar to @code{malloc()}, @code{realloc()} and @code{free()}
|
|
3748 that allocate memory that can be dynamically relocated in memory. The
|
|
3749 advantage of this is that allocated memory can be shuffled around to
|
|
3750 place all the free memory at the end of the heap, and the heap can then
|
|
3751 be shrunk, releasing the memory back to the operating system. The use
|
|
3752 of this can be controlled with the configure option @code{--rel-alloc};
|
|
3753 if enabled, memory allocated for buffers will be relocatable, so that if
|
|
3754 a very large file is visited and the buffer is later killed, the memory
|
|
3755 can be released to the operating system. (The disadvantage of this
|
|
3756 mechanism is that it can be very slow. On systems with the
|
|
3757 @code{mmap()} system call, the XEmacs version of @file{ralloc.c} uses
|
|
3758 this to move memory around without actually having to block-copy it,
|
|
3759 which can speed things up; but it can still cause noticeable performance
|
|
3760 degradation.)
|
|
3761
|
|
3762 @file{free-hook.c} contains some debugging functions for checking for invalid
|
|
3763 arguments to @code{free()}.
|
|
3764
|
|
3765 @file{vm-limit.c} contains some functions that warn the user when memory is
|
|
3766 getting low. These are callback functions that are called by @file{gmalloc.c}
|
|
3767 and @file{malloc.c} at appropriate times.
|
|
3768
|
|
3769 @file{getpagesize.h} provides a uniform interface for retrieving the size of a
|
|
3770 page in virtual memory. @file{mem-limits.h} provides a uniform interface for
|
|
3771 retrieving the total amount of available virtual memory. Both are
|
|
3772 similar in spirit to the @file{sys*.h} files described in section J, below.
|
|
3773
|
|
3774
|
|
3775
|
|
3776 @example
|
|
3777 blocktype.c
|
|
3778 blocktype.h
|
|
3779 dynarr.c
|
|
3780 @end example
|
|
3781
|
|
3782 These implement a couple of basic C data types to facilitate memory
|
|
3783 allocation. The @code{Blocktype} type efficiently manages the
|
|
3784 allocation of fixed-size blocks by minimizing the number of times that
|
|
3785 @code{malloc()} and @code{free()} are called. It allocates memory in
|
|
3786 large chunks, subdivides the chunks into blocks of the proper size, and
|
|
3787 returns the blocks as requested. When blocks are freed, they are placed
|
|
3788 onto a linked list, so they can be efficiently reused. This data type
|
|
3789 is not much used in XEmacs currently, because it's a fairly new
|
|
3790 addition.
|
|
3791
|
|
3792 @cindex dynamic array
|
|
3793 The @code{Dynarr} type implements a @dfn{dynamic array}, which is
|
|
3794 similar to a standard C array but has no fixed limit on the number of
|
|
3795 elements it can contain. Dynamic arrays can hold elements of any type,
|
|
3796 and when you add a new element, the array automatically resizes itself
|
|
3797 if it isn't big enough. Dynarrs are extensively used in the redisplay
|
|
3798 mechanism.
|
|
3799
|
|
3800
|
|
3801
|
|
3802 @example
|
|
3803 inline.c
|
|
3804 @end example
|
|
3805
|
|
3806 This module is used in connection with inline functions (available in
|
|
3807 some compilers). Often, inline functions need to have a corresponding
|
|
3808 non-inline function that does the same thing. This module is where they
|
|
3809 reside. It contains no actual code, but defines some special flags that
|
|
3810 cause inline functions defined in header files to be rendered as actual
|
|
3811 functions. It then includes all header files that contain any inline
|
|
3812 function definitions, so that each one gets a real function equivalent.
|
|
3813
|
|
3814
|
|
3815
|
|
3816 @example
|
|
3817 debug.c
|
|
3818 debug.h
|
|
3819 @end example
|
|
3820
|
|
3821 These functions provide a system for doing internal consistency checks
|
|
3822 during code development. This system is not currently used; instead the
|
|
3823 simpler @code{assert()} macro is used along with the various checks
|
|
3824 provided by the @samp{--error-check-*} configuration options.
|
|
3825
|
|
3826
|
|
3827
|
|
3828 @example
|
|
3829 universe.h
|
|
3830 @end example
|
|
3831
|
|
3832 This is not currently used.
|
|
3833
|
|
3834
|
|
3835
|
462
|
3836 @node Basic Lisp Modules
|
428
|
3837 @section Basic Lisp Modules
|
462
|
3838 @cindex Lisp modules, basic
|
|
3839 @cindex modules, basic Lisp
|
428
|
3840
|
|
3841 @example
|
|
3842 lisp-disunion.h
|
|
3843 lisp-union.h
|
|
3844 lisp.h
|
|
3845 lrecord.h
|
|
3846 symsinit.h
|
|
3847 @end example
|
|
3848
|
|
3849 These are the basic header files for all XEmacs modules. Each module
|
|
3850 includes @file{lisp.h}, which brings the other header files in.
|
|
3851 @file{lisp.h} contains the definitions of the structures and extractor
|
|
3852 and constructor macros for the basic Lisp objects and various other
|
|
3853 basic definitions for the Lisp environment, as well as some
|
|
3854 general-purpose definitions (e.g. @code{min()} and @code{max()}).
|
|
3855 @file{lisp.h} includes either @file{lisp-disunion.h} or
|
|
3856 @file{lisp-union.h}, depending on whether @code{USE_UNION_TYPE} is
|
|
3857 defined. These files define the typedef of the Lisp object itself (as
|
|
3858 described above) and the low-level macros that hide the actual
|
|
3859 implementation of the Lisp object. All extractor and constructor macros
|
|
3860 for particular types of Lisp objects are defined in terms of these
|
|
3861 low-level macros.
|
|
3862
|
|
3863 As a general rule, all typedefs should go into the typedefs section of
|
|
3864 @file{lisp.h} rather than into a module-specific header file even if the
|
|
3865 structure is defined elsewhere. This allows function prototypes that
|
|
3866 use the typedef to be placed into other header files. Forward structure
|
|
3867 declarations (i.e. a simple declaration like @code{struct foo;} where
|
|
3868 the structure itself is defined elsewhere) should be placed into the
|
|
3869 typedefs section as necessary.
|
|
3870
|
|
3871 @file{lrecord.h} contains the basic structures and macros that implement
|
440
|
3872 all record-type Lisp objects---i.e. all objects whose type is a field
|
428
|
3873 in their C structure, which includes all objects except the few most
|
|
3874 basic ones.
|
|
3875
|
|
3876 @file{lisp.h} contains prototypes for most of the exported functions in
|
|
3877 the various modules. Lisp primitives defined using @code{DEFUN} that
|
|
3878 need to be called by C code should be declared using @code{EXFUN}.
|
|
3879 Other function prototypes should be placed either into the appropriate
|
|
3880 section of @code{lisp.h}, or into a module-specific header file,
|
|
3881 depending on how general-purpose the function is and whether it has
|
|
3882 special-purpose argument types requiring definitions not in
|
|
3883 @file{lisp.h}.) All initialization functions are prototyped in
|
|
3884 @file{symsinit.h}.
|
|
3885
|
|
3886
|
|
3887
|
|
3888 @example
|
|
3889 alloc.c
|
|
3890 @end example
|
|
3891
|
|
3892 The large module @file{alloc.c} implements all of the basic allocation and
|
|
3893 garbage collection for Lisp objects. The most commonly used Lisp
|
|
3894 objects are allocated in chunks, similar to the Blocktype data type
|
|
3895 described above; others are allocated in individually @code{malloc()}ed
|
|
3896 blocks. This module provides the foundation on which all other aspects
|
|
3897 of the Lisp environment sit, and is the first module initialized at
|
|
3898 startup.
|
|
3899
|
|
3900 Note that @file{alloc.c} provides a series of generic functions that are
|
|
3901 not dependent on any particular object type, and interfaces to
|
|
3902 particular types of objects using a standardized interface of
|
|
3903 type-specific methods. This scheme is a fundamental principle of
|
|
3904 object-oriented programming and is heavily used throughout XEmacs. The
|
|
3905 great advantage of this is that it allows for a clean separation of
|
440
|
3906 functionality into different modules---new classes of Lisp objects, new
|
428
|
3907 event interfaces, new device types, new stream interfaces, etc. can be
|
|
3908 added transparently without affecting code anywhere else in XEmacs.
|
|
3909 Because the different subsystems are divided into general and specific
|
|
3910 code, adding a new subtype within a subsystem will in general not
|
|
3911 require changes to the generic subsystem code or affect any of the other
|
|
3912 subtypes in the subsystem; this provides a great deal of robustness to
|
|
3913 the XEmacs code.
|
|
3914
|
|
3915
|
|
3916 @example
|
|
3917 eval.c
|
|
3918 backtrace.h
|
|
3919 @end example
|
|
3920
|
|
3921 This module contains all of the functions to handle the flow of control.
|
|
3922 This includes the mechanisms of defining functions, calling functions,
|
|
3923 traversing stack frames, and binding variables; the control primitives
|
|
3924 and other special forms such as @code{while}, @code{if}, @code{eval},
|
|
3925 @code{let}, @code{and}, @code{or}, @code{progn}, etc.; handling of
|
|
3926 non-local exits, unwind-protects, and exception handlers; entering the
|
|
3927 debugger; methods for the subr Lisp object type; etc. It does
|
|
3928 @emph{not} include the @code{read} function, the @code{print} function,
|
|
3929 or the handling of symbols and obarrays.
|
|
3930
|
|
3931 @file{backtrace.h} contains some structures related to stack frames and the
|
|
3932 flow of control.
|
|
3933
|
|
3934
|
|
3935
|
|
3936 @example
|
|
3937 lread.c
|
|
3938 @end example
|
|
3939
|
|
3940 This module implements the Lisp reader and the @code{read} function,
|
|
3941 which converts text into Lisp objects, according to the read syntax of
|
|
3942 the objects, as described above. This is similar to the parser that is
|
|
3943 a part of all compilers.
|
|
3944
|
|
3945
|
|
3946
|
|
3947 @example
|
|
3948 print.c
|
|
3949 @end example
|
|
3950
|
|
3951 This module implements the Lisp print mechanism and the @code{print}
|
|
3952 function and related functions. This is the inverse of the Lisp reader
|
|
3953 -- it converts Lisp objects to a printed, textual representation.
|
|
3954 (Hopefully something that can be read back in using @code{read} to get
|
|
3955 an equivalent object.)
|
|
3956
|
|
3957
|
|
3958
|
|
3959 @example
|
|
3960 general.c
|
|
3961 symbols.c
|
|
3962 symeval.h
|
|
3963 @end example
|
|
3964
|
|
3965 @file{symbols.c} implements the handling of symbols, obarrays, and
|
|
3966 retrieving the values of symbols. Much of the code is devoted to
|
|
3967 handling the special @dfn{symbol-value-magic} objects that define
|
440
|
3968 special types of variables---this includes buffer-local variables,
|
428
|
3969 variable aliases, variables that forward into C variables, etc. This
|
|
3970 module is initialized extremely early (right after @file{alloc.c}),
|
|
3971 because it is here that the basic symbols @code{t} and @code{nil} are
|
|
3972 created, and those symbols are used everywhere throughout XEmacs.
|
|
3973
|
|
3974 @file{symeval.h} contains the definitions of symbol structures and the
|
|
3975 @code{DEFVAR_LISP()} and related macros for declaring variables.
|
|
3976
|
|
3977
|
|
3978
|
|
3979 @example
|
|
3980 data.c
|
|
3981 floatfns.c
|
|
3982 fns.c
|
|
3983 @end example
|
|
3984
|
|
3985 These modules implement the methods and standard Lisp primitives for all
|
|
3986 the basic Lisp object types other than symbols (which are described
|
|
3987 above). @file{data.c} contains all the predicates (primitives that return
|
|
3988 whether an object is of a particular type); the integer arithmetic
|
|
3989 functions; and the basic accessor and mutator primitives for the various
|
|
3990 object types. @file{fns.c} contains all the standard predicates for working
|
|
3991 with sequences (where, abstractly speaking, a sequence is an ordered set
|
|
3992 of objects, and can be represented by a list, string, vector, or
|
|
3993 bit-vector); it also contains @code{equal}, perhaps on the grounds that
|
|
3994 bulk of the operation of @code{equal} is comparing sequences.
|
|
3995 @file{floatfns.c} contains methods and primitives for floats and floating-point
|
|
3996 arithmetic.
|
|
3997
|
|
3998
|
|
3999
|
|
4000 @example
|
|
4001 bytecode.c
|
|
4002 bytecode.h
|
|
4003 @end example
|
|
4004
|
|
4005 @file{bytecode.c} implements the byte-code interpreter and
|
|
4006 compiled-function objects, and @file{bytecode.h} contains associated
|
|
4007 structures. Note that the byte-code @emph{compiler} is written in Lisp.
|
|
4008
|
|
4009
|
|
4010
|
|
4011
|
462
|
4012 @node Modules for Standard Editing Operations
|
428
|
4013 @section Modules for Standard Editing Operations
|
462
|
4014 @cindex modules for standard editing operations
|
|
4015 @cindex editing operations, modules for standard
|
428
|
4016
|
|
4017 @example
|
|
4018 buffer.c
|
|
4019 buffer.h
|
|
4020 bufslots.h
|
|
4021 @end example
|
|
4022
|
|
4023 @file{buffer.c} implements the @dfn{buffer} Lisp object type. This
|
|
4024 includes functions that create and destroy buffers; retrieve buffers by
|
|
4025 name or by other properties; manipulate lists of buffers (remember that
|
|
4026 buffers are permanent objects and stored in various ordered lists);
|
|
4027 retrieve or change buffer properties; etc. It also contains the
|
|
4028 definitions of all the built-in buffer-local variables (which can be
|
|
4029 viewed as buffer properties). It does @emph{not} contain code to
|
|
4030 manipulate buffer-local variables (that's in @file{symbols.c}, described
|
|
4031 above); or code to manipulate the text in a buffer.
|
|
4032
|
|
4033 @file{buffer.h} defines the structures associated with a buffer and the various
|
|
4034 macros for retrieving text from a buffer and special buffer positions
|
|
4035 (e.g. @code{point}, the default location for text insertion). It also
|
|
4036 contains macros for working with buffer positions and converting between
|
|
4037 their representations as character offsets and as byte offsets (under
|
|
4038 MULE, they are different, because characters can be multi-byte). It is
|
|
4039 one of the largest header files.
|
|
4040
|
|
4041 @file{bufslots.h} defines the fields in the buffer structure that correspond to
|
|
4042 the built-in buffer-local variables. It is its own header file because
|
|
4043 it is included many times in @file{buffer.c}, as a way of iterating over all
|
|
4044 the built-in buffer-local variables.
|
|
4045
|
|
4046
|
|
4047
|
|
4048 @example
|
|
4049 insdel.c
|
|
4050 insdel.h
|
|
4051 @end example
|
|
4052
|
|
4053 @file{insdel.c} contains low-level functions for inserting and deleting text in
|
|
4054 a buffer, keeping track of changed regions for use by redisplay, and
|
|
4055 calling any before-change and after-change functions that may have been
|
|
4056 registered for the buffer. It also contains the actual functions that
|
|
4057 convert between byte offsets and character offsets.
|
|
4058
|
|
4059 @file{insdel.h} contains associated headers.
|
|
4060
|
|
4061
|
|
4062
|
|
4063 @example
|
|
4064 marker.c
|
|
4065 @end example
|
|
4066
|
|
4067 This module implements the @dfn{marker} Lisp object type, which
|
|
4068 conceptually is a pointer to a text position in a buffer that moves
|
|
4069 around as text is inserted and deleted, so as to remain in the same
|
|
4070 relative position. This module doesn't actually move the markers around
|
|
4071 -- that's handled in @file{insdel.c}. This module just creates them and
|
|
4072 implements the primitives for working with them. As markers are simple
|
|
4073 objects, this does not entail much.
|
|
4074
|
|
4075 Note that the standard arithmetic primitives (e.g. @code{+}) accept
|
|
4076 markers in place of integers and automatically substitute the value of
|
|
4077 @code{marker-position} for the marker, i.e. an integer describing the
|
|
4078 current buffer position of the marker.
|
|
4079
|
|
4080
|
|
4081
|
|
4082 @example
|
|
4083 extents.c
|
|
4084 extents.h
|
|
4085 @end example
|
|
4086
|
|
4087 This module implements the @dfn{extent} Lisp object type, which is like
|
|
4088 a marker that works over a range of text rather than a single position.
|
|
4089 Extents are also much more complex and powerful than markers and have a
|
|
4090 more efficient (and more algorithmically complex) implementation. The
|
|
4091 implementation is described in detail in comments in @file{extents.c}.
|
|
4092
|
|
4093 The code in @file{extents.c} works closely with @file{insdel.c} so that
|
|
4094 extents are properly moved around as text is inserted and deleted.
|
|
4095 There is also code in @file{extents.c} that provides information needed
|
|
4096 by the redisplay mechanism for efficient operation. (Remember that
|
|
4097 extents can have display properties that affect [sometimes drastically,
|
|
4098 as in the @code{invisible} property] the display of the text they
|
|
4099 cover.)
|
|
4100
|
|
4101
|
|
4102
|
|
4103 @example
|
|
4104 editfns.c
|
|
4105 @end example
|
|
4106
|
|
4107 @file{editfns.c} contains the standard Lisp primitives for working with
|
|
4108 a buffer's text, and calls the low-level functions in @file{insdel.c}.
|
|
4109 It also contains primitives for working with @code{point} (the default
|
|
4110 buffer insertion location).
|
|
4111
|
|
4112 @file{editfns.c} also contains functions for retrieving various
|
|
4113 characteristics from the external environment: the current time, the
|
|
4114 process ID of the running XEmacs process, the name of the user who ran
|
|
4115 this XEmacs process, etc. It's not clear why this code is in
|
|
4116 @file{editfns.c}.
|
|
4117
|
|
4118
|
|
4119
|
|
4120 @example
|
|
4121 callint.c
|
|
4122 cmds.c
|
|
4123 commands.h
|
|
4124 @end example
|
|
4125
|
|
4126 @cindex interactive
|
|
4127 These modules implement the basic @dfn{interactive} commands,
|
|
4128 i.e. user-callable functions. Commands, as opposed to other functions,
|
|
4129 have special ways of getting their parameters interactively (by querying
|
|
4130 the user), as opposed to having them passed in a normal function
|
|
4131 invocation. Many commands are not really meant to be called from other
|
|
4132 Lisp functions, because they modify global state in a way that's often
|
|
4133 undesired as part of other Lisp functions.
|
|
4134
|
|
4135 @file{callint.c} implements the mechanism for querying the user for
|
|
4136 parameters and calling interactive commands. The bulk of this module is
|
|
4137 code that parses the interactive spec that is supplied with an
|
|
4138 interactive command.
|
|
4139
|
|
4140 @file{cmds.c} implements the basic, most commonly used editing commands:
|
|
4141 commands to move around the current buffer and insert and delete
|
|
4142 characters. These commands are implemented using the Lisp primitives
|
|
4143 defined in @file{editfns.c}.
|
|
4144
|
|
4145 @file{commands.h} contains associated structure definitions and prototypes.
|
|
4146
|
|
4147
|
|
4148
|
|
4149 @example
|
|
4150 regex.c
|
|
4151 regex.h
|
|
4152 search.c
|
|
4153 @end example
|
|
4154
|
|
4155 @file{search.c} implements the Lisp primitives for searching for text in
|
|
4156 a buffer, and some of the low-level algorithms for doing this. In
|
|
4157 particular, the fast fixed-string Boyer-Moore search algorithm is
|
|
4158 implemented in @file{search.c}. The low-level algorithms for doing
|
|
4159 regular-expression searching, however, are implemented in @file{regex.c}
|
|
4160 and @file{regex.h}. These two modules are largely independent of
|
|
4161 XEmacs, and are similar to (and based upon) the regular-expression
|
|
4162 routines used in @file{grep} and other GNU utilities.
|
|
4163
|
|
4164
|
|
4165
|
|
4166 @example
|
|
4167 doprnt.c
|
|
4168 @end example
|
|
4169
|
|
4170 @file{doprnt.c} implements formatted-string processing, similar to
|
|
4171 @code{printf()} command in C.
|
|
4172
|
|
4173
|
|
4174
|
|
4175 @example
|
|
4176 undo.c
|
|
4177 @end example
|
|
4178
|
|
4179 This module implements the undo mechanism for tracking buffer changes.
|
|
4180 Most of this could be implemented in Lisp.
|
|
4181
|
|
4182
|
|
4183
|
462
|
4184 @node Editor-Level Control Flow Modules
|
428
|
4185 @section Editor-Level Control Flow Modules
|
462
|
4186 @cindex control flow modules, editor-level
|
|
4187 @cindex modules, editor-level control flow
|
428
|
4188
|
|
4189 @example
|
|
4190 event-Xt.c
|
442
|
4191 event-msw.c
|
428
|
4192 event-stream.c
|
|
4193 event-tty.c
|
442
|
4194 events-mod.h
|
|
4195 gpmevent.c
|
|
4196 gpmevent.h
|
428
|
4197 events.c
|
|
4198 events.h
|
|
4199 @end example
|
|
4200
|
|
4201 These implement the handling of events (user input and other system
|
|
4202 notifications).
|
|
4203
|
|
4204 @file{events.c} and @file{events.h} define the @dfn{event} Lisp object
|
|
4205 type and primitives for manipulating it.
|
|
4206
|
|
4207 @file{event-stream.c} implements the basic functions for working with
|
|
4208 event queues, dispatching an event by looking it up in relevant keymaps
|
|
4209 and such, and handling timeouts; this includes the primitives
|
|
4210 @code{next-event} and @code{dispatch-event}, as well as related
|
|
4211 primitives such as @code{sit-for}, @code{sleep-for}, and
|
|
4212 @code{accept-process-output}. (@file{event-stream.c} is one of the
|
|
4213 hairiest and trickiest modules in XEmacs. Beware! You can easily mess
|
|
4214 things up here.)
|
|
4215
|
|
4216 @file{event-Xt.c} and @file{event-tty.c} implement the low-level
|
|
4217 interfaces onto retrieving events from Xt (the X toolkit) and from TTY's
|
|
4218 (using @code{read()} and @code{select()}), respectively. The event
|
|
4219 interface enforces a clean separation between the specific code for
|
|
4220 interfacing with the operating system and the generic code for working
|
|
4221 with events, by defining an API of basic, low-level event methods;
|
|
4222 @file{event-Xt.c} and @file{event-tty.c} are two different
|
|
4223 implementations of this API. To add support for a new operating system
|
|
4224 (e.g. NeXTstep), one merely needs to provide another implementation of
|
|
4225 those API functions.
|
|
4226
|
|
4227 Note that the choice of whether to use @file{event-Xt.c} or
|
|
4228 @file{event-tty.c} is made at compile time! Or at the very latest, it
|
|
4229 is made at startup time. @file{event-Xt.c} handles events for
|
|
4230 @emph{both} X and TTY frames; @file{event-tty.c} is only used when X
|
|
4231 support is not compiled into XEmacs. The reason for this is that there
|
|
4232 is only one event loop in XEmacs: thus, it needs to be able to receive
|
|
4233 events from all different kinds of frames.
|
|
4234
|
|
4235
|
|
4236
|
|
4237 @example
|
|
4238 keymap.c
|
|
4239 keymap.h
|
|
4240 @end example
|
|
4241
|
|
4242 @file{keymap.c} and @file{keymap.h} define the @dfn{keymap} Lisp object
|
|
4243 type and associated methods and primitives. (Remember that keymaps are
|
|
4244 objects that associate event descriptions with functions to be called to
|
|
4245 ``execute'' those events; @code{dispatch-event} looks up events in the
|
|
4246 relevant keymaps.)
|
|
4247
|
|
4248
|
|
4249
|
|
4250 @example
|
442
|
4251 cmdloop.c
|
|
4252 @end example
|
|
4253
|
|
4254 @file{cmdloop.c} contains functions that implement the actual editor
|
440
|
4255 command loop---i.e. the event loop that cyclically retrieves and
|
428
|
4256 dispatches events. This code is also rather tricky, just like
|
|
4257 @file{event-stream.c}.
|
|
4258
|
|
4259
|
|
4260
|
|
4261 @example
|
|
4262 macros.c
|
|
4263 macros.h
|
|
4264 @end example
|
|
4265
|
|
4266 These two modules contain the basic code for defining keyboard macros.
|
|
4267 These functions don't actually do much; most of the code that handles keyboard
|
|
4268 macros is mixed in with the event-handling code in @file{event-stream.c}.
|
|
4269
|
|
4270
|
|
4271
|
|
4272 @example
|
|
4273 minibuf.c
|
|
4274 @end example
|
|
4275
|
|
4276 This contains some miscellaneous code related to the minibuffer (most of
|
|
4277 the minibuffer code was moved into Lisp by Richard Mlynarik). This
|
|
4278 includes the primitives for completion (although filename completion is
|
|
4279 in @file{dired.c}), the lowest-level interface to the minibuffer (if the
|
|
4280 command loop were cleaned up, this too could be in Lisp), and code for
|
|
4281 dealing with the echo area (this, too, was mostly moved into Lisp, and
|
|
4282 the only code remaining is code to call out to Lisp or provide simple
|
|
4283 bootstrapping implementations early in temacs, before the echo-area Lisp
|
|
4284 code is loaded).
|
|
4285
|
|
4286
|
|
4287
|
462
|
4288 @node Modules for the Basic Displayable Lisp Objects
|
428
|
4289 @section Modules for the Basic Displayable Lisp Objects
|
462
|
4290 @cindex modules for the basic displayable Lisp objects
|
|
4291 @cindex displayable Lisp objects, modules for the basic
|
|
4292 @cindex Lisp objects, modules for the basic displayable
|
|
4293 @cindex objects, modules for the basic displayable Lisp
|
428
|
4294
|
|
4295 @example
|
442
|
4296 console-msw.c
|
|
4297 console-msw.h
|
|
4298 console-stream.c
|
|
4299 console-stream.h
|
|
4300 console-tty.c
|
|
4301 console-tty.h
|
|
4302 console-x.c
|
|
4303 console-x.h
|
|
4304 console.c
|
|
4305 console.h
|
|
4306 @end example
|
|
4307
|
|
4308 These modules implement the @dfn{console} Lisp object type. A console
|
|
4309 contains multiple display devices, but only one keyboard and mouse.
|
|
4310 Most of the time, a console will contain exactly one device.
|
|
4311
|
|
4312 Consoles are the top of a lisp object inclusion hierarchy. Consoles
|
|
4313 contain devices, which contain frames, which contain windows.
|
|
4314
|
|
4315
|
|
4316
|
|
4317 @example
|
|
4318 device-msw.c
|
428
|
4319 device-tty.c
|
|
4320 device-x.c
|
|
4321 device.c
|
|
4322 device.h
|
|
4323 @end example
|
|
4324
|
|
4325 These modules implement the @dfn{device} Lisp object type. This
|
|
4326 abstracts a particular screen or connection on which frames are
|
|
4327 displayed. As with Lisp objects, event interfaces, and other
|
|
4328 subsystems, the device code is separated into a generic component that
|
|
4329 contains a standardized interface (in the form of a set of methods) onto
|
|
4330 particular device types.
|
|
4331
|
|
4332 The device subsystem defines all the methods and provides method
|
|
4333 services for not only device operations but also for the frame, window,
|
|
4334 menubar, scrollbar, toolbar, and other displayable-object subsystems.
|
|
4335 The reason for this is that all of these subsystems have the same
|
|
4336 subtypes (X, TTY, NeXTstep, Microsoft Windows, etc.) as devices do.
|
|
4337
|
|
4338
|
|
4339
|
|
4340 @example
|
442
|
4341 frame-msw.c
|
428
|
4342 frame-tty.c
|
|
4343 frame-x.c
|
|
4344 frame.c
|
|
4345 frame.h
|
|
4346 @end example
|
|
4347
|
|
4348 Each device contains one or more frames in which objects (e.g. text) are
|
|
4349 displayed. A frame corresponds to a window in the window system;
|
|
4350 usually this is a top-level window but it could potentially be one of a
|
|
4351 number of overlapping child windows within a top-level window, using the
|
|
4352 MDI (Multiple Document Interface) protocol in Microsoft Windows or a
|
|
4353 similar scheme.
|
|
4354
|
|
4355 The @file{frame-*} files implement the @dfn{frame} Lisp object type and
|
|
4356 provide the generic and device-type-specific operations on frames
|
|
4357 (e.g. raising, lowering, resizing, moving, etc.).
|
|
4358
|
|
4359
|
|
4360
|
|
4361 @example
|
|
4362 window.c
|
|
4363 window.h
|
|
4364 @end example
|
|
4365
|
|
4366 @cindex window (in Emacs)
|
|
4367 @cindex pane
|
|
4368 Each frame consists of one or more non-overlapping @dfn{windows} (better
|
|
4369 known as @dfn{panes} in standard window-system terminology) in which a
|
|
4370 buffer's text can be displayed. Windows can also have scrollbars
|
|
4371 displayed around their edges.
|
|
4372
|
|
4373 @file{window.c} and @file{window.h} implement the @dfn{window} Lisp
|
|
4374 object type and provide code to manage windows. Since windows have no
|
|
4375 associated resources in the window system (the window system knows only
|
|
4376 about the frame; no child windows or anything are used for XEmacs
|
|
4377 windows), there is no device-type-specific code here; all of that code
|
|
4378 is part of the redisplay mechanism or the code for particular object
|
|
4379 types such as scrollbars.
|
|
4380
|
|
4381
|
|
4382
|
462
|
4383 @node Modules for other Display-Related Lisp Objects
|
428
|
4384 @section Modules for other Display-Related Lisp Objects
|
462
|
4385 @cindex modules for other display-related Lisp objects
|
|
4386 @cindex display-related Lisp objects, modules for other
|
|
4387 @cindex Lisp objects, modules for other display-related
|
428
|
4388
|
|
4389 @example
|
|
4390 faces.c
|
|
4391 faces.h
|
|
4392 @end example
|
|
4393
|
|
4394
|
|
4395
|
|
4396 @example
|
|
4397 bitmaps.h
|
442
|
4398 glyphs-eimage.c
|
|
4399 glyphs-msw.c
|
|
4400 glyphs-msw.h
|
|
4401 glyphs-widget.c
|
428
|
4402 glyphs-x.c
|
|
4403 glyphs-x.h
|
|
4404 glyphs.c
|
|
4405 glyphs.h
|
|
4406 @end example
|
|
4407
|
|
4408
|
|
4409
|
|
4410 @example
|
442
|
4411 objects-msw.c
|
|
4412 objects-msw.h
|
428
|
4413 objects-tty.c
|
|
4414 objects-tty.h
|
|
4415 objects-x.c
|
|
4416 objects-x.h
|
|
4417 objects.c
|
|
4418 objects.h
|
|
4419 @end example
|
|
4420
|
|
4421
|
|
4422
|
|
4423 @example
|
442
|
4424 menubar-msw.c
|
|
4425 menubar-msw.h
|
428
|
4426 menubar-x.c
|
|
4427 menubar.c
|
442
|
4428 menubar.h
|
|
4429 @end example
|
|
4430
|
|
4431
|
|
4432
|
|
4433 @example
|
|
4434 scrollbar-msw.c
|
|
4435 scrollbar-msw.h
|
428
|
4436 scrollbar-x.c
|
|
4437 scrollbar-x.h
|
|
4438 scrollbar.c
|
|
4439 scrollbar.h
|
|
4440 @end example
|
|
4441
|
|
4442
|
|
4443
|
|
4444 @example
|
442
|
4445 toolbar-msw.c
|
428
|
4446 toolbar-x.c
|
|
4447 toolbar.c
|
|
4448 toolbar.h
|
|
4449 @end example
|
|
4450
|
|
4451
|
|
4452
|
|
4453 @example
|
|
4454 font-lock.c
|
|
4455 @end example
|
|
4456
|
440
|
4457 This file provides C support for syntax highlighting---i.e.
|
428
|
4458 highlighting different syntactic constructs of a source file in
|
|
4459 different colors, for easy reading. The C support is provided so that
|
|
4460 this is fast.
|
|
4461
|
|
4462
|
|
4463
|
|
4464 @example
|
|
4465 dgif_lib.c
|
|
4466 gif_err.c
|
|
4467 gif_lib.h
|
|
4468 gifalloc.c
|
|
4469 @end example
|
|
4470
|
|
4471 These modules decode GIF-format image files, for use with glyphs.
|
442
|
4472 These files were removed due to Unisys patent infringement concerns.
|
|
4473
|
|
4474
|
|
4475
|
462
|
4476 @node Modules for the Redisplay Mechanism
|
428
|
4477 @section Modules for the Redisplay Mechanism
|
462
|
4478 @cindex modules for the redisplay mechanism
|
|
4479 @cindex redisplay mechanism, modules for the
|
428
|
4480
|
|
4481 @example
|
|
4482 redisplay-output.c
|
442
|
4483 redisplay-msw.c
|
428
|
4484 redisplay-tty.c
|
|
4485 redisplay-x.c
|
|
4486 redisplay.c
|
|
4487 redisplay.h
|
|
4488 @end example
|
|
4489
|
|
4490 These files provide the redisplay mechanism. As with many other
|
|
4491 subsystems in XEmacs, there is a clean separation between the general
|
|
4492 and device-specific support.
|
|
4493
|
|
4494 @file{redisplay.c} contains the bulk of the redisplay engine. These
|
|
4495 functions update the redisplay structures (which describe how the screen
|
|
4496 is to appear) to reflect any changes made to the state of any
|
|
4497 displayable objects (buffer, frame, window, etc.) since the last time
|
|
4498 that redisplay was called. These functions are highly optimized to
|
|
4499 avoid doing more work than necessary (since redisplay is called
|
|
4500 extremely often and is potentially a huge time sink), and depend heavily
|
|
4501 on notifications from the objects themselves that changes have occurred,
|
|
4502 so that redisplay doesn't explicitly have to check each possible object.
|
|
4503 The redisplay mechanism also contains a great deal of caching to further
|
|
4504 speed things up; some of this caching is contained within the various
|
|
4505 displayable objects.
|
|
4506
|
|
4507 @file{redisplay-output.c} goes through the redisplay structures and converts
|
|
4508 them into calls to device-specific methods to actually output the screen
|
|
4509 changes.
|
|
4510
|
|
4511 @file{redisplay-x.c} and @file{redisplay-tty.c} are two implementations
|
|
4512 of these redisplay output methods, for X frames and TTY frames,
|
|
4513 respectively.
|
|
4514
|
|
4515
|
|
4516
|
|
4517 @example
|
|
4518 indent.c
|
|
4519 @end example
|
|
4520
|
|
4521 This module contains various functions and Lisp primitives for
|
|
4522 converting between buffer positions and screen positions. These
|
|
4523 functions call the redisplay mechanism to do most of the work, and then
|
|
4524 examine the redisplay structures to get the necessary information. This
|
|
4525 module needs work.
|
|
4526
|
|
4527
|
|
4528
|
|
4529 @example
|
|
4530 termcap.c
|
|
4531 terminfo.c
|
|
4532 tparam.c
|
|
4533 @end example
|
|
4534
|
|
4535 These files contain functions for working with the termcap (BSD-style)
|
|
4536 and terminfo (System V style) databases of terminal capabilities and
|
|
4537 escape sequences, used when XEmacs is displaying in a TTY.
|
|
4538
|
|
4539
|
|
4540
|
|
4541 @example
|
|
4542 cm.c
|
|
4543 cm.h
|
|
4544 @end example
|
|
4545
|
|
4546 These files provide some miscellaneous TTY-output functions and should
|
|
4547 probably be merged into @file{redisplay-tty.c}.
|
|
4548
|
|
4549
|
|
4550
|
462
|
4551 @node Modules for Interfacing with the File System
|
428
|
4552 @section Modules for Interfacing with the File System
|
462
|
4553 @cindex modules for interfacing with the file system
|
|
4554 @cindex interfacing with the file system, modules for
|
|
4555 @cindex file system, modules for interfacing with the
|
428
|
4556
|
|
4557 @example
|
|
4558 lstream.c
|
|
4559 lstream.h
|
|
4560 @end example
|
|
4561
|
|
4562 These modules implement the @dfn{stream} Lisp object type. This is an
|
|
4563 internal-only Lisp object that implements a generic buffering stream.
|
|
4564 The idea is to provide a uniform interface onto all sources and sinks of
|
|
4565 data, including file descriptors, stdio streams, chunks of memory, Lisp
|
|
4566 buffers, Lisp strings, etc. That way, I/O functions can be written to
|
|
4567 the stream interface and can transparently handle all possible sources
|
|
4568 and sinks. (For example, the @code{read} function can read data from a
|
|
4569 file, a string, a buffer, or even a function that is called repeatedly
|
|
4570 to return data, without worrying about where the data is coming from or
|
|
4571 what-size chunks it is returned in.)
|
|
4572
|
|
4573 @cindex lstream
|
|
4574 Note that in the C code, streams are called @dfn{lstreams} (for ``Lisp
|
|
4575 streams'') to distinguish them from other kinds of streams, e.g. stdio
|
|
4576 streams and C++ I/O streams.
|
|
4577
|
|
4578 Similar to other subsystems in XEmacs, lstreams are separated into
|
|
4579 generic functions and a set of methods for the different types of
|
|
4580 lstreams. @file{lstream.c} provides implementations of many different
|
442
|
4581 types of streams; others are provided, e.g., in @file{file-coding.c}.
|
428
|
4582
|
|
4583
|
|
4584
|
|
4585 @example
|
|
4586 fileio.c
|
|
4587 @end example
|
|
4588
|
|
4589 This implements the basic primitives for interfacing with the file
|
|
4590 system. This includes primitives for reading files into buffers,
|
|
4591 writing buffers into files, checking for the presence or accessibility
|
|
4592 of files, canonicalizing file names, etc. Note that these primitives
|
|
4593 are usually not invoked directly by the user: There is a great deal of
|
|
4594 higher-level Lisp code that implements the user commands such as
|
|
4595 @code{find-file} and @code{save-buffer}. This is similar to the
|
|
4596 distinction between the lower-level primitives in @file{editfns.c} and
|
|
4597 the higher-level user commands in @file{commands.c} and
|
|
4598 @file{simple.el}.
|
|
4599
|
|
4600
|
|
4601
|
|
4602 @example
|
|
4603 filelock.c
|
|
4604 @end example
|
|
4605
|
|
4606 This file provides functions for detecting clashes between different
|
|
4607 processes (e.g. XEmacs and some external process, or two different
|
|
4608 XEmacs processes) modifying the same file. (XEmacs can optionally use
|
|
4609 the @file{lock/} subdirectory to provide a form of ``locking'' between
|
|
4610 different XEmacs processes.) This module is also used by the low-level
|
|
4611 functions in @file{insdel.c} to ensure that, if the first modification
|
|
4612 is being made to a buffer whose corresponding file has been externally
|
|
4613 modified, the user is made aware of this so that the buffer can be
|
|
4614 synched up with the external changes if necessary.
|
|
4615
|
|
4616
|
|
4617 @example
|
|
4618 filemode.c
|
|
4619 @end example
|
|
4620
|
|
4621 This file provides some miscellaneous functions that construct a
|
|
4622 @samp{rwxr-xr-x}-type permissions string (as might appear in an
|
|
4623 @file{ls}-style directory listing) given the information returned by the
|
|
4624 @code{stat()} system call.
|
|
4625
|
|
4626
|
|
4627
|
|
4628 @example
|
|
4629 dired.c
|
|
4630 ndir.h
|
|
4631 @end example
|
|
4632
|
|
4633 These files implement the XEmacs interface to directory searching. This
|
|
4634 includes a number of primitives for determining the files in a directory
|
|
4635 and for doing filename completion. (Remember that generic completion is
|
|
4636 handled by a different mechanism, in @file{minibuf.c}.)
|
|
4637
|
|
4638 @file{ndir.h} is a header file used for the directory-searching
|
|
4639 emulation functions provided in @file{sysdep.c} (see section J below),
|
|
4640 for systems that don't provide any directory-searching functions. (On
|
|
4641 those systems, directories can be read directly as files, and parsed.)
|
|
4642
|
|
4643
|
|
4644
|
|
4645 @example
|
|
4646 realpath.c
|
|
4647 @end example
|
|
4648
|
|
4649 This file provides an implementation of the @code{realpath()} function
|
|
4650 for expanding symbolic links, on systems that don't implement it or have
|
|
4651 a broken implementation.
|
|
4652
|
|
4653
|
|
4654
|
462
|
4655 @node Modules for Other Aspects of the Lisp Interpreter and Object System
|
428
|
4656 @section Modules for Other Aspects of the Lisp Interpreter and Object System
|
462
|
4657 @cindex modules for other aspects of the Lisp interpreter and object system
|
|
4658 @cindex Lisp interpreter and object system, modules for other aspects of the
|
|
4659 @cindex interpreter and object system, modules for other aspects of the Lisp
|
|
4660 @cindex object system, modules for other aspects of the Lisp interpreter and
|
428
|
4661
|
|
4662 @example
|
|
4663 elhash.c
|
|
4664 elhash.h
|
|
4665 hash.c
|
|
4666 hash.h
|
|
4667 @end example
|
|
4668
|
|
4669 These files provide two implementations of hash tables. Files
|
|
4670 @file{hash.c} and @file{hash.h} provide a generic C implementation of
|
|
4671 hash tables which can stand independently of XEmacs. Files
|
|
4672 @file{elhash.c} and @file{elhash.h} provide a separate implementation of
|
|
4673 hash tables that can store only Lisp objects, and knows about Lispy
|
|
4674 things like garbage collection, and implement the @dfn{hash-table} Lisp
|
|
4675 object type.
|
|
4676
|
|
4677
|
|
4678 @example
|
|
4679 specifier.c
|
|
4680 specifier.h
|
|
4681 @end example
|
|
4682
|
|
4683 This module implements the @dfn{specifier} Lisp object type. This is
|
|
4684 primarily used for displayable properties, and allows for values that
|
|
4685 are specific to a particular buffer, window, frame, device, or device
|
|
4686 class, as well as a default value existing. This is used, for example,
|
|
4687 to control the height of the horizontal scrollbar or the appearance of
|
|
4688 the @code{default}, @code{bold}, or other faces. The specifier object
|
|
4689 consists of a number of specifications, each of which maps from a
|
|
4690 buffer, window, etc. to a value. The function @code{specifier-instance}
|
|
4691 looks up a value given a window (from which a buffer, frame, and device
|
|
4692 can be derived).
|
|
4693
|
|
4694
|
|
4695 @example
|
|
4696 chartab.c
|
|
4697 chartab.h
|
|
4698 casetab.c
|
|
4699 @end example
|
|
4700
|
|
4701 @file{chartab.c} and @file{chartab.h} implement the @dfn{char table}
|
|
4702 Lisp object type, which maps from characters or certain sorts of
|
|
4703 character ranges to Lisp objects. The implementation of this object
|
|
4704 type is optimized for the internal representation of characters. Char
|
|
4705 tables come in different types, which affect the allowed object types to
|
|
4706 which a character can be mapped and also dictate certain other
|
|
4707 properties of the char table.
|
|
4708
|
|
4709 @cindex case table
|
|
4710 @file{casetab.c} implements one sort of char table, the @dfn{case
|
|
4711 table}, which maps characters to other characters of possibly different
|
|
4712 case. These are used by XEmacs to implement case-changing primitives
|
|
4713 and to do case-insensitive searching.
|
|
4714
|
|
4715
|
|
4716
|
|
4717 @example
|
|
4718 syntax.c
|
|
4719 syntax.h
|
|
4720 @end example
|
|
4721
|
|
4722 @cindex scanner
|
|
4723 This module implements @dfn{syntax tables}, another sort of char table
|
|
4724 that maps characters into syntax classes that define the syntax of these
|
|
4725 characters (e.g. a parenthesis belongs to a class of @samp{open}
|
|
4726 characters that have corresponding @samp{close} characters and can be
|
|
4727 nested). This module also implements the Lisp @dfn{scanner}, a set of
|
|
4728 primitives for scanning over text based on syntax tables. This is used,
|
|
4729 for example, to find the matching parenthesis in a command such as
|
|
4730 @code{forward-sexp}, and by @file{font-lock.c} to locate quoted strings,
|
|
4731 comments, etc.
|
|
4732
|
|
4733
|
|
4734
|
|
4735 @example
|
|
4736 casefiddle.c
|
|
4737 @end example
|
|
4738
|
|
4739 This module implements various Lisp primitives for upcasing, downcasing
|
|
4740 and capitalizing strings or regions of buffers.
|
|
4741
|
|
4742
|
|
4743
|
|
4744 @example
|
|
4745 rangetab.c
|
|
4746 @end example
|
|
4747
|
|
4748 This module implements the @dfn{range table} Lisp object type, which
|
|
4749 provides for a mapping from ranges of integers to arbitrary Lisp
|
|
4750 objects.
|
|
4751
|
|
4752
|
|
4753
|
|
4754 @example
|
|
4755 opaque.c
|
|
4756 opaque.h
|
|
4757 @end example
|
|
4758
|
|
4759 This module implements the @dfn{opaque} Lisp object type, an
|
|
4760 internal-only Lisp object that encapsulates an arbitrary block of memory
|
|
4761 so that it can be managed by the Lisp allocation system. To create an
|
|
4762 opaque object, you call @code{make_opaque()}, passing a pointer to a
|
|
4763 block of memory. An object is created that is big enough to hold the
|
|
4764 memory, which is copied into the object's storage. The object will then
|
|
4765 stick around as long as you keep pointers to it, after which it will be
|
|
4766 automatically reclaimed.
|
|
4767
|
|
4768 @cindex mark method
|
|
4769 Opaque objects can also have an arbitrary @dfn{mark method} associated
|
|
4770 with them, in case the block of memory contains other Lisp objects that
|
|
4771 need to be marked for garbage-collection purposes. (If you need other
|
|
4772 object methods, such as a finalize method, you should just go ahead and
|
440
|
4773 create a new Lisp object type---it's not hard.)
|
428
|
4774
|
|
4775
|
|
4776
|
|
4777 @example
|
|
4778 abbrev.c
|
|
4779 @end example
|
|
4780
|
|
4781 This function provides a few primitives for doing dynamic abbreviation
|
|
4782 expansion. In XEmacs, most of the code for this has been moved into
|
|
4783 Lisp. Some C code remains for speed and because the primitive
|
|
4784 @code{self-insert-command} (which is executed for all self-inserting
|
|
4785 characters) hooks into the abbrev mechanism. (@code{self-insert-command}
|
|
4786 is itself in C only for speed.)
|
|
4787
|
|
4788
|
|
4789
|
|
4790 @example
|
|
4791 doc.c
|
|
4792 @end example
|
|
4793
|
|
4794 This function provides primitives for retrieving the documentation
|
|
4795 strings of functions and variables. These documentation strings contain
|
|
4796 certain special markers that get dynamically expanded (e.g. a
|
|
4797 reverse-lookup is performed on some named functions to retrieve their
|
|
4798 current key bindings). Some documentation strings (in particular, for
|
|
4799 the built-in primitives and pre-loaded Lisp functions) are stored
|
|
4800 externally in a file @file{DOC} in the @file{lib-src/} directory and
|
|
4801 need to be fetched from that file. (Part of the build stage involves
|
|
4802 building this file, and another part involves constructing an index for
|
|
4803 this file and embedding it into the executable, so that the functions in
|
|
4804 @file{doc.c} do not have to search the entire @file{DOC} file to find
|
|
4805 the appropriate documentation string.)
|
|
4806
|
|
4807
|
|
4808
|
|
4809 @example
|
|
4810 md5.c
|
|
4811 @end example
|
|
4812
|
|
4813 This function provides a Lisp primitive that implements the MD5 secure
|
|
4814 hashing scheme, used to create a large hash value of a string of data such that
|
|
4815 the data cannot be derived from the hash value. This is used for
|
|
4816 various security applications on the Internet.
|
|
4817
|
|
4818
|
|
4819
|
|
4820
|
462
|
4821 @node Modules for Interfacing with the Operating System
|
428
|
4822 @section Modules for Interfacing with the Operating System
|
462
|
4823 @cindex modules for interfacing with the operating system
|
|
4824 @cindex interfacing with the operating system, modules for
|
|
4825 @cindex operating system, modules for interfacing with the
|
428
|
4826
|
|
4827 @example
|
|
4828 callproc.c
|
|
4829 process.c
|
|
4830 process.h
|
|
4831 @end example
|
|
4832
|
|
4833 These modules allow XEmacs to spawn and communicate with subprocesses
|
|
4834 and network connections.
|
|
4835
|
|
4836 @cindex synchronous subprocesses
|
|
4837 @cindex subprocesses, synchronous
|
|
4838 @file{callproc.c} implements (through the @code{call-process}
|
|
4839 primitive) what are called @dfn{synchronous subprocesses}. This means
|
|
4840 that XEmacs runs a program, waits till it's done, and retrieves its
|
|
4841 output. A typical example might be calling the @file{ls} program to get
|
|
4842 a directory listing.
|
|
4843
|
|
4844 @cindex asynchronous subprocesses
|
|
4845 @cindex subprocesses, asynchronous
|
|
4846 @file{process.c} and @file{process.h} implement @dfn{asynchronous
|
|
4847 subprocesses}. This means that XEmacs starts a program and then
|
|
4848 continues normally, not waiting for the process to finish. Data can be
|
|
4849 sent to the process or retrieved from it as it's running. This is used
|
|
4850 for the @code{shell} command (which provides a front end onto a shell
|
|
4851 program such as @file{csh}), the mail and news readers implemented in
|
|
4852 XEmacs, etc. The result of calling @code{start-process} to start a
|
|
4853 subprocess is a process object, a particular kind of object used to
|
|
4854 communicate with the subprocess. You can send data to the process by
|
|
4855 passing the process object and the data to @code{send-process}, and you
|
|
4856 can specify what happens to data retrieved from the process by setting
|
|
4857 properties of the process object. (When the process sends data, XEmacs
|
|
4858 receives a process event, which says that there is data ready. When
|
|
4859 @code{dispatch-event} is called on this event, it reads the data from
|
|
4860 the process and does something with it, as specified by the process
|
|
4861 object's properties. Typically, this means inserting the data into a
|
|
4862 buffer or calling a function.) Another property of the process object is
|
|
4863 called the @dfn{sentinel}, which is a function that is called when the
|
|
4864 process terminates.
|
|
4865
|
|
4866 @cindex network connections
|
|
4867 Process objects are also used for network connections (connections to a
|
|
4868 process running on another machine). Network connections are started
|
|
4869 with @code{open-network-stream} but otherwise work just like
|
|
4870 subprocesses.
|
|
4871
|
|
4872
|
|
4873
|
|
4874 @example
|
|
4875 sysdep.c
|
|
4876 sysdep.h
|
|
4877 @end example
|
|
4878
|
|
4879 These modules implement most of the low-level, messy operating-system
|
|
4880 interface code. This includes various device control (ioctl) operations
|
|
4881 for file descriptors, TTY's, pseudo-terminals, etc. (usually this stuff
|
|
4882 is fairly system-dependent; thus the name of this module), and emulation
|
|
4883 of standard library functions and system calls on systems that don't
|
|
4884 provide them or have broken versions.
|
|
4885
|
|
4886
|
|
4887
|
|
4888 @example
|
|
4889 sysdir.h
|
|
4890 sysfile.h
|
|
4891 sysfloat.h
|
|
4892 sysproc.h
|
|
4893 syspwd.h
|
|
4894 syssignal.h
|
|
4895 systime.h
|
|
4896 systty.h
|
|
4897 syswait.h
|
|
4898 @end example
|
|
4899
|
|
4900 These header files provide consistent interfaces onto system-dependent
|
|
4901 header files and system calls. The idea is that, instead of including a
|
|
4902 standard header file like @file{<sys/param.h>} (which may or may not
|
|
4903 exist on various systems) or having to worry about whether all system
|
|
4904 provide a particular preprocessor constant, or having to deal with the
|
|
4905 four different paradigms for manipulating signals, you just include the
|
|
4906 appropriate @file{sys*.h} header file, which includes all the right
|
|
4907 system header files, defines and missing preprocessor constants,
|
|
4908 provides a uniform interface onto system calls, etc.
|
|
4909
|
|
4910 @file{sysdir.h} provides a uniform interface onto directory-querying
|
|
4911 functions. (In some cases, this is in conjunction with emulation
|
|
4912 functions in @file{sysdep.c}.)
|
|
4913
|
|
4914 @file{sysfile.h} includes all the necessary header files for standard
|
|
4915 system calls (e.g. @code{read()}), ensures that all necessary
|
|
4916 @code{open()} and @code{stat()} preprocessor constants are defined, and
|
|
4917 possibly (usually) substitutes sugared versions of @code{read()},
|
|
4918 @code{write()}, etc. that automatically restart interrupted I/O
|
|
4919 operations.
|
|
4920
|
|
4921 @file{sysfloat.h} includes the necessary header files for floating-point
|
|
4922 operations.
|
|
4923
|
|
4924 @file{sysproc.h} includes the necessary header files for calling
|
|
4925 @code{select()}, @code{fork()}, @code{execve()}, socket operations, and
|
|
4926 the like, and ensures that the @code{FD_*()} macros for descriptor-set
|
|
4927 manipulations are available.
|
|
4928
|
|
4929 @file{syspwd.h} includes the necessary header files for obtaining
|
|
4930 information from @file{/etc/passwd} (the functions are emulated under
|
|
4931 VMS).
|
|
4932
|
|
4933 @file{syssignal.h} includes the necessary header files for
|
|
4934 signal-handling and provides a uniform interface onto the different
|
|
4935 signal-handling and signal-blocking paradigms.
|
|
4936
|
|
4937 @file{systime.h} includes the necessary header files and provides
|
|
4938 uniform interfaces for retrieving the time of day, setting file
|
|
4939 access/modification times, getting the amount of time used by the XEmacs
|
|
4940 process, etc.
|
|
4941
|
|
4942 @file{systty.h} buffers against the infinitude of different ways of
|
|
4943 controlling TTY's.
|
|
4944
|
|
4945 @file{syswait.h} provides a uniform way of retrieving the exit status
|
|
4946 from a @code{wait()}ed-on process (some systems use a union, others use
|
|
4947 an int).
|
|
4948
|
|
4949
|
|
4950
|
|
4951 @example
|
|
4952 hpplay.c
|
|
4953 libsst.c
|
|
4954 libsst.h
|
|
4955 libst.h
|
|
4956 linuxplay.c
|
|
4957 nas.c
|
|
4958 sgiplay.c
|
|
4959 sound.c
|
|
4960 sunplay.c
|
|
4961 @end example
|
|
4962
|
|
4963 These files implement the ability to play various sounds on some types
|
|
4964 of computers. You have to configure your XEmacs with sound support in
|
|
4965 order to get this capability.
|
|
4966
|
|
4967 @file{sound.c} provides the generic interface. It implements various
|
|
4968 Lisp primitives and variables that let you specify which sounds should
|
|
4969 be played in certain conditions. (The conditions are identified by
|
|
4970 symbols, which are passed to @code{ding} to make a sound. Various
|
|
4971 standard functions call this function at certain times; if sound support
|
|
4972 does not exist, a simple beep results.
|
|
4973
|
|
4974 @cindex native sound
|
|
4975 @cindex sound, native
|
|
4976 @file{sgiplay.c}, @file{sunplay.c}, @file{hpplay.c}, and
|
|
4977 @file{linuxplay.c} interface to the machine's speaker for various
|
|
4978 different kind of machines. This is called @dfn{native} sound.
|
|
4979
|
|
4980 @cindex sound, network
|
|
4981 @cindex network sound
|
|
4982 @cindex NAS
|
|
4983 @file{nas.c} interfaces to a computer somewhere else on the network
|
|
4984 using the NAS (Network Audio Server) protocol, playing sounds on that
|
|
4985 machine. This allows you to run XEmacs on a remote machine, with its
|
|
4986 display set to your local machine, and have the sounds be made on your
|
|
4987 local machine, provided that you have a NAS server running on your local
|
|
4988 machine.
|
|
4989
|
|
4990 @file{libsst.c}, @file{libsst.h}, and @file{libst.h} provide some
|
|
4991 additional functions for playing sound on a Sun SPARC but are not
|
|
4992 currently in use.
|
|
4993
|
|
4994
|
|
4995
|
|
4996 @example
|
|
4997 tooltalk.c
|
|
4998 tooltalk.h
|
|
4999 @end example
|
|
5000
|
|
5001 These two modules implement an interface to the ToolTalk protocol, which
|
|
5002 is an interprocess communication protocol implemented on some versions
|
|
5003 of Unix. ToolTalk is a high-level protocol that allows processes to
|
|
5004 register themselves as providers of particular services; other processes
|
|
5005 can then request a service without knowing or caring exactly who is
|
|
5006 providing the service. It is similar in spirit to the DDE protocol
|
|
5007 provided under Microsoft Windows. ToolTalk is a part of the new CDE
|
|
5008 (Common Desktop Environment) specification and is used to connect the
|
|
5009 parts of the SPARCWorks development environment.
|
|
5010
|
|
5011
|
|
5012
|
|
5013 @example
|
|
5014 getloadavg.c
|
|
5015 @end example
|
|
5016
|
|
5017 This module provides the ability to retrieve the system's current load
|
|
5018 average. (The way to do this is highly system-specific, unfortunately,
|
|
5019 and requires a lot of special-case code.)
|
|
5020
|
|
5021
|
|
5022
|
|
5023 @example
|
|
5024 sunpro.c
|
|
5025 @end example
|
|
5026
|
|
5027 This module provides a small amount of code used internally at Sun to
|
|
5028 keep statistics on the usage of XEmacs.
|
|
5029
|
|
5030
|
|
5031
|
|
5032 @example
|
|
5033 broken-sun.h
|
|
5034 strcmp.c
|
|
5035 strcpy.c
|
|
5036 sunOS-fix.c
|
|
5037 @end example
|
|
5038
|
|
5039 These files provide replacement functions and prototypes to fix numerous
|
|
5040 bugs in early releases of SunOS 4.1.
|
|
5041
|
|
5042
|
|
5043
|
|
5044 @example
|
|
5045 hftctl.c
|
|
5046 @end example
|
|
5047
|
|
5048 This module provides some terminal-control code necessary on versions of
|
|
5049 AIX prior to 4.1.
|
|
5050
|
|
5051
|
|
5052
|
462
|
5053 @node Modules for Interfacing with X Windows
|
428
|
5054 @section Modules for Interfacing with X Windows
|
462
|
5055 @cindex modules for interfacing with X Windows
|
|
5056 @cindex interfacing with X Windows, modules for
|
|
5057 @cindex X Windows, modules for interfacing with
|
428
|
5058
|
|
5059 @example
|
|
5060 Emacs.ad.h
|
|
5061 @end example
|
|
5062
|
|
5063 A file generated from @file{Emacs.ad}, which contains XEmacs-supplied
|
|
5064 fallback resources (so that XEmacs has pretty defaults).
|
|
5065
|
|
5066
|
|
5067
|
|
5068 @example
|
|
5069 EmacsFrame.c
|
|
5070 EmacsFrame.h
|
|
5071 EmacsFrameP.h
|
|
5072 @end example
|
|
5073
|
|
5074 These modules implement an Xt widget class that encapsulates a frame.
|
|
5075 This is for ease in integrating with Xt. The EmacsFrame widget covers
|
|
5076 the entire X window except for the menubar; the scrollbars are
|
|
5077 positioned on top of the EmacsFrame widget.
|
|
5078
|
|
5079 @strong{Warning:} Abandon hope, all ye who enter here. This code took
|
|
5080 an ungodly amount of time to get right, and is likely to fall apart
|
|
5081 mercilessly at the slightest change. Such is life under Xt.
|
|
5082
|
|
5083
|
|
5084
|
|
5085 @example
|
|
5086 EmacsManager.c
|
|
5087 EmacsManager.h
|
|
5088 EmacsManagerP.h
|
|
5089 @end example
|
|
5090
|
|
5091 These modules implement a simple Xt manager (i.e. composite) widget
|
|
5092 class that simply lets its children set whatever geometry they want.
|
|
5093 It's amazing that Xt doesn't provide this standardly, but on second
|
|
5094 thought, it makes sense, considering how amazingly broken Xt is.
|
|
5095
|
|
5096
|
|
5097 @example
|
|
5098 EmacsShell-sub.c
|
|
5099 EmacsShell.c
|
|
5100 EmacsShell.h
|
|
5101 EmacsShellP.h
|
|
5102 @end example
|
|
5103
|
|
5104 These modules implement two Xt widget classes that are subclasses of
|
|
5105 the TopLevelShell and TransientShell classes. This is necessary to deal
|
|
5106 with more brokenness that Xt has sadistically thrust onto the backs of
|
|
5107 developers.
|
|
5108
|
|
5109
|
|
5110
|
|
5111 @example
|
|
5112 xgccache.c
|
|
5113 xgccache.h
|
|
5114 @end example
|
|
5115
|
|
5116 These modules provide functions for maintenance and caching of GC's
|
|
5117 (graphics contexts) under the X Window System. This code is junky and
|
|
5118 needs to be rewritten.
|
|
5119
|
|
5120
|
|
5121
|
|
5122 @example
|
442
|
5123 select-msw.c
|
|
5124 select-x.c
|
|
5125 select.c
|
|
5126 select.h
|
428
|
5127 @end example
|
|
5128
|
|
5129 @cindex selections
|
|
5130 This module provides an interface to the X Window System's concept of
|
|
5131 @dfn{selections}, the standard way for X applications to communicate
|
|
5132 with each other.
|
|
5133
|
|
5134
|
|
5135
|
|
5136 @example
|
|
5137 xintrinsic.h
|
|
5138 xintrinsicp.h
|
|
5139 xmmanagerp.h
|
|
5140 xmprimitivep.h
|
|
5141 @end example
|
|
5142
|
|
5143 These header files are similar in spirit to the @file{sys*.h} files and buffer
|
|
5144 against different implementations of Xt and Motif.
|
|
5145
|
|
5146 @itemize @bullet
|
|
5147 @item
|
|
5148 @file{xintrinsic.h} should be included in place of @file{<Intrinsic.h>}.
|
|
5149 @item
|
|
5150 @file{xintrinsicp.h} should be included in place of @file{<IntrinsicP.h>}.
|
|
5151 @item
|
|
5152 @file{xmmanagerp.h} should be included in place of @file{<XmManagerP.h>}.
|
|
5153 @item
|
|
5154 @file{xmprimitivep.h} should be included in place of @file{<XmPrimitiveP.h>}.
|
|
5155 @end itemize
|
|
5156
|
|
5157
|
|
5158
|
|
5159 @example
|
|
5160 xmu.c
|
|
5161 xmu.h
|
|
5162 @end example
|
|
5163
|
|
5164 These files provide an emulation of the Xmu library for those systems
|
|
5165 (i.e. HPUX) that don't provide it as a standard part of X.
|
|
5166
|
|
5167
|
|
5168
|
|
5169 @example
|
|
5170 ExternalClient-Xlib.c
|
|
5171 ExternalClient.c
|
|
5172 ExternalClient.h
|
|
5173 ExternalClientP.h
|
|
5174 ExternalShell.c
|
|
5175 ExternalShell.h
|
|
5176 ExternalShellP.h
|
|
5177 extw-Xlib.c
|
|
5178 extw-Xlib.h
|
|
5179 extw-Xt.c
|
|
5180 extw-Xt.h
|
|
5181 @end example
|
|
5182
|
|
5183 @cindex external widget
|
|
5184 These files provide the @dfn{external widget} interface, which allows an
|
|
5185 XEmacs frame to appear as a widget in another application. To do this,
|
|
5186 you have to configure with @samp{--external-widget}.
|
|
5187
|
|
5188 @file{ExternalShell*} provides the server (XEmacs) side of the
|
|
5189 connection.
|
|
5190
|
|
5191 @file{ExternalClient*} provides the client (other application) side of
|
|
5192 the connection. These files are not compiled into XEmacs but are
|
|
5193 compiled into libraries that are then linked into your application.
|
|
5194
|
|
5195 @file{extw-*} is common code that is used for both the client and server.
|
|
5196
|
|
5197 Don't touch this code; something is liable to break if you do.
|
|
5198
|
|
5199
|
|
5200
|
462
|
5201 @node Modules for Internationalization
|
428
|
5202 @section Modules for Internationalization
|
462
|
5203 @cindex modules for internationalization
|
|
5204 @cindex internationalization, modules for
|
428
|
5205
|
|
5206 @example
|
|
5207 mule-canna.c
|
|
5208 mule-ccl.c
|
|
5209 mule-charset.c
|
|
5210 mule-charset.h
|
442
|
5211 file-coding.c
|
|
5212 file-coding.h
|
428
|
5213 mule-mcpath.c
|
|
5214 mule-mcpath.h
|
|
5215 mule-wnnfns.c
|
|
5216 mule.c
|
|
5217 @end example
|
|
5218
|
|
5219 These files implement the MULE (Asian-language) support. Note that MULE
|
|
5220 actually provides a general interface for all sorts of languages, not
|
|
5221 just Asian languages (although they are generally the most complicated
|
|
5222 to support). This code is still in beta.
|
|
5223
|
442
|
5224 @file{mule-charset.*} and @file{file-coding.*} provide the heart of the
|
428
|
5225 XEmacs MULE support. @file{mule-charset.*} implements the @dfn{charset}
|
|
5226 Lisp object type, which encapsulates a character set (an ordered one- or
|
|
5227 two-dimensional set of characters, such as US ASCII or JISX0208 Japanese
|
|
5228 Kanji).
|
|
5229
|
442
|
5230 @file{file-coding.*} implements the @dfn{coding-system} Lisp object
|
428
|
5231 type, which encapsulates a method of converting between different
|
|
5232 encodings. An encoding is a representation of a stream of characters,
|
|
5233 possibly from multiple character sets, using a stream of bytes or words,
|
|
5234 and defines (e.g.) which escape sequences are used to specify particular
|
|
5235 character sets, how the indices for a character are converted into bytes
|
|
5236 (sometimes this involves setting the high bit; sometimes complicated
|
|
5237 rearranging of the values takes place, as in the Shift-JIS encoding),
|
|
5238 etc.
|
|
5239
|
|
5240 @file{mule-ccl.c} provides the CCL (Code Conversion Language)
|
|
5241 interpreter. CCL is similar in spirit to Lisp byte code and is used to
|
|
5242 implement converters for custom encodings.
|
|
5243
|
|
5244 @file{mule-canna.c} and @file{mule-wnnfns.c} implement interfaces to
|
|
5245 external programs used to implement the Canna and WNN input methods,
|
|
5246 respectively. This is currently in beta.
|
|
5247
|
|
5248 @file{mule-mcpath.c} provides some functions to allow for pathnames
|
|
5249 containing extended characters. This code is fragmentary, obsolete, and
|
|
5250 completely non-working. Instead, @var{pathname-coding-system} is used
|
|
5251 to specify conversions of names of files and directories. The standard
|
|
5252 C I/O functions like @samp{open()} are wrapped so that conversion occurs
|
|
5253 automatically.
|
|
5254
|
|
5255 @file{mule.c} provides a few miscellaneous things that should probably
|
|
5256 be elsewhere.
|
|
5257
|
|
5258
|
|
5259
|
|
5260 @example
|
|
5261 intl.c
|
|
5262 @end example
|
|
5263
|
|
5264 This provides some miscellaneous internationalization code for
|
|
5265 implementing message translation and interfacing to the Ximp input
|
|
5266 method. None of this code is currently working.
|
|
5267
|
|
5268
|
|
5269
|
|
5270 @example
|
|
5271 iso-wide.h
|
|
5272 @end example
|
|
5273
|
|
5274 This contains leftover code from an earlier implementation of
|
|
5275 Asian-language support, and is not currently used.
|
|
5276
|
|
5277
|
|
5278
|
|
5279
|
965
|
5280 @node Modules for Regression Testing
|
|
5281 @section Modules for Regression Testing
|
|
5282 @cindex modules for regression testing
|
|
5283 @cindex regression testing, modules for
|
|
5284
|
|
5285 @example
|
|
5286 test-harness.el
|
|
5287 base64-tests.el
|
|
5288 byte-compiler-tests.el
|
|
5289 case-tests.el
|
|
5290 ccl-tests.el
|
|
5291 c-tests.el
|
|
5292 database-tests.el
|
|
5293 extent-tests.el
|
|
5294 hash-table-tests.el
|
|
5295 lisp-tests.el
|
|
5296 md5-tests.el
|
|
5297 mule-tests.el
|
|
5298 regexp-tests.el
|
|
5299 symbol-tests.el
|
|
5300 syntax-tests.el
|
|
5301 @end example
|
|
5302
|
|
5303 @file{test-harness.el} defines the macros @code{Assert},
|
|
5304 @code{Check-Error}, @code{Check-Error-Message}, and
|
|
5305 @code{Check-Message}. The other files are test files, testing various
|
|
5306 XEmacs facilities.
|
|
5307
|
|
5308
|
|
5309
|
442
|
5310 @node Allocation of Objects in XEmacs Lisp, Dumping, A Summary of the Various XEmacs Modules, Top
|
428
|
5311 @chapter Allocation of Objects in XEmacs Lisp
|
462
|
5312 @cindex allocation of objects in XEmacs Lisp
|
|
5313 @cindex objects in XEmacs Lisp, allocation of
|
|
5314 @cindex Lisp objects, allocation of in XEmacs
|
428
|
5315
|
|
5316 @menu
|
|
5317 * Introduction to Allocation::
|
|
5318 * Garbage Collection::
|
|
5319 * GCPROing::
|
|
5320 * Garbage Collection - Step by Step::
|
|
5321 * Integers and Characters::
|
|
5322 * Allocation from Frob Blocks::
|
|
5323 * lrecords::
|
|
5324 * Low-level allocation::
|
|
5325 * Cons::
|
|
5326 * Vector::
|
|
5327 * Bit Vector::
|
|
5328 * Symbol::
|
|
5329 * Marker::
|
|
5330 * String::
|
|
5331 * Compiled Function::
|
|
5332 @end menu
|
|
5333
|
462
|
5334 @node Introduction to Allocation
|
428
|
5335 @section Introduction to Allocation
|
462
|
5336 @cindex allocation, introduction to
|
428
|
5337
|
|
5338 Emacs Lisp, like all Lisps, has garbage collection. This means that
|
|
5339 the programmer never has to explicitly free (destroy) an object; it
|
|
5340 happens automatically when the object becomes inaccessible. Most
|
|
5341 experts agree that garbage collection is a necessity in a modern,
|
|
5342 high-level language. Its omission from C stems from the fact that C was
|
|
5343 originally designed to be a nice abstract layer on top of assembly
|
|
5344 language, for writing kernels and basic system utilities rather than
|
|
5345 large applications.
|
|
5346
|
|
5347 Lisp objects can be created by any of a number of Lisp primitives.
|
|
5348 Most object types have one or a small number of basic primitives
|
|
5349 for creating objects. For conses, the basic primitive is @code{cons};
|
|
5350 for vectors, the primitives are @code{make-vector} and @code{vector}; for
|
|
5351 symbols, the primitives are @code{make-symbol} and @code{intern}; etc.
|
|
5352 Some Lisp objects, especially those that are primarily used internally,
|
|
5353 have no corresponding Lisp primitives. Every Lisp object, though,
|
|
5354 has at least one C primitive for creating it.
|
|
5355
|
442
|
5356 Recall from section (VII) that a Lisp object, as stored in a 32-bit or
|
|
5357 64-bit word, has a few tag bits, and a ``value'' that occupies the
|
|
5358 remainder of the bits. We can separate the different Lisp object types
|
|
5359 into three broad categories:
|
428
|
5360
|
|
5361 @itemize @bullet
|
|
5362 @item
|
|
5363 (a) Those for whom the value directly represents the contents of the
|
|
5364 Lisp object. Only two types are in this category: integers and
|
|
5365 characters. No special allocation or garbage collection is necessary
|
|
5366 for such objects. Lisp objects of these types do not need to be
|
|
5367 @code{GCPRO}ed.
|
|
5368 @end itemize
|
|
5369
|
442
|
5370 In the remaining two categories, the type is stored in the object
|
|
5371 itself. The tag for all such objects is the generic @dfn{lrecord}
|
|
5372 (Lisp_Type_Record) tag. The first bytes of the object's structure are an
|
|
5373 integer (actually a char) characterising the object's type and some
|
|
5374 flags, in particular the mark bit used for garbage collection. A
|
|
5375 structure describing the type is accessible thru the
|
|
5376 lrecord_implementation_table indexed with said integer. This structure
|
|
5377 includes the method pointers and a pointer to a string naming the type.
|
428
|
5378
|
|
5379 @itemize @bullet
|
|
5380 @item
|
442
|
5381 (b) Those lrecords that are allocated in frob blocks (see above). This
|
428
|
5382 includes the objects that are most common and relatively small, and
|
442
|
5383 includes conses, strings, subrs, floats, compiled functions, symbols,
|
428
|
5384 extents, events, and markers. With the cleanup of frob blocks done in
|
|
5385 19.12, it's not terribly hard to add more objects to this category, but
|
442
|
5386 it's a bit trickier than adding an object type to type (c) (esp. if the
|
428
|
5387 object needs a finalization method), and is not likely to save much
|
|
5388 space unless the object is small and there are many of them. (In fact,
|
|
5389 if there are very few of them, it might actually waste space.)
|
|
5390 @item
|
442
|
5391 (c) Those lrecords that are individually @code{malloc()}ed. These are
|
428
|
5392 called @dfn{lcrecords}. All other types are in this category. Adding a
|
|
5393 new type to this category is comparatively easy, and all types added
|
|
5394 since 19.8 (when the current allocation scheme was devised, by Richard
|
|
5395 Mlynarik), with the exception of the character type, have been in this
|
|
5396 category.
|
|
5397 @end itemize
|
|
5398
|
|
5399 Note that bit vectors are a bit of a special case. They are
|
442
|
5400 simple lrecords as in category (b), but are individually @code{malloc()}ed
|
428
|
5401 like vectors. You can basically view them as exactly like vectors
|
|
5402 except that their type is stored in lrecord fashion rather than
|
|
5403 in directly-tagged fashion.
|
|
5404
|
442
|
5405
|
462
|
5406 @node Garbage Collection
|
428
|
5407 @section Garbage Collection
|
|
5408 @cindex garbage collection
|
|
5409
|
|
5410 @cindex mark and sweep
|
|
5411 Garbage collection is simple in theory but tricky to implement.
|
|
5412 Emacs Lisp uses the oldest garbage collection method, called
|
|
5413 @dfn{mark and sweep}. Garbage collection begins by starting with
|
|
5414 all accessible locations (i.e. all variables and other slots where
|
|
5415 Lisp objects might occur) and recursively traversing all objects
|
|
5416 accessible from those slots, marking each one that is found.
|
|
5417 We then go through all of memory and free each object that is
|
|
5418 not marked, and unmarking each object that is marked. Note
|
|
5419 that ``all of memory'' means all currently allocated objects.
|
|
5420 Traversing all these objects means traversing all frob blocks,
|
|
5421 all vectors (which are chained in one big list), and all
|
|
5422 lcrecords (which are likewise chained).
|
|
5423
|
442
|
5424 Garbage collection can be invoked explicitly by calling
|
|
5425 @code{garbage-collect} but is also called automatically by @code{eval},
|
|
5426 once a certain amount of memory has been allocated since the last
|
|
5427 garbage collection (according to @code{gc-cons-threshold}).
|
|
5428
|
|
5429
|
462
|
5430 @node GCPROing
|
428
|
5431 @section @code{GCPRO}ing
|
462
|
5432 @cindex @code{GCPRO}ing
|
|
5433 @cindex garbage collection protection
|
|
5434 @cindex protection, garbage collection
|
428
|
5435
|
|
5436 @code{GCPRO}ing is one of the ugliest and trickiest parts of Emacs
|
|
5437 internals. The basic idea is that whenever garbage collection
|
|
5438 occurs, all in-use objects must be reachable somehow or
|
|
5439 other from one of the roots of accessibility. The roots
|
|
5440 of accessibility are:
|
|
5441
|
|
5442 @enumerate
|
|
5443 @item
|
442
|
5444 All objects that have been @code{staticpro()}d or
|
|
5445 @code{staticpro_nodump()}ed. This is used for any global C variables
|
|
5446 that hold Lisp objects. A call to @code{staticpro()} happens implicitly
|
|
5447 as a result of any symbols declared with @code{defsymbol()} and any
|
|
5448 variables declared with @code{DEFVAR_FOO()}. You need to explicitly
|
|
5449 call @code{staticpro()} (in the @code{vars_of_foo()} method of a module)
|
|
5450 for other global C variables holding Lisp objects. (This typically
|
|
5451 includes internal lists and such things.). Use
|
|
5452 @code{staticpro_nodump()} only in the rare cases when you do not want
|
|
5453 the pointed variable to be saved at dump time but rather recompute it at
|
|
5454 startup.
|
428
|
5455
|
|
5456 Note that @code{obarray} is one of the @code{staticpro()}d things.
|
|
5457 Therefore, all functions and variables get marked through this.
|
|
5458 @item
|
|
5459 Any shadowed bindings that are sitting on the @code{specpdl} stack.
|
|
5460 @item
|
|
5461 Any objects sitting in currently active (Lisp) stack frames,
|
|
5462 catches, and condition cases.
|
|
5463 @item
|
|
5464 A couple of special-case places where active objects are
|
|
5465 located.
|
|
5466 @item
|
|
5467 Anything currently marked with @code{GCPRO}.
|
|
5468 @end enumerate
|
|
5469
|
|
5470 Marking with @code{GCPRO} is necessary because some C functions (quite
|
|
5471 a lot, in fact), allocate objects during their operation. Quite
|
|
5472 frequently, there will be no other pointer to the object while the
|
|
5473 function is running, and if a garbage collection occurs and the object
|
|
5474 needs to be referenced again, bad things will happen. The solution is
|
|
5475 to mark those objects with @code{GCPRO}. Unfortunately this is easy to
|
|
5476 forget, and there is basically no way around this problem. Here are
|
|
5477 some rules, though:
|
|
5478
|
|
5479 @enumerate
|
|
5480 @item
|
|
5481 For every @code{GCPRO@var{n}}, there have to be declarations of
|
|
5482 @code{struct gcpro gcpro1, gcpro2}, etc.
|
|
5483
|
|
5484 @item
|
|
5485 You @emph{must} @code{UNGCPRO} anything that's @code{GCPRO}ed, and you
|
|
5486 @emph{must not} @code{UNGCPRO} if you haven't @code{GCPRO}ed. Getting
|
|
5487 either of these wrong will lead to crashes, often in completely random
|
|
5488 places unrelated to where the problem lies.
|
|
5489
|
|
5490 @item
|
|
5491 The way this actually works is that all currently active @code{GCPRO}s
|
|
5492 are chained through the @code{struct gcpro} local variables, with the
|
|
5493 variable @samp{gcprolist} pointing to the head of the list and the nth
|
|
5494 local @code{gcpro} variable pointing to the first @code{gcpro} variable
|
|
5495 in the next enclosing stack frame. Each @code{GCPRO}ed thing is an
|
|
5496 lvalue, and the @code{struct gcpro} local variable contains a pointer to
|
|
5497 this lvalue. This is why things will mess up badly if you don't pair up
|
440
|
5498 the @code{GCPRO}s and @code{UNGCPRO}s---you will end up with
|
428
|
5499 @code{gcprolist}s containing pointers to @code{struct gcpro}s or local
|
|
5500 @code{Lisp_Object} variables in no-longer-active stack frames.
|
|
5501
|
|
5502 @item
|
|
5503 It is actually possible for a single @code{struct gcpro} to
|
|
5504 protect a contiguous array of any number of values, rather than
|
|
5505 just a single lvalue. To effect this, call @code{GCPRO@var{n}} as usual on
|
|
5506 the first object in the array and then set @code{gcpro@var{n}.nvars}.
|
|
5507
|
|
5508 @item
|
|
5509 @strong{Strings are relocated.} What this means in practice is that the
|
|
5510 pointer obtained using @code{XSTRING_DATA()} is liable to change at any
|
|
5511 time, and you should never keep it around past any function call, or
|
|
5512 pass it as an argument to any function that might cause a garbage
|
|
5513 collection. This is why a number of functions accept either a
|
|
5514 ``non-relocatable'' @code{char *} pointer or a relocatable Lisp string,
|
|
5515 and only access the Lisp string's data at the very last minute. In some
|
|
5516 cases, you may end up having to @code{alloca()} some space and copy the
|
|
5517 string's data into it.
|
|
5518
|
|
5519 @item
|
|
5520 By convention, if you have to nest @code{GCPRO}'s, use @code{NGCPRO@var{n}}
|
|
5521 (along with @code{struct gcpro ngcpro1, ngcpro2}, etc.), @code{NNGCPRO@var{n}},
|
|
5522 etc. This avoids compiler warnings about shadowed locals.
|
|
5523
|
|
5524 @item
|
|
5525 It is @emph{always} better to err on the side of extra @code{GCPRO}s
|
|
5526 rather than too few. The extra cycles spent on this are
|
|
5527 almost never going to make a whit of difference in the
|
|
5528 speed of anything.
|
|
5529
|
|
5530 @item
|
|
5531 The general rule to follow is that caller, not callee, @code{GCPRO}s.
|
|
5532 That is, you should not have to explicitly @code{GCPRO} any Lisp objects
|
|
5533 that are passed in as parameters.
|
|
5534
|
|
5535 One exception from this rule is if you ever plan to change the parameter
|
|
5536 value, and store a new object in it. In that case, you @emph{must}
|
|
5537 @code{GCPRO} the parameter, because otherwise the new object will not be
|
|
5538 protected.
|
|
5539
|
|
5540 So, if you create any Lisp objects (remember, this happens in all sorts
|
|
5541 of circumstances, e.g. with @code{Fcons()}, etc.), you are responsible
|
|
5542 for @code{GCPRO}ing them, unless you are @emph{absolutely sure} that
|
|
5543 there's no possibility that a garbage-collection can occur while you
|
|
5544 need to use the object. Even then, consider @code{GCPRO}ing.
|
|
5545
|
|
5546 @item
|
|
5547 A garbage collection can occur whenever anything calls @code{Feval}, or
|
|
5548 whenever a QUIT can occur where execution can continue past
|
|
5549 this. (Remember, this is almost anywhere.)
|
|
5550
|
|
5551 @item
|
|
5552 If you have the @emph{least smidgeon of doubt} about whether
|
|
5553 you need to @code{GCPRO}, you should @code{GCPRO}.
|
|
5554
|
|
5555 @item
|
|
5556 Beware of @code{GCPRO}ing something that is uninitialized. If you have
|
|
5557 any shade of doubt about this, initialize all your variables to @code{Qnil}.
|
|
5558
|
|
5559 @item
|
|
5560 Be careful of traps, like calling @code{Fcons()} in the argument to
|
|
5561 another function. By the ``caller protects'' law, you should be
|
|
5562 @code{GCPRO}ing the newly-created cons, but you aren't. A certain
|
|
5563 number of functions that are commonly called on freshly created stuff
|
|
5564 (e.g. @code{nconc2()}, @code{Fsignal()}), break the ``caller protects''
|
|
5565 law and go ahead and @code{GCPRO} their arguments so as to simplify
|
|
5566 things, but make sure and check if it's OK whenever doing something like
|
|
5567 this.
|
|
5568
|
|
5569 @item
|
|
5570 Once again, remember to @code{GCPRO}! Bugs resulting from insufficient
|
|
5571 @code{GCPRO}ing are intermittent and extremely difficult to track down,
|
|
5572 often showing up in crashes inside of @code{garbage-collect} or in
|
|
5573 weirdly corrupted objects or even in incorrect values in a totally
|
|
5574 different section of code.
|
|
5575 @end enumerate
|
|
5576
|
965
|
5577 If you don't understand whether to @code{GCPRO} in a particular
|
|
5578 instance, ask on the mailing lists. A general hint is that @code{prog1}
|
|
5579 is the canonical example
|
|
5580
|
428
|
5581 @cindex garbage collection, conservative
|
|
5582 @cindex conservative garbage collection
|
|
5583 Given the extremely error-prone nature of the @code{GCPRO} scheme, and
|
|
5584 the difficulties in tracking down, it should be considered a deficiency
|
|
5585 in the XEmacs code. A solution to this problem would involve
|
|
5586 implementing so-called @dfn{conservative} garbage collection for the C
|
|
5587 stack. That involves looking through all of stack memory and treating
|
|
5588 anything that looks like a reference to an object as a reference. This
|
|
5589 will result in a few objects not getting collected when they should, but
|
|
5590 it obviates the need for @code{GCPRO}ing, and allows garbage collection
|
|
5591 to happen at any point at all, such as during object allocation.
|
|
5592
|
462
|
5593 @node Garbage Collection - Step by Step
|
428
|
5594 @section Garbage Collection - Step by Step
|
462
|
5595 @cindex garbage collection - step by step
|
428
|
5596
|
|
5597 @menu
|
|
5598 * Invocation::
|
|
5599 * garbage_collect_1::
|
|
5600 * mark_object::
|
|
5601 * gc_sweep::
|
|
5602 * sweep_lcrecords_1::
|
|
5603 * compact_string_chars::
|
|
5604 * sweep_strings::
|
|
5605 * sweep_bit_vectors_1::
|
|
5606 @end menu
|
|
5607
|
462
|
5608 @node Invocation
|
428
|
5609 @subsection Invocation
|
|
5610 @cindex garbage collection, invocation
|
|
5611
|
|
5612 The first thing that anyone should know about garbage collection is:
|
442
|
5613 when and how the garbage collector is invoked. One might think that this
|
428
|
5614 could happen every time new memory is allocated, e.g. new objects are
|
|
5615 created, but this is @emph{not} the case. Instead, we have the following
|
|
5616 situation:
|
|
5617
|
|
5618 The entry point of any process of garbage collection is an invocation
|
|
5619 of the function @code{garbage_collect_1} in file @code{alloc.c}. The
|
|
5620 invocation can occur @emph{explicitly} by calling the function
|
|
5621 @code{Fgarbage_collect} (in addition this function provides information
|
442
|
5622 about the freed memory), or can occur @emph{implicitly} in four different
|
428
|
5623 situations:
|
|
5624 @enumerate
|
|
5625 @item
|
|
5626 In function @code{main_1} in file @code{emacs.c}. This function is called
|
|
5627 at each startup of xemacs. The garbage collection is invoked after all
|
|
5628 initial creations are completed, but only if a special internal error
|
|
5629 checking-constant @code{ERROR_CHECK_GC} is defined.
|
|
5630 @item
|
|
5631 In function @code{disksave_object_finalization} in file
|
|
5632 @code{alloc.c}. The only purpose of this function is to clear the
|
442
|
5633 objects from memory which need not be stored with xemacs when we dump out
|
428
|
5634 an executable. This is only done by @code{Fdump_emacs} or by
|
|
5635 @code{Fdump_emacs_data} respectively (both in @code{emacs.c}). The
|
|
5636 actual clearing is accomplished by making these objects unreachable and
|
|
5637 starting a garbage collection. The function is only used while building
|
|
5638 xemacs.
|
|
5639 @item
|
|
5640 In function @code{Feval / eval} in file @code{eval.c}. Each time the
|
|
5641 well known and often used function eval is called to evaluate a form,
|
|
5642 one of the first things that could happen, is a potential call of
|
|
5643 @code{garbage_collect_1}. There exist three global variables,
|
|
5644 @code{consing_since_gc} (counts the created cons-cells since the last
|
|
5645 garbage collection), @code{gc_cons_threshold} (a specified threshold
|
|
5646 after which a garbage collection occurs) and @code{always_gc}. If
|
|
5647 @code{always_gc} is set or if the threshold is exceeded, the garbage
|
|
5648 collection will start.
|
|
5649 @item
|
|
5650 In function @code{Ffuncall / funcall} in file @code{eval.c}. This
|
|
5651 function evaluates calls of elisp functions and works according to
|
|
5652 @code{Feval}.
|
|
5653 @end enumerate
|
|
5654
|
|
5655 The upshot is that garbage collection can basically occur everywhere
|
|
5656 @code{Feval}, respectively @code{Ffuncall}, is used - either directly or
|
442
|
5657 through another function. Since calls to these two functions are hidden
|
|
5658 in various other functions, many calls to @code{garbage_collect_1} are
|
|
5659 not obviously foreseeable, and therefore unexpected. Instances where
|
|
5660 they are used that are worth remembering are various elisp commands, as
|
|
5661 for example @code{or}, @code{and}, @code{if}, @code{cond}, @code{while},
|
|
5662 @code{setq}, etc., miscellaneous @code{gui_item_...} functions,
|
|
5663 everything related to @code{eval} (@code{Feval_buffer}, @code{call0},
|
|
5664 ...) and inside @code{Fsignal}. The latter is used to handle signals, as
|
444
|
5665 for example the ones raised by every @code{QUIT}-macro triggered after
|
442
|
5666 pressing Ctrl-g.
|
|
5667
|
462
|
5668 @node garbage_collect_1
|
428
|
5669 @subsection @code{garbage_collect_1}
|
|
5670 @cindex @code{garbage_collect_1}
|
|
5671
|
|
5672 We can now describe exactly what happens after the invocation takes
|
|
5673 place.
|
|
5674 @enumerate
|
|
5675 @item
|
442
|
5676 There are several cases in which the garbage collector is left immediately:
|
428
|
5677 when we are already garbage collecting (@code{gc_in_progress}), when
|
|
5678 the garbage collection is somehow forbidden
|
|
5679 (@code{gc_currently_forbidden}), when we are currently displaying something
|
|
5680 (@code{in_display}) or when we are preparing for the armageddon of the
|
|
5681 whole system (@code{preparing_for_armageddon}).
|
|
5682 @item
|
|
5683 Next the correct frame in which to put
|
|
5684 all the output occurring during garbage collecting is determined. In
|
|
5685 order to be able to restore the old display's state after displaying the
|
|
5686 message, some data about the current cursor position has to be
|
442
|
5687 saved. The variables @code{pre_gc_cursor} and @code{cursor_changed} take
|
428
|
5688 care of that.
|
|
5689 @item
|
|
5690 The state of @code{gc_currently_forbidden} must be restored after
|
|
5691 the garbage collection, no matter what happens during the process. We
|
|
5692 accomplish this by @code{record_unwind_protect}ing the suitable function
|
|
5693 @code{restore_gc_inhibit} together with the current value of
|
442
|
5694 @code{gc_currently_forbidden}.
|
428
|
5695 @item
|
|
5696 If we are concurrently running an interactive xemacs session, the next step
|
|
5697 is simply to show the garbage collector's cursor/message.
|
|
5698 @item
|
|
5699 The following steps are the intrinsic steps of the garbage collector,
|
|
5700 therefore @code{gc_in_progress} is set.
|
|
5701 @item
|
|
5702 For debugging purposes, it is possible to copy the current C stack
|
|
5703 frame. However, this seems to be a currently unused feature.
|
|
5704 @item
|
|
5705 Before actually starting to go over all live objects, references to
|
|
5706 objects that are no longer used are pruned. We only have to do this for events
|
|
5707 (@code{clear_event_resource}) and for specifiers
|
442
|
5708 (@code{cleanup_specifiers}).
|
428
|
5709 @item
|
|
5710 Now the mark phase begins and marks all accessible elements. In order to
|
|
5711 start from
|
|
5712 all slots that serve as roots of accessibility, the function
|
|
5713 @code{mark_object} is called for each root individually to go out from
|
|
5714 there to mark all reachable objects. All roots that are traversed are
|
|
5715 shown in their processed order:
|
|
5716 @itemize @bullet
|
|
5717 @item
|
|
5718 all constant symbols and static variables that are registered via
|
452
|
5719 @code{staticpro}@ in the dynarr @code{staticpros}.
|
442
|
5720 @xref{Adding Global Lisp Variables}.
|
428
|
5721 @item
|
|
5722 all Lisp objects that are created in C functions and that must be
|
|
5723 protected from freeing them. They are registered in the global
|
|
5724 list @code{gcprolist}.
|
|
5725 @xref{GCPROing}.
|
442
|
5726 @item
|
428
|
5727 all local variables (i.e. their name fields @code{symbol} and old
|
|
5728 values @code{old_values}) that are bound during the evaluation by the Lisp
|
|
5729 engine. They are stored in @code{specbinding} structs pushed on a stack
|
|
5730 called @code{specpdl}.
|
|
5731 @xref{Dynamic Binding; The specbinding Stack; Unwind-Protects}.
|
|
5732 @item
|
|
5733 all catch blocks that the Lisp engine encounters during the evaluation
|
|
5734 cause the creation of structs @code{catchtag} inserted in the list
|
|
5735 @code{catchlist}. Their tag (@code{tag}) and value (@code{val} fields
|
|
5736 are freshly created objects and therefore have to be marked.
|
|
5737 @xref{Catch and Throw}.
|
|
5738 @item
|
442
|
5739 every function application pushes new structs @code{backtrace}
|
|
5740 on the call stack of the Lisp engine (@code{backtrace_list}). The unique
|
428
|
5741 parts that have to be marked are the fields for each function
|
|
5742 (@code{function}) and all their arguments (@code{args}).
|
|
5743 @xref{Evaluation}.
|
|
5744 @item
|
442
|
5745 all objects that are used by the redisplay engine that must not be freed
|
428
|
5746 are marked by a special function called @code{mark_redisplay} (in
|
|
5747 @code{redisplay.c}).
|
|
5748 @item
|
|
5749 all objects created for profiling purposes are allocated by C functions
|
|
5750 instead of using the lisp allocation mechanisms. In order to receive the
|
|
5751 right ones during the sweep phase, they also have to be marked
|
|
5752 manually. That is done by the function @code{mark_profiling_info}
|
|
5753 @end itemize
|
|
5754 @item
|
436
|
5755 Hash tables in XEmacs belong to a kind of special objects that
|
428
|
5756 make use of a concept often called 'weak pointers'.
|
|
5757 To make a long story short, these kind of pointers are not followed
|
|
5758 during the estimation of the live objects during garbage collection.
|
|
5759 Any object referenced only by weak pointers is collected
|
|
5760 anyway, and the reference to it is cleared. In hash tables there are
|
|
5761 different usage patterns of them, manifesting in different types of hash
|
|
5762 tables, namely 'non-weak', 'weak', 'key-weak' and 'value-weak'
|
442
|
5763 (internally also 'key-car-weak' and 'value-car-weak') hash tables, each
|
|
5764 clearing entries depending on different conditions. More information can
|
428
|
5765 be found in the documentation to the function @code{make-hash-table}.
|
|
5766
|
|
5767 Because there are complicated dependency rules about when and what to
|
|
5768 mark while processing weak hash tables, the standard @code{marker}
|
|
5769 method is only active if it is marking non-weak hash tables. As soon as
|
|
5770 a weak component is in the table, the hash table entries are ignored
|
|
5771 while marking. Instead their marking is done each separately by the
|
|
5772 function @code{finish_marking_weak_hash_tables}. This function iterates
|
|
5773 over each hash table entry @code{hentries} for each weak hash table in
|
|
5774 @code{Vall_weak_hash_tables}. Depending on the type of a table, the
|
442
|
5775 appropriate action is performed.
|
428
|
5776 If a table is acting as @code{HASH_TABLE_KEY_WEAK}, and a key already marked,
|
442
|
5777 everything reachable from the @code{value} component is marked. If it is
|
428
|
5778 acting as a @code{HASH_TABLE_VALUE_WEAK} and the value component is
|
442
|
5779 already marked, the marking starts beginning only from the
|
428
|
5780 @code{key} component.
|
442
|
5781 If it is a @code{HASH_TABLE_KEY_CAR_WEAK} and the car
|
428
|
5782 of the key entry is already marked, we mark both the @code{key} and
|
|
5783 @code{value} components.
|
|
5784 Finally, if the table is of the type @code{HASH_TABLE_VALUE_CAR_WEAK}
|
|
5785 and the car of the value components is already marked, again both the
|
|
5786 @code{key} and the @code{value} components get marked.
|
|
5787
|
|
5788 Again, there are lists with comparable properties called weak
|
|
5789 lists. There exist different peculiarities of their types called
|
|
5790 @code{simple}, @code{assoc}, @code{key-assoc} and
|
|
5791 @code{value-assoc}. You can find further details about them in the
|
|
5792 description to the function @code{make-weak-list}. The scheme of their
|
442
|
5793 marking is similar: all weak lists are listed in @code{Qall_weak_lists},
|
428
|
5794 therefore we iterate over them. The marking is advanced until we hit an
|
442
|
5795 already marked pair. Then we know that during a former run all
|
428
|
5796 the rest has been marked completely. Again, depending on the special
|
|
5797 type of the weak list, our jobs differ. If it is a @code{WEAK_LIST_SIMPLE}
|
|
5798 and the elem is marked, we mark the @code{cons} part. If it is a
|
|
5799 @code{WEAK_LIST_ASSOC} and not a pair or a pair with both marked car and
|
|
5800 cdr, we mark the @code{cons} and the @code{elem}. If it is a
|
|
5801 @code{WEAK_LIST_KEY_ASSOC} and not a pair or a pair with a marked car of
|
|
5802 the elem, we mark the @code{cons} and the @code{elem}. Finally, if it is
|
|
5803 a @code{WEAK_LIST_VALUE_ASSOC} and not a pair or a pair with a marked
|
|
5804 cdr of the elem, we mark both the @code{cons} and the @code{elem}.
|
|
5805
|
|
5806 Since, by marking objects in reach from weak hash tables and weak lists,
|
|
5807 other objects could get marked, this perhaps implies further marking of
|
442
|
5808 other weak objects, both finishing functions are redone as long as
|
428
|
5809 yet unmarked objects get freshly marked.
|
|
5810
|
|
5811 @item
|
|
5812 After completing the special marking for the weak hash tables and for the weak
|
|
5813 lists, all entries that point to objects that are going to be swept in
|
|
5814 the further process are useless, and therefore have to be removed from
|
|
5815 the table or the list.
|
|
5816
|
|
5817 The function @code{prune_weak_hash_tables} does the job for weak hash
|
|
5818 tables. Totally unmarked hash tables are removed from the list
|
|
5819 @code{Vall_weak_hash_tables}. The other ones are treated more carefully
|
442
|
5820 by scanning over all entries and removing one as soon as one of
|
428
|
5821 the components @code{key} and @code{value} is unmarked.
|
|
5822
|
|
5823 The same idea applies to the weak lists. It is accomplished by
|
|
5824 @code{prune_weak_lists}: An unmarked list is pruned from
|
|
5825 @code{Vall_weak_lists} immediately. A marked list is treated more
|
|
5826 carefully by going over it and removing just the unmarked pairs.
|
|
5827
|
|
5828 @item
|
|
5829 The function @code{prune_specifiers} checks all listed specifiers held
|
442
|
5830 in @code{Vall_specifiers} and removes the ones from the lists that are
|
428
|
5831 unmarked.
|
|
5832
|
|
5833 @item
|
|
5834 All syntax tables are stored in a list called
|
442
|
5835 @code{Vall_syntax_tables}. The function @code{prune_syntax_tables} walks
|
428
|
5836 through it and unlinks the tables that are unmarked.
|
|
5837
|
|
5838 @item
|
|
5839 Next, we will attack the complete sweeping - the function
|
|
5840 @code{gc_sweep} which holds the predominance.
|
|
5841 @item
|
|
5842 First, all the variables with respect to garbage collection are
|
442
|
5843 reset. @code{consing_since_gc} - the counter of the created cells since
|
428
|
5844 the last garbage collection - is set back to 0, and
|
|
5845 @code{gc_in_progress} is not @code{true} anymore.
|
|
5846 @item
|
442
|
5847 In case the session is interactive, the displayed cursor and message are
|
428
|
5848 removed again.
|
|
5849 @item
|
|
5850 The state of @code{gc_inhibit} is restored to the former value by
|
|
5851 unwinding the stack.
|
|
5852 @item
|
|
5853 A small memory reserve is always held back that can be reached by
|
|
5854 @code{breathing_space}. If nothing more is left, we create a new reserve
|
442
|
5855 and exit.
|
428
|
5856 @end enumerate
|
|
5857
|
462
|
5858 @node mark_object
|
428
|
5859 @subsection @code{mark_object}
|
|
5860 @cindex @code{mark_object}
|
|
5861
|
|
5862 The first thing that is checked while marking an object is whether the
|
|
5863 object is a real Lisp object @code{Lisp_Type_Record} or just an integer
|
|
5864 or a character. Integers and characters are the only two types that are
|
|
5865 stored directly - without another level of indirection, and therefore they
|
442
|
5866 don't have to be marked and collected.
|
428
|
5867 @xref{How Lisp Objects Are Represented in C}.
|
|
5868
|
|
5869 The second case is the one we have to handle. It is the one when we are
|
|
5870 dealing with a pointer to a Lisp object. But, there exist also three
|
|
5871 possibilities, that prevent us from doing anything while marking: The
|
|
5872 object is read only which prevents it from being garbage collected,
|
|
5873 i.e. marked (@code{C_READONLY_RECORD_HEADER}). The object in question is
|
|
5874 already marked, and need not be marked for the second time (checked by
|
|
5875 @code{MARKED_RECORD_HEADER_P}). If it is a special, unmarkable object
|
|
5876 (@code{UNMARKABLE_RECORD_HEADER_P}, apparently, these are objects that
|
442
|
5877 sit in some const space, and can therefore not be marked, see
|
428
|
5878 @code{this_one_is_unmarkable} in @code{alloc.c}).
|
|
5879
|
|
5880 Now, the actual marking is feasible. We do so by once using the macro
|
|
5881 @code{MARK_RECORD_HEADER} to mark the object itself (actually the
|
|
5882 special flag in the lrecord header), and calling its special marker
|
|
5883 "method" @code{marker} if available. The marker method marks every
|
442
|
5884 other object that is in reach from our current object. Note, that these
|
428
|
5885 marker methods should not call @code{mark_object} recursively, but
|
|
5886 instead should return the next object from where further marking has to
|
|
5887 be performed.
|
|
5888
|
|
5889 In case another object was returned, as mentioned before, we reiterate
|
|
5890 the whole @code{mark_object} process beginning with this next object.
|
|
5891
|
462
|
5892 @node gc_sweep
|
428
|
5893 @subsection @code{gc_sweep}
|
|
5894 @cindex @code{gc_sweep}
|
|
5895
|
442
|
5896 The job of this function is to free all unmarked records from memory. As
|
428
|
5897 we know, there are different types of objects implemented and managed, and
|
|
5898 consequently different ways to free them from memory.
|
|
5899 @xref{Introduction to Allocation}.
|
|
5900
|
|
5901 We start with all objects stored through @code{lcrecords}. All
|
|
5902 bulkier objects are allocated and handled using that scheme of
|
|
5903 @code{lcrecords}. Each object is @code{malloc}ed separately
|
|
5904 instead of placing it in one of the contiguous frob blocks. All types
|
442
|
5905 that are currently stored
|
438
|
5906 using @code{lcrecords}'s @code{alloc_lcrecord} and
|
428
|
5907 @code{make_lcrecord_list} are the types: vectors, buffers,
|
|
5908 char-table, char-table-entry, console, weak-list, database, device,
|
|
5909 ldap, hash-table, command-builder, extent-auxiliary, extent-info, face,
|
|
5910 coding-system, frame, image-instance, glyph, popup-data, gui-item,
|
|
5911 keymap, charset, color_instance, font_instance, opaque, opaque-list,
|
|
5912 process, range-table, specifier, symbol-value-buffer-local,
|
|
5913 symbol-value-lisp-magic, symbol-value-varalias, toolbar-button,
|
|
5914 tooltalk-message, tooltalk-pattern, window, and window-configuration. We
|
|
5915 take care of them in the fist place
|
|
5916 in order to be able to handle and to finalize items stored in them more
|
|
5917 easily. The function @code{sweep_lcrecords_1} as described below is
|
|
5918 doing the whole job for us.
|
|
5919 For a description about the internals: @xref{lrecords}.
|
|
5920
|
|
5921 Our next candidates are the other objects that behave quite differently
|
|
5922 than everything else: the strings. They consists of two parts, a
|
442
|
5923 fixed-size portion (@code{struct Lisp_String}) holding the string's
|
428
|
5924 length, its property list and a pointer to the second part, and the
|
|
5925 actual string data, which is stored in string-chars blocks comparable to
|
|
5926 frob blocks. In this block, the data is not only freed, but also a
|
|
5927 compression of holes is made, i.e. all strings are relocated together.
|
|
5928 @xref{String}. This compacting phase is performed by the function
|
|
5929 @code{compact_string_chars}, the actual sweeping by the function
|
|
5930 @code{sweep_strings} is described below.
|
|
5931
|
|
5932 After that, the other types are swept step by step using functions
|
|
5933 @code{sweep_conses}, @code{sweep_bit_vectors_1},
|
|
5934 @code{sweep_compiled_functions}, @code{sweep_floats},
|
|
5935 @code{sweep_symbols}, @code{sweep_extents}, @code{sweep_markers} and
|
|
5936 @code{sweep_extents}. They are the fixed-size types cons, floats,
|
|
5937 compiled-functions, symbol, marker, extent, and event stored in
|
|
5938 so-called "frob blocks", and therefore we can basically do the same on
|
|
5939 every type objects, using the same macros, especially defined only to
|
442
|
5940 handle everything with respect to fixed-size blocks. The only fixed-size
|
428
|
5941 type that is not handled here are the fixed-size portion of strings,
|
|
5942 because we took special care of them earlier.
|
|
5943
|
|
5944 The only big exceptions are bit vectors stored differently and
|
442
|
5945 therefore treated differently by the function @code{sweep_bit_vectors_1}
|
428
|
5946 described later.
|
|
5947
|
|
5948 At first, we need some brief information about how
|
|
5949 these fixed-size types are managed in general, in order to understand
|
|
5950 how the sweeping is done. They have all a fixed size, and are therefore
|
|
5951 stored in big blocks of memory - allocated at once - that can hold a
|
|
5952 certain amount of objects of one type. The macro
|
|
5953 @code{DECLARE_FIXED_TYPE_ALLOC} creates the suitable structures for
|
442
|
5954 every type. More precisely, we have the block struct
|
428
|
5955 (holding a pointer to the previous block @code{prev} and the
|
|
5956 objects in @code{block[]}), a pointer to current block
|
|
5957 (@code{current_..._block)}) and its last index
|
|
5958 (@code{current_..._block_index}), and a pointer to the free list that
|
|
5959 will be created. Also a macro @code{FIXED_TYPE_FROM_BLOCK} plus some
|
|
5960 related macros exists that are used to obtain a new object, either from
|
|
5961 the free list @code{ALLOCATE_FIXED_TYPE_1} if there is an unused object
|
|
5962 of that type stored or by allocating a completely new block using
|
|
5963 @code{ALLOCATE_FIXED_TYPE_FROM_BLOCK}.
|
|
5964
|
|
5965 The rest works as follows: all of them define a
|
|
5966 macro @code{UNMARK_...} that is used to unmark the object. They define a
|
|
5967 macro @code{ADDITIONAL_FREE_...} that defines additional work that has
|
|
5968 to be done when converting an object from in use to not in use (so far,
|
|
5969 only markers use it in order to unchain them). Then, they all call
|
442
|
5970 the macro @code{SWEEP_FIXED_TYPE_BLOCK} instantiated with their type name
|
428
|
5971 and their struct name.
|
|
5972
|
|
5973 This call in particular does the following: we go over all blocks
|
|
5974 starting with the current moving towards the oldest.
|
|
5975 For each block, we look at every object in it. If the object already
|
|
5976 freed (checked with @code{FREE_STRUCT_P} using the first pointer of the
|
442
|
5977 object), or if it is
|
428
|
5978 set to read only (@code{C_READONLY_RECORD_HEADER_P}, nothing must be
|
|
5979 done. If it is unmarked (checked with @code{MARKED_RECORD_HEADER_P}), it
|
|
5980 is put in the free list and set free (using the macro
|
442
|
5981 @code{FREE_FIXED_TYPE}, otherwise it stays in the block, but is unmarked
|
428
|
5982 (by @code{UNMARK_...}). While going through one block, we note if the
|
|
5983 whole block is empty. If so, the whole block is freed (using
|
|
5984 @code{xfree}) and the free list state is set to the state it had before
|
|
5985 handling this block.
|
|
5986
|
462
|
5987 @node sweep_lcrecords_1
|
428
|
5988 @subsection @code{sweep_lcrecords_1}
|
|
5989 @cindex @code{sweep_lcrecords_1}
|
|
5990
|
|
5991 After nullifying the complete lcrecord statistics, we go over all
|
442
|
5992 lcrecords two separate times. They are all chained together in a list with
|
|
5993 a head called @code{all_lcrecords}.
|
|
5994
|
|
5995 The first loop calls for each object its @code{finalizer} method, but only
|
428
|
5996 in the case that it is not read only
|
|
5997 (@code{C_READONLY_RECORD_HEADER_P)}, it is not already marked
|
|
5998 (@code{MARKED_RECORD_HEADER_P}), it is not already in a free list (list of
|
|
5999 freed objects, field @code{free}) and finally it owns a finalizer
|
|
6000 method.
|
442
|
6001
|
|
6002 The second loop actually frees the appropriate objects again by iterating
|
|
6003 through the whole list. In case an object is read only or marked, it
|
428
|
6004 has to persist, otherwise it is manually freed by calling
|
|
6005 @code{xfree}. During this loop, the lcrecord statistics are kept up to
|
442
|
6006 date by calling @code{tick_lcrecord_stats} with the right arguments,
|
|
6007
|
462
|
6008 @node compact_string_chars
|
428
|
6009 @subsection @code{compact_string_chars}
|
|
6010 @cindex @code{compact_string_chars}
|
|
6011
|
|
6012 The purpose of this function is to compact all the data parts of the
|
|
6013 strings that are held in so-called @code{string_chars_block}, i.e. the
|
|
6014 strings that do not exceed a certain maximal length.
|
|
6015
|
|
6016 The procedure with which this is done is as follows. We are keeping two
|
|
6017 positions in the @code{string_chars_block}s using two pointer/integer
|
|
6018 pairs, namely @code{from_sb}/@code{from_pos} and
|
|
6019 @code{to_sb}/@code{to_pos}. They stand for the actual positions, from
|
442
|
6020 where to where, to copy the actually handled string.
|
428
|
6021
|
|
6022 While going over all chained @code{string_char_block}s and their held
|
|
6023 strings, staring at @code{first_string_chars_block}, both pointers
|
|
6024 are advanced and eventually a string is copied from @code{from_sb} to
|
|
6025 @code{to_sb}, depending on the status of the pointed at strings.
|
|
6026
|
|
6027 More precisely, we can distinguish between the following actions.
|
|
6028 @itemize @bullet
|
|
6029 @item
|
|
6030 The string at @code{from_sb}'s position could be marked as free, which
|
442
|
6031 is indicated by an invalid pointer to the pointer that should point back
|
428
|
6032 to the fixed size string object, and which is checked by
|
|
6033 @code{FREE_STRUCT_P}. In this case, the @code{from_sb}/@code{from_pos}
|
|
6034 is advanced to the next string, and nothing has to be copied.
|
|
6035 @item
|
|
6036 Also, if a string object itself is unmarked, nothing has to be
|
|
6037 copied. We likewise advance the @code{from_sb}/@code{from_pos}
|
|
6038 pair as described above.
|
|
6039 @item
|
442
|
6040 In all other cases, we have a marked string at hand. The string data
|
428
|
6041 must be moved from the from-position to the to-position. In case
|
|
6042 there is not enough space in the actual @code{to_sb}-block, we advance
|
|
6043 this pointer to the beginning of the next block before copying. In case the
|
|
6044 from and to positions are different, we perform the
|
|
6045 actual copying using the library function @code{memmove}.
|
|
6046 @end itemize
|
|
6047
|
|
6048 After compacting, the pointer to the current
|
|
6049 @code{string_chars_block}, sitting in @code{current_string_chars_block},
|
|
6050 is reset on the last block to which we moved a string,
|
|
6051 i.e. @code{to_block}, and all remaining blocks (we know that they just
|
|
6052 carry garbage) are explicitly @code{xfree}d.
|
|
6053
|
462
|
6054 @node sweep_strings
|
428
|
6055 @subsection @code{sweep_strings}
|
|
6056 @cindex @code{sweep_strings}
|
|
6057
|
|
6058 The sweeping for the fixed sized string objects is essentially exactly
|
|
6059 the same as it is for all other fixed size types. As before, the freeing
|
|
6060 into the suitable free list is done by using the macro
|
|
6061 @code{SWEEP_FIXED_SIZE_BLOCK} after defining the right macros
|
|
6062 @code{UNMARK_string} and @code{ADDITIONAL_FREE_string}. These two
|
|
6063 definitions are a little bit special compared to the ones used
|
|
6064 for the other fixed size types.
|
|
6065
|
442
|
6066 @code{UNMARK_string} is defined the same way except some additional code
|
428
|
6067 used for updating the bookkeeping information.
|
|
6068
|
|
6069 For strings, @code{ADDITIONAL_FREE_string} has to do something in
|
|
6070 addition: in case, the string was not allocated in a
|
|
6071 @code{string_chars_block} because it exceeded the maximal length, and
|
|
6072 therefore it was @code{malloc}ed separately, we know also @code{xfree}
|
|
6073 it explicitly.
|
|
6074
|
462
|
6075 @node sweep_bit_vectors_1
|
428
|
6076 @subsection @code{sweep_bit_vectors_1}
|
|
6077 @cindex @code{sweep_bit_vectors_1}
|
|
6078
|
|
6079 Bit vectors are also one of the rare types that are @code{malloc}ed
|
|
6080 individually. Consequently, while sweeping, all further needless
|
|
6081 bit vectors must be freed by hand. This is done, as one might imagine,
|
|
6082 the expected way: since they are all registered in a list called
|
|
6083 @code{all_bit_vectors}, all elements of that list are traversed,
|
442
|
6084 all unmarked bit vectors are unlinked by calling @code{xfree} and all of
|
428
|
6085 them become unmarked.
|
442
|
6086 In addition, the bookkeeping information used for garbage
|
428
|
6087 collector's output purposes is updated.
|
|
6088
|
462
|
6089 @node Integers and Characters
|
428
|
6090 @section Integers and Characters
|
462
|
6091 @cindex integers and characters
|
|
6092 @cindex characters, integers and
|
428
|
6093
|
|
6094 Integer and character Lisp objects are created from integers using the
|
|
6095 macros @code{XSETINT()} and @code{XSETCHAR()} or the equivalent
|
|
6096 functions @code{make_int()} and @code{make_char()}. (These are actually
|
|
6097 macros on most systems.) These functions basically just do some moving
|
|
6098 of bits around, since the integral value of the object is stored
|
|
6099 directly in the @code{Lisp_Object}.
|
|
6100
|
|
6101 @code{XSETINT()} and the like will truncate values given to them that
|
|
6102 are too big; i.e. you won't get the value you expected but the tag bits
|
|
6103 will at least be correct.
|
|
6104
|
462
|
6105 @node Allocation from Frob Blocks
|
428
|
6106 @section Allocation from Frob Blocks
|
462
|
6107 @cindex allocation from frob blocks
|
|
6108 @cindex frob blocks, allocation from
|
428
|
6109
|
|
6110 The uninitialized memory required by a @code{Lisp_Object} of a particular type
|
|
6111 is allocated using
|
|
6112 @code{ALLOCATE_FIXED_TYPE()}. This only occurs inside of the
|
|
6113 lowest-level object-creating functions in @file{alloc.c}:
|
|
6114 @code{Fcons()}, @code{make_float()}, @code{Fmake_byte_code()},
|
|
6115 @code{Fmake_symbol()}, @code{allocate_extent()},
|
|
6116 @code{allocate_event()}, @code{Fmake_marker()}, and
|
|
6117 @code{make_uninit_string()}. The idea is that, for each type, there are
|
|
6118 a number of frob blocks (each 2K in size); each frob block is divided up
|
|
6119 into object-sized chunks. Each frob block will have some of these
|
|
6120 chunks that are currently assigned to objects, and perhaps some that are
|
|
6121 free. (If a frob block has nothing but free chunks, it is freed at the
|
|
6122 end of the garbage collection cycle.) The free chunks are stored in a
|
|
6123 free list, which is chained by storing a pointer in the first four bytes
|
|
6124 of the chunk. (Except for the free chunks at the end of the last frob
|
|
6125 block, which are handled using an index which points past the end of the
|
|
6126 last-allocated chunk in the last frob block.)
|
|
6127 @code{ALLOCATE_FIXED_TYPE()} first tries to retrieve a chunk from the
|
|
6128 free list; if that fails, it calls
|
|
6129 @code{ALLOCATE_FIXED_TYPE_FROM_BLOCK()}, which looks at the end of the
|
|
6130 last frob block for space, and creates a new frob block if there is
|
|
6131 none. (There are actually two versions of these macros, one of which is
|
|
6132 more defensive but less efficient and is used for error-checking.)
|
|
6133
|
462
|
6134 @node lrecords
|
428
|
6135 @section lrecords
|
462
|
6136 @cindex lrecords
|
428
|
6137
|
|
6138 [see @file{lrecord.h}]
|
|
6139
|
|
6140 All lrecords have at the beginning of their structure a @code{struct
|
442
|
6141 lrecord_header}. This just contains a type number and some flags,
|
|
6142 including the mark bit. All builtin type numbers are defined as
|
|
6143 constants in @code{enum lrecord_type}, to allow the compiler to generate
|
|
6144 more efficient code for @code{@var{type}P}. The type number, thru the
|
|
6145 @code{lrecord_implementation_table}, gives access to a @code{struct
|
428
|
6146 lrecord_implementation}, which is a structure containing method pointers
|
|
6147 and such. There is one of these for each type, and it is a global,
|
|
6148 constant, statically-declared structure that is declared in the
|
442
|
6149 @code{DEFINE_LRECORD_IMPLEMENTATION()} macro.
|
|
6150
|
|
6151 Simple lrecords (of type (b) above) just have a @code{struct
|
428
|
6152 lrecord_header} at their beginning. lcrecords, however, actually have a
|
|
6153 @code{struct lcrecord_header}. This, in turn, has a @code{struct
|
|
6154 lrecord_header} at its beginning, so sanity is preserved; but it also
|
|
6155 has a pointer used to chain all lcrecords together, and a special ID
|
|
6156 field used to distinguish one lcrecord from another. (This field is used
|
|
6157 only for debugging and could be removed, but the space gain is not
|
|
6158 significant.)
|
|
6159
|
|
6160 Simple lrecords are created using @code{ALLOCATE_FIXED_TYPE()}, just
|
|
6161 like for other frob blocks. The only change is that the implementation
|
|
6162 pointer must be initialized correctly. (The implementation structure for
|
|
6163 an lrecord, or rather the pointer to it, is named @code{lrecord_float},
|
|
6164 @code{lrecord_extent}, @code{lrecord_buffer}, etc.)
|
|
6165
|
|
6166 lcrecords are created using @code{alloc_lcrecord()}. This takes a
|
|
6167 size to allocate and an implementation pointer. (The size needs to be
|
|
6168 passed because some lcrecords, such as window configurations, are of
|
|
6169 variable size.) This basically just @code{malloc()}s the storage,
|
|
6170 initializes the @code{struct lcrecord_header}, and chains the lcrecord
|
|
6171 onto the head of the list of all lcrecords, which is stored in the
|
|
6172 variable @code{all_lcrecords}. The calls to @code{alloc_lcrecord()}
|
|
6173 generally occur in the lowest-level allocation function for each lrecord
|
|
6174 type.
|
|
6175
|
|
6176 Whenever you create an lrecord, you need to call either
|
|
6177 @code{DEFINE_LRECORD_IMPLEMENTATION()} or
|
|
6178 @code{DEFINE_LRECORD_SEQUENCE_IMPLEMENTATION()}. This needs to be
|
442
|
6179 specified in a @file{.c} file, at the top level. What this actually
|
|
6180 does is define and initialize the implementation structure for the
|
|
6181 lrecord. (And possibly declares a function @code{error_check_foo()} that
|
|
6182 implements the @code{XFOO()} macro when error-checking is enabled.) The
|
|
6183 arguments to the macros are the actual type name (this is used to
|
|
6184 construct the C variable name of the lrecord implementation structure
|
|
6185 and related structures using the @samp{##} macro concatenation
|
|
6186 operator), a string that names the type on the Lisp level (this may not
|
|
6187 be the same as the C type name; typically, the C type name has
|
|
6188 underscores, while the Lisp string has dashes), various method pointers,
|
|
6189 and the name of the C structure that contains the object. The methods
|
|
6190 are used to encapsulate type-specific information about the object, such
|
|
6191 as how to print it or mark it for garbage collection, so that it's easy
|
|
6192 to add new object types without having to add a specific case for each
|
|
6193 new type in a bunch of different places.
|
428
|
6194
|
|
6195 The difference between @code{DEFINE_LRECORD_IMPLEMENTATION()} and
|
|
6196 @code{DEFINE_LRECORD_SEQUENCE_IMPLEMENTATION()} is that the former is
|
|
6197 used for fixed-size object types and the latter is for variable-size
|
|
6198 object types. Most object types are fixed-size; some complex
|
|
6199 types, however (e.g. window configurations), are variable-size.
|
|
6200 Variable-size object types have an extra method, which is called
|
|
6201 to determine the actual size of a particular object of that type.
|
|
6202 (Currently this is only used for keeping allocation statistics.)
|
|
6203
|
|
6204 For the purpose of keeping allocation statistics, the allocation
|
|
6205 engine keeps a list of all the different types that exist. Note that,
|
|
6206 since @code{DEFINE_LRECORD_IMPLEMENTATION()} is a macro that is
|
442
|
6207 specified at top-level, there is no way for it to initialize the global
|
|
6208 data structures containing type information, like
|
|
6209 @code{lrecord_implementations_table}. For this reason a call to
|
|
6210 @code{INIT_LRECORD_IMPLEMENTATION} must be added to the same source file
|
|
6211 containing @code{DEFINE_LRECORD_IMPLEMENTATION}, but instead of to the
|
|
6212 top level, to one of the init functions, typically
|
|
6213 @code{syms_of_@var{foo}.c}. @code{INIT_LRECORD_IMPLEMENTATION} must be
|
|
6214 called before an object of this type is used.
|
|
6215
|
|
6216 The type number is also used to index into an array holding the number
|
|
6217 of objects of each type and the total memory allocated for objects of
|
|
6218 that type. The statistics in this array are computed during the sweep
|
|
6219 stage. These statistics are returned by the call to
|
|
6220 @code{garbage-collect}.
|
428
|
6221
|
|
6222 Note that for every type defined with a @code{DEFINE_LRECORD_*()}
|
|
6223 macro, there needs to be a @code{DECLARE_LRECORD_IMPLEMENTATION()}
|
|
6224 somewhere in a @file{.h} file, and this @file{.h} file needs to be
|
|
6225 included by @file{inline.c}.
|
|
6226
|
|
6227 Furthermore, there should generally be a set of @code{XFOOBAR()},
|
|
6228 @code{FOOBARP()}, etc. macros in a @file{.h} (or occasionally @file{.c})
|
|
6229 file. To create one of these, copy an existing model and modify as
|
|
6230 necessary.
|
|
6231
|
442
|
6232 @strong{Please note:} If you define an lrecord in an external
|
|
6233 dynamically-loaded module, you must use @code{DECLARE_EXTERNAL_LRECORD},
|
|
6234 @code{DEFINE_EXTERNAL_LRECORD_IMPLEMENTATION}, and
|
|
6235 @code{DEFINE_EXTERNAL_LRECORD_SEQUENCE_IMPLEMENTATION} instead of the
|
|
6236 non-EXTERNAL forms. These macros will dynamically add new type numbers
|
|
6237 to the global enum that records them, whereas the non-EXTERNAL forms
|
|
6238 assume that the programmer has already inserted the correct type numbers
|
|
6239 into the enum's code at compile-time.
|
|
6240
|
428
|
6241 The various methods in the lrecord implementation structure are:
|
|
6242
|
|
6243 @enumerate
|
|
6244 @item
|
|
6245 @cindex mark method
|
|
6246 A @dfn{mark} method. This is called during the marking stage and passed
|
|
6247 a function pointer (usually the @code{mark_object()} function), which is
|
|
6248 used to mark an object. All Lisp objects that are contained within the
|
|
6249 object need to be marked by applying this function to them. The mark
|
444
|
6250 method should also return a Lisp object, which should be either @code{nil} or
|
428
|
6251 an object to mark. (This can be used in lieu of calling
|
|
6252 @code{mark_object()} on the object, to reduce the recursion depth, and
|
|
6253 consequently should be the most heavily nested sub-object, such as a
|
|
6254 long list.)
|
|
6255
|
|
6256 @strong{Please note:} When the mark method is called, garbage collection
|
|
6257 is in progress, and special precautions need to be taken when accessing
|
|
6258 objects; see section (B) above.
|
|
6259
|
|
6260 If your mark method does not need to do anything, it can be
|
|
6261 @code{NULL}.
|
|
6262
|
|
6263 @item
|
|
6264 A @dfn{print} method. This is called to create a printed representation
|
|
6265 of the object, whenever @code{princ}, @code{prin1}, or the like is
|
|
6266 called. It is passed the object, a stream to which the output is to be
|
|
6267 directed, and an @code{escapeflag} which indicates whether the object's
|
|
6268 printed representation should be @dfn{escaped} so that it is
|
|
6269 readable. (This corresponds to the difference between @code{princ} and
|
|
6270 @code{prin1}.) Basically, @dfn{escaped} means that strings will have
|
|
6271 quotes around them and confusing characters in the strings such as
|
|
6272 quotes, backslashes, and newlines will be backslashed; and that special
|
|
6273 care will be taken to make symbols print in a readable fashion
|
|
6274 (e.g. symbols that look like numbers will be backslashed). Other
|
|
6275 readable objects should perhaps pass @code{escapeflag} on when
|
|
6276 sub-objects are printed, so that readability is preserved when necessary
|
|
6277 (or if not, always pass in a 1 for @code{escapeflag}). Non-readable
|
|
6278 objects should in general ignore @code{escapeflag}, except that some use
|
|
6279 it as an indication that more verbose output should be given.
|
|
6280
|
|
6281 Sub-objects are printed using @code{print_internal()}, which takes
|
|
6282 exactly the same arguments as are passed to the print method.
|
|
6283
|
|
6284 Literal C strings should be printed using @code{write_c_string()},
|
|
6285 or @code{write_string_1()} for non-null-terminated strings.
|
|
6286
|
|
6287 Functions that do not have a readable representation should check the
|
|
6288 @code{print_readably} flag and signal an error if it is set.
|
|
6289
|
|
6290 If you specify NULL for the print method, the
|
|
6291 @code{default_object_printer()} will be used.
|
|
6292
|
|
6293 @item
|
|
6294 A @dfn{finalize} method. This is called at the beginning of the sweep
|
|
6295 stage on lcrecords that are about to be freed, and should be used to
|
|
6296 perform any extra object cleanup. This typically involves freeing any
|
|
6297 extra @code{malloc()}ed memory associated with the object, releasing any
|
|
6298 operating-system and window-system resources associated with the object
|
|
6299 (e.g. pixmaps, fonts), etc.
|
|
6300
|
|
6301 The finalize method can be NULL if nothing needs to be done.
|
|
6302
|
|
6303 WARNING #1: The finalize method is also called at the end of the dump
|
|
6304 phase; this time with the for_disksave parameter set to non-zero. The
|
|
6305 object is @emph{not} about to disappear, so you have to make sure to
|
|
6306 @emph{not} free any extra @code{malloc()}ed memory if you're going to
|
|
6307 need it later. (Also, signal an error if there are any operating-system
|
|
6308 and window-system resources here, because they can't be dumped.)
|
|
6309
|
|
6310 Finalize methods should, as a rule, set to zero any pointers after
|
|
6311 they've been freed, and check to make sure pointers are not zero before
|
|
6312 freeing. Although I'm pretty sure that finalize methods are not called
|
|
6313 twice on the same object (except for the @code{for_disksave} proviso),
|
|
6314 we've gotten nastily burned in some cases by not doing this.
|
|
6315
|
|
6316 WARNING #2: The finalize method is @emph{only} called for
|
|
6317 lcrecords, @emph{not} for simply lrecords. If you need a
|
|
6318 finalize method for simple lrecords, you have to stick
|
|
6319 it in the @code{ADDITIONAL_FREE_foo()} macro in @file{alloc.c}.
|
|
6320
|
|
6321 WARNING #3: Things are in an @emph{extremely} bizarre state
|
|
6322 when @code{ADDITIONAL_FREE_foo()} is called, so you have to
|
|
6323 be incredibly careful when writing one of these functions.
|
|
6324 See the comment in @code{gc_sweep()}. If you ever have to add
|
|
6325 one of these, consider using an lcrecord or dealing with
|
|
6326 the problem in a different fashion.
|
|
6327
|
|
6328 @item
|
|
6329 An @dfn{equal} method. This compares the two objects for similarity,
|
|
6330 when @code{equal} is called. It should compare the contents of the
|
|
6331 objects in some reasonable fashion. It is passed the two objects and a
|
|
6332 @dfn{depth} value, which is used to catch circular objects. To compare
|
|
6333 sub-Lisp-objects, call @code{internal_equal()} and bump the depth value
|
|
6334 by one. If this value gets too high, a @code{circular-object} error
|
|
6335 will be signaled.
|
|
6336
|
|
6337 If this is NULL, objects are @code{equal} only when they are @code{eq},
|
|
6338 i.e. identical.
|
|
6339
|
|
6340 @item
|
|
6341 A @dfn{hash} method. This is used to hash objects when they are to be
|
|
6342 compared with @code{equal}. The rule here is that if two objects are
|
|
6343 @code{equal}, they @emph{must} hash to the same value; i.e. your hash
|
|
6344 function should use some subset of the sub-fields of the object that are
|
|
6345 compared in the ``equal'' method. If you specify this method as
|
|
6346 @code{NULL}, the object's pointer will be used as the hash, which will
|
|
6347 @emph{fail} if the object has an @code{equal} method, so don't do this.
|
|
6348
|
|
6349 To hash a sub-Lisp-object, call @code{internal_hash()}. Bump the
|
|
6350 depth by one, just like in the ``equal'' method.
|
|
6351
|
|
6352 To convert a Lisp object directly into a hash value (using
|
|
6353 its pointer), use @code{LISP_HASH()}. This is what happens when
|
|
6354 the hash method is NULL.
|
|
6355
|
|
6356 To hash two or more values together into a single value, use
|
|
6357 @code{HASH2()}, @code{HASH3()}, @code{HASH4()}, etc.
|
|
6358
|
|
6359 @item
|
|
6360 @dfn{getprop}, @dfn{putprop}, @dfn{remprop}, and @dfn{plist} methods.
|
|
6361 These are used for object types that have properties. I don't feel like
|
|
6362 documenting them here. If you create one of these objects, you have to
|
|
6363 use different macros to define them,
|
|
6364 i.e. @code{DEFINE_LRECORD_IMPLEMENTATION_WITH_PROPS()} or
|
|
6365 @code{DEFINE_LRECORD_SEQUENCE_IMPLEMENTATION_WITH_PROPS()}.
|
|
6366
|
|
6367 @item
|
|
6368 A @dfn{size_in_bytes} method, when the object is of variable-size.
|
|
6369 (i.e. declared with a @code{_SEQUENCE_IMPLEMENTATION} macro.) This should
|
|
6370 simply return the object's size in bytes, exactly as you might expect.
|
|
6371 For an example, see the methods for window configurations and opaques.
|
|
6372 @end enumerate
|
|
6373
|
462
|
6374 @node Low-level allocation
|
428
|
6375 @section Low-level allocation
|
462
|
6376 @cindex low-level allocation
|
|
6377 @cindex allocation, low-level
|
428
|
6378
|
|
6379 Memory that you want to allocate directly should be allocated using
|
|
6380 @code{xmalloc()} rather than @code{malloc()}. This implements
|
|
6381 error-checking on the return value, and once upon a time did some more
|
|
6382 vital stuff (i.e. @code{BLOCK_INPUT}, which is no longer necessary).
|
|
6383 Free using @code{xfree()}, and realloc using @code{xrealloc()}. Note
|
|
6384 that @code{xmalloc()} will do a non-local exit if the memory can't be
|
|
6385 allocated. (Many functions, however, do not expect this, and thus XEmacs
|
|
6386 will likely crash if this happens. @strong{This is a bug.} If you can,
|
|
6387 you should strive to make your function handle this OK. However, it's
|
|
6388 difficult in the general circumstance, perhaps requiring extra
|
|
6389 unwind-protects and such.)
|
|
6390
|
|
6391 Note that XEmacs provides two separate replacements for the standard
|
|
6392 @code{malloc()} library function. These are called @dfn{old GNU malloc}
|
|
6393 (@file{malloc.c}) and @dfn{new GNU malloc} (@file{gmalloc.c}),
|
|
6394 respectively. New GNU malloc is better in pretty much every way than
|
|
6395 old GNU malloc, and should be used if possible. (It used to be that on
|
|
6396 some systems, the old one worked but the new one didn't. I think this
|
|
6397 was due specifically to a bug in SunOS, which the new one now works
|
|
6398 around; so I don't think the old one ever has to be used any more.) The
|
|
6399 primary difference between both of these mallocs and the standard system
|
|
6400 malloc is that they are much faster, at the expense of increased space.
|
|
6401 The basic idea is that memory is allocated in fixed chunks of powers of
|
|
6402 two. This allows for basically constant malloc time, since the various
|
|
6403 chunks can just be kept on a number of free lists. (The standard system
|
|
6404 malloc typically allocates arbitrary-sized chunks and has to spend some
|
|
6405 time, sometimes a significant amount of time, walking the heap looking
|
|
6406 for a free block to use and cleaning things up.) The new GNU malloc
|
|
6407 improves on things by allocating large objects in chunks of 4096 bytes
|
|
6408 rather than in ever larger powers of two, which results in ever larger
|
|
6409 wastage. There is a slight speed loss here, but it's of doubtful
|
|
6410 significance.
|
|
6411
|
|
6412 NOTE: Apparently there is a third-generation GNU malloc that is
|
|
6413 significantly better than the new GNU malloc, and should probably
|
|
6414 be included in XEmacs.
|
|
6415
|
|
6416 There is also the relocating allocator, @file{ralloc.c}. This actually
|
|
6417 moves blocks of memory around so that the @code{sbrk()} pointer shrunk
|
|
6418 and virtual memory released back to the system. On some systems,
|
|
6419 this is a big win. On all systems, it causes a noticeable (and
|
|
6420 sometimes huge) speed penalty, so I turn it off by default.
|
|
6421 @file{ralloc.c} only works with the new GNU malloc in @file{gmalloc.c}.
|
|
6422 There are also two versions of @file{ralloc.c}, one that uses @code{mmap()}
|
|
6423 rather than block copies to move data around. This purports to
|
|
6424 be faster, although that depends on the amount of data that would
|
|
6425 have had to be block copied and the system-call overhead for
|
|
6426 @code{mmap()}. I don't know exactly how this works, except that the
|
|
6427 relocating-allocation routines are pretty much used only for
|
|
6428 the memory allocated for a buffer, which is the biggest consumer
|
|
6429 of space, esp. of space that may get freed later.
|
|
6430
|
|
6431 Note that the GNU mallocs have some ``memory warning'' facilities.
|
|
6432 XEmacs taps into them and issues a warning through the standard
|
|
6433 warning system, when memory gets to 75%, 85%, and 95% full.
|
|
6434 (On some systems, the memory warnings are not functional.)
|
|
6435
|
|
6436 Allocated memory that is going to be used to make a Lisp object
|
442
|
6437 is created using @code{allocate_lisp_storage()}. This just calls
|
|
6438 @code{xmalloc()}. It used to verify that the pointer to the memory can
|
|
6439 fit into a Lisp word, before the current Lisp object representation was
|
|
6440 introduced. @code{allocate_lisp_storage()} is called by
|
|
6441 @code{alloc_lcrecord()}, @code{ALLOCATE_FIXED_TYPE()}, and the vector
|
|
6442 and bit-vector creation routines. These routines also call
|
|
6443 @code{INCREMENT_CONS_COUNTER()} at the appropriate times; this keeps
|
|
6444 statistics on how much memory is allocated, so that garbage-collection
|
|
6445 can be invoked when the threshold is reached.
|
|
6446
|
462
|
6447 @node Cons
|
428
|
6448 @section Cons
|
462
|
6449 @cindex cons
|
428
|
6450
|
|
6451 Conses are allocated in standard frob blocks. The only thing to
|
|
6452 note is that conses can be explicitly freed using @code{free_cons()}
|
|
6453 and associated functions @code{free_list()} and @code{free_alist()}. This
|
|
6454 immediately puts the conses onto the cons free list, and decrements
|
|
6455 the statistics on memory allocation appropriately. This is used
|
|
6456 to good effect by some extremely commonly-used code, to avoid
|
|
6457 generating extra objects and thereby triggering GC sooner.
|
|
6458 However, you have to be @emph{extremely} careful when doing this.
|
|
6459 If you mess this up, you will get BADLY BURNED, and it has happened
|
|
6460 before.
|
|
6461
|
462
|
6462 @node Vector
|
428
|
6463 @section Vector
|
462
|
6464 @cindex vector
|
428
|
6465
|
|
6466 As mentioned above, each vector is @code{malloc()}ed individually, and
|
|
6467 all are threaded through the variable @code{all_vectors}. Vectors are
|
|
6468 marked strangely during garbage collection, by kludging the size field.
|
|
6469 Note that the @code{struct Lisp_Vector} is declared with its
|
|
6470 @code{contents} field being a @emph{stretchy} array of one element. It
|
|
6471 is actually @code{malloc()}ed with the right size, however, and access
|
|
6472 to any element through the @code{contents} array works fine.
|
|
6473
|
462
|
6474 @node Bit Vector
|
428
|
6475 @section Bit Vector
|
462
|
6476 @cindex bit vector
|
|
6477 @cindex vector, bit
|
428
|
6478
|
|
6479 Bit vectors work exactly like vectors, except for more complicated
|
|
6480 code to access an individual bit, and except for the fact that bit
|
|
6481 vectors are lrecords while vectors are not. (The only difference here is
|
|
6482 that there's an lrecord implementation pointer at the beginning and the
|
|
6483 tag field in bit vector Lisp words is ``lrecord'' rather than
|
|
6484 ``vector''.)
|
|
6485
|
462
|
6486 @node Symbol
|
428
|
6487 @section Symbol
|
462
|
6488 @cindex symbol
|
428
|
6489
|
442
|
6490 Symbols are also allocated in frob blocks. Symbols in the awful
|
|
6491 horrible obarray structure are chained through their @code{next} field.
|
428
|
6492
|
|
6493 Remember that @code{intern} looks up a symbol in an obarray, creating
|
|
6494 one if necessary.
|
|
6495
|
462
|
6496 @node Marker
|
428
|
6497 @section Marker
|
462
|
6498 @cindex marker
|
428
|
6499
|
|
6500 Markers are allocated in frob blocks, as usual. They are kept
|
|
6501 in a buffer unordered, but in a doubly-linked list so that they
|
|
6502 can easily be removed. (Formerly this was a singly-linked list,
|
|
6503 but in some cases garbage collection took an extraordinarily
|
|
6504 long time due to the O(N^2) time required to remove lots of
|
|
6505 markers from a buffer.) Markers are removed from a buffer in
|
|
6506 the finalize stage, in @code{ADDITIONAL_FREE_marker()}.
|
|
6507
|
462
|
6508 @node String
|
428
|
6509 @section String
|
462
|
6510 @cindex string
|
428
|
6511
|
|
6512 As mentioned above, strings are a special case. A string is logically
|
|
6513 two parts, a fixed-size object (containing the length, property list,
|
|
6514 and a pointer to the actual data), and the actual data in the string.
|
|
6515 The fixed-size object is a @code{struct Lisp_String} and is allocated in
|
|
6516 frob blocks, as usual. The actual data is stored in special
|
|
6517 @dfn{string-chars blocks}, which are 8K blocks of memory.
|
|
6518 Currently-allocated strings are simply laid end to end in these
|
|
6519 string-chars blocks, with a pointer back to the @code{struct Lisp_String}
|
|
6520 stored before each string in the string-chars block. When a new string
|
|
6521 needs to be allocated, the remaining space at the end of the last
|
|
6522 string-chars block is used if there's enough, and a new string-chars
|
|
6523 block is created otherwise.
|
|
6524
|
|
6525 There are never any holes in the string-chars blocks due to the string
|
|
6526 compaction and relocation that happens at the end of garbage collection.
|
|
6527 During the sweep stage of garbage collection, when objects are
|
|
6528 reclaimed, the garbage collector goes through all string-chars blocks,
|
|
6529 looking for unused strings. Each chunk of string data is preceded by a
|
|
6530 pointer to the corresponding @code{struct Lisp_String}, which indicates
|
|
6531 both whether the string is used and how big the string is, i.e. how to
|
|
6532 get to the next chunk of string data. Holes are compressed by
|
|
6533 block-copying the next string into the empty space and relocating the
|
|
6534 pointer stored in the corresponding @code{struct Lisp_String}.
|
|
6535 @strong{This means you have to be careful with strings in your code.}
|
|
6536 See the section above on @code{GCPRO}ing.
|
|
6537
|
|
6538 Note that there is one situation not handled: a string that is too big
|
|
6539 to fit into a string-chars block. Such strings, called @dfn{big
|
|
6540 strings}, are all @code{malloc()}ed as their own block. (#### Although it
|
|
6541 would make more sense for the threshold for big strings to be somewhat
|
|
6542 lower, e.g. 1/2 or 1/4 the size of a string-chars block. It seems that
|
440
|
6543 this was indeed the case formerly---indeed, the threshold was set at
|
|
6544 1/8---but Mly forgot about this when rewriting things for 19.8.)
|
428
|
6545
|
|
6546 Note also that the string data in string-chars blocks is padded as
|
|
6547 necessary so that proper alignment constraints on the @code{struct
|
|
6548 Lisp_String} back pointers are maintained.
|
|
6549
|
|
6550 Finally, strings can be resized. This happens in Mule when a
|
|
6551 character is substituted with a different-length character, or during
|
|
6552 modeline frobbing. (You could also export this to Lisp, but it's not
|
|
6553 done so currently.) Resizing a string is a potentially tricky process.
|
|
6554 If the change is small enough that the padding can absorb it, nothing
|
|
6555 other than a simple memory move needs to be done. Keep in mind,
|
|
6556 however, that the string can't shrink too much because the offset to the
|
|
6557 next string in the string-chars block is computed by looking at the
|
|
6558 length and rounding to the nearest multiple of four or eight. If the
|
|
6559 string would shrink or expand beyond the correct padding, new string
|
|
6560 data needs to be allocated at the end of the last string-chars block and
|
|
6561 the data moved appropriately. This leaves some dead string data, which
|
|
6562 is marked by putting a special marker of 0xFFFFFFFF in the @code{struct
|
|
6563 Lisp_String} pointer before the data (there's no real @code{struct
|
|
6564 Lisp_String} to point to and relocate), and storing the size of the dead
|
|
6565 string data (which would normally be obtained from the now-non-existent
|
|
6566 @code{struct Lisp_String}) at the beginning of the dead string data gap.
|
|
6567 The string compactor recognizes this special 0xFFFFFFFF marker and
|
|
6568 handles it correctly.
|
|
6569
|
462
|
6570 @node Compiled Function
|
428
|
6571 @section Compiled Function
|
462
|
6572 @cindex compiled function
|
|
6573 @cindex function, compiled
|
428
|
6574
|
|
6575 Not yet documented.
|
|
6576
|
442
|
6577
|
|
6578 @node Dumping, Events and the Event Loop, Allocation of Objects in XEmacs Lisp, Top
|
|
6579 @chapter Dumping
|
462
|
6580 @cindex dumping
|
442
|
6581
|
|
6582 @section What is dumping and its justification
|
462
|
6583 @cindex dumping and its justification, what is
|
442
|
6584
|
|
6585 The C code of XEmacs is just a Lisp engine with a lot of built-in
|
|
6586 primitives useful for writing an editor. The editor itself is written
|
|
6587 mostly in Lisp, and represents around 100K lines of code. Loading and
|
|
6588 executing the initialization of all this code takes a bit a time (five
|
|
6589 to ten times the usual startup time of current xemacs) and requires
|
|
6590 having all the lisp source files around. Having to reload them each
|
|
6591 time the editor is started would not be acceptable.
|
|
6592
|
|
6593 The traditional solution to this problem is called dumping: the build
|
|
6594 process first creates the lisp engine under the name @file{temacs}, then
|
|
6595 runs it until it has finished loading and initializing all the lisp
|
|
6596 code, and eventually creates a new executable called @file{xemacs}
|
|
6597 including both the object code in @file{temacs} and all the contents of
|
|
6598 the memory after the initialization.
|
|
6599
|
|
6600 This solution, while working, has a huge problem: the creation of the
|
|
6601 new executable from the actual contents of memory is an extremely
|
|
6602 system-specific process, quite error-prone, and which interferes with a
|
|
6603 lot of system libraries (like malloc). It is even getting worse
|
|
6604 nowadays with libraries using constructors which are automatically
|
|
6605 called when the program is started (even before main()) which tend to
|
|
6606 crash when they are called multiple times, once before dumping and once
|
|
6607 after (IRIX 6.x libz.so pulls in some C++ image libraries thru
|
|
6608 dependencies which have this problem). Writing the dumper is also one
|
|
6609 of the most difficult parts of porting XEmacs to a new operating system.
|
|
6610 Basically, `dumping' is an operation that is just not officially
|
|
6611 supported on many operating systems.
|
|
6612
|
|
6613 The aim of the portable dumper is to solve the same problem as the
|
|
6614 system-specific dumper, that is to be able to reload quickly, using only
|
|
6615 a small number of files, the fully initialized lisp part of the editor,
|
|
6616 without any system-specific hacks.
|
|
6617
|
|
6618 @menu
|
|
6619 * Overview::
|
|
6620 * Data descriptions::
|
|
6621 * Dumping phase::
|
|
6622 * Reloading phase::
|
|
6623 * Remaining issues::
|
|
6624 @end menu
|
|
6625
|
462
|
6626 @node Overview
|
442
|
6627 @section Overview
|
462
|
6628 @cindex dumping overview
|
442
|
6629
|
|
6630 The portable dumping system has to:
|
|
6631
|
|
6632 @enumerate
|
|
6633 @item
|
|
6634 At dump time, write all initialized, non-quickly-rebuildable data to a
|
|
6635 file [Note: currently named @file{xemacs.dmp}, but the name will
|
|
6636 change], along with all informations needed for the reloading.
|
|
6637
|
|
6638 @item
|
|
6639 When starting xemacs, reload the dump file, relocate it to its new
|
|
6640 starting address if needed, and reinitialize all pointers to this
|
|
6641 data. Also, rebuild all the quickly rebuildable data.
|
|
6642 @end enumerate
|
|
6643
|
462
|
6644 @node Data descriptions
|
442
|
6645 @section Data descriptions
|
462
|
6646 @cindex dumping data descriptions
|
442
|
6647
|
|
6648 The more complex task of the dumper is to be able to write lisp objects
|
|
6649 (lrecords) and C structs to disk and reload them at a different address,
|
|
6650 updating all the pointers they include in the process. This is done by
|
|
6651 using external data descriptions that give information about the layout
|
|
6652 of the structures in memory.
|
|
6653
|
|
6654 The specification of these descriptions is in lrecord.h. A description
|
|
6655 of an lrecord is an array of struct lrecord_description. Each of these
|
|
6656 structs include a type, an offset in the structure and some optional
|
|
6657 parameters depending on the type. For instance, here is the string
|
|
6658 description:
|
|
6659
|
|
6660 @example
|
|
6661 static const struct lrecord_description string_description[] = @{
|
|
6662 @{ XD_BYTECOUNT, offsetof (Lisp_String, size) @},
|
|
6663 @{ XD_OPAQUE_DATA_PTR, offsetof (Lisp_String, data), XD_INDIRECT(0, 1) @},
|
|
6664 @{ XD_LISP_OBJECT, offsetof (Lisp_String, plist) @},
|
|
6665 @{ XD_END @}
|
|
6666 @};
|
|
6667 @end example
|
|
6668
|
|
6669 The first line indicates a member of type Bytecount, which is used by
|
|
6670 the next, indirect directive. The second means "there is a pointer to
|
|
6671 some opaque data in the field @code{data}". The length of said data is
|
|
6672 given by the expression @code{XD_INDIRECT(0, 1)}, which means "the value
|
|
6673 in the 0th line of the description (welcome to C) plus one". The third
|
|
6674 line means "there is a Lisp_Object member @code{plist} in the Lisp_String
|
|
6675 structure". @code{XD_END} then ends the description.
|
|
6676
|
|
6677 This gives us all the information we need to move around what is pointed
|
|
6678 to by a structure (C or lrecord) and, by transitivity, everything that
|
|
6679 it points to. The only missing information for dumping is the size of
|
|
6680 the structure. For lrecords, this is part of the
|
|
6681 lrecord_implementation, so we don't need to duplicate it. For C
|
|
6682 structures we use a struct struct_description, which includes a size
|
|
6683 field and a pointer to an associated array of lrecord_description.
|
|
6684
|
462
|
6685 @node Dumping phase
|
442
|
6686 @section Dumping phase
|
462
|
6687 @cindex dumping phase
|
442
|
6688
|
|
6689 Dumping is done by calling the function pdump() (in dumper.c) which is
|
|
6690 invoked from Fdump_emacs (in emacs.c). This function performs a number
|
|
6691 of tasks.
|
|
6692
|
|
6693 @menu
|
|
6694 * Object inventory::
|
|
6695 * Address allocation::
|
|
6696 * The header::
|
|
6697 * Data dumping::
|
|
6698 * Pointers dumping::
|
|
6699 @end menu
|
|
6700
|
462
|
6701 @node Object inventory
|
442
|
6702 @subsection Object inventory
|
462
|
6703 @cindex dumping object inventory
|
442
|
6704
|
|
6705 The first task is to build the list of the objects to dump. This
|
|
6706 includes:
|
|
6707
|
|
6708 @itemize @bullet
|
|
6709 @item lisp objects
|
|
6710 @item C structures
|
|
6711 @end itemize
|
|
6712
|
|
6713 We end up with one @code{pdump_entry_list_elmt} per object group (arrays
|
|
6714 of C structs are kept together) which includes a pointer to the first
|
|
6715 object of the group, the per-object size and the count of objects in the
|
|
6716 group, along with some other information which is initialized later.
|
|
6717
|
|
6718 These entries are linked together in @code{pdump_entry_list} structures
|
|
6719 and can be enumerated thru either:
|
|
6720
|
|
6721 @enumerate
|
|
6722 @item
|
|
6723 the @code{pdump_object_table}, an array of @code{pdump_entry_list}, one
|
|
6724 per lrecord type, indexed by type number.
|
|
6725
|
|
6726 @item
|
|
6727 the @code{pdump_opaque_data_list}, used for the opaque data which does
|
|
6728 not include pointers, and hence does not need descriptions.
|
|
6729
|
|
6730 @item
|
|
6731 the @code{pdump_struct_table}, which is a vector of
|
|
6732 @code{struct_description}/@code{pdump_entry_list} pairs, used for
|
|
6733 non-opaque C structures.
|
|
6734 @end enumerate
|
|
6735
|
|
6736 This uses a marking strategy similar to the garbage collector. Some
|
|
6737 differences though:
|
|
6738
|
|
6739 @enumerate
|
|
6740 @item
|
|
6741 We do not use the mark bit (which does not exist for C structures
|
448
|
6742 anyway); we use a big hash table instead.
|
442
|
6743
|
|
6744 @item
|
|
6745 We do not use the mark function of lrecords but instead rely on the
|
|
6746 external descriptions. This happens essentially because we need to
|
|
6747 follow pointers to C structures and opaque data in addition to
|
|
6748 Lisp_Object members.
|
|
6749 @end enumerate
|
|
6750
|
452
|
6751 This is done by @code{pdump_register_object()}, which handles Lisp_Object
|
|
6752 variables, and @code{pdump_register_struct()} which handles C structures,
|
|
6753 which both delegate the description management to @code{pdump_register_sub()}.
|
442
|
6754
|
|
6755 The hash table doubles as a map object to pdump_entry_list_elmt (i.e.
|
|
6756 allows us to look up a pdump_entry_list_elmt with the object it points
|
|
6757 to). Entries are added with @code{pdump_add_entry()} and looked up with
|
|
6758 @code{pdump_get_entry()}. There is no need for entry removal. The hash
|
448
|
6759 value is computed quite simply from the object pointer by
|
442
|
6760 @code{pdump_make_hash()}.
|
|
6761
|
|
6762 The roots for the marking are:
|
|
6763
|
|
6764 @enumerate
|
|
6765 @item
|
|
6766 the @code{staticpro}'ed variables (there is a special @code{staticpro_nodump()}
|
|
6767 call for protected variables we do not want to dump).
|
|
6768
|
|
6769 @item
|
452
|
6770 the variables registered via @code{dump_add_root_object}
|
|
6771 (@code{staticpro()} is equivalent to @code{staticpro_nodump()} +
|
|
6772 @code{dump_add_root_object()}).
|
|
6773
|
|
6774 @item
|
|
6775 the variables registered via @code{dump_add_root_struct_ptr}, each of
|
|
6776 which points to a C structure.
|
442
|
6777 @end enumerate
|
|
6778
|
|
6779 This does not include the GCPRO'ed variables, the specbinds, the
|
|
6780 catchtags, the backlist, the redisplay or the profiling info, since we
|
|
6781 do not want to rebuild the actual chain of lisp calls which end up to
|
|
6782 the dump-emacs call, only the global variables.
|
|
6783
|
|
6784 Weak lists and weak hash tables are dumped as if they were their
|
|
6785 non-weak equivalent (without changing their type, of course). This has
|
|
6786 not yet been a problem.
|
|
6787
|
462
|
6788 @node Address allocation
|
442
|
6789 @subsection Address allocation
|
462
|
6790 @cindex dumping address allocation
|
442
|
6791
|
|
6792
|
|
6793 The next step is to allocate the offsets of each of the objects in the
|
|
6794 final dump file. This is done by @code{pdump_allocate_offset()} which
|
|
6795 is called indirectly by @code{pdump_scan_by_alignment()}.
|
|
6796
|
|
6797 The strategy to deal with alignment problems uses these facts:
|
|
6798
|
|
6799 @enumerate
|
|
6800 @item
|
|
6801 real world alignment requirements are powers of two.
|
|
6802
|
|
6803 @item
|
|
6804 the C compiler is required to adjust the size of a struct so that you
|
448
|
6805 can have an array of them next to each other. This means you can have an
|
442
|
6806 upper bound of the alignment requirements of a given structure by
|
|
6807 looking at which power of two its size is a multiple.
|
|
6808
|
|
6809 @item
|
|
6810 the non-variant part of variable size lrecords has an alignment
|
|
6811 requirement of 4.
|
|
6812 @end enumerate
|
|
6813
|
|
6814 Hence, for each lrecord type, C struct type or opaque data block the
|
|
6815 alignment requirement is computed as a power of two, with a minimum of
|
|
6816 2^2 for lrecords. @code{pdump_scan_by_alignment()} then scans all the
|
|
6817 @code{pdump_entry_list_elmt}'s, the ones with the highest requirements
|
|
6818 first. This ensures the best packing.
|
|
6819
|
|
6820 The maximum alignment requirement we take into account is 2^8.
|
|
6821
|
|
6822 @code{pdump_allocate_offset()} only has to do a linear allocation,
|
448
|
6823 starting at offset 256 (this leaves room for the header and keeps the
|
442
|
6824 alignments happy).
|
|
6825
|
462
|
6826 @node The header
|
442
|
6827 @subsection The header
|
462
|
6828 @cindex dumping, the header
|
442
|
6829
|
|
6830 The next step creates the file and writes a header with a signature and
|
452
|
6831 some random information in it. The @code{reloc_address} field, which
|
|
6832 indicates at which address the file should be loaded if we want to avoid
|
|
6833 post-reload relocation, is set to 0. It then seeks to offset 256 (base
|
|
6834 offset for the objects).
|
442
|
6835
|
462
|
6836 @node Data dumping
|
442
|
6837 @subsection Data dumping
|
462
|
6838 @cindex data dumping
|
|
6839 @cindex dumping, data
|
442
|
6840
|
|
6841 The data is dumped in the same order as the addresses were allocated by
|
|
6842 @code{pdump_dump_data()}, called from @code{pdump_scan_by_alignment()}.
|
|
6843 This function copies the data to a temporary buffer, relocates all
|
|
6844 pointers in the object to the addresses allocated in step Address
|
|
6845 Allocation, and writes it to the file. Using the same order means that,
|
|
6846 if we are careful with lrecords whose size is not a multiple of 4, we
|
|
6847 are ensured that the object is always written at the offset in the file
|
|
6848 allocated in step Address Allocation.
|
|
6849
|
462
|
6850 @node Pointers dumping
|
442
|
6851 @subsection Pointers dumping
|
462
|
6852 @cindex pointers dumping
|
|
6853 @cindex dumping, pointers
|
442
|
6854
|
|
6855 A bunch of tables needed to reassign properly the global pointers are
|
|
6856 then written. They are:
|
|
6857
|
|
6858 @enumerate
|
|
6859 @item
|
452
|
6860 the pdump_root_struct_ptrs dynarr
|
|
6861 @item
|
|
6862 the pdump_opaques dynarr
|
442
|
6863 @item
|
|
6864 a vector of all the offsets to the objects in the file that include a
|
|
6865 description (for faster relocation at reload time)
|
|
6866 @item
|
452
|
6867 the pdump_root_objects and pdump_weak_object_chains dynarrs.
|
442
|
6868 @end enumerate
|
|
6869
|
452
|
6870 For each of the dynarrs we write both the pointer to the variables and
|
442
|
6871 the relocated offset of the object they point to. Since these variables
|
|
6872 are global, the pointers are still valid when restarting the program and
|
|
6873 are used to regenerate the global pointers.
|
|
6874
|
452
|
6875 The @code{pdump_weak_object_chains} dynarr is a special case. The
|
|
6876 variables it points to are the head of weak linked lists of lisp objects
|
|
6877 of the same type. Not all objects of this list are dumped so the
|
|
6878 relocated pointer we associate with them points to the first dumped
|
|
6879 object of the list, or Qnil if none is available. This is also the
|
|
6880 reason why they are not used as roots for the purpose of object
|
|
6881 enumeration.
|
|
6882
|
|
6883 Some very important information like the @code{staticpros} and
|
|
6884 @code{lrecord_implementations_table} are handled indirectly using
|
|
6885 @code{dump_add_opaque} or @code{dump_add_root_struct_ptr}.
|
442
|
6886
|
|
6887 This is the end of the dumping part.
|
|
6888
|
462
|
6889 @node Reloading phase
|
442
|
6890 @section Reloading phase
|
462
|
6891 @cindex reloading phase
|
|
6892 @cindex dumping, reloading phase
|
442
|
6893
|
|
6894 @subsection File loading
|
462
|
6895 @cindex dumping, file loading
|
442
|
6896
|
|
6897 The file is mmap'ed in memory (which ensures a PAGESIZE alignment, at
|
|
6898 least 4096), or if mmap is unavailable or fails, a 256-bytes aligned
|
|
6899 malloc is done and the file is loaded.
|
|
6900
|
|
6901 Some variables are reinitialized from the values found in the header.
|
|
6902
|
|
6903 The difference between the actual loading address and the reloc_address
|
|
6904 is computed and will be used for all the relocations.
|
|
6905
|
|
6906
|
452
|
6907 @subsection Putting back the pdump_opaques
|
462
|
6908 @cindex dumping, putting back the pdump_opaques
|
452
|
6909
|
|
6910 The memory contents are restored in the obvious and trivial way.
|
|
6911
|
|
6912
|
|
6913 @subsection Putting back the pdump_root_struct_ptrs
|
462
|
6914 @cindex dumping, putting back the pdump_root_struct_ptrs
|
452
|
6915
|
|
6916 The variables pointed to by pdump_root_struct_ptrs in the dump phase are
|
|
6917 reset to the right relocated object addresses.
|
442
|
6918
|
|
6919
|
|
6920 @subsection Object relocation
|
462
|
6921 @cindex dumping, object relocation
|
442
|
6922
|
|
6923 All the objects are relocated using their description and their offset
|
|
6924 by @code{pdump_reloc_one}. This step is unnecessary if the
|
|
6925 reloc_address is equal to the file loading address.
|
|
6926
|
|
6927
|
452
|
6928 @subsection Putting back the pdump_root_objects and pdump_weak_object_chains
|
462
|
6929 @cindex dumping, putting back the pdump_root_objects and pdump_weak_object_chains
|
452
|
6930
|
|
6931 Same as Putting back the pdump_root_struct_ptrs.
|
442
|
6932
|
|
6933
|
|
6934 @subsection Reorganize the hash tables
|
462
|
6935 @cindex dumping, reorganize the hash tables
|
442
|
6936
|
|
6937 Since some of the hash values in the lisp hash tables are
|
|
6938 address-dependent, their layout is now wrong. So we go through each of
|
|
6939 them and have them resorted by calling @code{pdump_reorganize_hash_table}.
|
|
6940
|
462
|
6941 @node Remaining issues
|
442
|
6942 @section Remaining issues
|
462
|
6943 @cindex dumping, remaining issues
|
442
|
6944
|
|
6945 The build process will have to start a post-dump xemacs, ask it the
|
|
6946 loading address (which will, hopefully, be always the same between
|
|
6947 different xemacs invocations) and relocate the file to the new address.
|
|
6948 This way the object relocation phase will not have to be done, which
|
|
6949 means no writes in the objects and that, because of the use of mmap, the
|
|
6950 dumped data will be shared between all the xemacs running on the
|
|
6951 computer.
|
|
6952
|
|
6953 Some executable signature will be necessary to ensure that a given dump
|
|
6954 file is really associated with a given executable, or random crashes
|
|
6955 will occur. Maybe a random number set at compile or configure time thru
|
|
6956 a define. This will also allow for having differently-compiled xemacsen
|
|
6957 on the same system (mule and no-mule comes to mind).
|
|
6958
|
|
6959 The DOC file contents should probably end up in the dump file.
|
|
6960
|
|
6961
|
|
6962 @node Events and the Event Loop, Evaluation; Stack Frames; Bindings, Dumping, Top
|
428
|
6963 @chapter Events and the Event Loop
|
462
|
6964 @cindex events and the event loop
|
|
6965 @cindex event loop, events and the
|
428
|
6966
|
|
6967 @menu
|
|
6968 * Introduction to Events::
|
|
6969 * Main Loop::
|
|
6970 * Specifics of the Event Gathering Mechanism::
|
|
6971 * Specifics About the Emacs Event::
|
|
6972 * The Event Stream Callback Routines::
|
|
6973 * Other Event Loop Functions::
|
|
6974 * Converting Events::
|
|
6975 * Dispatching Events; The Command Builder::
|
|
6976 @end menu
|
|
6977
|
462
|
6978 @node Introduction to Events
|
428
|
6979 @section Introduction to Events
|
462
|
6980 @cindex events, introduction to
|
428
|
6981
|
|
6982 An event is an object that encapsulates information about an
|
|
6983 interesting occurrence in the operating system. Events are
|
|
6984 generated either by user action, direct (e.g. typing on the
|
|
6985 keyboard or moving the mouse) or indirect (moving another
|
|
6986 window, thereby generating an expose event on an Emacs frame),
|
|
6987 or as a result of some other typically asynchronous action happening,
|
|
6988 such as output from a subprocess being ready or a timer expiring.
|
|
6989 Events come into the system in an asynchronous fashion (typically
|
|
6990 through a callback being called) and are converted into a
|
|
6991 synchronous event queue (first-in, first-out) in a process that
|
|
6992 we will call @dfn{collection}.
|
|
6993
|
|
6994 Note that each application has its own event queue. (It is
|
|
6995 immaterial whether the collection process directly puts the
|
|
6996 events in the proper application's queue, or puts them into
|
|
6997 a single system queue, which is later split up.)
|
|
6998
|
|
6999 The most basic level of event collection is done by the
|
|
7000 operating system or window system. Typically, XEmacs does
|
|
7001 its own event collection as well. Often there are multiple
|
|
7002 layers of collection in XEmacs, with events from various
|
|
7003 sources being collected into a queue, which is then combined
|
|
7004 with other sources to go into another queue (i.e. a second
|
|
7005 level of collection), with perhaps another level on top of
|
|
7006 this, etc.
|
|
7007
|
|
7008 XEmacs has its own types of events (called @dfn{Emacs events}),
|
|
7009 which provides an abstract layer on top of the system-dependent
|
|
7010 nature of the most basic events that are received. Part of the
|
|
7011 complex nature of the XEmacs event collection process involves
|
|
7012 converting from the operating-system events into the proper
|
440
|
7013 Emacs events---there may not be a one-to-one correspondence.
|
428
|
7014
|
|
7015 Emacs events are documented in @file{events.h}; I'll discuss them
|
|
7016 later.
|
|
7017
|
462
|
7018 @node Main Loop
|
428
|
7019 @section Main Loop
|
462
|
7020 @cindex main loop
|
|
7021 @cindex events, main loop
|
428
|
7022
|
|
7023 The @dfn{command loop} is the top-level loop that the editor is always
|
|
7024 running. It loops endlessly, calling @code{next-event} to retrieve an
|
|
7025 event and @code{dispatch-event} to execute it. @code{dispatch-event} does
|
|
7026 the appropriate thing with non-user events (process, timeout,
|
|
7027 magic, eval, mouse motion); this involves calling a Lisp handler
|
|
7028 function, redrawing a newly-exposed part of a frame, reading
|
|
7029 subprocess output, etc. For user events, @code{dispatch-event}
|
|
7030 looks up the event in relevant keymaps or menubars; when a
|
|
7031 full key sequence or menubar selection is reached, the appropriate
|
|
7032 function is executed. @code{dispatch-event} may have to keep state
|
|
7033 across calls; this is done in the ``command-builder'' structure
|
|
7034 associated with each console (remember, there's usually only
|
|
7035 one console), and the engine that looks up keystrokes and
|
|
7036 constructs full key sequences is called the @dfn{command builder}.
|
|
7037 This is documented elsewhere.
|
|
7038
|
|
7039 The guts of the command loop are in @code{command_loop_1()}. This
|
440
|
7040 function doesn't catch errors, though---that's the job of
|
428
|
7041 @code{command_loop_2()}, which is a condition-case (i.e. error-trapping)
|
|
7042 wrapper around @code{command_loop_1()}. @code{command_loop_1()} never
|
|
7043 returns, but may get thrown out of.
|
|
7044
|
|
7045 When an error occurs, @code{cmd_error()} is called, which usually
|
|
7046 invokes the Lisp error handler in @code{command-error}; however, a
|
|
7047 default error handler is provided if @code{command-error} is @code{nil}
|
|
7048 (e.g. during startup). The purpose of the error handler is simply to
|
|
7049 display the error message and do associated cleanup; it does not need to
|
|
7050 throw anywhere. When the error handler finishes, the condition-case in
|
|
7051 @code{command_loop_2()} will finish and @code{command_loop_2()} will
|
|
7052 reinvoke @code{command_loop_1()}.
|
|
7053
|
|
7054 @code{command_loop_2()} is invoked from three places: from
|
|
7055 @code{initial_command_loop()} (called from @code{main()} at the end of
|
|
7056 internal initialization), from the Lisp function @code{recursive-edit},
|
|
7057 and from @code{call_command_loop()}.
|
|
7058
|
|
7059 @code{call_command_loop()} is called when a macro is started and when
|
|
7060 the minibuffer is entered; normal termination of the macro or minibuffer
|
|
7061 causes a throw out of the recursive command loop. (To
|
|
7062 @code{execute-kbd-macro} for macros and @code{exit} for minibuffers.
|
|
7063 Note also that the low-level minibuffer-entering function,
|
|
7064 @code{read-minibuffer-internal}, provides its own error handling and
|
|
7065 does not need @code{command_loop_2()}'s error encapsulation; so it tells
|
|
7066 @code{call_command_loop()} to invoke @code{command_loop_1()} directly.)
|
|
7067
|
|
7068 Note that both read-minibuffer-internal and recursive-edit set up a
|
|
7069 catch for @code{exit}; this is why @code{abort-recursive-edit}, which
|
|
7070 throws to this catch, exits out of either one.
|
|
7071
|
|
7072 @code{initial_command_loop()}, called from @code{main()}, sets up a
|
|
7073 catch for @code{top-level} when invoking @code{command_loop_2()},
|
|
7074 allowing functions to throw all the way to the top level if they really
|
|
7075 need to. Before invoking @code{command_loop_2()},
|
|
7076 @code{initial_command_loop()} calls @code{top_level_1()}, which handles
|
|
7077 all of the startup stuff (creating the initial frame, handling the
|
|
7078 command-line options, loading the user's @file{.emacs} file, etc.). The
|
|
7079 function that actually does this is in Lisp and is pointed to by the
|
|
7080 variable @code{top-level}; normally this function is
|
|
7081 @code{normal-top-level}. @code{top_level_1()} is just an error-handling
|
|
7082 wrapper similar to @code{command_loop_2()}. Note also that
|
|
7083 @code{initial_command_loop()} sets up a catch for @code{top-level} when
|
|
7084 invoking @code{top_level_1()}, just like when it invokes
|
|
7085 @code{command_loop_2()}.
|
|
7086
|
462
|
7087 @node Specifics of the Event Gathering Mechanism
|
428
|
7088 @section Specifics of the Event Gathering Mechanism
|
462
|
7089 @cindex event gathering mechanism, specifics of the
|
428
|
7090
|
|
7091 Here is an approximate diagram of the collection processes
|
|
7092 at work in XEmacs, under TTY's (TTY's are simpler than X
|
|
7093 so we'll look at this first):
|
|
7094
|
|
7095 @noindent
|
|
7096 @example
|
|
7097 asynch. asynch. asynch. asynch. [Collectors in
|
|
7098 kbd events kbd events process process the OS]
|
|
7099 | | output output
|
|
7100 | | | |
|
|
7101 | | | | SIGINT, [signal handlers
|
|
7102 | | | | SIGQUIT, in XEmacs]
|
|
7103 V V V V SIGWINCH,
|
|
7104 file file file file SIGALRM
|
|
7105 desc. desc. desc. desc. |
|
|
7106 (TTY) (TTY) (pipe) (pipe) |
|
|
7107 | | | | fake timeouts
|
|
7108 | | | | file |
|
|
7109 | | | | desc. |
|
|
7110 | | | | (pipe) |
|
|
7111 | | | | | |
|
|
7112 | | | | | |
|
|
7113 | | | | | |
|
|
7114 V V V V V V
|
|
7115 ------>-----------<----------------<----------------
|
|
7116 |
|
|
7117 |
|
|
7118 | [collected using select() in emacs_tty_next_event()
|
|
7119 | and converted to the appropriate Emacs event]
|
|
7120 |
|
|
7121 |
|
|
7122 V (above this line is TTY-specific)
|
|
7123 Emacs -----------------------------------------------
|
|
7124 event (below this line is the generic event mechanism)
|
|
7125 |
|
|
7126 |
|
|
7127 was there if not, call
|
|
7128 a SIGINT? emacs_tty_next_event()
|
|
7129 | |
|
|
7130 | |
|
|
7131 | |
|
|
7132 V V
|
|
7133 --->------<----
|
|
7134 |
|
|
7135 | [collected in event_stream_next_event();
|
|
7136 | SIGINT is converted using maybe_read_quit_event()]
|
|
7137 V
|
|
7138 Emacs
|
|
7139 event
|
|
7140 |
|
|
7141 \---->------>----- maybe_kbd_translate() ---->---\
|
|
7142 |
|
|
7143 |
|
|
7144 |
|
|
7145 command event queue |
|
|
7146 if not from command
|
|
7147 (contains events that were event queue, call
|
|
7148 read earlier but not processed, event_stream_next_event()
|
|
7149 typically when waiting in a |
|
|
7150 sit-for, sleep-for, etc. for |
|
|
7151 a particular event to be received) |
|
|
7152 | |
|
|
7153 | |
|
|
7154 V V
|
|
7155 ---->------------------------------------<----
|
|
7156 |
|
|
7157 | [collected in
|
|
7158 | next_event_internal()]
|
|
7159 |
|
|
7160 unread- unread- event from |
|
|
7161 command- command- keyboard else, call
|
|
7162 events event macro next_event_internal()
|
|
7163 | | | |
|
|
7164 | | | |
|
|
7165 | | | |
|
|
7166 V V V V
|
|
7167 --------->----------------------<------------
|
|
7168 |
|
|
7169 | [collected in `next-event', which may loop
|
|
7170 | more than once if the event it gets is on
|
|
7171 | a dead frame, device, etc.]
|
|
7172 |
|
|
7173 |
|
|
7174 V
|
|
7175 feed into top-level event loop,
|
|
7176 which repeatedly calls `next-event'
|
|
7177 and then dispatches the event
|
|
7178 using `dispatch-event'
|
|
7179 @end example
|
|
7180
|
|
7181 Notice the separation between TTY-specific and generic event mechanism.
|
|
7182 When using the Xt-based event loop, the TTY-specific stuff is replaced
|
|
7183 but the rest stays the same.
|
|
7184
|
|
7185 It's also important to realize that only one different kind of
|
|
7186 system-specific event loop can be operating at a time, and must be able
|
|
7187 to receive all kinds of events simultaneously. For the two existing
|
|
7188 event loops (implemented in @file{event-tty.c} and @file{event-Xt.c},
|
|
7189 respectively), the TTY event loop @emph{only} handles TTY consoles,
|
|
7190 while the Xt event loop handles @emph{both} TTY and X consoles. This
|
|
7191 situation is different from all of the output handlers, where you simply
|
|
7192 have one per console type.
|
|
7193
|
|
7194 Here's the Xt Event Loop Diagram (notice that below a certain point,
|
|
7195 it's the same as the above diagram):
|
|
7196
|
|
7197 @example
|
|
7198 asynch. asynch. asynch. asynch. [Collectors in
|
|
7199 kbd kbd process process the OS]
|
|
7200 events events output output
|
|
7201 | | | |
|
|
7202 | | | | asynch. asynch. [Collectors in the
|
|
7203 | | | | X X OS and X Window System]
|
|
7204 | | | | events events
|
|
7205 | | | | | |
|
|
7206 | | | | | |
|
|
7207 | | | | | | SIGINT, [signal handlers
|
|
7208 | | | | | | SIGQUIT, in XEmacs]
|
|
7209 | | | | | | SIGWINCH,
|
|
7210 | | | | | | SIGALRM
|
|
7211 | | | | | | |
|
|
7212 | | | | | | |
|
|
7213 | | | | | | | timeouts
|
|
7214 | | | | | | | |
|
|
7215 | | | | | | | |
|
|
7216 | | | | | | V |
|
|
7217 V V V V V V fake |
|
|
7218 file file file file file file file |
|
|
7219 desc. desc. desc. desc. desc. desc. desc. |
|
|
7220 (TTY) (TTY) (pipe) (pipe) (socket) (socket) (pipe) |
|
|
7221 | | | | | | | |
|
|
7222 | | | | | | | |
|
|
7223 | | | | | | | |
|
|
7224 V V V V V V V V
|
|
7225 --->----------------------------------------<---------<------
|
|
7226 | | |
|
|
7227 | | |[collected using select() in
|
|
7228 | | | _XtWaitForSomething(), called
|
|
7229 | | | from XtAppProcessEvent(), called
|
|
7230 | | | in emacs_Xt_next_event();
|
|
7231 | | | dispatched to various callbacks]
|
|
7232 | | |
|
|
7233 | | |
|
|
7234 emacs_Xt_ p_s_callback(), | [popup_selection_callback]
|
|
7235 event_handler() x_u_v_s_callback(),| [x_update_vertical_scrollbar_
|
|
7236 | x_u_h_s_callback(),| callback]
|
|
7237 | search_callback() | [x_update_horizontal_scrollbar_
|
|
7238 | | | callback]
|
|
7239 | | |
|
|
7240 | | |
|
|
7241 enqueue_Xt_ signal_special_ |
|
|
7242 dispatch_event() Xt_user_event() |
|
|
7243 [maybe multiple | |
|
|
7244 times, maybe 0 | |
|
|
7245 times] | |
|
|
7246 | enqueue_Xt_ |
|
|
7247 | dispatch_event() |
|
|
7248 | | |
|
|
7249 | | |
|
|
7250 V V |
|
|
7251 -->----------<-- |
|
|
7252 | |
|
|
7253 | |
|
|
7254 dispatch Xt_what_callback()
|
|
7255 event sets flags
|
|
7256 queue |
|
|
7257 | |
|
|
7258 | |
|
|
7259 | |
|
|
7260 | |
|
|
7261 ---->-----------<--------
|
|
7262 |
|
|
7263 |
|
|
7264 | [collected and converted as appropriate in
|
|
7265 | emacs_Xt_next_event()]
|
|
7266 |
|
|
7267 |
|
|
7268 V (above this line is Xt-specific)
|
|
7269 Emacs ------------------------------------------------
|
|
7270 event (below this line is the generic event mechanism)
|
|
7271 |
|
|
7272 |
|
|
7273 was there if not, call
|
|
7274 a SIGINT? emacs_Xt_next_event()
|
|
7275 | |
|
|
7276 | |
|
|
7277 | |
|
|
7278 V V
|
|
7279 --->-------<----
|
|
7280 |
|
|
7281 | [collected in event_stream_next_event();
|
|
7282 | SIGINT is converted using maybe_read_quit_event()]
|
|
7283 V
|
|
7284 Emacs
|
|
7285 event
|
|
7286 |
|
|
7287 \---->------>----- maybe_kbd_translate() -->-----\
|
|
7288 |
|
|
7289 |
|
|
7290 |
|
|
7291 command event queue |
|
|
7292 if not from command
|
|
7293 (contains events that were event queue, call
|
|
7294 read earlier but not processed, event_stream_next_event()
|
|
7295 typically when waiting in a |
|
|
7296 sit-for, sleep-for, etc. for |
|
|
7297 a particular event to be received) |
|
|
7298 | |
|
|
7299 | |
|
|
7300 V V
|
|
7301 ---->----------------------------------<------
|
|
7302 |
|
|
7303 | [collected in
|
|
7304 | next_event_internal()]
|
|
7305 |
|
|
7306 unread- unread- event from |
|
|
7307 command- command- keyboard else, call
|
|
7308 events event macro next_event_internal()
|
|
7309 | | | |
|
|
7310 | | | |
|
|
7311 | | | |
|
|
7312 V V V V
|
|
7313 --------->----------------------<------------
|
|
7314 |
|
|
7315 | [collected in `next-event', which may loop
|
|
7316 | more than once if the event it gets is on
|
|
7317 | a dead frame, device, etc.]
|
|
7318 |
|
|
7319 |
|
|
7320 V
|
|
7321 feed into top-level event loop,
|
|
7322 which repeatedly calls `next-event'
|
|
7323 and then dispatches the event
|
|
7324 using `dispatch-event'
|
|
7325 @end example
|
|
7326
|
462
|
7327 @node Specifics About the Emacs Event
|
428
|
7328 @section Specifics About the Emacs Event
|
462
|
7329 @cindex event, specifics about the Lisp object
|
|
7330
|
|
7331 @node The Event Stream Callback Routines
|
428
|
7332 @section The Event Stream Callback Routines
|
462
|
7333 @cindex event stream callback routines, the
|
|
7334 @cindex callback routines, the event stream
|
|
7335
|
|
7336 @node Other Event Loop Functions
|
428
|
7337 @section Other Event Loop Functions
|
462
|
7338 @cindex event loop functions, other
|
428
|
7339
|
|
7340 @code{detect_input_pending()} and @code{input-pending-p} look for
|
|
7341 input by calling @code{event_stream->event_pending_p} and looking in
|
|
7342 @code{[V]unread-command-event} and the @code{command_event_queue} (they
|
|
7343 do not check for an executing keyboard macro, though).
|
|
7344
|
|
7345 @code{discard-input} cancels any command events pending (and any
|
|
7346 keyboard macros currently executing), and puts the others onto the
|
|
7347 @code{command_event_queue}. There is a comment about a ``race
|
|
7348 condition'', which is not a good sign.
|
|
7349
|
|
7350 @code{next-command-event} and @code{read-char} are higher-level
|
|
7351 interfaces to @code{next-event}. @code{next-command-event} gets the
|
|
7352 next @dfn{command} event (i.e. keypress, mouse event, menu selection,
|
|
7353 or scrollbar action), calling @code{dispatch-event} on any others.
|
|
7354 @code{read-char} calls @code{next-command-event} and uses
|
|
7355 @code{event_to_character()} to return the character equivalent. With
|
|
7356 the right kind of input method support, it is possible for (read-char)
|
|
7357 to return a Kanji character.
|
|
7358
|
462
|
7359 @node Converting Events
|
428
|
7360 @section Converting Events
|
462
|
7361 @cindex converting events
|
|
7362 @cindex events, converting
|
428
|
7363
|
|
7364 @code{character_to_event()}, @code{event_to_character()},
|
|
7365 @code{event-to-character}, and @code{character-to-event} convert between
|
|
7366 characters and keypress events corresponding to the characters. If the
|
|
7367 event was not a keypress, @code{event_to_character()} returns -1 and
|
|
7368 @code{event-to-character} returns @code{nil}. These functions convert
|
|
7369 between character representation and the split-up event representation
|
|
7370 (keysym plus mod keys).
|
|
7371
|
462
|
7372 @node Dispatching Events; The Command Builder
|
428
|
7373 @section Dispatching Events; The Command Builder
|
462
|
7374 @cindex dispatching events; the command builder
|
|
7375 @cindex events; the command builder, dispatching
|
|
7376 @cindex command builder, dispatching events; the
|
428
|
7377
|
|
7378 Not yet documented.
|
|
7379
|
|
7380 @node Evaluation; Stack Frames; Bindings, Symbols and Variables, Events and the Event Loop, Top
|
|
7381 @chapter Evaluation; Stack Frames; Bindings
|
462
|
7382 @cindex evaluation; stack frames; bindings
|
|
7383 @cindex stack frames; bindings, evaluation;
|
|
7384 @cindex bindings, evaluation; stack frames;
|
428
|
7385
|
|
7386 @menu
|
|
7387 * Evaluation::
|
|
7388 * Dynamic Binding; The specbinding Stack; Unwind-Protects::
|
|
7389 * Simple Special Forms::
|
|
7390 * Catch and Throw::
|
|
7391 @end menu
|
|
7392
|
462
|
7393 @node Evaluation
|
428
|
7394 @section Evaluation
|
462
|
7395 @cindex evaluation
|
428
|
7396
|
|
7397 @code{Feval()} evaluates the form (a Lisp object) that is passed to
|
|
7398 it. Note that evaluation is only non-trivial for two types of objects:
|
|
7399 symbols and conses. A symbol is evaluated simply by calling
|
|
7400 @code{symbol-value} on it and returning the value.
|
|
7401
|
|
7402 Evaluating a cons means calling a function. First, @code{eval} checks
|
|
7403 to see if garbage-collection is necessary, and calls
|
|
7404 @code{garbage_collect_1()} if so. It then increases the evaluation
|
|
7405 depth by 1 (@code{lisp_eval_depth}, which is always less than
|
|
7406 @code{max_lisp_eval_depth}) and adds an element to the linked list of
|
|
7407 @code{struct backtrace}'s (@code{backtrace_list}). Each such structure
|
|
7408 contains a pointer to the function being called plus a list of the
|
|
7409 function's arguments. Originally these values are stored unevalled, and
|
|
7410 as they are evaluated, the backtrace structure is updated. Garbage
|
|
7411 collection pays attention to the objects pointed to in the backtrace
|
|
7412 structures (garbage collection might happen while a function is being
|
|
7413 called or while an argument is being evaluated, and there could easily
|
|
7414 be no other references to the arguments in the argument list; once an
|
|
7415 argument is evaluated, however, the unevalled version is not needed by
|
|
7416 eval, and so the backtrace structure is changed).
|
|
7417
|
|
7418 At this point, the function to be called is determined by looking at
|
|
7419 the car of the cons (if this is a symbol, its function definition is
|
|
7420 retrieved and the process repeated). The function should then consist
|
|
7421 of either a @code{Lisp_Subr} (built-in function written in C), a
|
|
7422 @code{Lisp_Compiled_Function} object, or a cons whose car is one of the
|
|
7423 symbols @code{autoload}, @code{macro} or @code{lambda}.
|
|
7424
|
|
7425 If the function is a @code{Lisp_Subr}, the lisp object points to a
|
|
7426 @code{struct Lisp_Subr} (created by @code{DEFUN()}), which contains a
|
|
7427 pointer to the C function, a minimum and maximum number of arguments
|
|
7428 (or possibly the special constants @code{MANY} or @code{UNEVALLED}), a
|
|
7429 pointer to the symbol referring to that subr, and a couple of other
|
|
7430 things. If the subr wants its arguments @code{UNEVALLED}, they are
|
|
7431 passed raw as a list. Otherwise, an array of evaluated arguments is
|
|
7432 created and put into the backtrace structure, and either passed whole
|
|
7433 (@code{MANY}) or each argument is passed as a C argument.
|
|
7434
|
|
7435 If the function is a @code{Lisp_Compiled_Function},
|
|
7436 @code{funcall_compiled_function()} is called. If the function is a
|
|
7437 lambda list, @code{funcall_lambda()} is called. If the function is a
|
|
7438 macro, [..... fill in] is done. If the function is an autoload,
|
|
7439 @code{do_autoload()} is called to load the definition and then eval
|
|
7440 starts over [explain this more].
|
|
7441
|
|
7442 When @code{Feval()} exits, the evaluation depth is reduced by one, the
|
|
7443 debugger is called if appropriate, and the current backtrace structure
|
|
7444 is removed from the list.
|
|
7445
|
|
7446 Both @code{funcall_compiled_function()} and @code{funcall_lambda()} need
|
|
7447 to go through the list of formal parameters to the function and bind
|
|
7448 them to the actual arguments, checking for @code{&rest} and
|
|
7449 @code{&optional} symbols in the formal parameters and making sure the
|
|
7450 number of actual arguments is correct.
|
|
7451 @code{funcall_compiled_function()} can do this a little more
|
|
7452 efficiently, since the formal parameter list can be checked for sanity
|
|
7453 when the compiled function object is created.
|
|
7454
|
|
7455 @code{funcall_lambda()} simply calls @code{Fprogn} to execute the code
|
|
7456 in the lambda list.
|
|
7457
|
|
7458 @code{funcall_compiled_function()} calls the real byte-code interpreter
|
|
7459 @code{execute_optimized_program()} on the byte-code instructions, which
|
|
7460 are converted into an internal form for faster execution.
|
|
7461
|
|
7462 When a compiled function is executed for the first time by
|
442
|
7463 @code{funcall_compiled_function()}, or during the dump phase of building
|
|
7464 XEmacs, the byte-code instructions are converted from a
|
|
7465 @code{Lisp_String} (which is inefficient to access, especially in the
|
|
7466 presence of MULE) into a @code{Lisp_Opaque} object containing an array
|
|
7467 of unsigned char, which can be directly executed by the byte-code
|
|
7468 interpreter. At this time the byte code is also analyzed for validity
|
|
7469 and transformed into a more optimized form, so that
|
428
|
7470 @code{execute_optimized_program()} can really fly.
|
|
7471
|
|
7472 Here are some of the optimizations performed by the internal byte-code
|
|
7473 transformer:
|
|
7474 @enumerate
|
|
7475 @item
|
|
7476 References to the @code{constants} array are checked for out-of-range
|
|
7477 indices, so that the byte interpreter doesn't have to.
|
|
7478 @item
|
|
7479 References to the @code{constants} array that will be used as a Lisp
|
|
7480 variable are checked for being correct non-constant (i.e. not @code{t},
|
|
7481 @code{nil}, or @code{keywordp}) symbols, so that the byte interpreter
|
|
7482 doesn't have to.
|
|
7483 @item
|
442
|
7484 The maximum number of variable bindings in the byte-code is
|
428
|
7485 pre-computed, so that space on the @code{specpdl} stack can be
|
|
7486 pre-reserved once for the whole function execution.
|
|
7487 @item
|
|
7488 All byte-code jumps are relative to the current program counter instead
|
|
7489 of the start of the program, thereby saving a register.
|
|
7490 @item
|
|
7491 One-byte relative jumps are converted from the byte-code form of unsigned
|
|
7492 chars offset by 127 to machine-friendly signed chars.
|
|
7493 @end enumerate
|
|
7494
|
|
7495 Of course, this transformation of the @code{instructions} should not be
|
|
7496 visible to the user, so @code{Fcompiled_function_instructions()} needs
|
|
7497 to know how to convert the optimized opaque object back into a Lisp
|
|
7498 string that is identical to the original string from the @file{.elc}
|
|
7499 file. (Actually, the resulting string may (rarely) contain slightly
|
|
7500 different, yet equivalent, byte code.)
|
|
7501
|
|
7502 @code{Ffuncall()} implements Lisp @code{funcall}. @code{(funcall fun
|
|
7503 x1 x2 x3 ...)} is equivalent to @code{(eval (list fun (quote x1) (quote
|
|
7504 x2) (quote x3) ...))}. @code{Ffuncall()} contains its own code to do
|
|
7505 the evaluation, however, and is very similar to @code{Feval()}.
|
|
7506
|
|
7507 From the performance point of view, it is worth knowing that most of the
|
|
7508 time in Lisp evaluation is spent executing @code{Lisp_Subr} and
|
|
7509 @code{Lisp_Compiled_Function} objects via @code{Ffuncall()} (not
|
|
7510 @code{Feval()}).
|
|
7511
|
|
7512 @code{Fapply()} implements Lisp @code{apply}, which is very similar to
|
|
7513 @code{funcall} except that if the last argument is a list, the result is the
|
|
7514 same as if each of the arguments in the list had been passed separately.
|
|
7515 @code{Fapply()} does some business to expand the last argument if it's a
|
|
7516 list, then calls @code{Ffuncall()} to do the work.
|
|
7517
|
|
7518 @code{apply1()}, @code{call0()}, @code{call1()}, @code{call2()}, and
|
|
7519 @code{call3()} call a function, passing it the argument(s) given (the
|
|
7520 arguments are given as separate C arguments rather than being passed as
|
|
7521 an array). @code{apply1()} uses @code{Fapply()} while the others use
|
|
7522 @code{Ffuncall()} to do the real work.
|
|
7523
|
462
|
7524 @node Dynamic Binding; The specbinding Stack; Unwind-Protects
|
428
|
7525 @section Dynamic Binding; The specbinding Stack; Unwind-Protects
|
462
|
7526 @cindex dynamic binding; the specbinding stack; unwind-protects
|
|
7527 @cindex binding; the specbinding stack; unwind-protects, dynamic
|
|
7528 @cindex specbinding stack; unwind-protects, dynamic binding; the
|
|
7529 @cindex unwind-protects, dynamic binding; the specbinding stack;
|
428
|
7530
|
|
7531 @example
|
|
7532 struct specbinding
|
|
7533 @{
|
|
7534 Lisp_Object symbol;
|
|
7535 Lisp_Object old_value;
|
|
7536 Lisp_Object (*func) (Lisp_Object); /* for unwind-protect */
|
|
7537 @};
|
|
7538 @end example
|
|
7539
|
|
7540 @code{struct specbinding} is used for local-variable bindings and
|
|
7541 unwind-protects. @code{specpdl} holds an array of @code{struct specbinding}'s,
|
|
7542 @code{specpdl_ptr} points to the beginning of the free bindings in the
|
|
7543 array, @code{specpdl_size} specifies the total number of binding slots
|
|
7544 in the array, and @code{max_specpdl_size} specifies the maximum number
|
|
7545 of bindings the array can be expanded to hold. @code{grow_specpdl()}
|
|
7546 increases the size of the @code{specpdl} array, multiplying its size by
|
|
7547 2 but never exceeding @code{max_specpdl_size} (except that if this
|
|
7548 number is less than 400, it is first set to 400).
|
|
7549
|
|
7550 @code{specbind()} binds a symbol to a value and is used for local
|
|
7551 variables and @code{let} forms. The symbol and its old value (which
|
|
7552 might be @code{Qunbound}, indicating no prior value) are recorded in the
|
|
7553 specpdl array, and @code{specpdl_size} is increased by 1.
|
|
7554
|
|
7555 @code{record_unwind_protect()} implements an @dfn{unwind-protect},
|
|
7556 which, when placed around a section of code, ensures that some specified
|
|
7557 cleanup routine will be executed even if the code exits abnormally
|
|
7558 (e.g. through a @code{throw} or quit). @code{record_unwind_protect()}
|
|
7559 simply adds a new specbinding to the @code{specpdl} array and stores the
|
|
7560 appropriate information in it. The cleanup routine can either be a C
|
|
7561 function, which is stored in the @code{func} field, or a @code{progn}
|
|
7562 form, which is stored in the @code{old_value} field.
|
|
7563
|
|
7564 @code{unbind_to()} removes specbindings from the @code{specpdl} array
|
|
7565 until the specified position is reached. Each specbinding can be one of
|
|
7566 three types:
|
|
7567
|
|
7568 @enumerate
|
|
7569 @item
|
|
7570 an unwind-protect with a C cleanup function (@code{func} is not 0, and
|
|
7571 @code{old_value} holds an argument to be passed to the function);
|
|
7572 @item
|
|
7573 an unwind-protect with a Lisp form (@code{func} is 0, @code{symbol}
|
|
7574 is @code{nil}, and @code{old_value} holds the form to be executed with
|
|
7575 @code{Fprogn()}); or
|
|
7576 @item
|
|
7577 a local-variable binding (@code{func} is 0, @code{symbol} is not
|
|
7578 @code{nil}, and @code{old_value} holds the old value, which is stored as
|
|
7579 the symbol's value).
|
|
7580 @end enumerate
|
|
7581
|
462
|
7582 @node Simple Special Forms
|
428
|
7583 @section Simple Special Forms
|
462
|
7584 @cindex special forms, simple
|
428
|
7585
|
|
7586 @code{or}, @code{and}, @code{if}, @code{cond}, @code{progn},
|
|
7587 @code{prog1}, @code{prog2}, @code{setq}, @code{quote}, @code{function},
|
|
7588 @code{let*}, @code{let}, @code{while}
|
|
7589
|
|
7590 All of these are very simple and work as expected, calling
|
|
7591 @code{Feval()} or @code{Fprogn()} as necessary and (in the case of
|
|
7592 @code{let} and @code{let*}) using @code{specbind()} to create bindings
|
|
7593 and @code{unbind_to()} to undo the bindings when finished.
|
|
7594
|
442
|
7595 Note that, with the exception of @code{Fprogn}, these functions are
|
428
|
7596 typically called in real life only in interpreted code, since the byte
|
|
7597 compiler knows how to convert calls to these functions directly into
|
|
7598 byte code.
|
|
7599
|
462
|
7600 @node Catch and Throw
|
428
|
7601 @section Catch and Throw
|
462
|
7602 @cindex catch and throw
|
|
7603 @cindex throw, catch and
|
428
|
7604
|
|
7605 @example
|
|
7606 struct catchtag
|
|
7607 @{
|
|
7608 Lisp_Object tag;
|
|
7609 Lisp_Object val;
|
|
7610 struct catchtag *next;
|
|
7611 struct gcpro *gcpro;
|
|
7612 jmp_buf jmp;
|
|
7613 struct backtrace *backlist;
|
|
7614 int lisp_eval_depth;
|
|
7615 int pdlcount;
|
|
7616 @};
|
|
7617 @end example
|
|
7618
|
|
7619 @code{catch} is a Lisp function that places a catch around a body of
|
|
7620 code. A catch is a means of non-local exit from the code. When a catch
|
|
7621 is created, a tag is specified, and executing a @code{throw} to this tag
|
|
7622 will exit from the body of code caught with this tag, and its value will
|
|
7623 be the value given in the call to @code{throw}. If there is no such
|
|
7624 call, the code will be executed normally.
|
|
7625
|
|
7626 Information pertaining to a catch is held in a @code{struct catchtag},
|
|
7627 which is placed at the head of a linked list pointed to by
|
|
7628 @code{catchlist}. @code{internal_catch()} is passed a C function to
|
|
7629 call (@code{Fprogn()} when Lisp @code{catch} is called) and arguments to
|
|
7630 give it, and places a catch around the function. Each @code{struct
|
|
7631 catchtag} is held in the stack frame of the @code{internal_catch()}
|
|
7632 instance that created the catch.
|
|
7633
|
|
7634 @code{internal_catch()} is fairly straightforward. It stores into the
|
|
7635 @code{struct catchtag} the tag name and the current values of
|
|
7636 @code{backtrace_list}, @code{lisp_eval_depth}, @code{gcprolist}, and the
|
|
7637 offset into the @code{specpdl} array, sets a jump point with @code{_setjmp()}
|
|
7638 (storing the jump point into the @code{struct catchtag}), and calls the
|
|
7639 function. Control will return to @code{internal_catch()} either when
|
|
7640 the function exits normally or through a @code{_longjmp()} to this jump
|
|
7641 point. In the latter case, @code{throw} will store the value to be
|
|
7642 returned into the @code{struct catchtag} before jumping. When it's
|
|
7643 done, @code{internal_catch()} removes the @code{struct catchtag} from
|
|
7644 the catchlist and returns the proper value.
|
|
7645
|
|
7646 @code{Fthrow()} goes up through the catchlist until it finds one with
|
|
7647 a matching tag. It then calls @code{unbind_catch()} to restore
|
|
7648 everything to what it was when the appropriate catch was set, stores the
|
|
7649 return value in the @code{struct catchtag}, and jumps (with
|
|
7650 @code{_longjmp()}) to its jump point.
|
|
7651
|
|
7652 @code{unbind_catch()} removes all catches from the catchlist until it
|
|
7653 finds the correct one. Some of the catches might have been placed for
|
|
7654 error-trapping, and if so, the appropriate entries on the handlerlist
|
|
7655 must be removed (see ``errors''). @code{unbind_catch()} also restores
|
|
7656 the values of @code{gcprolist}, @code{backtrace_list}, and
|
|
7657 @code{lisp_eval}, and calls @code{unbind_to()} to undo any specbindings
|
|
7658 created since the catch.
|
|
7659
|
|
7660
|
|
7661 @node Symbols and Variables, Buffers and Textual Representation, Evaluation; Stack Frames; Bindings, Top
|
|
7662 @chapter Symbols and Variables
|
462
|
7663 @cindex symbols and variables
|
|
7664 @cindex variables, symbols and
|
428
|
7665
|
|
7666 @menu
|
|
7667 * Introduction to Symbols::
|
|
7668 * Obarrays::
|
|
7669 * Symbol Values::
|
|
7670 @end menu
|
|
7671
|
462
|
7672 @node Introduction to Symbols
|
428
|
7673 @section Introduction to Symbols
|
462
|
7674 @cindex symbols, introduction to
|
428
|
7675
|
|
7676 A symbol is basically just an object with four fields: a name (a
|
|
7677 string), a value (some Lisp object), a function (some Lisp object), and
|
|
7678 a property list (usually a list of alternating keyword/value pairs).
|
|
7679 What makes symbols special is that there is usually only one symbol with
|
|
7680 a given name, and the symbol is referred to by name. This makes a
|
|
7681 symbol a convenient way of calling up data by name, i.e. of implementing
|
|
7682 variables. (The variable's value is stored in the @dfn{value slot}.)
|
|
7683 Similarly, functions are referenced by name, and the definition of the
|
|
7684 function is stored in a symbol's @dfn{function slot}. This means that
|
|
7685 there can be a distinct function and variable with the same name. The
|
|
7686 property list is used as a more general mechanism of associating
|
|
7687 additional values with particular names, and once again the namespace is
|
|
7688 independent of the function and variable namespaces.
|
|
7689
|
462
|
7690 @node Obarrays
|
428
|
7691 @section Obarrays
|
462
|
7692 @cindex obarrays
|
428
|
7693
|
|
7694 The identity of symbols with their names is accomplished through a
|
|
7695 structure called an obarray, which is just a poorly-implemented hash
|
|
7696 table mapping from strings to symbols whose name is that string. (I say
|
|
7697 ``poorly implemented'' because an obarray appears in Lisp as a vector
|
|
7698 with some hidden fields rather than as its own opaque type. This is an
|
|
7699 Emacs Lisp artifact that should be fixed.)
|
|
7700
|
|
7701 Obarrays are implemented as a vector of some fixed size (which should
|
|
7702 be a prime for best results), where each ``bucket'' of the vector
|
|
7703 contains one or more symbols, threaded through a hidden @code{next}
|
|
7704 field in the symbol. Lookup of a symbol in an obarray, and adding a
|
|
7705 symbol to an obarray, is accomplished through standard hash-table
|
|
7706 techniques.
|
|
7707
|
|
7708 The standard Lisp function for working with symbols and obarrays is
|
|
7709 @code{intern}. This looks up a symbol in an obarray given its name; if
|
|
7710 it's not found, a new symbol is automatically created with the specified
|
|
7711 name, added to the obarray, and returned. This is what happens when the
|
|
7712 Lisp reader encounters a symbol (or more precisely, encounters the name
|
|
7713 of a symbol) in some text that it is reading. There is a standard
|
|
7714 obarray called @code{obarray} that is used for this purpose, although
|
|
7715 the Lisp programmer is free to create his own obarrays and @code{intern}
|
|
7716 symbols in them.
|
|
7717
|
|
7718 Note that, once a symbol is in an obarray, it stays there until
|
|
7719 something is done about it, and the standard obarray @code{obarray}
|
|
7720 always stays around, so once you use any particular variable name, a
|
|
7721 corresponding symbol will stay around in @code{obarray} until you exit
|
|
7722 XEmacs.
|
|
7723
|
|
7724 Note that @code{obarray} itself is a variable, and as such there is a
|
|
7725 symbol in @code{obarray} whose name is @code{"obarray"} and which
|
|
7726 contains @code{obarray} as its value.
|
|
7727
|
|
7728 Note also that this call to @code{intern} occurs only when in the Lisp
|
|
7729 reader, not when the code is executed (at which point the symbol is
|
|
7730 already around, stored as such in the definition of the function).
|
|
7731
|
|
7732 You can create your own obarray using @code{make-vector} (this is
|
|
7733 horrible but is an artifact) and intern symbols into that obarray.
|
|
7734 Doing that will result in two or more symbols with the same name.
|
|
7735 However, at most one of these symbols is in the standard @code{obarray}:
|
|
7736 You cannot have two symbols of the same name in any particular obarray.
|
|
7737 Note that you cannot add a symbol to an obarray in any fashion other
|
|
7738 than using @code{intern}: i.e. you can't take an existing symbol and put
|
|
7739 it in an existing obarray. Nor can you change the name of an existing
|
|
7740 symbol. (Since obarrays are vectors, you can violate the consistency of
|
|
7741 things by storing directly into the vector, but let's ignore that
|
|
7742 possibility.)
|
|
7743
|
|
7744 Usually symbols are created by @code{intern}, but if you really want,
|
|
7745 you can explicitly create a symbol using @code{make-symbol}, giving it
|
|
7746 some name. The resulting symbol is not in any obarray (i.e. it is
|
|
7747 @dfn{uninterned}), and you can't add it to any obarray. Therefore its
|
|
7748 primary purpose is as a symbol to use in macros to avoid namespace
|
|
7749 pollution. It can also be used as a carrier of information, but cons
|
|
7750 cells could probably be used just as well.
|
|
7751
|
|
7752 You can also use @code{intern-soft} to look up a symbol but not create
|
|
7753 a new one, and @code{unintern} to remove a symbol from an obarray. This
|
|
7754 returns the removed symbol. (Remember: You can't put the symbol back
|
|
7755 into any obarray.) Finally, @code{mapatoms} maps over all of the symbols
|
|
7756 in an obarray.
|
|
7757
|
462
|
7758 @node Symbol Values
|
428
|
7759 @section Symbol Values
|
462
|
7760 @cindex symbol values
|
|
7761 @cindex values, symbol
|
428
|
7762
|
|
7763 The value field of a symbol normally contains a Lisp object. However,
|
|
7764 a symbol can be @dfn{unbound}, meaning that it logically has no value.
|
|
7765 This is internally indicated by storing a special Lisp object, called
|
|
7766 @dfn{the unbound marker} and stored in the global variable
|
|
7767 @code{Qunbound}. The unbound marker is of a special Lisp object type
|
|
7768 called @dfn{symbol-value-magic}. It is impossible for the Lisp
|
|
7769 programmer to directly create or access any object of this type.
|
|
7770
|
|
7771 @strong{You must not let any ``symbol-value-magic'' object escape to
|
|
7772 the Lisp level.} Printing any of these objects will cause the message
|
|
7773 @samp{INTERNAL EMACS BUG} to appear as part of the print representation.
|
|
7774 (You may see this normally when you call @code{debug_print()} from the
|
|
7775 debugger on a Lisp object.) If you let one of these objects escape to
|
|
7776 the Lisp level, you will violate a number of assumptions contained in
|
|
7777 the C code and make the unbound marker not function right.
|
|
7778
|
|
7779 When a symbol is created, its value field (and function field) are set
|
|
7780 to @code{Qunbound}. The Lisp programmer can restore these conditions
|
|
7781 later using @code{makunbound} or @code{fmakunbound}, and can query to
|
|
7782 see whether the value of function fields are @dfn{bound} (i.e. have a
|
|
7783 value other than @code{Qunbound}) using @code{boundp} and
|
|
7784 @code{fboundp}. The fields are set to a normal Lisp object using
|
|
7785 @code{set} (or @code{setq}) and @code{fset}.
|
|
7786
|
|
7787 Other symbol-value-magic objects are used as special markers to
|
|
7788 indicate variables that have non-normal properties. This includes any
|
|
7789 variables that are tied into C variables (setting the variable magically
|
|
7790 sets some global variable in the C code, and likewise for retrieving the
|
|
7791 variable's value), variables that magically tie into slots in the
|
|
7792 current buffer, variables that are buffer-local, etc. The
|
|
7793 symbol-value-magic object is stored in the value cell in place of
|
|
7794 a normal object, and the code to retrieve a symbol's value
|
|
7795 (i.e. @code{symbol-value}) knows how to do special things with them.
|
|
7796 This means that you should not just fetch the value cell directly if you
|
|
7797 want a symbol's value.
|
|
7798
|
|
7799 The exact workings of this are rather complex and involved and are
|
|
7800 well-documented in comments in @file{buffer.c}, @file{symbols.c}, and
|
|
7801 @file{lisp.h}.
|
|
7802
|
|
7803 @node Buffers and Textual Representation, MULE Character Sets and Encodings, Symbols and Variables, Top
|
|
7804 @chapter Buffers and Textual Representation
|
462
|
7805 @cindex buffers and textual representation
|
|
7806 @cindex textual representation, buffers and
|
428
|
7807
|
|
7808 @menu
|
|
7809 * Introduction to Buffers:: A buffer holds a block of text such as a file.
|
|
7810 * The Text in a Buffer:: Representation of the text in a buffer.
|
|
7811 * Buffer Lists:: Keeping track of all buffers.
|
|
7812 * Markers and Extents:: Tagging locations within a buffer.
|
|
7813 * Bufbytes and Emchars:: Representation of individual characters.
|
|
7814 * The Buffer Object:: The Lisp object corresponding to a buffer.
|
|
7815 @end menu
|
|
7816
|
462
|
7817 @node Introduction to Buffers
|
428
|
7818 @section Introduction to Buffers
|
462
|
7819 @cindex buffers, introduction to
|
428
|
7820
|
|
7821 A buffer is logically just a Lisp object that holds some text.
|
|
7822 In this, it is like a string, but a buffer is optimized for
|
|
7823 frequent insertion and deletion, while a string is not. Furthermore:
|
|
7824
|
|
7825 @enumerate
|
|
7826 @item
|
|
7827 Buffers are @dfn{permanent} objects, i.e. once you create them, they
|
|
7828 remain around, and need to be explicitly deleted before they go away.
|
|
7829 @item
|
|
7830 Each buffer has a unique name, which is a string. Buffers are
|
|
7831 normally referred to by name. In this respect, they are like
|
|
7832 symbols.
|
|
7833 @item
|
|
7834 Buffers have a default insertion position, called @dfn{point}.
|
|
7835 Inserting text (unless you explicitly give a position) goes at point,
|
|
7836 and moves point forward past the text. This is what is going on when
|
|
7837 you type text into Emacs.
|
|
7838 @item
|
|
7839 Buffers have lots of extra properties associated with them.
|
|
7840 @item
|
|
7841 Buffers can be @dfn{displayed}. What this means is that there
|
|
7842 exist a number of @dfn{windows}, which are objects that correspond
|
|
7843 to some visible section of your display, and each window has
|
|
7844 an associated buffer, and the current contents of the buffer
|
|
7845 are shown in that section of the display. The redisplay mechanism
|
|
7846 (which takes care of doing this) knows how to look at the
|
|
7847 text of a buffer and come up with some reasonable way of displaying
|
|
7848 this. Many of the properties of a buffer control how the
|
|
7849 buffer's text is displayed.
|
|
7850 @item
|
|
7851 One buffer is distinguished and called the @dfn{current buffer}. It is
|
|
7852 stored in the variable @code{current_buffer}. Buffer operations operate
|
|
7853 on this buffer by default. When you are typing text into a buffer, the
|
|
7854 buffer you are typing into is always @code{current_buffer}. Switching
|
|
7855 to a different window changes the current buffer. Note that Lisp code
|
|
7856 can temporarily change the current buffer using @code{set-buffer} (often
|
|
7857 enclosed in a @code{save-excursion} so that the former current buffer
|
|
7858 gets restored when the code is finished). However, calling
|
|
7859 @code{set-buffer} will NOT cause a permanent change in the current
|
|
7860 buffer. The reason for this is that the top-level event loop sets
|
|
7861 @code{current_buffer} to the buffer of the selected window, each time
|
|
7862 it finishes executing a user command.
|
|
7863 @end enumerate
|
|
7864
|
|
7865 Make sure you understand the distinction between @dfn{current buffer}
|
|
7866 and @dfn{buffer of the selected window}, and the distinction between
|
|
7867 @dfn{point} of the current buffer and @dfn{window-point} of the selected
|
|
7868 window. (This latter distinction is explained in detail in the section
|
|
7869 on windows.)
|
|
7870
|
462
|
7871 @node The Text in a Buffer
|
428
|
7872 @section The Text in a Buffer
|
462
|
7873 @cindex text in a buffer, the
|
|
7874 @cindex buffer, the text in a
|
428
|
7875
|
|
7876 The text in a buffer consists of a sequence of zero or more
|
|
7877 characters. A @dfn{character} is an integer that logically represents
|
|
7878 a letter, number, space, or other unit of text. Most of the characters
|
|
7879 that you will typically encounter belong to the ASCII set of characters,
|
|
7880 but there are also characters for various sorts of accented letters,
|
|
7881 special symbols, Chinese and Japanese ideograms (i.e. Kanji, Katakana,
|
|
7882 etc.), Cyrillic and Greek letters, etc. The actual number of possible
|
|
7883 characters is quite large.
|
|
7884
|
|
7885 For now, we can view a character as some non-negative integer that
|
|
7886 has some shape that defines how it typically appears (e.g. as an
|
|
7887 uppercase A). (The exact way in which a character appears depends on the
|
|
7888 font used to display the character.) The internal type of characters in
|
|
7889 the C code is an @code{Emchar}; this is just an @code{int}, but using a
|
|
7890 symbolic type makes the code clearer.
|
|
7891
|
|
7892 Between every character in a buffer is a @dfn{buffer position} or
|
|
7893 @dfn{character position}. We can speak of the character before or after
|
|
7894 a particular buffer position, and when you insert a character at a
|
|
7895 particular position, all characters after that position end up at new
|
|
7896 positions. When we speak of the character @dfn{at} a position, we
|
|
7897 really mean the character after the position. (This schizophrenia
|
|
7898 between a buffer position being ``between'' a character and ``on'' a
|
|
7899 character is rampant in Emacs.)
|
|
7900
|
|
7901 Buffer positions are numbered starting at 1. This means that
|
|
7902 position 1 is before the first character, and position 0 is not
|
|
7903 valid. If there are N characters in a buffer, then buffer
|
|
7904 position N+1 is after the last one, and position N+2 is not valid.
|
|
7905
|
|
7906 The internal makeup of the Emchar integer varies depending on whether
|
|
7907 we have compiled with MULE support. If not, the Emchar integer is an
|
|
7908 8-bit integer with possible values from 0 - 255. 0 - 127 are the
|
|
7909 standard ASCII characters, while 128 - 255 are the characters from the
|
|
7910 ISO-8859-1 character set. If we have compiled with MULE support, an
|
|
7911 Emchar is a 19-bit integer, with the various bits having meanings
|
|
7912 according to a complex scheme that will be detailed later. The
|
|
7913 characters numbered 0 - 255 still have the same meanings as for the
|
|
7914 non-MULE case, though.
|
|
7915
|
|
7916 Internally, the text in a buffer is represented in a fairly simple
|
|
7917 fashion: as a contiguous array of bytes, with a @dfn{gap} of some size
|
|
7918 in the middle. Although the gap is of some substantial size in bytes,
|
|
7919 there is no text contained within it: From the perspective of the text
|
|
7920 in the buffer, it does not exist. The gap logically sits at some buffer
|
|
7921 position, between two characters (or possibly at the beginning or end of
|
|
7922 the buffer). Insertion of text in a buffer at a particular position is
|
|
7923 always accomplished by first moving the gap to that position
|
|
7924 (i.e. through some block moving of text), then writing the text into the
|
|
7925 beginning of the gap, thereby shrinking the gap. If the gap shrinks
|
|
7926 down to nothing, a new gap is created. (What actually happens is that a
|
|
7927 new gap is ``created'' at the end of the buffer's text, which requires
|
|
7928 nothing more than changing a couple of indices; then the gap is
|
|
7929 ``moved'' to the position where the insertion needs to take place by
|
|
7930 moving up in memory all the text after that position.) Similarly,
|
|
7931 deletion occurs by moving the gap to the place where the text is to be
|
|
7932 deleted, and then simply expanding the gap to include the deleted text.
|
|
7933 (@dfn{Expanding} and @dfn{shrinking} the gap as just described means
|
|
7934 just that the internal indices that keep track of where the gap is
|
|
7935 located are changed.)
|
|
7936
|
|
7937 Note that the total amount of memory allocated for a buffer text never
|
|
7938 decreases while the buffer is live. Therefore, if you load up a
|
|
7939 20-megabyte file and then delete all but one character, there will be a
|
|
7940 20-megabyte gap, which won't get any smaller (except by inserting
|
|
7941 characters back again). Once the buffer is killed, the memory allocated
|
|
7942 for the buffer text will be freed, but it will still be sitting on the
|
|
7943 heap, taking up virtual memory, and will not be released back to the
|
|
7944 operating system. (However, if you have compiled XEmacs with rel-alloc,
|
|
7945 the situation is different. In this case, the space @emph{will} be
|
|
7946 released back to the operating system. However, this tends to result in a
|
|
7947 noticeable speed penalty.)
|
|
7948
|
|
7949 Astute readers may notice that the text in a buffer is represented as
|
|
7950 an array of @emph{bytes}, while (at least in the MULE case) an Emchar is
|
|
7951 a 19-bit integer, which clearly cannot fit in a byte. This means (of
|
|
7952 course) that the text in a buffer uses a different representation from
|
|
7953 an Emchar: specifically, the 19-bit Emchar becomes a series of one to
|
|
7954 four bytes. The conversion between these two representations is complex
|
|
7955 and will be described later.
|
|
7956
|
|
7957 In the non-MULE case, everything is very simple: An Emchar
|
|
7958 is an 8-bit value, which fits neatly into one byte.
|
|
7959
|
|
7960 If we are given a buffer position and want to retrieve the
|
|
7961 character at that position, we need to follow these steps:
|
|
7962
|
|
7963 @enumerate
|
|
7964 @item
|
|
7965 Pretend there's no gap, and convert the buffer position into a @dfn{byte
|
|
7966 index} that indexes to the appropriate byte in the buffer's stream of
|
|
7967 textual bytes. By convention, byte indices begin at 1, just like buffer
|
|
7968 positions. In the non-MULE case, byte indices and buffer positions are
|
|
7969 identical, since one character equals one byte.
|
|
7970 @item
|
|
7971 Convert the byte index into a @dfn{memory index}, which takes the gap
|
|
7972 into account. The memory index is a direct index into the block of
|
|
7973 memory that stores the text of a buffer. This basically just involves
|
|
7974 checking to see if the byte index is past the gap, and if so, adding the
|
|
7975 size of the gap to it. By convention, memory indices begin at 1, just
|
|
7976 like buffer positions and byte indices, and when referring to the
|
|
7977 position that is @dfn{at} the gap, we always use the memory position at
|
|
7978 the @emph{beginning}, not at the end, of the gap.
|
|
7979 @item
|
|
7980 Fetch the appropriate bytes at the determined memory position.
|
|
7981 @item
|
|
7982 Convert these bytes into an Emchar.
|
|
7983 @end enumerate
|
|
7984
|
|
7985 In the non-Mule case, (3) and (4) boil down to a simple one-byte
|
|
7986 memory access.
|
|
7987
|
|
7988 Note that we have defined three types of positions in a buffer:
|
|
7989
|
|
7990 @enumerate
|
|
7991 @item
|
|
7992 @dfn{buffer positions} or @dfn{character positions}, typedef @code{Bufpos}
|
|
7993 @item
|
|
7994 @dfn{byte indices}, typedef @code{Bytind}
|
|
7995 @item
|
|
7996 @dfn{memory indices}, typedef @code{Memind}
|
|
7997 @end enumerate
|
|
7998
|
|
7999 All three typedefs are just @code{int}s, but defining them this way makes
|
|
8000 things a lot clearer.
|
|
8001
|
|
8002 Most code works with buffer positions. In particular, all Lisp code
|
|
8003 that refers to text in a buffer uses buffer positions. Lisp code does
|
|
8004 not know that byte indices or memory indices exist.
|
|
8005
|
|
8006 Finally, we have a typedef for the bytes in a buffer. This is a
|
|
8007 @code{Bufbyte}, which is an unsigned char. Referring to them as
|
|
8008 Bufbytes underscores the fact that we are working with a string of bytes
|
|
8009 in the internal Emacs buffer representation rather than in one of a
|
|
8010 number of possible alternative representations (e.g. EUC-encoded text,
|
|
8011 etc.).
|
|
8012
|
462
|
8013 @node Buffer Lists
|
428
|
8014 @section Buffer Lists
|
462
|
8015 @cindex buffer lists
|
428
|
8016
|
|
8017 Recall earlier that buffers are @dfn{permanent} objects, i.e. that
|
|
8018 they remain around until explicitly deleted. This entails that there is
|
|
8019 a list of all the buffers in existence. This list is actually an
|
|
8020 assoc-list (mapping from the buffer's name to the buffer) and is stored
|
|
8021 in the global variable @code{Vbuffer_alist}.
|
|
8022
|
|
8023 The order of the buffers in the list is important: the buffers are
|
|
8024 ordered approximately from most-recently-used to least-recently-used.
|
|
8025 Switching to a buffer using @code{switch-to-buffer},
|
|
8026 @code{pop-to-buffer}, etc. and switching windows using
|
|
8027 @code{other-window}, etc. usually brings the new current buffer to the
|
|
8028 front of the list. @code{switch-to-buffer}, @code{other-buffer},
|
|
8029 etc. look at the beginning of the list to find an alternative buffer to
|
|
8030 suggest. You can also explicitly move a buffer to the end of the list
|
|
8031 using @code{bury-buffer}.
|
|
8032
|
|
8033 In addition to the global ordering in @code{Vbuffer_alist}, each frame
|
|
8034 has its own ordering of the list. These lists always contain the same
|
|
8035 elements as in @code{Vbuffer_alist} although possibly in a different
|
|
8036 order. @code{buffer-list} normally returns the list for the selected
|
|
8037 frame. This allows you to work in separate frames without things
|
|
8038 interfering with each other.
|
|
8039
|
|
8040 The standard way to look up a buffer given a name is
|
|
8041 @code{get-buffer}, and the standard way to create a new buffer is
|
|
8042 @code{get-buffer-create}, which looks up a buffer with a given name,
|
|
8043 creating a new one if necessary. These operations correspond exactly
|
|
8044 with the symbol operations @code{intern-soft} and @code{intern},
|
|
8045 respectively. You can also force a new buffer to be created using
|
|
8046 @code{generate-new-buffer}, which takes a name and (if necessary) makes
|
|
8047 a unique name from this by appending a number, and then creates the
|
|
8048 buffer. This is basically like the symbol operation @code{gensym}.
|
|
8049
|
462
|
8050 @node Markers and Extents
|
428
|
8051 @section Markers and Extents
|
462
|
8052 @cindex markers and extents
|
|
8053 @cindex extents, markers and
|
428
|
8054
|
|
8055 Among the things associated with a buffer are things that are
|
|
8056 logically attached to certain buffer positions. This can be used to
|
|
8057 keep track of a buffer position when text is inserted and deleted, so
|
|
8058 that it remains at the same spot relative to the text around it; to
|
|
8059 assign properties to particular sections of text; etc. There are two
|
|
8060 such objects that are useful in this regard: they are @dfn{markers} and
|
|
8061 @dfn{extents}.
|
|
8062
|
|
8063 A @dfn{marker} is simply a flag placed at a particular buffer
|
|
8064 position, which is moved around as text is inserted and deleted.
|
|
8065 Markers are used for all sorts of purposes, such as the @code{mark} that
|
|
8066 is the other end of textual regions to be cut, copied, etc.
|
|
8067
|
|
8068 An @dfn{extent} is similar to two markers plus some associated
|
|
8069 properties, and is used to keep track of regions in a buffer as text is
|
|
8070 inserted and deleted, and to add properties (e.g. fonts) to particular
|
|
8071 regions of text. The external interface of extents is explained
|
|
8072 elsewhere.
|
|
8073
|
|
8074 The important thing here is that markers and extents simply contain
|
|
8075 buffer positions in them as integers, and every time text is inserted or
|
|
8076 deleted, these positions must be updated. In order to minimize the
|
|
8077 amount of shuffling that needs to be done, the positions in markers and
|
442
|
8078 extents (there's one per marker, two per extent) are stored in Meminds.
|
428
|
8079 This means that they only need to be moved when the text is physically
|
|
8080 moved in memory; since the gap structure tries to minimize this, it also
|
|
8081 minimizes the number of marker and extent indices that need to be
|
|
8082 adjusted. Look in @file{insdel.c} for the details of how this works.
|
|
8083
|
|
8084 One other important distinction is that markers are @dfn{temporary}
|
|
8085 while extents are @dfn{permanent}. This means that markers disappear as
|
|
8086 soon as there are no more pointers to them, and correspondingly, there
|
|
8087 is no way to determine what markers are in a buffer if you are just
|
|
8088 given the buffer. Extents remain in a buffer until they are detached
|
|
8089 (which could happen as a result of text being deleted) or the buffer is
|
|
8090 deleted, and primitives do exist to enumerate the extents in a buffer.
|
|
8091
|
462
|
8092 @node Bufbytes and Emchars
|
428
|
8093 @section Bufbytes and Emchars
|
462
|
8094 @cindex Bufbytes and Emchars
|
|
8095 @cindex Emchars, Bufbytes and
|
428
|
8096
|
|
8097 Not yet documented.
|
|
8098
|
462
|
8099 @node The Buffer Object
|
428
|
8100 @section The Buffer Object
|
462
|
8101 @cindex buffer object, the
|
|
8102 @cindex object, the buffer
|
428
|
8103
|
|
8104 Buffers contain fields not directly accessible by the Lisp programmer.
|
|
8105 We describe them here, naming them by the names used in the C code.
|
|
8106 Many are accessible indirectly in Lisp programs via Lisp primitives.
|
|
8107
|
|
8108 @table @code
|
|
8109 @item name
|
|
8110 The buffer name is a string that names the buffer. It is guaranteed to
|
446
|
8111 be unique. @xref{Buffer Names,,, lispref, XEmacs Lisp Reference
|
428
|
8112 Manual}.
|
|
8113
|
|
8114 @item save_modified
|
|
8115 This field contains the time when the buffer was last saved, as an
|
446
|
8116 integer. @xref{Buffer Modification,,, lispref, XEmacs Lisp Reference
|
428
|
8117 Manual}.
|
|
8118
|
|
8119 @item modtime
|
|
8120 This field contains the modification time of the visited file. It is
|
|
8121 set when the file is written or read. Every time the buffer is written
|
|
8122 to the file, this field is compared to the modification time of the
|
446
|
8123 file. @xref{Buffer Modification,,, lispref, XEmacs Lisp Reference
|
428
|
8124 Manual}.
|
|
8125
|
|
8126 @item auto_save_modified
|
|
8127 This field contains the time when the buffer was last auto-saved.
|
|
8128
|
|
8129 @item last_window_start
|
|
8130 This field contains the @code{window-start} position in the buffer as of
|
|
8131 the last time the buffer was displayed in a window.
|
|
8132
|
|
8133 @item undo_list
|
|
8134 This field points to the buffer's undo list. @xref{Undo,,, lispref,
|
446
|
8135 XEmacs Lisp Reference Manual}.
|
428
|
8136
|
|
8137 @item syntax_table_v
|
|
8138 This field contains the syntax table for the buffer. @xref{Syntax
|
446
|
8139 Tables,,, lispref, XEmacs Lisp Reference Manual}.
|
428
|
8140
|
|
8141 @item downcase_table
|
|
8142 This field contains the conversion table for converting text to lower
|
446
|
8143 case. @xref{Case Tables,,, lispref, XEmacs Lisp Reference Manual}.
|
428
|
8144
|
|
8145 @item upcase_table
|
|
8146 This field contains the conversion table for converting text to upper
|
446
|
8147 case. @xref{Case Tables,,, lispref, XEmacs Lisp Reference Manual}.
|
428
|
8148
|
|
8149 @item case_canon_table
|
|
8150 This field contains the conversion table for canonicalizing text for
|
|
8151 case-folding search. @xref{Case Tables,,, lispref, XEmacs Lisp
|
446
|
8152 Reference Manual}.
|
428
|
8153
|
|
8154 @item case_eqv_table
|
|
8155 This field contains the equivalence table for case-folding search.
|
446
|
8156 @xref{Case Tables,,, lispref, XEmacs Lisp Reference Manual}.
|
428
|
8157
|
|
8158 @item display_table
|
|
8159 This field contains the buffer's display table, or @code{nil} if it
|
|
8160 doesn't have one. @xref{Display Tables,,, lispref, XEmacs Lisp
|
446
|
8161 Reference Manual}.
|
428
|
8162
|
|
8163 @item markers
|
|
8164 This field contains the chain of all markers that currently point into
|
|
8165 the buffer. Deletion of text in the buffer, and motion of the buffer's
|
|
8166 gap, must check each of these markers and perhaps update it.
|
446
|
8167 @xref{Markers,,, lispref, XEmacs Lisp Reference Manual}.
|
428
|
8168
|
|
8169 @item backed_up
|
|
8170 This field is a flag that tells whether a backup file has been made for
|
|
8171 the visited file of this buffer.
|
|
8172
|
|
8173 @item mark
|
|
8174 This field contains the mark for the buffer. The mark is a marker,
|
|
8175 hence it is also included on the list @code{markers}. @xref{The Mark,,,
|
446
|
8176 lispref, XEmacs Lisp Reference Manual}.
|
428
|
8177
|
|
8178 @item mark_active
|
|
8179 This field is non-@code{nil} if the buffer's mark is active.
|
|
8180
|
|
8181 @item local_var_alist
|
|
8182 This field contains the association list describing the variables local
|
|
8183 in this buffer, and their values, with the exception of local variables
|
|
8184 that have special slots in the buffer object. (Those slots are omitted
|
|
8185 from this table.) @xref{Buffer-Local Variables,,, lispref, XEmacs Lisp
|
446
|
8186 Reference Manual}.
|
428
|
8187
|
|
8188 @item modeline_format
|
|
8189 This field contains a Lisp object which controls how to display the mode
|
|
8190 line for this buffer. @xref{Modeline Format,,, lispref, XEmacs Lisp
|
446
|
8191 Reference Manual}.
|
428
|
8192
|
|
8193 @item base_buffer
|
|
8194 This field holds the buffer's base buffer (if it is an indirect buffer),
|
|
8195 or @code{nil}.
|
|
8196 @end table
|
|
8197
|
|
8198 @node MULE Character Sets and Encodings, The Lisp Reader and Compiler, Buffers and Textual Representation, Top
|
|
8199 @chapter MULE Character Sets and Encodings
|
462
|
8200 @cindex Mule character sets and encodings
|
|
8201 @cindex character sets and encodings, Mule
|
|
8202 @cindex encodings, Mule character sets and
|
428
|
8203
|
|
8204 Recall that there are two primary ways that text is represented in
|
|
8205 XEmacs. The @dfn{buffer} representation sees the text as a series of
|
|
8206 bytes (Bufbytes), with a variable number of bytes used per character.
|
|
8207 The @dfn{character} representation sees the text as a series of integers
|
|
8208 (Emchars), one per character. The character representation is a cleaner
|
|
8209 representation from a theoretical standpoint, and is thus used in many
|
|
8210 cases when lots of manipulations on a string need to be done. However,
|
|
8211 the buffer representation is the standard representation used in both
|
|
8212 Lisp strings and buffers, and because of this, it is the ``default''
|
|
8213 representation that text comes in. The reason for using this
|
|
8214 representation is that it's compact and is compatible with ASCII.
|
|
8215
|
|
8216 @menu
|
|
8217 * Character Sets::
|
|
8218 * Encodings::
|
|
8219 * Internal Mule Encodings::
|
|
8220 * CCL::
|
|
8221 @end menu
|
|
8222
|
462
|
8223 @node Character Sets
|
428
|
8224 @section Character Sets
|
462
|
8225 @cindex character sets
|
428
|
8226
|
|
8227 A character set (or @dfn{charset}) is an ordered set of characters. A
|
|
8228 particular character in a charset is indexed using one or more
|
|
8229 @dfn{position codes}, which are non-negative integers. The number of
|
|
8230 position codes needed to identify a particular character in a charset is
|
|
8231 called the @dfn{dimension} of the charset. In XEmacs/Mule, all charsets
|
|
8232 have dimension 1 or 2, and the size of all charsets (except for a few
|
|
8233 special cases) is either 94, 96, 94 by 94, or 96 by 96. The range of
|
|
8234 position codes used to index characters from any of these types of
|
|
8235 character sets is as follows:
|
|
8236
|
|
8237 @example
|
|
8238 Charset type Position code 1 Position code 2
|
|
8239 ------------------------------------------------------------
|
|
8240 94 33 - 126 N/A
|
|
8241 96 32 - 127 N/A
|
|
8242 94x94 33 - 126 33 - 126
|
|
8243 96x96 32 - 127 32 - 127
|
|
8244 @end example
|
|
8245
|
|
8246 Note that in the above cases position codes do not start at an
|
|
8247 expected value such as 0 or 1. The reason for this will become clear
|
|
8248 later.
|
|
8249
|
|
8250 For example, Latin-1 is a 96-character charset, and JISX0208 (the
|
|
8251 Japanese national character set) is a 94x94-character charset.
|
|
8252
|
|
8253 [Note that, although the ranges above define the @emph{valid} position
|
|
8254 codes for a charset, some of the slots in a particular charset may in
|
|
8255 fact be empty. This is the case for JISX0208, for example, where (e.g.)
|
|
8256 all the slots whose first position code is in the range 118 - 127 are
|
|
8257 empty.]
|
|
8258
|
|
8259 There are three charsets that do not follow the above rules. All of
|
|
8260 them have one dimension, and have ranges of position codes as follows:
|
|
8261
|
|
8262 @example
|
|
8263 Charset name Position code 1
|
|
8264 ------------------------------------
|
|
8265 ASCII 0 - 127
|
|
8266 Control-1 0 - 31
|
|
8267 Composite 0 - some large number
|
|
8268 @end example
|
|
8269
|
|
8270 (The upper bound of the position code for composite characters has not
|
|
8271 yet been determined, but it will probably be at least 16,383).
|
|
8272
|
|
8273 ASCII is the union of two subsidiary character sets: Printing-ASCII
|
|
8274 (the printing ASCII character set, consisting of position codes 33 -
|
|
8275 126, like for a standard 94-character charset) and Control-ASCII (the
|
|
8276 non-printing characters that would appear in a binary file with codes 0
|
|
8277 - 32 and 127).
|
|
8278
|
|
8279 Control-1 contains the non-printing characters that would appear in a
|
|
8280 binary file with codes 128 - 159.
|
|
8281
|
|
8282 Composite contains characters that are generated by overstriking one
|
|
8283 or more characters from other charsets.
|
|
8284
|
|
8285 Note that some characters in ASCII, and all characters in Control-1,
|
|
8286 are @dfn{control} (non-printing) characters. These have no printed
|
|
8287 representation but instead control some other function of the printing
|
|
8288 (e.g. TAB or 8 moves the current character position to the next tab
|
|
8289 stop). All other characters in all charsets are @dfn{graphic}
|
|
8290 (printing) characters.
|
|
8291
|
|
8292 When a binary file is read in, the bytes in the file are assigned to
|
|
8293 character sets as follows:
|
|
8294
|
|
8295 @example
|
|
8296 Bytes Character set Range
|
|
8297 --------------------------------------------------
|
|
8298 0 - 127 ASCII 0 - 127
|
|
8299 128 - 159 Control-1 0 - 31
|
|
8300 160 - 255 Latin-1 32 - 127
|
|
8301 @end example
|
|
8302
|
|
8303 This is a bit ad-hoc but gets the job done.
|
|
8304
|
462
|
8305 @node Encodings
|
428
|
8306 @section Encodings
|
462
|
8307 @cindex encodings, Mule
|
|
8308 @cindex Mule encodings
|
428
|
8309
|
|
8310 An @dfn{encoding} is a way of numerically representing characters from
|
|
8311 one or more character sets. If an encoding only encompasses one
|
|
8312 character set, then the position codes for the characters in that
|
|
8313 character set could be used directly. This is not possible, however, if
|
|
8314 more than one character set is to be used in the encoding.
|
|
8315
|
|
8316 For example, the conversion detailed above between bytes in a binary
|
|
8317 file and characters is effectively an encoding that encompasses the
|
|
8318 three character sets ASCII, Control-1, and Latin-1 in a stream of 8-bit
|
|
8319 bytes.
|
|
8320
|
|
8321 Thus, an encoding can be viewed as a way of encoding characters from a
|
|
8322 specified group of character sets using a stream of bytes, each of which
|
|
8323 contains a fixed number of bits (but not necessarily 8, as in the common
|
|
8324 usage of ``byte'').
|
|
8325
|
|
8326 Here are descriptions of a couple of common
|
|
8327 encodings:
|
|
8328
|
|
8329 @menu
|
|
8330 * Japanese EUC (Extended Unix Code)::
|
|
8331 * JIS7::
|
|
8332 @end menu
|
|
8333
|
462
|
8334 @node Japanese EUC (Extended Unix Code)
|
428
|
8335 @subsection Japanese EUC (Extended Unix Code)
|
462
|
8336 @cindex Japanese EUC (Extended Unix Code)
|
|
8337 @cindex EUC (Extended Unix Code), Japanese
|
|
8338 @cindex Extended Unix Code, Japanese EUC
|
428
|
8339
|
|
8340 This encompasses the character sets Printing-ASCII, Japanese-JISX0201,
|
|
8341 and Japanese-JISX0208-Kana (half-width katakana, the right half of
|
|
8342 JISX0201). It uses 8-bit bytes.
|
|
8343
|
|
8344 Note that Printing-ASCII and Japanese-JISX0201-Kana are 94-character
|
|
8345 charsets, while Japanese-JISX0208 is a 94x94-character charset.
|
|
8346
|
|
8347 The encoding is as follows:
|
|
8348
|
|
8349 @example
|
|
8350 Character set Representation (PC=position-code)
|
|
8351 ------------- --------------
|
|
8352 Printing-ASCII PC1
|
|
8353 Japanese-JISX0201-Kana 0x8E | PC1 + 0x80
|
|
8354 Japanese-JISX0208 PC1 + 0x80 | PC2 + 0x80
|
|
8355 Japanese-JISX0212 PC1 + 0x80 | PC2 + 0x80
|
|
8356 @end example
|
|
8357
|
|
8358
|
462
|
8359 @node JIS7
|
428
|
8360 @subsection JIS7
|
462
|
8361 @cindex JIS7
|
428
|
8362
|
|
8363 This encompasses the character sets Printing-ASCII,
|
|
8364 Japanese-JISX0201-Roman (the left half of JISX0201; this character set
|
|
8365 is very similar to Printing-ASCII and is a 94-character charset),
|
|
8366 Japanese-JISX0208, and Japanese-JISX0201-Kana. It uses 7-bit bytes.
|
|
8367
|
|
8368 Unlike Japanese EUC, this is a @dfn{modal} encoding, which
|
|
8369 means that there are multiple states that the encoding can
|
|
8370 be in, which affect how the bytes are to be interpreted.
|
|
8371 Special sequences of bytes (called @dfn{escape sequences})
|
|
8372 are used to change states.
|
|
8373
|
|
8374 The encoding is as follows:
|
|
8375
|
|
8376 @example
|
|
8377 Character set Representation (PC=position-code)
|
|
8378 ------------- --------------
|
|
8379 Printing-ASCII PC1
|
|
8380 Japanese-JISX0201-Roman PC1
|
|
8381 Japanese-JISX0201-Kana PC1
|
|
8382 Japanese-JISX0208 PC1 PC2
|
|
8383
|
|
8384
|
|
8385 Escape sequence ASCII equivalent Meaning
|
|
8386 --------------- ---------------- -------
|
|
8387 0x1B 0x28 0x4A ESC ( J invoke Japanese-JISX0201-Roman
|
|
8388 0x1B 0x28 0x49 ESC ( I invoke Japanese-JISX0201-Kana
|
|
8389 0x1B 0x24 0x42 ESC $ B invoke Japanese-JISX0208
|
|
8390 0x1B 0x28 0x42 ESC ( B invoke Printing-ASCII
|
|
8391 @end example
|
|
8392
|
|
8393 Initially, Printing-ASCII is invoked.
|
|
8394
|
462
|
8395 @node Internal Mule Encodings
|
428
|
8396 @section Internal Mule Encodings
|
462
|
8397 @cindex internal Mule encodings
|
|
8398 @cindex Mule encodings, internal
|
|
8399 @cindex encodings, internal Mule
|
428
|
8400
|
|
8401 In XEmacs/Mule, each character set is assigned a unique number, called a
|
|
8402 @dfn{leading byte}. This is used in the encodings of a character.
|
|
8403 Leading bytes are in the range 0x80 - 0xFF (except for ASCII, which has
|
|
8404 a leading byte of 0), although some leading bytes are reserved.
|
|
8405
|
|
8406 Charsets whose leading byte is in the range 0x80 - 0x9F are called
|
|
8407 @dfn{official} and are used for built-in charsets. Other charsets are
|
|
8408 called @dfn{private} and have leading bytes in the range 0xA0 - 0xFF;
|
|
8409 these are user-defined charsets.
|
|
8410
|
|
8411 More specifically:
|
|
8412
|
|
8413 @example
|
|
8414 Character set Leading byte
|
|
8415 ------------- ------------
|
|
8416 ASCII 0
|
|
8417 Composite 0x80
|
|
8418 Dimension-1 Official 0x81 - 0x8D
|
|
8419 (0x8E is free)
|
|
8420 Control-1 0x8F
|
|
8421 Dimension-2 Official 0x90 - 0x99
|
|
8422 (0x9A - 0x9D are free;
|
|
8423 0x9E and 0x9F are reserved)
|
|
8424 Dimension-1 Private 0xA0 - 0xEF
|
|
8425 Dimension-2 Private 0xF0 - 0xFF
|
|
8426 @end example
|
|
8427
|
|
8428 There are two internal encodings for characters in XEmacs/Mule. One is
|
|
8429 called @dfn{string encoding} and is an 8-bit encoding that is used for
|
|
8430 representing characters in a buffer or string. It uses 1 to 4 bytes per
|
|
8431 character. The other is called @dfn{character encoding} and is a 19-bit
|
|
8432 encoding that is used for representing characters individually in a
|
|
8433 variable.
|
|
8434
|
|
8435 (In the following descriptions, we'll ignore composite characters for
|
|
8436 the moment. We also give a general (structural) overview first,
|
|
8437 followed later by the exact details.)
|
|
8438
|
|
8439 @menu
|
|
8440 * Internal String Encoding::
|
|
8441 * Internal Character Encoding::
|
|
8442 @end menu
|
|
8443
|
462
|
8444 @node Internal String Encoding
|
428
|
8445 @subsection Internal String Encoding
|
462
|
8446 @cindex internal string encoding
|
|
8447 @cindex string encoding, internal
|
|
8448 @cindex encoding, internal string
|
428
|
8449
|
|
8450 ASCII characters are encoded using their position code directly. Other
|
|
8451 characters are encoded using their leading byte followed by their
|
|
8452 position code(s) with the high bit set. Characters in private character
|
|
8453 sets have their leading byte prefixed with a @dfn{leading byte prefix},
|
|
8454 which is either 0x9E or 0x9F. (No character sets are ever assigned these
|
|
8455 leading bytes.) Specifically:
|
|
8456
|
|
8457 @example
|
|
8458 Character set Encoding (PC=position-code, LB=leading-byte)
|
|
8459 ------------- --------
|
|
8460 ASCII PC-1 |
|
|
8461 Control-1 LB | PC1 + 0xA0 |
|
|
8462 Dimension-1 official LB | PC1 + 0x80 |
|
|
8463 Dimension-1 private 0x9E | LB | PC1 + 0x80 |
|
|
8464 Dimension-2 official LB | PC1 + 0x80 | PC2 + 0x80 |
|
|
8465 Dimension-2 private 0x9F | LB | PC1 + 0x80 | PC2 + 0x80
|
|
8466 @end example
|
|
8467
|
|
8468 The basic characteristic of this encoding is that the first byte
|
|
8469 of all characters is in the range 0x00 - 0x9F, and the second and
|
|
8470 following bytes of all characters is in the range 0xA0 - 0xFF.
|
|
8471 This means that it is impossible to get out of sync, or more
|
|
8472 specifically:
|
|
8473
|
|
8474 @enumerate
|
|
8475 @item
|
|
8476 Given any byte position, the beginning of the character it is
|
|
8477 within can be determined in constant time.
|
|
8478 @item
|
|
8479 Given any byte position at the beginning of a character, the
|
|
8480 beginning of the next character can be determined in constant
|
|
8481 time.
|
|
8482 @item
|
|
8483 Given any byte position at the beginning of a character, the
|
|
8484 beginning of the previous character can be determined in constant
|
|
8485 time.
|
|
8486 @item
|
|
8487 Textual searches can simply treat encoded strings as if they
|
|
8488 were encoded in a one-byte-per-character fashion rather than
|
|
8489 the actual multi-byte encoding.
|
|
8490 @end enumerate
|
|
8491
|
|
8492 None of the standard non-modal encodings meet all of these
|
|
8493 conditions. For example, EUC satisfies only (2) and (3), while
|
|
8494 Shift-JIS and Big5 (not yet described) satisfy only (2). (All
|
|
8495 non-modal encodings must satisfy (2), in order to be unambiguous.)
|
|
8496
|
462
|
8497 @node Internal Character Encoding
|
428
|
8498 @subsection Internal Character Encoding
|
462
|
8499 @cindex internal character encoding
|
|
8500 @cindex character encoding, internal
|
|
8501 @cindex encoding, internal character
|
428
|
8502
|
|
8503 One 19-bit word represents a single character. The word is
|
|
8504 separated into three fields:
|
|
8505
|
|
8506 @example
|
|
8507 Bit number: 18 17 16 15 14 13 12 11 10 09 08 07 06 05 04 03 02 01 00
|
|
8508 <------------> <------------------> <------------------>
|
|
8509 Field: 1 2 3
|
|
8510 @end example
|
|
8511
|
|
8512 Note that fields 2 and 3 hold 7 bits each, while field 1 holds 5 bits.
|
|
8513
|
|
8514 @example
|
|
8515 Character set Field 1 Field 2 Field 3
|
|
8516 ------------- ------- ------- -------
|
|
8517 ASCII 0 0 PC1
|
|
8518 range: (00 - 7F)
|
|
8519 Control-1 0 1 PC1
|
|
8520 range: (00 - 1F)
|
|
8521 Dimension-1 official 0 LB - 0x80 PC1
|
|
8522 range: (01 - 0D) (20 - 7F)
|
|
8523 Dimension-1 private 0 LB - 0x80 PC1
|
|
8524 range: (20 - 6F) (20 - 7F)
|
|
8525 Dimension-2 official LB - 0x8F PC1 PC2
|
|
8526 range: (01 - 0A) (20 - 7F) (20 - 7F)
|
|
8527 Dimension-2 private LB - 0xE1 PC1 PC2
|
|
8528 range: (0F - 1E) (20 - 7F) (20 - 7F)
|
|
8529 Composite 0x1F ? ?
|
|
8530 @end example
|
|
8531
|
|
8532 Note that character codes 0 - 255 are the same as the ``binary encoding''
|
|
8533 described above.
|
|
8534
|
462
|
8535 @node CCL
|
428
|
8536 @section CCL
|
462
|
8537 @cindex CCL
|
428
|
8538
|
|
8539 @example
|
|
8540 CCL PROGRAM SYNTAX:
|
|
8541 CCL_PROGRAM := (CCL_MAIN_BLOCK
|
|
8542 [ CCL_EOF_BLOCK ])
|
|
8543
|
|
8544 CCL_MAIN_BLOCK := CCL_BLOCK
|
|
8545 CCL_EOF_BLOCK := CCL_BLOCK
|
|
8546
|
|
8547 CCL_BLOCK := STATEMENT | (STATEMENT [STATEMENT ...])
|
|
8548 STATEMENT :=
|
|
8549 SET | IF | BRANCH | LOOP | REPEAT | BREAK
|
|
8550 | READ | WRITE
|
|
8551
|
|
8552 SET := (REG = EXPRESSION) | (REG SELF_OP EXPRESSION)
|
|
8553 | INT-OR-CHAR
|
|
8554
|
|
8555 EXPRESSION := ARG | (EXPRESSION OP ARG)
|
|
8556
|
|
8557 IF := (if EXPRESSION CCL_BLOCK CCL_BLOCK)
|
|
8558 BRANCH := (branch EXPRESSION CCL_BLOCK [CCL_BLOCK ...])
|
|
8559 LOOP := (loop STATEMENT [STATEMENT ...])
|
|
8560 BREAK := (break)
|
|
8561 REPEAT := (repeat)
|
|
8562 | (write-repeat [REG | INT-OR-CHAR | string])
|
|
8563 | (write-read-repeat REG [INT-OR-CHAR | string | ARRAY]?)
|
|
8564 READ := (read REG) | (read REG REG)
|
|
8565 | (read-if REG ARITH_OP ARG CCL_BLOCK CCL_BLOCK)
|
|
8566 | (read-branch REG CCL_BLOCK [CCL_BLOCK ...])
|
|
8567 WRITE := (write REG) | (write REG REG)
|
|
8568 | (write INT-OR-CHAR) | (write STRING) | STRING
|
|
8569 | (write REG ARRAY)
|
|
8570 END := (end)
|
|
8571
|
|
8572 REG := r0 | r1 | r2 | r3 | r4 | r5 | r6 | r7
|
|
8573 ARG := REG | INT-OR-CHAR
|
|
8574 OP := + | - | * | / | % | & | '|' | ^ | << | >> | <8 | >8 | //
|
|
8575 | < | > | == | <= | >= | !=
|
|
8576 SELF_OP :=
|
|
8577 += | -= | *= | /= | %= | &= | '|=' | ^= | <<= | >>=
|
|
8578 ARRAY := '[' INT-OR-CHAR ... ']'
|
|
8579 INT-OR-CHAR := INT | CHAR
|
|
8580
|
|
8581 MACHINE CODE:
|
|
8582
|
|
8583 The machine code consists of a vector of 32-bit words.
|
|
8584 The first such word specifies the start of the EOF section of the code;
|
|
8585 this is the code executed to handle any stuff that needs to be done
|
|
8586 (e.g. designating back to ASCII and left-to-right mode) after all
|
|
8587 other encoded/decoded data has been written out. This is not used for
|
|
8588 charset CCL programs.
|
|
8589
|
442
|
8590 REGISTER: 0..7 -- referred by RRR or rrr
|
428
|
8591
|
|
8592 OPERATOR BIT FIELD (27-bit): XXXXXXXXXXXXXXX RRR TTTTT
|
|
8593 TTTTT (5-bit): operator type
|
|
8594 RRR (3-bit): register number
|
|
8595 XXXXXXXXXXXXXXXX (15-bit):
|
|
8596 CCCCCCCCCCCCCCC: constant or address
|
|
8597 000000000000rrr: register number
|
|
8598
|
|
8599 AAAA: 00000 +
|
|
8600 00001 -
|
|
8601 00010 *
|
|
8602 00011 /
|
|
8603 00100 %
|
|
8604 00101 &
|
|
8605 00110 |
|
|
8606 00111 ~
|
|
8607
|
|
8608 01000 <<
|
|
8609 01001 >>
|
|
8610 01010 <8
|
|
8611 01011 >8
|
|
8612 01100 //
|
|
8613 01101 not used
|
|
8614 01110 not used
|
|
8615 01111 not used
|
|
8616
|
|
8617 10000 <
|
|
8618 10001 >
|
|
8619 10010 ==
|
|
8620 10011 <=
|
|
8621 10100 >=
|
|
8622 10101 !=
|
|
8623
|
|
8624 OPERATORS: TTTTT RRR XX..
|
|
8625
|
|
8626 SetCS: 00000 RRR C...C RRR = C...C
|
|
8627 SetCL: 00001 RRR ..... RRR = c...c
|
|
8628 c.............c
|
|
8629 SetR: 00010 RRR ..rrr RRR = rrr
|
|
8630 SetA: 00011 RRR ..rrr RRR = array[rrr]
|
|
8631 C.............C size of array = C...C
|
|
8632 c.............c contents = c...c
|
|
8633
|
|
8634 Jump: 00100 000 c...c jump to c...c
|
|
8635 JumpCond: 00101 RRR c...c if (!RRR) jump to c...c
|
|
8636 WriteJump: 00110 RRR c...c Write1 RRR, jump to c...c
|
|
8637 WriteReadJump: 00111 RRR c...c Write1, Read1 RRR, jump to c...c
|
|
8638 WriteCJump: 01000 000 c...c Write1 C...C, jump to c...c
|
|
8639 C...C
|
|
8640 WriteCReadJump: 01001 RRR c...c Write1 C...C, Read1 RRR,
|
|
8641 C.............C and jump to c...c
|
|
8642 WriteSJump: 01010 000 c...c WriteS, jump to c...c
|
|
8643 C.............C
|
|
8644 S.............S
|
|
8645 ...
|
|
8646 WriteSReadJump: 01011 RRR c...c WriteS, Read1 RRR, jump to c...c
|
|
8647 C.............C
|
|
8648 S.............S
|
|
8649 ...
|
|
8650 WriteAReadJump: 01100 RRR c...c WriteA, Read1 RRR, jump to c...c
|
|
8651 C.............C size of array = C...C
|
|
8652 c.............c contents = c...c
|
|
8653 ...
|
|
8654 Branch: 01101 RRR C...C if (RRR >= 0 && RRR < C..)
|
|
8655 c.............c branch to (RRR+1)th address
|
|
8656 Read1: 01110 RRR ... read 1-byte to RRR
|
|
8657 Read2: 01111 RRR ..rrr read 2-byte to RRR and rrr
|
|
8658 ReadBranch: 10000 RRR C...C Read1 and Branch
|
|
8659 c.............c
|
|
8660 ...
|
|
8661 Write1: 10001 RRR ..... write 1-byte RRR
|
|
8662 Write2: 10010 RRR ..rrr write 2-byte RRR and rrr
|
|
8663 WriteC: 10011 000 ..... write 1-char C...CC
|
|
8664 C.............C
|
|
8665 WriteS: 10100 000 ..... write C..-byte of string
|
|
8666 C.............C
|
|
8667 S.............S
|
|
8668 ...
|
|
8669 WriteA: 10101 RRR ..... write array[RRR]
|
|
8670 C.............C size of array = C...C
|
|
8671 c.............c contents = c...c
|
|
8672 ...
|
|
8673 End: 10110 000 ..... terminate the execution
|
|
8674
|
|
8675 SetSelfCS: 10111 RRR C...C RRR AAAAA= C...C
|
|
8676 ..........AAAAA
|
|
8677 SetSelfCL: 11000 RRR ..... RRR AAAAA= c...c
|
|
8678 c.............c
|
|
8679 ..........AAAAA
|
|
8680 SetSelfR: 11001 RRR ..Rrr RRR AAAAA= rrr
|
|
8681 ..........AAAAA
|
|
8682 SetExprCL: 11010 RRR ..Rrr RRR = rrr AAAAA c...c
|
|
8683 c.............c
|
|
8684 ..........AAAAA
|
|
8685 SetExprR: 11011 RRR ..rrr RRR = rrr AAAAA Rrr
|
|
8686 ............Rrr
|
|
8687 ..........AAAAA
|
|
8688 JumpCondC: 11100 RRR c...c if !(RRR AAAAA C..) jump to c...c
|
|
8689 C.............C
|
|
8690 ..........AAAAA
|
|
8691 JumpCondR: 11101 RRR c...c if !(RRR AAAAA rrr) jump to c...c
|
|
8692 ............rrr
|
|
8693 ..........AAAAA
|
|
8694 ReadJumpCondC: 11110 RRR c...c Read1 and JumpCondC
|
|
8695 C.............C
|
|
8696 ..........AAAAA
|
|
8697 ReadJumpCondR: 11111 RRR c...c Read1 and JumpCondR
|
|
8698 ............rrr
|
|
8699 ..........AAAAA
|
|
8700 @end example
|
|
8701
|
|
8702 @node The Lisp Reader and Compiler, Lstreams, MULE Character Sets and Encodings, Top
|
|
8703 @chapter The Lisp Reader and Compiler
|
462
|
8704 @cindex Lisp reader and compiler, the
|
|
8705 @cindex reader and compiler, the Lisp
|
|
8706 @cindex compiler, the Lisp reader and
|
428
|
8707
|
|
8708 Not yet documented.
|
|
8709
|
|
8710 @node Lstreams, Consoles; Devices; Frames; Windows, The Lisp Reader and Compiler, Top
|
|
8711 @chapter Lstreams
|
462
|
8712 @cindex lstreams
|
428
|
8713
|
|
8714 An @dfn{lstream} is an internal Lisp object that provides a generic
|
|
8715 buffering stream implementation. Conceptually, you send data to the
|
|
8716 stream or read data from the stream, not caring what's on the other end
|
|
8717 of the stream. The other end could be another stream, a file
|
|
8718 descriptor, a stdio stream, a fixed block of memory, a reallocating
|
|
8719 block of memory, etc. The main purpose of the stream is to provide a
|
|
8720 standard interface and to do buffering. Macros are defined to read or
|
|
8721 write characters, so the calling functions do not have to worry about
|
|
8722 blocking data together in order to achieve efficiency.
|
|
8723
|
|
8724 @menu
|
|
8725 * Creating an Lstream:: Creating an lstream object.
|
|
8726 * Lstream Types:: Different sorts of things that are streamed.
|
|
8727 * Lstream Functions:: Functions for working with lstreams.
|
|
8728 * Lstream Methods:: Creating new lstream types.
|
|
8729 @end menu
|
|
8730
|
462
|
8731 @node Creating an Lstream
|
428
|
8732 @section Creating an Lstream
|
462
|
8733 @cindex lstream, creating an
|
428
|
8734
|
|
8735 Lstreams come in different types, depending on what is being interfaced
|
|
8736 to. Although the primitive for creating new lstreams is
|
|
8737 @code{Lstream_new()}, generally you do not call this directly. Instead,
|
|
8738 you call some type-specific creation function, which creates the lstream
|
|
8739 and initializes it as appropriate for the particular type.
|
|
8740
|
|
8741 All lstream creation functions take a @var{mode} argument, specifying
|
|
8742 what mode the lstream should be opened as. This controls whether the
|
|
8743 lstream is for input and output, and optionally whether data should be
|
|
8744 blocked up in units of MULE characters. Note that some types of
|
|
8745 lstreams can only be opened for input; others only for output; and
|
|
8746 others can be opened either way. #### Richard Mlynarik thinks that
|
|
8747 there should be a strict separation between input and output streams,
|
|
8748 and he's probably right.
|
|
8749
|
|
8750 @var{mode} is a string, one of
|
|
8751
|
|
8752 @table @code
|
|
8753 @item "r"
|
|
8754 Open for reading.
|
|
8755 @item "w"
|
|
8756 Open for writing.
|
|
8757 @item "rc"
|
|
8758 Open for reading, but ``read'' never returns partial MULE characters.
|
|
8759 @item "wc"
|
|
8760 Open for writing, but never writes partial MULE characters.
|
|
8761 @end table
|
|
8762
|
462
|
8763 @node Lstream Types
|
428
|
8764 @section Lstream Types
|
462
|
8765 @cindex lstream types
|
|
8766 @cindex types, lstream
|
428
|
8767
|
|
8768 @table @asis
|
|
8769 @item stdio
|
|
8770
|
|
8771 @item filedesc
|
|
8772
|
|
8773 @item lisp-string
|
|
8774
|
|
8775 @item fixed-buffer
|
|
8776
|
|
8777 @item resizing-buffer
|
|
8778
|
|
8779 @item dynarr
|
|
8780
|
|
8781 @item lisp-buffer
|
|
8782
|
|
8783 @item print
|
|
8784
|
|
8785 @item decoding
|
|
8786
|
|
8787 @item encoding
|
|
8788 @end table
|
|
8789
|
462
|
8790 @node Lstream Functions
|
428
|
8791 @section Lstream Functions
|
462
|
8792 @cindex lstream functions
|
428
|
8793
|
442
|
8794 @deftypefun {Lstream *} Lstream_new (Lstream_implementation *@var{imp}, const char *@var{mode})
|
428
|
8795 Allocate and return a new Lstream. This function is not really meant to
|
|
8796 be called directly; rather, each stream type should provide its own
|
|
8797 stream creation function, which creates the stream and does any other
|
|
8798 necessary creation stuff (e.g. opening a file).
|
|
8799 @end deftypefun
|
|
8800
|
|
8801 @deftypefun void Lstream_set_buffering (Lstream *@var{lstr}, Lstream_buffering @var{buffering}, int @var{buffering_size})
|
|
8802 Change the buffering of a stream. See @file{lstream.h}. By default the
|
|
8803 buffering is @code{STREAM_BLOCK_BUFFERED}.
|
|
8804 @end deftypefun
|
|
8805
|
|
8806 @deftypefun int Lstream_flush (Lstream *@var{lstr})
|
|
8807 Flush out any pending unwritten data in the stream. Clear any buffered
|
|
8808 input data. Returns 0 on success, -1 on error.
|
|
8809 @end deftypefun
|
|
8810
|
|
8811 @deftypefn Macro int Lstream_putc (Lstream *@var{stream}, int @var{c})
|
|
8812 Write out one byte to the stream. This is a macro and so it is very
|
|
8813 efficient. The @var{c} argument is only evaluated once but the @var{stream}
|
|
8814 argument is evaluated more than once. Returns 0 on success, -1 on
|
|
8815 error.
|
|
8816 @end deftypefn
|
|
8817
|
|
8818 @deftypefn Macro int Lstream_getc (Lstream *@var{stream})
|
|
8819 Read one byte from the stream. This is a macro and so it is very
|
|
8820 efficient. The @var{stream} argument is evaluated more than once. Return
|
|
8821 value is -1 for EOF or error.
|
|
8822 @end deftypefn
|
|
8823
|
|
8824 @deftypefn Macro void Lstream_ungetc (Lstream *@var{stream}, int @var{c})
|
|
8825 Push one byte back onto the input queue. This will be the next byte
|
|
8826 read from the stream. Any number of bytes can be pushed back and will
|
440
|
8827 be read in the reverse order they were pushed back---most recent
|
|
8828 first. (This is necessary for consistency---if there are a number of
|
428
|
8829 bytes that have been unread and I read and unread a byte, it needs to be
|
|
8830 the first to be read again.) This is a macro and so it is very
|
|
8831 efficient. The @var{c} argument is only evaluated once but the @var{stream}
|
|
8832 argument is evaluated more than once.
|
|
8833 @end deftypefn
|
|
8834
|
|
8835 @deftypefun int Lstream_fputc (Lstream *@var{stream}, int @var{c})
|
|
8836 @deftypefunx int Lstream_fgetc (Lstream *@var{stream})
|
|
8837 @deftypefunx void Lstream_fungetc (Lstream *@var{stream}, int @var{c})
|
|
8838 Function equivalents of the above macros.
|
|
8839 @end deftypefun
|
|
8840
|
|
8841 @deftypefun ssize_t Lstream_read (Lstream *@var{stream}, void *@var{data}, size_t @var{size})
|
|
8842 Read @var{size} bytes of @var{data} from the stream. Return the number
|
|
8843 of bytes read. 0 means EOF. -1 means an error occurred and no bytes
|
|
8844 were read.
|
|
8845 @end deftypefun
|
|
8846
|
|
8847 @deftypefun ssize_t Lstream_write (Lstream *@var{stream}, void *@var{data}, size_t @var{size})
|
|
8848 Write @var{size} bytes of @var{data} to the stream. Return the number
|
|
8849 of bytes written. -1 means an error occurred and no bytes were written.
|
|
8850 @end deftypefun
|
|
8851
|
|
8852 @deftypefun void Lstream_unread (Lstream *@var{stream}, void *@var{data}, size_t @var{size})
|
|
8853 Push back @var{size} bytes of @var{data} onto the input queue. The next
|
|
8854 call to @code{Lstream_read()} with the same size will read the same
|
|
8855 bytes back. Note that this will be the case even if there is other
|
|
8856 pending unread data.
|
|
8857 @end deftypefun
|
|
8858
|
|
8859 @deftypefun int Lstream_close (Lstream *@var{stream})
|
|
8860 Close the stream. All data will be flushed out.
|
|
8861 @end deftypefun
|
|
8862
|
|
8863 @deftypefun void Lstream_reopen (Lstream *@var{stream})
|
|
8864 Reopen a closed stream. This enables I/O on it again. This is not
|
|
8865 meant to be called except from a wrapper routine that reinitializes
|
440
|
8866 variables and such---the close routine may well have freed some
|
428
|
8867 necessary storage structures, for example.
|
|
8868 @end deftypefun
|
|
8869
|
|
8870 @deftypefun void Lstream_rewind (Lstream *@var{stream})
|
|
8871 Rewind the stream to the beginning.
|
|
8872 @end deftypefun
|
|
8873
|
462
|
8874 @node Lstream Methods
|
428
|
8875 @section Lstream Methods
|
462
|
8876 @cindex lstream methods
|
428
|
8877
|
|
8878 @deftypefn {Lstream Method} ssize_t reader (Lstream *@var{stream}, unsigned char *@var{data}, size_t @var{size})
|
|
8879 Read some data from the stream's end and store it into @var{data}, which
|
|
8880 can hold @var{size} bytes. Return the number of bytes read. A return
|
|
8881 value of 0 means no bytes can be read at this time. This may be because
|
|
8882 of an EOF, or because there is a granularity greater than one byte that
|
|
8883 the stream imposes on the returned data, and @var{size} is less than
|
|
8884 this granularity. (This will happen frequently for streams that need to
|
|
8885 return whole characters, because @code{Lstream_read()} calls the reader
|
|
8886 function repeatedly until it has the number of bytes it wants or until 0
|
|
8887 is returned.) The lstream functions do not treat a 0 return as EOF or
|
|
8888 do anything special; however, the calling function will interpret any 0
|
|
8889 it gets back as EOF. This will normally not happen unless the caller
|
|
8890 calls @code{Lstream_read()} with a very small size.
|
|
8891
|
|
8892 This function can be @code{NULL} if the stream is output-only.
|
|
8893 @end deftypefn
|
|
8894
|
442
|
8895 @deftypefn {Lstream Method} ssize_t writer (Lstream *@var{stream}, const unsigned char *@var{data}, size_t @var{size})
|
428
|
8896 Send some data to the stream's end. Data to be sent is in @var{data}
|
|
8897 and is @var{size} bytes. Return the number of bytes sent. This
|
|
8898 function can send and return fewer bytes than is passed in; in that
|
|
8899 case, the function will just be called again until there is no data left
|
|
8900 or 0 is returned. A return value of 0 means that no more data can be
|
|
8901 currently stored, but there is no error; the data will be squirreled
|
|
8902 away until the writer can accept data. (This is useful, e.g., if you're
|
|
8903 dealing with a non-blocking file descriptor and are getting
|
|
8904 @code{EWOULDBLOCK} errors.) This function can be @code{NULL} if the
|
|
8905 stream is input-only.
|
|
8906 @end deftypefn
|
|
8907
|
|
8908 @deftypefn {Lstream Method} int rewinder (Lstream *@var{stream})
|
|
8909 Rewind the stream. If this is @code{NULL}, the stream is not seekable.
|
|
8910 @end deftypefn
|
|
8911
|
|
8912 @deftypefn {Lstream Method} int seekable_p (Lstream *@var{stream})
|
440
|
8913 Indicate whether this stream is seekable---i.e. it can be rewound.
|
428
|
8914 This method is ignored if the stream does not have a rewind method. If
|
|
8915 this method is not present, the result is determined by whether a rewind
|
|
8916 method is present.
|
|
8917 @end deftypefn
|
|
8918
|
|
8919 @deftypefn {Lstream Method} int flusher (Lstream *@var{stream})
|
|
8920 Perform any additional operations necessary to flush the data in this
|
|
8921 stream.
|
|
8922 @end deftypefn
|
|
8923
|
|
8924 @deftypefn {Lstream Method} int pseudo_closer (Lstream *@var{stream})
|
|
8925 @end deftypefn
|
|
8926
|
|
8927 @deftypefn {Lstream Method} int closer (Lstream *@var{stream})
|
|
8928 Perform any additional operations necessary to close this stream down.
|
|
8929 May be @code{NULL}. This function is called when @code{Lstream_close()}
|
|
8930 is called or when the stream is garbage-collected. When this function
|
|
8931 is called, all pending data in the stream will already have been written
|
|
8932 out.
|
|
8933 @end deftypefn
|
|
8934
|
|
8935 @deftypefn {Lstream Method} Lisp_Object marker (Lisp_Object @var{lstream}, void (*@var{markfun}) (Lisp_Object))
|
|
8936 Mark this object for garbage collection. Same semantics as a standard
|
|
8937 @code{Lisp_Object} marker. This function can be @code{NULL}.
|
|
8938 @end deftypefn
|
|
8939
|
|
8940 @node Consoles; Devices; Frames; Windows, The Redisplay Mechanism, Lstreams, Top
|
|
8941 @chapter Consoles; Devices; Frames; Windows
|
462
|
8942 @cindex consoles; devices; frames; windows
|
|
8943 @cindex devices; frames; windows, consoles;
|
|
8944 @cindex frames; windows, consoles; devices;
|
|
8945 @cindex windows, consoles; devices; frames;
|
428
|
8946
|
|
8947 @menu
|
|
8948 * Introduction to Consoles; Devices; Frames; Windows::
|
|
8949 * Point::
|
|
8950 * Window Hierarchy::
|
|
8951 * The Window Object::
|
|
8952 @end menu
|
|
8953
|
462
|
8954 @node Introduction to Consoles; Devices; Frames; Windows
|
428
|
8955 @section Introduction to Consoles; Devices; Frames; Windows
|
462
|
8956 @cindex consoles; devices; frames; windows, introduction to
|
|
8957 @cindex devices; frames; windows, introduction to consoles;
|
|
8958 @cindex frames; windows, introduction to consoles; devices;
|
|
8959 @cindex windows, introduction to consoles; devices; frames;
|
428
|
8960
|
|
8961 A window-system window that you see on the screen is called a
|
|
8962 @dfn{frame} in Emacs terminology. Each frame is subdivided into one or
|
|
8963 more non-overlapping panes, called (confusingly) @dfn{windows}. Each
|
|
8964 window displays the text of a buffer in it. (See above on Buffers.) Note
|
|
8965 that buffers and windows are independent entities: Two or more windows
|
|
8966 can be displaying the same buffer (potentially in different locations),
|
|
8967 and a buffer can be displayed in no windows.
|
|
8968
|
|
8969 A single display screen that contains one or more frames is called
|
|
8970 a @dfn{display}. Under most circumstances, there is only one display.
|
|
8971 However, more than one display can exist, for example if you have
|
|
8972 a @dfn{multi-headed} console, i.e. one with a single keyboard but
|
|
8973 multiple displays. (Typically in such a situation, the various
|
|
8974 displays act like one large display, in that the mouse is only
|
|
8975 in one of them at a time, and moving the mouse off of one moves
|
|
8976 it into another.) In some cases, the different displays will
|
|
8977 have different characteristics, e.g. one color and one mono.
|
|
8978
|
|
8979 XEmacs can display frames on multiple displays. It can even deal
|
|
8980 simultaneously with frames on multiple keyboards (called @dfn{consoles} in
|
|
8981 XEmacs terminology). Here is one case where this might be useful: You
|
|
8982 are using XEmacs on your workstation at work, and leave it running.
|
|
8983 Then you go home and dial in on a TTY line, and you can use the
|
|
8984 already-running XEmacs process to display another frame on your local
|
|
8985 TTY.
|
|
8986
|
|
8987 Thus, there is a hierarchy console -> display -> frame -> window.
|
|
8988 There is a separate Lisp object type for each of these four concepts.
|
|
8989 Furthermore, there is logically a @dfn{selected console},
|
|
8990 @dfn{selected display}, @dfn{selected frame}, and @dfn{selected window}.
|
|
8991 Each of these objects is distinguished in various ways, such as being the
|
|
8992 default object for various functions that act on objects of that type.
|
442
|
8993 Note that every containing object remembers the ``selected'' object
|
428
|
8994 among the objects that it contains: e.g. not only is there a selected
|
|
8995 window, but every frame remembers the last window in it that was
|
|
8996 selected, and changing the selected frame causes the remembered window
|
|
8997 within it to become the selected window. Similar relationships apply
|
|
8998 for consoles to devices and devices to frames.
|
|
8999
|
462
|
9000 @node Point
|
428
|
9001 @section Point
|
462
|
9002 @cindex point
|
428
|
9003
|
|
9004 Recall that every buffer has a current insertion position, called
|
|
9005 @dfn{point}. Now, two or more windows may be displaying the same buffer,
|
|
9006 and the text cursor in the two windows (i.e. @code{point}) can be in
|
|
9007 two different places. You may ask, how can that be, since each
|
|
9008 buffer has only one value of @code{point}? The answer is that each window
|
|
9009 also has a value of @code{point} that is squirreled away in it. There
|
|
9010 is only one selected window, and the value of ``point'' in that buffer
|
|
9011 corresponds to that window. When the selected window is changed
|
|
9012 from one window to another displaying the same buffer, the old
|
|
9013 value of @code{point} is stored into the old window's ``point'' and the
|
|
9014 value of @code{point} from the new window is retrieved and made the
|
|
9015 value of @code{point} in the buffer. This means that @code{window-point}
|
|
9016 for the selected window is potentially inaccurate, and if you
|
|
9017 want to retrieve the correct value of @code{point} for a window,
|
|
9018 you must special-case on the selected window and retrieve the
|
|
9019 buffer's point instead. This is related to why @code{save-window-excursion}
|
|
9020 does not save the selected window's value of @code{point}.
|
|
9021
|
462
|
9022 @node Window Hierarchy
|
428
|
9023 @section Window Hierarchy
|
|
9024 @cindex window hierarchy
|
|
9025 @cindex hierarchy of windows
|
|
9026
|
|
9027 If a frame contains multiple windows (panes), they are always created
|
|
9028 by splitting an existing window along the horizontal or vertical axis.
|
|
9029 Terminology is a bit confusing here: to @dfn{split a window
|
|
9030 horizontally} means to create two side-by-side windows, i.e. to make a
|
|
9031 @emph{vertical} cut in a window. Likewise, to @dfn{split a window
|
|
9032 vertically} means to create two windows, one above the other, by making
|
|
9033 a @emph{horizontal} cut.
|
|
9034
|
|
9035 If you split a window and then split again along the same axis, you
|
|
9036 will end up with a number of panes all arranged along the same axis.
|
|
9037 The precise way in which the splits were made should not be important,
|
|
9038 and this is reflected internally. Internally, all windows are arranged
|
|
9039 in a tree, consisting of two types of windows, @dfn{combination} windows
|
|
9040 (which have children, and are covered completely by those children) and
|
|
9041 @dfn{leaf} windows, which have no children and are visible. Every
|
|
9042 combination window has two or more children, all arranged along the same
|
|
9043 axis. There are (logically) two subtypes of windows, depending on
|
|
9044 whether their children are horizontally or vertically arrayed. There is
|
|
9045 always one root window, which is either a leaf window (if the frame
|
|
9046 contains only one window) or a combination window (if the frame contains
|
|
9047 more than one window). In the latter case, the root window will have
|
|
9048 two or more children, either horizontally or vertically arrayed, and
|
|
9049 each of those children will be either a leaf window or another
|
|
9050 combination window.
|
|
9051
|
|
9052 Here are some rules:
|
|
9053
|
|
9054 @enumerate
|
|
9055 @item
|
|
9056 Horizontal combination windows can never have children that are
|
|
9057 horizontal combination windows; same for vertical.
|
|
9058
|
|
9059 @item
|
|
9060 Only leaf windows can be split (obviously) and this splitting does one
|
|
9061 of two things: (a) turns the leaf window into a combination window and
|
|
9062 creates two new leaf children, or (b) turns the leaf window into one of
|
|
9063 the two new leaves and creates the other leaf. Rule (1) dictates which
|
|
9064 of these two outcomes happens.
|
|
9065
|
|
9066 @item
|
|
9067 Every combination window must have at least two children.
|
|
9068
|
|
9069 @item
|
|
9070 Leaf windows can never become combination windows. They can be deleted,
|
|
9071 however. If this results in a violation of (3), the parent combination
|
|
9072 window also gets deleted.
|
|
9073
|
|
9074 @item
|
|
9075 All functions that accept windows must be prepared to accept combination
|
|
9076 windows, and do something sane (e.g. signal an error if so).
|
|
9077 Combination windows @emph{do} escape to the Lisp level.
|
|
9078
|
|
9079 @item
|
|
9080 All windows have three fields governing their contents:
|
|
9081 these are @dfn{hchild} (a list of horizontally-arrayed children),
|
|
9082 @dfn{vchild} (a list of vertically-arrayed children), and @dfn{buffer}
|
|
9083 (the buffer contained in a leaf window). Exactly one of
|
444
|
9084 these will be non-@code{nil}. Remember that @dfn{horizontally-arrayed}
|
428
|
9085 means ``side-by-side'' and @dfn{vertically-arrayed} means
|
|
9086 @dfn{one above the other}.
|
|
9087
|
|
9088 @item
|
|
9089 Leaf windows also have markers in their @code{start} (the
|
|
9090 first buffer position displayed in the window) and @code{pointm}
|
440
|
9091 (the window's stashed value of @code{point}---see above) fields,
|
444
|
9092 while combination windows have @code{nil} in these fields.
|
428
|
9093
|
|
9094 @item
|
|
9095 The list of children for a window is threaded through the
|
|
9096 @code{next} and @code{prev} fields of each child window.
|
|
9097
|
|
9098 @item
|
|
9099 @strong{Deleted windows can be undeleted}. This happens as a result of
|
|
9100 restoring a window configuration, and is unlike frames, displays, and
|
|
9101 consoles, which, once deleted, can never be restored. Deleting a window
|
|
9102 does nothing except set a special @code{dead} bit to 1 and clear out the
|
|
9103 @code{next}, @code{prev}, @code{hchild}, and @code{vchild} fields, for
|
|
9104 GC purposes.
|
|
9105
|
|
9106 @item
|
440
|
9107 Most frames actually have two top-level windows---one for the
|
428
|
9108 minibuffer and one (the @dfn{root}) for everything else. The modeline
|
|
9109 (if present) separates these two. The @code{next} field of the root
|
|
9110 points to the minibuffer, and the @code{prev} field of the minibuffer
|
|
9111 points to the root. The other @code{next} and @code{prev} fields are
|
|
9112 @code{nil}, and the frame points to both of these windows.
|
|
9113 Minibuffer-less frames have no minibuffer window, and the @code{next}
|
|
9114 and @code{prev} of the root window are @code{nil}. Minibuffer-only
|
|
9115 frames have no root window, and the @code{next} of the minibuffer window
|
|
9116 is @code{nil} but the @code{prev} points to itself. (#### This is an
|
|
9117 artifact that should be fixed.)
|
|
9118 @end enumerate
|
|
9119
|
462
|
9120 @node The Window Object
|
428
|
9121 @section The Window Object
|
462
|
9122 @cindex window object, the
|
|
9123 @cindex object, the window
|
428
|
9124
|
|
9125 Windows have the following accessible fields:
|
|
9126
|
|
9127 @table @code
|
|
9128 @item frame
|
|
9129 The frame that this window is on.
|
|
9130
|
|
9131 @item mini_p
|
|
9132 Non-@code{nil} if this window is a minibuffer window.
|
|
9133
|
|
9134 @item buffer
|
|
9135 The buffer that the window is displaying. This may change often during
|
|
9136 the life of the window.
|
|
9137
|
|
9138 @item dedicated
|
|
9139 Non-@code{nil} if this window is dedicated to its buffer.
|
|
9140
|
|
9141 @item pointm
|
|
9142 @cindex window point internals
|
|
9143 This is the value of point in the current buffer when this window is
|
|
9144 selected; when it is not selected, it retains its previous value.
|
|
9145
|
|
9146 @item start
|
|
9147 The position in the buffer that is the first character to be displayed
|
|
9148 in the window.
|
|
9149
|
|
9150 @item force_start
|
|
9151 If this flag is non-@code{nil}, it says that the window has been
|
|
9152 scrolled explicitly by the Lisp program. This affects what the next
|
|
9153 redisplay does if point is off the screen: instead of scrolling the
|
|
9154 window to show the text around point, it moves point to a location that
|
|
9155 is on the screen.
|
|
9156
|
|
9157 @item last_modified
|
|
9158 The @code{modified} field of the window's buffer, as of the last time
|
|
9159 a redisplay completed in this window.
|
|
9160
|
|
9161 @item last_point
|
|
9162 The buffer's value of point, as of the last time
|
|
9163 a redisplay completed in this window.
|
|
9164
|
|
9165 @item left
|
|
9166 This is the left-hand edge of the window, measured in columns. (The
|
|
9167 leftmost column on the screen is @w{column 0}.)
|
|
9168
|
|
9169 @item top
|
|
9170 This is the top edge of the window, measured in lines. (The top line on
|
|
9171 the screen is @w{line 0}.)
|
|
9172
|
|
9173 @item height
|
|
9174 The height of the window, measured in lines.
|
|
9175
|
|
9176 @item width
|
|
9177 The width of the window, measured in columns.
|
|
9178
|
|
9179 @item next
|
|
9180 This is the window that is the next in the chain of siblings. It is
|
|
9181 @code{nil} in a window that is the rightmost or bottommost of a group of
|
|
9182 siblings.
|
|
9183
|
|
9184 @item prev
|
|
9185 This is the window that is the previous in the chain of siblings. It is
|
|
9186 @code{nil} in a window that is the leftmost or topmost of a group of
|
|
9187 siblings.
|
|
9188
|
|
9189 @item parent
|
|
9190 Internally, XEmacs arranges windows in a tree; each group of siblings has
|
|
9191 a parent window whose area includes all the siblings. This field points
|
|
9192 to a window's parent.
|
|
9193
|
|
9194 Parent windows do not display buffers, and play little role in display
|
|
9195 except to shape their child windows. Emacs Lisp programs usually have
|
|
9196 no access to the parent windows; they operate on the windows at the
|
|
9197 leaves of the tree, which actually display buffers.
|
|
9198
|
|
9199 @item hscroll
|
|
9200 This is the number of columns that the display in the window is scrolled
|
|
9201 horizontally to the left. Normally, this is 0.
|
|
9202
|
|
9203 @item use_time
|
|
9204 This is the last time that the window was selected. The function
|
|
9205 @code{get-lru-window} uses this field.
|
|
9206
|
|
9207 @item display_table
|
|
9208 The window's display table, or @code{nil} if none is specified for it.
|
|
9209
|
|
9210 @item update_mode_line
|
|
9211 Non-@code{nil} means this window's mode line needs to be updated.
|
|
9212
|
|
9213 @item base_line_number
|
|
9214 The line number of a certain position in the buffer, or @code{nil}.
|
|
9215 This is used for displaying the line number of point in the mode line.
|
|
9216
|
|
9217 @item base_line_pos
|
|
9218 The position in the buffer for which the line number is known, or
|
|
9219 @code{nil} meaning none is known.
|
|
9220
|
|
9221 @item region_showing
|
|
9222 If the region (or part of it) is highlighted in this window, this field
|
|
9223 holds the mark position that made one end of that region. Otherwise,
|
|
9224 this field is @code{nil}.
|
|
9225 @end table
|
|
9226
|
|
9227 @node The Redisplay Mechanism, Extents, Consoles; Devices; Frames; Windows, Top
|
|
9228 @chapter The Redisplay Mechanism
|
462
|
9229 @cindex redisplay mechanism, the
|
428
|
9230
|
|
9231 The redisplay mechanism is one of the most complicated sections of
|
|
9232 XEmacs, especially from a conceptual standpoint. This is doubly so
|
|
9233 because, unlike for the basic aspects of the Lisp interpreter, the
|
|
9234 computer science theories of how to efficiently handle redisplay are not
|
|
9235 well-developed.
|
|
9236
|
|
9237 When working with the redisplay mechanism, remember the Golden Rules
|
|
9238 of Redisplay:
|
|
9239
|
|
9240 @enumerate
|
|
9241 @item
|
|
9242 It Is Better To Be Correct Than Fast.
|
|
9243 @item
|
|
9244 Thou Shalt Not Run Elisp From Within Redisplay.
|
|
9245 @item
|
|
9246 It Is Better To Be Fast Than Not To Be.
|
|
9247 @end enumerate
|
|
9248
|
|
9249 @menu
|
|
9250 * Critical Redisplay Sections::
|
|
9251 * Line Start Cache::
|
|
9252 * Redisplay Piece by Piece::
|
|
9253 @end menu
|
|
9254
|
462
|
9255 @node Critical Redisplay Sections
|
428
|
9256 @section Critical Redisplay Sections
|
462
|
9257 @cindex redisplay sections, critical
|
428
|
9258 @cindex critical redisplay sections
|
|
9259
|
|
9260 Within this section, we are defenseless and assume that the
|
|
9261 following cannot happen:
|
|
9262
|
|
9263 @enumerate
|
|
9264 @item
|
|
9265 garbage collection
|
|
9266 @item
|
|
9267 Lisp code evaluation
|
|
9268 @item
|
|
9269 frame size changes
|
|
9270 @end enumerate
|
|
9271
|
|
9272 We ensure (3) by calling @code{hold_frame_size_changes()}, which
|
|
9273 will cause any pending frame size changes to get put on hold
|
|
9274 till after the end of the critical section. (1) follows
|
|
9275 automatically if (2) is met. #### Unfortunately, there are
|
|
9276 some places where Lisp code can be called within this section.
|
|
9277 We need to remove them.
|
|
9278
|
|
9279 If @code{Fsignal()} is called during this critical section, we
|
|
9280 will @code{abort()}.
|
|
9281
|
|
9282 If garbage collection is called during this critical section,
|
|
9283 we simply return. #### We should abort instead.
|
|
9284
|
|
9285 #### If a frame-size change does occur we should probably
|
|
9286 actually be preempting redisplay.
|
|
9287
|
462
|
9288 @node Line Start Cache
|
428
|
9289 @section Line Start Cache
|
|
9290 @cindex line start cache
|
|
9291
|
|
9292 The traditional scrolling code in Emacs breaks in a variable height
|
|
9293 world. It depends on the key assumption that the number of lines that
|
|
9294 can be displayed at any given time is fixed. This led to a complete
|
|
9295 separation of the scrolling code from the redisplay code. In order to
|
|
9296 fully support variable height lines, the scrolling code must actually be
|
|
9297 tightly integrated with redisplay. Only redisplay can determine how
|
|
9298 many lines will be displayed on a screen for any given starting point.
|
|
9299
|
|
9300 What is ideally wanted is a complete list of the starting buffer
|
|
9301 position for every possible display line of a buffer along with the
|
|
9302 height of that display line. Maintaining such a full list would be very
|
|
9303 expensive. We settle for having it include information for all areas
|
|
9304 which we happen to generate anyhow (i.e. the region currently being
|
|
9305 displayed) and for those areas we need to work with.
|
|
9306
|
|
9307 In order to ensure that the cache accurately represents what redisplay
|
|
9308 would actually show, it is necessary to invalidate it in many
|
|
9309 situations. If the buffer changes, the starting positions may no longer
|
|
9310 be correct. If a face or an extent has changed then the line heights
|
|
9311 may have altered. These events happen frequently enough that the cache
|
|
9312 can end up being constantly disabled. With this potentially constant
|
|
9313 invalidation when is the cache ever useful?
|
|
9314
|
|
9315 Even if the cache is invalidated before every single usage, it is
|
|
9316 necessary. Scrolling often requires knowledge about display lines which
|
|
9317 are actually above or below the visible region. The cache provides a
|
|
9318 convenient light-weight method of storing this information for multiple
|
|
9319 display regions. This knowledge is necessary for the scrolling code to
|
|
9320 always obey the First Golden Rule of Redisplay.
|
|
9321
|
|
9322 If the cache already contains all of the information that the scrolling
|
|
9323 routines happen to need so that it doesn't have to go generate it, then
|
|
9324 we are able to obey the Third Golden Rule of Redisplay. The first thing
|
|
9325 we do to help out the cache is to always add the displayed region. This
|
|
9326 region had to be generated anyway, so the cache ends up getting the
|
|
9327 information basically for free. In those cases where a user is simply
|
|
9328 scrolling around viewing a buffer there is a high probability that this
|
|
9329 is sufficient to always provide the needed information. The second
|
|
9330 thing we can do is be smart about invalidating the cache.
|
|
9331
|
440
|
9332 TODO---Be smart about invalidating the cache. Potential places:
|
428
|
9333
|
|
9334 @itemize @bullet
|
|
9335 @item
|
|
9336 Insertions at end-of-line which don't cause line-wraps do not alter the
|
|
9337 starting positions of any display lines. These types of buffer
|
|
9338 modifications should not invalidate the cache. This is actually a large
|
|
9339 optimization for redisplay speed as well.
|
|
9340 @item
|
|
9341 Buffer modifications frequently only affect the display of lines at and
|
|
9342 below where they occur. In these situations we should only invalidate
|
|
9343 the part of the cache starting at where the modification occurs.
|
|
9344 @end itemize
|
|
9345
|
|
9346 In case you're wondering, the Second Golden Rule of Redisplay is not
|
|
9347 applicable.
|
|
9348
|
462
|
9349 @node Redisplay Piece by Piece
|
428
|
9350 @section Redisplay Piece by Piece
|
462
|
9351 @cindex redisplay piece by piece
|
428
|
9352
|
|
9353 As you can begin to see redisplay is complex and also not well
|
|
9354 documented. Chuck no longer works on XEmacs so this section is my take
|
|
9355 on the workings of redisplay.
|
|
9356
|
|
9357 Redisplay happens in three phases:
|
|
9358
|
|
9359 @enumerate
|
|
9360 @item
|
|
9361 Determine desired display in area that needs redisplay.
|
|
9362 Implemented by @code{redisplay.c}
|
|
9363 @item
|
|
9364 Compare desired display with current display
|
|
9365 Implemented by @code{redisplay-output.c}
|
|
9366 @item
|
|
9367 Output changes Implemented by @code{redisplay-output.c},
|
|
9368 @code{redisplay-x.c}, @code{redisplay-msw.c} and @code{redisplay-tty.c}
|
|
9369 @end enumerate
|
|
9370
|
442
|
9371 Steps 1 and 2 are device-independent and relatively complex. Step 3 is
|
428
|
9372 mostly device-dependent.
|
|
9373
|
|
9374 Determining the desired display
|
|
9375
|
|
9376 Display attributes are stored in @code{display_line} structures. Each
|
|
9377 @code{display_line} consists of a set of @code{display_block}'s and each
|
|
9378 @code{display_block} contains a number of @code{rune}'s. Generally
|
|
9379 dynarr's of @code{display_line}'s are held by each window representing
|
|
9380 the current display and the desired display.
|
|
9381
|
442
|
9382 The @code{display_line} structures are tightly tied to buffers which
|
428
|
9383 presents a problem for redisplay as this connection is bogus for the
|
|
9384 modeline. Hence the @code{display_line} generation routines are
|
|
9385 duplicated for generating the modeline. This means that the modeline
|
|
9386 display code has many bugs that the standard redisplay code does not.
|
|
9387
|
|
9388 The guts of @code{display_line} generation are in
|
|
9389 @code{create_text_block}, which creates a single display line for the
|
|
9390 desired locale. This incrementally parses the characters on the current
|
442
|
9391 line and generates redisplay structures for each.
|
428
|
9392
|
|
9393 Gutter redisplay is different. Because the data to display is stored in
|
|
9394 a string we cannot use @code{create_text_block}. Instead we use
|
|
9395 @code{create_text_string_block} which performs the same function as
|
|
9396 @code{create_text_block} but for strings. Many of the complexities of
|
|
9397 @code{create_text_block} to do with cursor handling and selective
|
|
9398 display have been removed.
|
|
9399
|
|
9400 @node Extents, Faces, The Redisplay Mechanism, Top
|
|
9401 @chapter Extents
|
462
|
9402 @cindex extents
|
428
|
9403
|
|
9404 @menu
|
|
9405 * Introduction to Extents:: Extents are ranges over text, with properties.
|
|
9406 * Extent Ordering:: How extents are ordered internally.
|
|
9407 * Format of the Extent Info:: The extent information in a buffer or string.
|
|
9408 * Zero-Length Extents:: A weird special case.
|
442
|
9409 * Mathematics of Extent Ordering:: A rigorous foundation.
|
428
|
9410 * Extent Fragments:: Cached information useful for redisplay.
|
|
9411 @end menu
|
|
9412
|
462
|
9413 @node Introduction to Extents
|
428
|
9414 @section Introduction to Extents
|
462
|
9415 @cindex extents, introduction to
|
428
|
9416
|
|
9417 Extents are regions over a buffer, with a start and an end position
|
|
9418 denoting the region of the buffer included in the extent. In
|
|
9419 addition, either end can be closed or open, meaning that the endpoint
|
|
9420 is or is not logically included in the extent. Insertion of a character
|
|
9421 at a closed endpoint causes the character to go inside the extent;
|
|
9422 insertion at an open endpoint causes the character to go outside.
|
|
9423
|
|
9424 Extent endpoints are stored using memory indices (see @file{insdel.c}),
|
|
9425 to minimize the amount of adjusting that needs to be done when
|
|
9426 characters are inserted or deleted.
|
|
9427
|
|
9428 (Formerly, extent endpoints at the gap could be either before or
|
|
9429 after the gap, depending on the open/closedness of the endpoint.
|
|
9430 The intent of this was to make it so that insertions would
|
|
9431 automatically go inside or out of extents as necessary with no
|
|
9432 further work needing to be done. It didn't work out that way,
|
|
9433 however, and just ended up complexifying and buggifying all the
|
|
9434 rest of the code.)
|
|
9435
|
462
|
9436 @node Extent Ordering
|
428
|
9437 @section Extent Ordering
|
462
|
9438 @cindex extent ordering
|
428
|
9439
|
|
9440 Extents are compared using memory indices. There are two orderings
|
|
9441 for extents and both orders are kept current at all times. The normal
|
|
9442 or @dfn{display} order is as follows:
|
|
9443
|
|
9444 @example
|
|
9445 Extent A is ``less than'' extent B,
|
|
9446 that is, earlier in the display order,
|
|
9447 if: A-start < B-start,
|
|
9448 or if: A-start = B-start, and A-end > B-end
|
|
9449 @end example
|
|
9450
|
|
9451 So if two extents begin at the same position, the larger of them is the
|
|
9452 earlier one in the display order (@code{EXTENT_LESS} is true).
|
|
9453
|
|
9454 For the e-order, the same thing holds:
|
|
9455
|
|
9456 @example
|
|
9457 Extent A is ``less than'' extent B in e-order,
|
|
9458 that is, later in the buffer,
|
|
9459 if: A-end < B-end,
|
|
9460 or if: A-end = B-end, and A-start > B-start
|
|
9461 @end example
|
|
9462
|
|
9463 So if two extents end at the same position, the smaller of them is the
|
|
9464 earlier one in the e-order (@code{EXTENT_E_LESS} is true).
|
|
9465
|
|
9466 The display order and the e-order are complementary orders: any
|
|
9467 theorem about the display order also applies to the e-order if you swap
|
|
9468 all occurrences of ``display order'' and ``e-order'', ``less than'' and
|
|
9469 ``greater than'', and ``extent start'' and ``extent end''.
|
|
9470
|
462
|
9471 @node Format of the Extent Info
|
428
|
9472 @section Format of the Extent Info
|
462
|
9473 @cindex extent info, format of the
|
428
|
9474
|
|
9475 An extent-info structure consists of a list of the buffer or string's
|
|
9476 extents and a @dfn{stack of extents} that lists all of the extents over
|
|
9477 a particular position. The stack-of-extents info is used for
|
440
|
9478 optimization purposes---it basically caches some info that might
|
428
|
9479 be expensive to compute. Certain otherwise hard computations are easy
|
|
9480 given the stack of extents over a particular position, and if the
|
|
9481 stack of extents over a nearby position is known (because it was
|
|
9482 calculated at some prior point in time), it's easy to move the stack
|
|
9483 of extents to the proper position.
|
|
9484
|
|
9485 Given that the stack of extents is an optimization, and given that
|
|
9486 it requires memory, a string's stack of extents is wiped out each
|
|
9487 time a garbage collection occurs. Therefore, any time you retrieve
|
|
9488 the stack of extents, it might not be there. If you need it to
|
|
9489 be there, use the @code{_force} version.
|
|
9490
|
|
9491 Similarly, a string may or may not have an extent_info structure.
|
|
9492 (Generally it won't if there haven't been any extents added to the
|
|
9493 string.) So use the @code{_force} version if you need the extent_info
|
|
9494 structure to be there.
|
|
9495
|
|
9496 A list of extents is maintained as a double gap array: one gap array
|
|
9497 is ordered by start index (the @dfn{display order}) and the other is
|
|
9498 ordered by end index (the @dfn{e-order}). Note that positions in an
|
|
9499 extent list should logically be conceived of as referring @emph{to} a
|
|
9500 particular extent (as is the norm in programs) rather than sitting
|
|
9501 between two extents. Note also that callers of these functions should
|
|
9502 not be aware of the fact that the extent list is implemented as an
|
|
9503 array, except for the fact that positions are integers (this should be
|
|
9504 generalized to handle integers and linked list equally well).
|
|
9505
|
462
|
9506 @node Zero-Length Extents
|
428
|
9507 @section Zero-Length Extents
|
462
|
9508 @cindex zero-length extents
|
|
9509 @cindex extents, zero-length
|
428
|
9510
|
|
9511 Extents can be zero-length, and will end up that way if their endpoints
|
444
|
9512 are explicitly set that way or if their detachable property is @code{nil}
|
428
|
9513 and all the text in the extent is deleted. (The exception is open-open
|
|
9514 zero-length extents, which are barred from existing because there is
|
|
9515 no sensible way to define their properties. Deletion of the text in
|
|
9516 an open-open extent causes it to be converted into a closed-open
|
|
9517 extent.) Zero-length extents are primarily used to represent
|
|
9518 annotations, and behave as follows:
|
|
9519
|
|
9520 @enumerate
|
|
9521 @item
|
|
9522 Insertion at the position of a zero-length extent expands the extent
|
|
9523 if both endpoints are closed; goes after the extent if it is closed-open;
|
|
9524 and goes before the extent if it is open-closed.
|
|
9525
|
|
9526 @item
|
|
9527 Deletion of a character on a side of a zero-length extent whose
|
|
9528 corresponding endpoint is closed causes the extent to be detached if
|
|
9529 it is detachable; if the extent is not detachable or the corresponding
|
|
9530 endpoint is open, the extent remains in the buffer, moving as necessary.
|
|
9531 @end enumerate
|
|
9532
|
|
9533 Note that closed-open, non-detachable zero-length extents behave
|
|
9534 exactly like markers and that open-closed, non-detachable zero-length
|
|
9535 extents behave like the ``point-type'' marker in Mule.
|
|
9536
|
462
|
9537 @node Mathematics of Extent Ordering
|
428
|
9538 @section Mathematics of Extent Ordering
|
462
|
9539 @cindex mathematics of extent ordering
|
428
|
9540 @cindex extent mathematics
|
|
9541 @cindex extent ordering
|
|
9542
|
|
9543 @cindex display order of extents
|
|
9544 @cindex extents, display order
|
|
9545 The extents in a buffer are ordered by ``display order'' because that
|
|
9546 is that order that the redisplay mechanism needs to process them in.
|
|
9547 The e-order is an auxiliary ordering used to facilitate operations
|
|
9548 over extents. The operations that can be performed on the ordered
|
|
9549 list of extents in a buffer are
|
|
9550
|
|
9551 @enumerate
|
|
9552 @item
|
|
9553 Locate where an extent would go if inserted into the list.
|
|
9554 @item
|
|
9555 Insert an extent into the list.
|
|
9556 @item
|
|
9557 Remove an extent from the list.
|
|
9558 @item
|
|
9559 Map over all the extents that overlap a range.
|
|
9560 @end enumerate
|
|
9561
|
|
9562 (4) requires being able to determine the first and last extents
|
|
9563 that overlap a range.
|
|
9564
|
|
9565 NOTE: @dfn{overlap} is used as follows:
|
|
9566
|
|
9567 @itemize @bullet
|
|
9568 @item
|
|
9569 two ranges overlap if they have at least one point in common.
|
|
9570 Whether the endpoints are open or closed makes a difference here.
|
|
9571 @item
|
|
9572 a point overlaps a range if the point is contained within the
|
|
9573 range; this is equivalent to treating a point @math{P} as the range
|
|
9574 @math{[P, P]}.
|
|
9575 @item
|
|
9576 In the case of an @emph{extent} overlapping a point or range, the extent
|
|
9577 is normally treated as having closed endpoints. This applies
|
|
9578 consistently in the discussion of stacks of extents and such below.
|
|
9579 Note that this definition of overlap is not necessarily consistent with
|
|
9580 the extents that @code{map-extents} maps over, since @code{map-extents}
|
|
9581 sometimes pays attention to whether the endpoints of an extents are open
|
|
9582 or closed. But for our purposes, it greatly simplifies things to treat
|
|
9583 all extents as having closed endpoints.
|
|
9584 @end itemize
|
|
9585
|
|
9586 First, define @math{>}, @math{<}, @math{<=}, etc. as applied to extents
|
|
9587 to mean comparison according to the display order. Comparison between
|
|
9588 an extent @math{E} and an index @math{I} means comparison between
|
|
9589 @math{E} and the range @math{[I, I]}.
|
|
9590
|
|
9591 Also define @math{e>}, @math{e<}, @math{e<=}, etc. to mean comparison
|
|
9592 according to the e-order.
|
|
9593
|
|
9594 For any range @math{R}, define @math{R(0)} to be the starting index of
|
|
9595 the range and @math{R(1)} to be the ending index of the range.
|
|
9596
|
|
9597 For any extent @math{E}, define @math{E(next)} to be the extent directly
|
|
9598 following @math{E}, and @math{E(prev)} to be the extent directly
|
|
9599 preceding @math{E}. Assume @math{E(next)} and @math{E(prev)} can be
|
|
9600 determined from @math{E} in constant time. (This is because we store
|
|
9601 the extent list as a doubly linked list.)
|
|
9602
|
|
9603 Similarly, define @math{E(e-next)} and @math{E(e-prev)} to be the
|
|
9604 extents directly following and preceding @math{E} in the e-order.
|
|
9605
|
|
9606 Now:
|
|
9607
|
|
9608 Let @math{R} be a range.
|
|
9609 Let @math{F} be the first extent overlapping @math{R}.
|
|
9610 Let @math{L} be the last extent overlapping @math{R}.
|
|
9611
|
|
9612 Theorem 1: @math{R(1)} lies between @math{L} and @math{L(next)},
|
|
9613 i.e. @math{L <= R(1) < L(next)}.
|
|
9614
|
|
9615 This follows easily from the definition of display order. The
|
|
9616 basic reason that this theorem applies is that the display order
|
|
9617 sorts by increasing starting index.
|
|
9618
|
|
9619 Therefore, we can determine @math{L} just by looking at where we would
|
|
9620 insert @math{R(1)} into the list, and if we know @math{F} and are moving
|
|
9621 forward over extents, we can easily determine when we've hit @math{L} by
|
|
9622 comparing the extent we're at to @math{R(1)}.
|
|
9623
|
|
9624 @example
|
|
9625 Theorem 2: @math{F(e-prev) e< [1, R(0)] e<= F}.
|
|
9626 @end example
|
|
9627
|
|
9628 This is the analog of Theorem 1, and applies because the e-order
|
|
9629 sorts by increasing ending index.
|
|
9630
|
|
9631 Therefore, @math{F} can be found in the same amount of time as
|
|
9632 operation (1), i.e. the time that it takes to locate where an extent
|
|
9633 would go if inserted into the e-order list.
|
|
9634
|
|
9635 If the lists were stored as balanced binary trees, then operation (1)
|
|
9636 would take logarithmic time, which is usually quite fast. However,
|
|
9637 currently they're stored as simple doubly-linked lists, and instead we
|
|
9638 do some caching to try to speed things up.
|
|
9639
|
|
9640 Define a @dfn{stack of extents} (or @dfn{SOE}) as the set of extents
|
|
9641 (ordered in the display order) that overlap an index @math{I}, together
|
|
9642 with the SOE's @dfn{previous} extent, which is an extent that precedes
|
|
9643 @math{I} in the e-order. (Hopefully there will not be very many extents
|
|
9644 between @math{I} and the previous extent.)
|
|
9645
|
|
9646 Now:
|
|
9647
|
|
9648 Let @math{I} be an index, let @math{S} be the stack of extents on
|
|
9649 @math{I}, let @math{F} be the first extent in @math{S}, and let @math{P}
|
|
9650 be @math{S}'s previous extent.
|
|
9651
|
|
9652 Theorem 3: The first extent in @math{S} is the first extent that overlaps
|
|
9653 any range @math{[I, J]}.
|
|
9654
|
|
9655 Proof: Any extent that overlaps @math{[I, J]} but does not include
|
|
9656 @math{I} must have a start index @math{> I}, and thus be greater than
|
|
9657 any extent in @math{S}.
|
|
9658
|
|
9659 Therefore, finding the first extent that overlaps a range @math{R} is
|
|
9660 the same as finding the first extent that overlaps @math{R(0)}.
|
|
9661
|
|
9662 Theorem 4: Let @math{I2} be an index such that @math{I2 > I}, and let
|
|
9663 @math{F2} be the first extent that overlaps @math{I2}. Then, either
|
|
9664 @math{F2} is in @math{S} or @math{F2} is greater than any extent in
|
|
9665 @math{S}.
|
|
9666
|
|
9667 Proof: If @math{F2} does not include @math{I} then its start index is
|
|
9668 greater than @math{I} and thus it is greater than any extent in
|
|
9669 @math{S}, including @math{F}. Otherwise, @math{F2} includes @math{I}
|
|
9670 and thus is in @math{S}, and thus @math{F2 >= F}.
|
|
9671
|
462
|
9672 @node Extent Fragments
|
428
|
9673 @section Extent Fragments
|
462
|
9674 @cindex extent fragments
|
|
9675 @cindex fragments, extent
|
428
|
9676
|
|
9677 Imagine that the buffer is divided up into contiguous, non-overlapping
|
|
9678 @dfn{runs} of text such that no extent starts or ends within a run
|
|
9679 (extents that abut the run don't count).
|
|
9680
|
|
9681 An extent fragment is a structure that holds data about the run that
|
|
9682 contains a particular buffer position (if the buffer position is at the
|
440
|
9683 junction of two runs, the run after the position is used)---the
|
428
|
9684 beginning and end of the run, a list of all of the extents in that run,
|
|
9685 the @dfn{merged face} that results from merging all of the faces
|
|
9686 corresponding to those extents, the begin and end glyphs at the
|
|
9687 beginning of the run, etc. This is the information that redisplay needs
|
|
9688 in order to display this run.
|
|
9689
|
|
9690 Extent fragments have to be very quick to update to a new buffer
|
|
9691 position when moving linearly through the buffer. They rely on the
|
|
9692 stack-of-extents code, which does the heavy-duty algorithmic work of
|
|
9693 determining which extents overly a particular position.
|
|
9694
|
|
9695 @node Faces, Glyphs, Extents, Top
|
|
9696 @chapter Faces
|
462
|
9697 @cindex faces
|
428
|
9698
|
|
9699 Not yet documented.
|
|
9700
|
|
9701 @node Glyphs, Specifiers, Faces, Top
|
|
9702 @chapter Glyphs
|
462
|
9703 @cindex glyphs
|
428
|
9704
|
|
9705 Glyphs are graphical elements that can be displayed in XEmacs buffers or
|
|
9706 gutters. We use the term graphical element here in the broadest possible
|
446
|
9707 sense since glyphs can be as mundane as text or as arcane as a native
|
428
|
9708 tab widget.
|
|
9709
|
|
9710 In XEmacs, glyphs represent the uninstantiated state of graphical
|
|
9711 elements, i.e. they hold all the information necessary to produce an
|
446
|
9712 image on-screen but the image need not exist at this stage, and multiple
|
|
9713 screen images can be instantiated from a single glyph.
|
428
|
9714
|
|
9715 Glyphs are lazily instantiated by calling one of the glyph
|
|
9716 functions. This usually occurs within redisplay when
|
|
9717 @code{Fglyph_height} is called. Instantiation causes an image-instance
|
454
|
9718 to be created and cached. This cache is on a per-device basis for all glyphs
|
|
9719 except widget-glyphs, and on a per-window basis for widgets-glyphs. The
|
428
|
9720 caching is done by @code{image_instantiate} and is necessary because it
|
|
9721 is generally possible to display an image-instance in multiple
|
|
9722 domains. For instance if we create a Pixmap, we can actually display
|
|
9723 this on multiple windows - even though we only need a single Pixmap
|
|
9724 instance to do this. If caching wasn't done then it would be necessary
|
442
|
9725 to create image-instances for every displayable occurrence of a glyph -
|
428
|
9726 and every usage - and this would be extremely memory and cpu intensive.
|
|
9727
|
|
9728 Widget-glyphs (a.k.a native widgets) are not cached in this way. This is
|
|
9729 because widget-glyph image-instances on screen are toolkit windows, and
|
|
9730 thus cannot be reused in multiple XEmacs domains. Thus widget-glyphs are
|
446
|
9731 cached on an XEmacs window basis.
|
428
|
9732
|
|
9733 Any action on a glyph first consults the cache before actually
|
|
9734 instantiating a widget.
|
|
9735
|
454
|
9736 @section Glyph Instantiation
|
462
|
9737 @cindex glyph instantiation
|
|
9738 @cindex instantiation, glyph
|
454
|
9739
|
|
9740 Glyph instantiation is a hairy topic and requires some explanation. The
|
|
9741 guts of glyph instantiation is contained within
|
|
9742 @code{image_instantiate}. A glyph contains an image which is a
|
|
9743 specifier. When a glyph function - for instance @code{Fglyph_height} -
|
|
9744 asks for a property of the glyph that can only be determined from its
|
|
9745 instantiated state, then the glyph image is instantiated and an image
|
|
9746 instance created. The instantiation process is governed by the specifier
|
|
9747 code and goes through a series of steps:
|
|
9748
|
|
9749 @itemize @bullet
|
|
9750 @item
|
|
9751 Validation. Instantiation of image instances happens dynamically - often
|
|
9752 within the guts of redisplay. Thus it is often not feasible to catch
|
|
9753 instantiator errors at instantiation time. Instead the instantiator is
|
|
9754 validated at the time it is added to the image specifier. This function
|
|
9755 is defined by @code{image_validate} and at a simple level validates
|
|
9756 keyword value pairs.
|
|
9757 @item
|
|
9758 Duplication. The specifier code by default takes a copy of the
|
|
9759 instantiator. This is reasonable for most specifiers but in the case of
|
|
9760 widget-glyphs can be problematic, since some of the properties in the
|
|
9761 instantiator - for instance callbacks - could cause infinite recursion
|
|
9762 in the copying process. Thus the image code defines a function -
|
|
9763 @code{image_copy_instantiator} - which will selectively copy values.
|
|
9764 This is controlled by the way that a keyword is defined either using
|
|
9765 @code{IIFORMAT_VALID_KEYWORD} or
|
|
9766 @code{IIFORMAT_VALID_NONCOPY_KEYWORD}. Note that the image caching and
|
|
9767 redisplay code relies on instantiator copying to ensure that current and
|
|
9768 new instantiators are actually different rather than referring to the
|
|
9769 same thing.
|
|
9770 @item
|
|
9771 Normalization. Once the instantiator has been copied it must be
|
|
9772 converted into a form that is viable at instantiation time. This can
|
|
9773 involve no changes at all, but typically involves things like converting
|
|
9774 file names to the actual data. This function is defined by
|
|
9775 @code{image_going_to_add} and @code{normalize_image_instantiator}.
|
|
9776 @item
|
|
9777 Instantiation. When an image instance is actually required for display
|
|
9778 it is instantiated using @code{image_instantiate}. This involves calling
|
|
9779 instantiate methods that are specific to the type of image being
|
|
9780 instantiated.
|
|
9781 @end itemize
|
|
9782
|
|
9783 The final instantiation phase also involves a number of steps. In order
|
|
9784 to understand these we need to describe a number of concepts.
|
|
9785
|
|
9786 An image is instantiated in a @dfn{domain}, where a domain can be any
|
|
9787 one of a device, frame, window or image-instance. The domain gives the
|
|
9788 image-instance context and identity and properties that affect the
|
|
9789 appearance of the image-instance may be different for the same glyph
|
|
9790 instantiated in different domains. An example is the face used to
|
|
9791 display the image-instance.
|
|
9792
|
|
9793 Although an image is instantiated in a particular domain the
|
|
9794 instantiation domain is not necessarily the domain in which the
|
|
9795 image-instance is cached. For example a pixmap can be instantiated in a
|
|
9796 window be actually be cached on a per-device basis. The domain in which
|
|
9797 the image-instance is actually cached is called the
|
|
9798 @dfn{governing-domain}. A governing-domain is currently either a device
|
|
9799 or a window. Widget-glyphs and text-glyphs have a window as a
|
|
9800 governing-domain, all other image-instances have a device as the
|
|
9801 governing-domain. The governing domain for an image-instance is
|
|
9802 determined using the governing_domain image-instance method.
|
|
9803
|
|
9804 @section Widget-Glyphs
|
462
|
9805 @cindex widget-glyphs
|
454
|
9806
|
440
|
9807 @section Widget-Glyphs in the MS-Windows Environment
|
462
|
9808 @cindex widget-glyphs in the MS-Windows environment
|
|
9809 @cindex MS-Windows environment, widget-glyphs in the
|
428
|
9810
|
|
9811 To Do
|
|
9812
|
|
9813 @section Widget-Glyphs in the X Environment
|
462
|
9814 @cindex widget-glyphs in the X environment
|
|
9815 @cindex X environment, widget-glyphs in the
|
428
|
9816
|
446
|
9817 Widget-glyphs under X make heavy use of lwlib (@pxref{Lucid Widget
|
|
9818 Library}) for manipulating the native toolkit objects. This is primarily
|
|
9819 so that different toolkits can be supported for widget-glyphs, just as
|
|
9820 they are supported for features such as menubars etc.
|
428
|
9821
|
454
|
9822 Lwlib is extremely poorly documented and quite hairy so here is my
|
|
9823 understanding of what goes on.
|
|
9824
|
|
9825 Lwlib maintains a set of widget_instances which mirror the hierarchical
|
|
9826 state of Xt widgets. I think this is so that widgets can be updated and
|
|
9827 manipulated generically by the lwlib library. For instance
|
|
9828 update_one_widget_instance can cope with multiple types of widget and
|
|
9829 multiple types of toolkit. Each element in the widget hierarchy is updated
|
|
9830 from its corresponding widget_instance by walking the widget_instance
|
|
9831 tree recursively.
|
|
9832
|
|
9833 This has desirable properties such as lw_modify_all_widgets which is
|
|
9834 called from @file{glyphs-x.c} and updates all the properties of a widget
|
|
9835 without having to know what the widget is or what toolkit it is from.
|
|
9836 Unfortunately this also has hairy properties such as making the lwlib
|
|
9837 code quite complex. And of course lwlib has to know at some level what
|
|
9838 the widget is and how to set its properties.
|
|
9839
|
428
|
9840 @node Specifiers, Menus, Glyphs, Top
|
|
9841 @chapter Specifiers
|
462
|
9842 @cindex specifiers
|
428
|
9843
|
|
9844 Not yet documented.
|
|
9845
|
|
9846 @node Menus, Subprocesses, Specifiers, Top
|
|
9847 @chapter Menus
|
462
|
9848 @cindex menus
|
428
|
9849
|
|
9850 A menu is set by setting the value of the variable
|
|
9851 @code{current-menubar} (which may be buffer-local) and then calling
|
|
9852 @code{set-menubar-dirty-flag} to signal a change. This will cause the
|
|
9853 menu to be redrawn at the next redisplay. The format of the data in
|
|
9854 @code{current-menubar} is described in @file{menubar.c}.
|
|
9855
|
|
9856 Internally the data in current-menubar is parsed into a tree of
|
|
9857 @code{widget_value's} (defined in @file{lwlib.h}); this is accomplished
|
|
9858 by the recursive function @code{menu_item_descriptor_to_widget_value()},
|
|
9859 called by @code{compute_menubar_data()}. Such a tree is deallocated
|
|
9860 using @code{free_widget_value()}.
|
|
9861
|
|
9862 @code{update_screen_menubars()} is one of the external entry points.
|
|
9863 This checks to see, for each screen, if that screen's menubar needs to
|
|
9864 be updated. This is the case if
|
|
9865
|
|
9866 @enumerate
|
|
9867 @item
|
|
9868 @code{set-menubar-dirty-flag} was called since the last redisplay. (This
|
|
9869 function sets the C variable menubar_has_changed.)
|
|
9870 @item
|
|
9871 The buffer displayed in the screen has changed.
|
|
9872 @item
|
|
9873 The screen has no menubar currently displayed.
|
|
9874 @end enumerate
|
|
9875
|
|
9876 @code{set_screen_menubar()} is called for each such screen. This
|
|
9877 function calls @code{compute_menubar_data()} to create the tree of
|
|
9878 widget_value's, then calls @code{lw_create_widget()},
|
|
9879 @code{lw_modify_all_widgets()}, and/or @code{lw_destroy_all_widgets()}
|
|
9880 to create the X-Toolkit widget associated with the menu.
|
|
9881
|
|
9882 @code{update_psheets()}, the other external entry point, actually
|
|
9883 changes the menus being displayed. It uses the widgets fixed by
|
|
9884 @code{update_screen_menubars()} and calls various X functions to ensure
|
|
9885 that the menus are displayed properly.
|
|
9886
|
|
9887 The menubar widget is set up so that @code{pre_activate_callback()} is
|
|
9888 called when the menu is first selected (i.e. mouse button goes down),
|
|
9889 and @code{menubar_selection_callback()} is called when an item is
|
|
9890 selected. @code{pre_activate_callback()} calls the function in
|
|
9891 activate-menubar-hook, which can change the menubar (this is described
|
|
9892 in @file{menubar.c}). If the menubar is changed,
|
|
9893 @code{set_screen_menubars()} is called.
|
|
9894 @code{menubar_selection_callback()} enqueues a menu event, putting in it
|
|
9895 a function to call (either @code{eval} or @code{call-interactively}) and
|
|
9896 its argument, which is the callback function or form given in the menu's
|
|
9897 description.
|
|
9898
|
446
|
9899 @node Subprocesses, Interface to the X Window System, Menus, Top
|
428
|
9900 @chapter Subprocesses
|
462
|
9901 @cindex subprocesses
|
428
|
9902
|
|
9903 The fields of a process are:
|
|
9904
|
|
9905 @table @code
|
|
9906 @item name
|
|
9907 A string, the name of the process.
|
|
9908
|
|
9909 @item command
|
|
9910 A list containing the command arguments that were used to start this
|
|
9911 process.
|
|
9912
|
|
9913 @item filter
|
|
9914 A function used to accept output from the process instead of a buffer,
|
|
9915 or @code{nil}.
|
|
9916
|
|
9917 @item sentinel
|
|
9918 A function called whenever the process receives a signal, or @code{nil}.
|
|
9919
|
|
9920 @item buffer
|
|
9921 The associated buffer of the process.
|
|
9922
|
|
9923 @item pid
|
|
9924 An integer, the Unix process @sc{id}.
|
|
9925
|
|
9926 @item childp
|
|
9927 A flag, non-@code{nil} if this is really a child process.
|
|
9928 It is @code{nil} for a network connection.
|
|
9929
|
|
9930 @item mark
|
|
9931 A marker indicating the position of the end of the last output from this
|
|
9932 process inserted into the buffer. This is often but not always the end
|
|
9933 of the buffer.
|
|
9934
|
|
9935 @item kill_without_query
|
|
9936 If this is non-@code{nil}, killing XEmacs while this process is still
|
|
9937 running does not ask for confirmation about killing the process.
|
|
9938
|
|
9939 @item raw_status_low
|
|
9940 @itemx raw_status_high
|
|
9941 These two fields record 16 bits each of the process status returned by
|
|
9942 the @code{wait} system call.
|
|
9943
|
|
9944 @item status
|
|
9945 The process status, as @code{process-status} should return it.
|
|
9946
|
|
9947 @item tick
|
|
9948 @itemx update_tick
|
|
9949 If these two fields are not equal, a change in the status of the process
|
|
9950 needs to be reported, either by running the sentinel or by inserting a
|
|
9951 message in the process buffer.
|
|
9952
|
|
9953 @item pty_flag
|
|
9954 Non-@code{nil} if communication with the subprocess uses a @sc{pty};
|
|
9955 @code{nil} if it uses a pipe.
|
|
9956
|
|
9957 @item infd
|
|
9958 The file descriptor for input from the process.
|
|
9959
|
|
9960 @item outfd
|
|
9961 The file descriptor for output to the process.
|
|
9962
|
|
9963 @item subtty
|
|
9964 The file descriptor for the terminal that the subprocess is using. (On
|
|
9965 some systems, there is no need to record this, so the value is
|
|
9966 @code{-1}.)
|
|
9967
|
|
9968 @item tty_name
|
|
9969 The name of the terminal that the subprocess is using,
|
|
9970 or @code{nil} if it is using pipes.
|
|
9971 @end table
|
|
9972
|
446
|
9973 @node Interface to the X Window System, Index, Subprocesses, Top
|
|
9974 @chapter Interface to the X Window System
|
462
|
9975 @cindex X Window System, interface to the
|
446
|
9976
|
|
9977 Mostly undocumented.
|
|
9978
|
|
9979 @menu
|
|
9980 * Lucid Widget Library:: An interface to various widget sets.
|
|
9981 @end menu
|
|
9982
|
462
|
9983 @node Lucid Widget Library
|
446
|
9984 @section Lucid Widget Library
|
462
|
9985 @cindex Lucid Widget Library
|
|
9986 @cindex widget library, Lucid
|
|
9987 @cindex library, Lucid Widget
|
446
|
9988
|
|
9989 Lwlib is extremely poorly documented and quite hairy. The author(s)
|
|
9990 blame that on X, Xt, and Motif, with some justice, but also sufficient
|
|
9991 hypocrisy to avoid drawing the obvious conclusion about their own work.
|
|
9992
|
|
9993 The Lucid Widget Library is composed of two more or less independent
|
|
9994 pieces. The first, as the name suggests, is a set of widgets. These
|
|
9995 widgets are intended to resemble and improve on widgets provided in the
|
|
9996 Motif toolkit but not in the Athena widgets, including menubars and
|
|
9997 scrollbars. Recent additions by Andy Piper integrate some ``modern''
|
|
9998 widgets by Edward Falk, including checkboxes, radio buttons, progress
|
|
9999 gauges, and index tab controls (aka notebooks).
|
|
10000
|
|
10001 The second piece of the Lucid widget library is a generic interface to
|
|
10002 several toolkits for X (including Xt, the Athena widget set, and Motif,
|
|
10003 as well as the Lucid widgets themselves) so that core XEmacs code need
|
|
10004 not know which widget set has been used to build the graphical user
|
|
10005 interface.
|
|
10006
|
|
10007 @menu
|
|
10008 * Generic Widget Interface:: The lwlib generic widget interface.
|
|
10009 * Scrollbars::
|
|
10010 * Menubars::
|
|
10011 * Checkboxes and Radio Buttons::
|
|
10012 * Progress Bars::
|
|
10013 * Tab Controls::
|
|
10014 @end menu
|
|
10015
|
462
|
10016 @node Generic Widget Interface
|
446
|
10017 @subsection Generic Widget Interface
|
462
|
10018 @cindex widget interface, generic
|
446
|
10019
|
|
10020 In general in any toolkit a widget may be a composite object. In Xt,
|
|
10021 all widgets have an X window that they manage, but typically a complex
|
|
10022 widget will have widget children, each of which manages a subwindow of
|
|
10023 the parent widget's X window. These children may themselves be
|
|
10024 composite widgets. Thus a widget is actually a tree or hierarchy of
|
|
10025 widgets.
|
|
10026
|
|
10027 For each toolkit widget, lwlib maintains a tree of @code{widget_values}
|
|
10028 which mirror the hierarchical state of Xt widgets (including Motif,
|
|
10029 Athena, 3D Athena, and Falk's widget sets). Each @code{widget_value}
|
|
10030 has @code{contents} member, which points to the head of a linked list of
|
|
10031 its children. The linked list of siblings is chained through the
|
|
10032 @code{next} member of @code{widget_value}.
|
|
10033
|
|
10034 @example
|
|
10035 +-----------+
|
|
10036 | composite |
|
|
10037 +-----------+
|
|
10038 |
|
|
10039 | contents
|
|
10040 V
|
|
10041 +-------+ next +-------+ next +-------+
|
|
10042 | child |----->| child |----->| child |
|
|
10043 +-------+ +-------+ +-------+
|
|
10044 |
|
|
10045 | contents
|
|
10046 V
|
|
10047 +-------------+ next +-------------+
|
|
10048 | grand child |----->| grand child |
|
|
10049 +-------------+ +-------------+
|
|
10050
|
|
10051 The @code{widget_value} hierarchy of a composite widget with two simple
|
|
10052 children and one composite child.
|
|
10053 @end example
|
|
10054
|
|
10055 The @code{widget_instance} structure maintains the inverse view of the
|
|
10056 tree. As for the @code{widget_value}, siblings are chained through the
|
|
10057 @code{next} member. However, rather than naming children, the
|
|
10058 @code{widget_instance} tree links to parents.
|
|
10059
|
|
10060 @example
|
|
10061 +-----------+
|
|
10062 | composite |
|
|
10063 +-----------+
|
|
10064 A
|
|
10065 | parent
|
|
10066 |
|
|
10067 +-------+ next +-------+ next +-------+
|
|
10068 | child |----->| child |----->| child |
|
|
10069 +-------+ +-------+ +-------+
|
|
10070 A
|
|
10071 | parent
|
|
10072 |
|
|
10073 +-------------+ next +-------------+
|
|
10074 | grand child |----->| grand child |
|
|
10075 +-------------+ +-------------+
|
|
10076
|
|
10077 The @code{widget_value} hierarchy of a composite widget with two simple
|
|
10078 children and one composite child.
|
|
10079 @end example
|
|
10080
|
|
10081 This permits widgets derived from different toolkits to be updated and
|
|
10082 manipulated generically by the lwlib library. For instance
|
|
10083 @code{update_one_widget_instance} can cope with multiple types of widget
|
|
10084 and multiple types of toolkit. Each element in the widget hierarchy is
|
|
10085 updated from its corresponding @code{widget_value} by walking the
|
|
10086 @code{widget_value} tree. This has desirable properties. For example,
|
|
10087 @code{lw_modify_all_widgets} is called from @file{glyphs-x.c} and
|
|
10088 updates all the properties of a widget without having to know what the
|
|
10089 widget is or what toolkit it is from. Unfortunately this also has its
|
|
10090 hairy properties; the lwlib code quite complex. And of course lwlib has
|
|
10091 to know at some level what the widget is and how to set its properties.
|
|
10092
|
|
10093 The @code{widget_instance} structure also contains a pointer to the root
|
|
10094 of its tree. Widget instances are further confi
|
|
10095
|
|
10096
|
462
|
10097 @node Scrollbars
|
446
|
10098 @subsection Scrollbars
|
462
|
10099 @cindex scrollbars
|
|
10100
|
|
10101 @node Menubars
|
446
|
10102 @subsection Menubars
|
462
|
10103 @cindex menubars
|
|
10104
|
|
10105 @node Checkboxes and Radio Buttons
|
446
|
10106 @subsection Checkboxes and Radio Buttons
|
462
|
10107 @cindex checkboxes and radio buttons
|
|
10108 @cindex radio buttons, checkboxes and
|
|
10109 @cindex buttons, checkboxes and radio
|
|
10110
|
|
10111 @node Progress Bars
|
446
|
10112 @subsection Progress Bars
|
462
|
10113 @cindex progress bars
|
|
10114 @cindex bars, progress
|
|
10115
|
|
10116 @node Tab Controls
|
446
|
10117 @subsection Tab Controls
|
462
|
10118 @cindex tab controls
|
428
|
10119
|
|
10120 @include index.texi
|
|
10121
|
|
10122 @c Print the tables of contents
|
|
10123 @summarycontents
|
|
10124 @contents
|
|
10125 @c That's all
|
|
10126
|
|
10127 @bye
|