comparison man/internals/internals.texi @ 424:11054d720c21 r21-2-20

Import from CVS: tag r21-2-20
author cvs
date Mon, 13 Aug 2007 11:26:11 +0200
parents da8ed4261e83
children
comparison
equal deleted inserted replaced
423:28d9c139be4c 424:11054d720c21
61 @setchapternewpage odd 61 @setchapternewpage odd
62 @finalout 62 @finalout
63 63
64 @titlepage 64 @titlepage
65 @title XEmacs Internals Manual 65 @title XEmacs Internals Manual
66 @subtitle Version 1.2, October 1998 66 @subtitle Version 1.3, August 1999
67 67
68 @author Ben Wing 68 @author Ben Wing
69 @author Martin Buchholz 69 @author Martin Buchholz
70 @author Hrvoje Niksic 70 @author Hrvoje Niksic
71 @author Matthias Neubauer
71 @page 72 @page
72 @vskip 0pt plus 1fill 73 @vskip 0pt plus 1fill
73 74
74 @noindent 75 @noindent
75 Copyright @copyright{} 1992 - 1996 Ben Wing. @* 76 Copyright @copyright{} 1992 - 1996 Ben Wing. @*
76 Copyright @copyright{} 1996, 1997 Sun Microsystems, Inc. @* 77 Copyright @copyright{} 1996, 1997 Sun Microsystems, Inc. @*
77 Copyright @copyright{} 1994 - 1998 Free Software Foundation. @* 78 Copyright @copyright{} 1994 - 1998 Free Software Foundation. @*
78 Copyright @copyright{} 1994, 1995 Board of Trustees, University of Illinois. 79 Copyright @copyright{} 1994, 1995 Board of Trustees, University of Illinois.
79 80
80 @sp 2 81 @sp 2
81 Version 1.2 @* 82 Version 1.3 @*
82 October 1998.@* 83 August 1999.@*
83 84
84 Permission is granted to make and distribute verbatim copies of this 85 Permission is granted to make and distribute verbatim copies of this
85 manual provided the copyright notice and this permission notice are 86 manual provided the copyright notice and this permission notice are
86 preserved on all copies. 87 preserved on all copies.
87 88
125 * The Lisp Reader and Compiler:: 126 * The Lisp Reader and Compiler::
126 * Lstreams:: 127 * Lstreams::
127 * Consoles; Devices; Frames; Windows:: 128 * Consoles; Devices; Frames; Windows::
128 * The Redisplay Mechanism:: 129 * The Redisplay Mechanism::
129 * Extents:: 130 * Extents::
130 * Faces and Glyphs:: 131 * Faces::
132 * Glyphs::
131 * Specifiers:: 133 * Specifiers::
132 * Menus:: 134 * Menus::
133 * Subprocesses:: 135 * Subprocesses::
134 * Interface to X Windows:: 136 * Interface to X Windows::
135 * Index:: Index including concepts, functions, variables, 137 * Index:: Index including concepts, functions, variables,
172 Allocation of Objects in XEmacs Lisp 174 Allocation of Objects in XEmacs Lisp
173 175
174 * Introduction to Allocation:: 176 * Introduction to Allocation::
175 * Garbage Collection:: 177 * Garbage Collection::
176 * GCPROing:: 178 * GCPROing::
179 * Garbage Collection - Step by Step::
177 * Integers and Characters:: 180 * Integers and Characters::
178 * Allocation from Frob Blocks:: 181 * Allocation from Frob Blocks::
179 * lrecords:: 182 * lrecords::
180 * Low-level allocation:: 183 * Low-level allocation::
181 * Pure Space:: 184 * Pure Space::
258 * Format of the Extent Info:: The extent information in a buffer or string. 261 * Format of the Extent Info:: The extent information in a buffer or string.
259 * Zero-Length Extents:: A weird special case. 262 * Zero-Length Extents:: A weird special case.
260 * Mathematics of Extent Ordering:: A rigorous foundation. 263 * Mathematics of Extent Ordering:: A rigorous foundation.
261 * Extent Fragments:: Cached information useful for redisplay. 264 * Extent Fragments:: Cached information useful for redisplay.
262 265
263 Faces and Glyphs 266 Faces
267
268 Glyphs
264 269
265 Specifiers 270 Specifiers
266 271
267 Menus 272 Menus
268 273
4368 4373
4369 @menu 4374 @menu
4370 * Introduction to Allocation:: 4375 * Introduction to Allocation::
4371 * Garbage Collection:: 4376 * Garbage Collection::
4372 * GCPROing:: 4377 * GCPROing::
4378 * Garbage Collection - Step by Step::
4373 * Integers and Characters:: 4379 * Integers and Characters::
4374 * Allocation from Frob Blocks:: 4380 * Allocation from Frob Blocks::
4375 * lrecords:: 4381 * lrecords::
4376 * Low-level allocation:: 4382 * Low-level allocation::
4377 * Pure Space:: 4383 * Pure Space::
4711 stack. That involves looking through all of stack memory and treating 4717 stack. That involves looking through all of stack memory and treating
4712 anything that looks like a reference to an object as a reference. This 4718 anything that looks like a reference to an object as a reference. This
4713 will result in a few objects not getting collected when they should, but 4719 will result in a few objects not getting collected when they should, but
4714 it obviates the need for @code{GCPRO}ing, and allows garbage collection 4720 it obviates the need for @code{GCPRO}ing, and allows garbage collection
4715 to happen at any point at all, such as during object allocation. 4721 to happen at any point at all, such as during object allocation.
4722
4723 @node Garbage Collection - Step by Step
4724 @section Garbage Collection - Step by Step
4725 @cindex garbage collection step by step
4726
4727 @menu
4728 * Invocation::
4729 * garbage_collect_1::
4730 * mark_object::
4731 * gc_sweep::
4732 * sweep_lcrecords_1::
4733 * compact_string_chars::
4734 * sweep_strings::
4735 * sweep_bit_vectors_1::
4736 @end menu
4737
4738 @node Invocation
4739 @subsection Invocation
4740 @cindex garbage collection, invocation
4741
4742 The first thing that anyone should know about garbage collection is:
4743 when and how the garbage collector is invoked. One might think that this
4744 could happen every time new memory is allocated, e.g. new objects are
4745 created, but this is @emph{not} the case. Instead, we have the following
4746 situation:
4747
4748 The entry point of any process of garbage collection is an invocation
4749 of the function @code{garbage_collect_1} in file @code{alloc.c}. The
4750 invocation can occur @emph{explicitly} by calling the function
4751 @code{Fgarbage_collect} (in addition this function provides information
4752 about the freed memory), or can occur @emph{implicitly} in four different
4753 situations:
4754 @enumerate
4755 @item
4756 In function @code{main_1} in file @code{emacs.c}. This function is called
4757 at each startup of xemacs. The garbage collection is invoked after all
4758 initial creations are completed, but only if a special internal error
4759 checking-constant @code{ERROR_CHECK_GC} is defined.
4760 @item
4761 In function @code{disksave_object_finalization} in file
4762 @code{alloc.c}. The only purpose of this function is to clear the
4763 objects from memory which need not be stored with xemacs when we dump out
4764 an executable. This is only done by @code{Fdump_emacs} or by
4765 @code{Fdump_emacs_data} respectively (both in @code{emacs.c}). The
4766 actual clearing is accomplished by making these objects unreachable and
4767 starting a garbage collection. The function is only used while building
4768 xemacs.
4769 @item
4770 In function @code{Feval / eval} in file @code{eval.c}. Each time the
4771 well known and often used function eval is called to evaluate a form,
4772 one of the first things that could happen, is a potential call of
4773 @code{garbage_collect_1}. There exist three global variables,
4774 @code{consing_since_gc} (counts the created cons-cells since the last
4775 garbage collection), @code{gc_cons_threshold} (a specified threshold
4776 after which a garbage collection occurs) and @code{always_gc}. If
4777 @code{always_gc} is set or if the threshold is exceeded, the garbage
4778 collection will start.
4779 @item
4780 In function @code{Ffuncall / funcall} in file @code{eval.c}. This
4781 function evaluates calls of elisp functions and works according to
4782 @code{Feval}.
4783 @end enumerate
4784
4785 The upshot is that garbage collection can basically occur everywhere
4786 @code{Feval}, respectively @code{Ffuncall}, is used - either directly or
4787 through another function. Since calls to these two functions are
4788 hidden in various other functions, many calls to
4789 @code{garabge_collect_1} are not obviously foreseeable, and therefore
4790 unexpected. Instances where they are used that are worth remembering are
4791 various elisp commands, as for example @code{or},
4792 @code{and}, @code{if}, @code{cond}, @code{while}, @code{setq}, etc.,
4793 miscellaneous @code{gui_item_...} functions, everything related to
4794 @code{eval} (@code{Feval_buffer}, @code{call0}, ...) and inside
4795 @code{Fsignal}. The latter is used to handle signals, as for example the
4796 ones raised by every @code{QUITE}-macro triggered after pressing Ctrl-g.
4797
4798 @node garbage_collect_1
4799 @subsection @code{garbage_collect_1}
4800 @cindex @code{garbage_collect_1}
4801
4802 We can now describe exactly what happens after the invocation takes
4803 place.
4804 @enumerate
4805 @item
4806 There are several cases in which the garbage collector is left immediately:
4807 when we are already garbage collecting (@code{gc_in_progress}), when
4808 the garbage collection is somehow forbidden
4809 (@code{gc_currently_forbidden}), when we are currently displaying something
4810 (@code{in_display}) or when we are preparing for the armageddon of the
4811 whole system (@code{preparing_for_armageddon}).
4812 @item
4813 Next the correct frame in which to put
4814 all the output occurring during garbage collecting is determined. In
4815 order to be able to restore the old display's state after displaying the
4816 message, some data about the current cursor position has to be
4817 saved. The variables @code{pre_gc_curser} and @code{cursor_changed} take
4818 care of that.
4819 @item
4820 The state of @code{gc_currently_forbidden} must be restored after
4821 the garbage collection, no matter what happens during the process. We
4822 accomplish this by @code{record_unwind_protect}ing the suitable function
4823 @code{restore_gc_inhibit} together with the current value of
4824 @code{gc_currently_forbidden}.
4825 @item
4826 If we are concurrently running an interactive xemacs session, the next step
4827 is simply to show the garbage collector's cursor/message.
4828 @item
4829 The following steps are the intrinsic steps of the garbage collector,
4830 therefore @code{gc_in_progress} is set.
4831 @item
4832 For debugging purposes, it is possible to copy the current C stack
4833 frame. However, this seems to be a currently unused feature.
4834 @item
4835 Before actually starting to go over all live objects, references to
4836 objects that are no longer used are pruned. We only have to do this for events
4837 (@code{clear_event_resource}) and for specifiers
4838 (@code{cleanup_specifiers}).
4839 @item
4840 Now the mark phase begins and marks all accessible elements. In order to
4841 start from
4842 all slots that serve as roots of accessibility, the function
4843 @code{mark_object} is called for each root individually to go out from
4844 there to mark all reachable objects. All roots that are traversed are
4845 shown in their processed order:
4846 @itemize @bullet
4847 @item
4848 all constant symbols and static variables that are registered via
4849 @code{staticpro}@ in the array @code{staticvec}.
4850 @xref{Adding Global Lisp Variables}.
4851 @item
4852 all Lisp objects that are created in C functions and that must be
4853 protected from freeing them. They are registered in the global
4854 list @code{gcprolist}.
4855 @xref{GCPROing}.
4856 @item
4857 all local variables (i.e. their name fields @code{symbol} and old
4858 values @code{old_values}) that are bound during the evaluation by the Lisp
4859 engine. They are stored in @code{specbinding} structs pushed on a stack
4860 called @code{specpdl}.
4861 @xref{Dynamic Binding; The specbinding Stack; Unwind-Protects}.
4862 @item
4863 all catch blocks that the Lisp engine encounters during the evaluation
4864 cause the creation of structs @code{catchtag} inserted in the list
4865 @code{catchlist}. Their tag (@code{tag}) and value (@code{val} fields
4866 are freshly created objects and therefore have to be marked.
4867 @xref{Catch and Throw}.
4868 @item
4869 every function application pushes new structs @code{backtrace}
4870 on the call stack of the Lisp engine (@code{backtrace_list}). The unique
4871 parts that have to be marked are the fields for each function
4872 (@code{function}) and all their arguments (@code{args}).
4873 @xref{Evaluation}.
4874 @item
4875 all objects that are used by the redisplay engine that must not be freed
4876 are marked by a special function called @code{mark_redisplay} (in
4877 @code{redisplay.c}).
4878 @item
4879 all objects created for profiling purposes are allocated by C functions
4880 instead of using the lisp allocation mechanisms. In order to receive the
4881 right ones during the sweep phase, they also have to be marked
4882 manually. That is done by the function @code{mark_profiling_info}
4883 @end itemize
4884 @item
4885 Hash tables in Xemacs belong to a kind of special objects that
4886 make use of a concept often called 'weak pointers'.
4887 To make a long story short, these kind of pointers are not followed
4888 during the estimation of the live objects during garbage collection.
4889 Any object referenced only by weak pointers is collected
4890 anyway, and the reference to it is cleared. In hash tables there are
4891 different usage patterns of them, manifesting in different types of hash
4892 tables, namely 'non-weak', 'weak', 'key-weak' and 'value-weak'
4893 (internally also 'key-car-weak' and 'value-car-weak') hash tables, each
4894 clearing entries depending on different conditions. More information can
4895 be found in the documentation to the function @code{make-hash-table}.
4896
4897 Because there are complicated dependency rules about when and what to
4898 mark while processing weak hash tables, the standard @code{marker}
4899 method is only active if it is marking non-weak hash tables. As soon as
4900 a weak component is in the table, the hash table entries are ignored
4901 while marking. Instead their marking is done each separately by the
4902 function @code{finish_marking_weak_hash_tables}. This function iterates
4903 over each hash table entry @code{hentries} for each weak hash table in
4904 @code{Vall_weak_hash_tables}. Depending on the type of a table, the
4905 appropriate action is performed.
4906 If a table is acting as @code{HASH_TABLE_KEY_WEAK}, and a key already marked,
4907 everything reachable from the @code{value} component is marked. If it is
4908 acting as a @code{HASH_TABLE_VALUE_WEAK} and the value component is
4909 already marked, the marking starts beginning only from the
4910 @code{key} component.
4911 If it is a @code{HASH_TABLE_KEY_CAR_WEAK} and the car
4912 of the key entry is already marked, we mark both the @code{key} and
4913 @code{value} components.
4914 Finally, if the table is of the type @code{HASH_TABLE_VALUE_CAR_WEAK}
4915 and the car of the value components is already marked, again both the
4916 @code{key} and the @code{value} components get marked.
4917
4918 Again, there are lists with comparable properties called weak
4919 lists. There exist different peculiarities of their types called
4920 @code{simple}, @code{assoc}, @code{key-assoc} and
4921 @code{value-assoc}. You can find further details about them in the
4922 description to the function @code{make-weak-list}. The scheme of their
4923 marking is similar: all weak lists are listed in @code{Qall_weak_lists},
4924 therefore we iterate over them. The marking is advanced until we hit an
4925 already marked pair. Then we know that during a former run all
4926 the rest has been marked completely. Again, depending on the special
4927 type of the weak list, our jobs differ. If it is a @code{WEAK_LIST_SIMPLE}
4928 and the elem is marked, we mark the @code{cons} part. If it is a
4929 @code{WEAK_LIST_ASSOC} and not a pair or a pair with both marked car and
4930 cdr, we mark the @code{cons} and the @code{elem}. If it is a
4931 @code{WEAK_LIST_KEY_ASSOC} and not a pair or a pair with a marked car of
4932 the elem, we mark the @code{cons} and the @code{elem}. Finally, if it is
4933 a @code{WEAK_LIST_VALUE_ASSOC} and not a pair or a pair with a marked
4934 cdr of the elem, we mark both the @code{cons} and the @code{elem}.
4935
4936 Since, by marking objects in reach from weak hash tables and weak lists,
4937 other objects could get marked, this perhaps implies further marking of
4938 other weak objects, both finishing functions are redone as long as
4939 yet unmarked objects get freshly marked.
4940
4941 @item
4942 After completing the special marking for the weak hash tables and for the weak
4943 lists, all entries that point to objects that are going to be swept in
4944 the further process are useless, and therefore have to be removed from
4945 the table or the list.
4946
4947 The function @code{prune_weak_hash_tables} does the job for weak hash
4948 tables. Totally unmarked hash tables are removed from the list
4949 @code{Vall_weak_hash_tables}. The other ones are treated more carefully
4950 by scanning over all entries and removing one as soon as one of
4951 the components @code{key} and @code{value} is unmarked.
4952
4953 The same idea applies to the weak lists. It is accomplished by
4954 @code{prune_weak_lists}: An unmarked list is pruned from
4955 @code{Vall_weak_lists} immediately. A marked list is treated more
4956 carefully by going over it and removing just the unmarked pairs.
4957
4958 @item
4959 The function @code{prune_specifiers} checks all listed specifiers held
4960 in @code{Vall_speficiers} and removes the ones from the lists that are
4961 unmarked.
4962
4963 @item
4964 All syntax tables are stored in a list called
4965 @code{Vall_syntax_tables}. The function @code{prune_syntax_tables} walks
4966 through it and unlinks the tables that are unmarked.
4967
4968 @item
4969 Next, we will attack the complete sweeping - the function
4970 @code{gc_sweep} which holds the predominance.
4971 @item
4972 First, all the variables with respect to garbage collection are
4973 reset. @code{consing_since_gc} - the counter of the created cells since
4974 the last garbage collection - is set back to 0, and
4975 @code{gc_in_progress} is not @code{true} anymore.
4976 @item
4977 In case the session is interactive, the displayed cursor and message are
4978 removed again.
4979 @item
4980 The state of @code{gc_inhibit} is restored to the former value by
4981 unwinding the stack.
4982 @item
4983 A small memory reserve is always held back that can be reached by
4984 @code{breathing_space}. If nothing more is left, we create a new reserve
4985 and exit.
4986 @end enumerate
4987
4988 @node mark_object
4989 @subsection @code{mark_object}
4990 @cindex @code{mark_object}
4991
4992 The first thing that is checked while marking an object is whether the
4993 object is a real Lisp object @code{Lisp_Type_Record} or just an integer
4994 or a character. Integers and characters are the only two types that are
4995 stored directly - without another level of indirection, and therefore they
4996 donīt have to be marked and collected.
4997 @xref{How Lisp Objects Are Represented in C}.
4998
4999 The second case is the one we have to handle. It is the one when we are
5000 dealing with a pointer to a Lisp object. But, there exist also three
5001 possibilities, that prevent us from doing anything while marking: The
5002 object is read only which prevents it from being garbage collected,
5003 i.e. marked (@code{C_READONLY_RECORD_HEADER}). The object in question is
5004 already marked, and need not be marked for the second time (checked by
5005 @code{MARKED_RECORD_HEADER_P}). If it is a special, unmarkable object
5006 (@code{UNMARKABLE_RECORD_HEADER_P}, apparently, these are objects that
5007 sit in some CONST space, and can therefore not be marked, see
5008 @code{this_one_is_unmarkable} in @code{alloc.c}).
5009
5010 Now, the actual marking is feasible. We do so by once using the macro
5011 @code{MARK_RECORD_HEADER} to mark the object itself (actually the
5012 special flag in the lrecord header), and calling its special marker
5013 "method" @code{marker} if available. The marker method marks every
5014 other object that is in reach from our current object. Note, that these
5015 marker methods should not call @code{mark_object} recursively, but
5016 instead should return the next object from where further marking has to
5017 be performed.
5018
5019 In case another object was returned, as mentioned before, we reiterate
5020 the whole @code{mark_object} process beginning with this next object.
5021
5022 @node gc_sweep
5023 @subsection @code{gc_sweep}
5024 @cindex @code{gc_sweep}
5025
5026 The job of this function is to free all unmarked records from memory. As
5027 we know, there are different types of objects implemented and managed, and
5028 consequently different ways to free them from memory.
5029 @xref{Introduction to Allocation}.
5030
5031 We start with all objects stored through @code{lcrecords}. All
5032 bulkier objects are allocated and handled using that scheme of
5033 @code{lcrecords}. Each object is @code{malloc}ed separately
5034 instead of placing it in one of the contiguous frob blocks. All types
5035 that are currently stored
5036 using @code{lcrecords}īs @code{alloc_lcrecord} and
5037 @code{make_lcrecord_list} are the types: vectors, buffers,
5038 char-table, char-table-entry, console, weak-list, database, device,
5039 ldap, hash-table, command-builder, extent-auxiliary, extent-info, face,
5040 coding-system, frame, image-instance, glyph, popup-data, gui-item,
5041 keymap, charset, color_instance, font_instance, opaque, opaque-list,
5042 process, range-table, specifier, symbol-value-buffer-local,
5043 symbol-value-lisp-magic, symbol-value-varalias, toolbar-button,
5044 tooltalk-message, tooltalk-pattern, window, and window-configuration. We
5045 take care of them in the fist place
5046 in order to be able to handle and to finalize items stored in them more
5047 easily. The function @code{sweep_lcrecords_1} as described below is
5048 doing the whole job for us.
5049 For a description about the internals: @xref{lrecords}.
5050
5051 Our next candidates are the other objects that behave quite differently
5052 than everything else: the strings. They consists of two parts, a
5053 fixed-size portion (@code{struct Lisp_string}) holding the string's
5054 length, its property list and a pointer to the second part, and the
5055 actual string data, which is stored in string-chars blocks comparable to
5056 frob blocks. In this block, the data is not only freed, but also a
5057 compression of holes is made, i.e. all strings are relocated together.
5058 @xref{String}. This compacting phase is performed by the function
5059 @code{compact_string_chars}, the actual sweeping by the function
5060 @code{sweep_strings} is described below.
5061
5062 After that, the other types are swept step by step using functions
5063 @code{sweep_conses}, @code{sweep_bit_vectors_1},
5064 @code{sweep_compiled_functions}, @code{sweep_floats},
5065 @code{sweep_symbols}, @code{sweep_extents}, @code{sweep_markers} and
5066 @code{sweep_extents}. They are the fixed-size types cons, floats,
5067 compiled-functions, symbol, marker, extent, and event stored in
5068 so-called "frob blocks", and therefore we can basically do the same on
5069 every type objects, using the same macros, especially defined only to
5070 handle everything with respect to fixed-size blocks. The only fixed-size
5071 type that is not handled here are the fixed-size portion of strings,
5072 because we took special care of them earlier.
5073
5074 The only big exceptions are bit vectors stored differently and
5075 therefore treated differently by the function @code{sweep_bit_vectors_1}
5076 described later.
5077
5078 At first, we need some brief information about how
5079 these fixed-size types are managed in general, in order to understand
5080 how the sweeping is done. They have all a fixed size, and are therefore
5081 stored in big blocks of memory - allocated at once - that can hold a
5082 certain amount of objects of one type. The macro
5083 @code{DECLARE_FIXED_TYPE_ALLOC} creates the suitable structures for
5084 every type. More precisely, we have the block struct
5085 (holding a pointer to the previous block @code{prev} and the
5086 objects in @code{block[]}), a pointer to current block
5087 (@code{current_..._block)}) and its last index
5088 (@code{current_..._block_index}), and a pointer to the free list that
5089 will be created. Also a macro @code{FIXED_TYPE_FROM_BLOCK} plus some
5090 related macros exists that are used to obtain a new object, either from
5091 the free list @code{ALLOCATE_FIXED_TYPE_1} if there is an unused object
5092 of that type stored or by allocating a completely new block using
5093 @code{ALLOCATE_FIXED_TYPE_FROM_BLOCK}.
5094
5095 The rest works as follows: all of them define a
5096 macro @code{UNMARK_...} that is used to unmark the object. They define a
5097 macro @code{ADDITIONAL_FREE_...} that defines additional work that has
5098 to be done when converting an object from in use to not in use (so far,
5099 only markers use it in order to unchain them). Then, they all call
5100 the macro @code{SWEEP_FIXED_TYPE_BLOCK} instantiated with their type name
5101 and their struct name.
5102
5103 This call in particular does the following: we go over all blocks
5104 starting with the current moving towards the oldest.
5105 For each block, we look at every object in it. If the object already
5106 freed (checked with @code{FREE_STRUCT_P} using the first pointer of the
5107 object), or if it is
5108 set to read only (@code{C_READONLY_RECORD_HEADER_P}, nothing must be
5109 done. If it is unmarked (checked with @code{MARKED_RECORD_HEADER_P}), it
5110 is put in the free list and set free (using the macro
5111 @code{FREE_FIXED_TYPE}, otherwise it stays in the block, but is unmarked
5112 (by @code{UNMARK_...}). While going through one block, we note if the
5113 whole block is empty. If so, the whole block is freed (using
5114 @code{xfree}) and the free list state is set to the state it had before
5115 handling this block.
5116
5117 @node sweep_lcrecords_1
5118 @subsection @code{sweep_lcrecords_1}
5119 @cindex @code{sweep_lcrecords_1}
5120
5121 After nullifying the complete lcrecord statistics, we go over all
5122 lcrecords two separate times. They are all chained together in a list with
5123 a head called @code{all_lcrecords}.
5124
5125 The first loop calls for each object its @code{finalizer} method, but only
5126 in the case that it is not read only
5127 (@code{C_READONLY_RECORD_HEADER_P)}, it is not already marked
5128 (@code{MARKED_RECORD_HEADER_P}), it is not already in a free list (list of
5129 freed objects, field @code{free}) and finally it owns a finalizer
5130 method.
5131
5132 The second loop actually frees the appropriate objects again by iterating
5133 through the whole list. In case an object is read only or marked, it
5134 has to persist, otherwise it is manually freed by calling
5135 @code{xfree}. During this loop, the lcrecord statistics are kept up to
5136 date by calling @code{tick_lcrecord_stats} with the right arguments,
5137
5138 @node compact_string_chars
5139 @subsection @code{compact_string_chars}
5140 @cindex @code{compact_string_chars}
5141
5142 The purpose of this function is to compact all the data parts of the
5143 strings that are held in so-called @code{string_chars_block}, i.e. the
5144 strings that do not exceed a certain maximal length.
5145
5146 The procedure with which this is done is as follows. We are keeping two
5147 positions in the @code{string_chars_block}s using two pointer/integer
5148 pairs, namely @code{from_sb}/@code{from_pos} and
5149 @code{to_sb}/@code{to_pos}. They stand for the actual positions, from
5150 where to where, to copy the actually handled string.
5151
5152 While going over all chained @code{string_char_block}s and their held
5153 strings, staring at @code{first_string_chars_block}, both pointers
5154 are advanced and eventually a string is copied from @code{from_sb} to
5155 @code{to_sb}, depending on the status of the pointed at strings.
5156
5157 More precisely, we can distinguish between the following actions.
5158 @itemize @bullet
5159 @item
5160 The string at @code{from_sb}'s position could be marked as free, which
5161 is indicated by an invalid pointer to the pointer that should point back
5162 to the fixed size string object, and which is checked by
5163 @code{FREE_STRUCT_P}. In this case, the @code{from_sb}/@code{from_pos}
5164 is advanced to the next string, and nothing has to be copied.
5165 @item
5166 Also, if a string object itself is unmarked, nothing has to be
5167 copied. We likewise advance the @code{from_sb}/@code{from_pos}
5168 pair as described above.
5169 @item
5170 In all other cases, we have a marked string at hand. The string data
5171 must be moved from the from-position to the to-position. In case
5172 there is not enough space in the actual @code{to_sb}-block, we advance
5173 this pointer to the beginning of the next block before copying. In case the
5174 from and to positions are different, we perform the
5175 actual copying using the library function @code{memmove}.
5176 @end itemize
5177
5178 After compacting, the pointer to the current
5179 @code{string_chars_block}, sitting in @code{current_string_chars_block},
5180 is reset on the last block to which we moved a string,
5181 i.e. @code{to_block}, and all remaining blocks (we know that they just
5182 carry garbage) are explicitly @code{xfree}d.
5183
5184 @node sweep_strings
5185 @subsection @code{sweep_strings}
5186 @cindex @code{sweep_strings}
5187
5188 The sweeping for the fixed sized string objects is essentially exactly
5189 the same as it is for all other fixed size types. As before, the freeing
5190 into the suitable free list is done by using the macro
5191 @code{SWEEP_FIXED_SIZE_BLOCK} after defining the right macros
5192 @code{UNMARK_string} and @code{ADDITIONAL_FREE_string}. These two
5193 definitions are a little bit special compared to the ones used
5194 for the other fixed size types.
5195
5196 @code{UNMARK_string} is defined the same way except some additional code
5197 used for updating the bookkeeping information.
5198
5199 For strings, @code{ADDITIONAL_FREE_string} has to do something in
5200 addition: in case, the string was not allocated in a
5201 @code{string_chars_block} because it exceeded the maximal length, and
5202 therefore it was @code{malloc}ed separately, we know also @code{xfree}
5203 it explicitly.
5204
5205 @node sweep_bit_vectors_1
5206 @subsection @code{sweep_bit_vectors_1}
5207 @cindex @code{sweep_bit_vectors_1}
5208
5209 Bit vectors are also one of the rare types that are @code{malloc}ed
5210 individually. Consequently, while sweeping, all further needless
5211 bit vectors must be freed by hand. This is done, as one might imagine,
5212 the expected way: since they are all registered in a list called
5213 @code{all_bit_vectors}, all elements of that list are traversed,
5214 all unmarked bit vectors are unlinked by calling @code{xfree} and all of
5215 them become unmarked.
5216 In addition, the bookkeeping information used for garbage
5217 collector's output purposes is updated.
4716 5218
4717 @node Integers and Characters 5219 @node Integers and Characters
4718 @section Integers and Characters 5220 @section Integers and Characters
4719 5221
4720 Integer and character Lisp objects are created from integers using the 5222 Integer and character Lisp objects are created from integers using the
7400 @end enumerate 7902 @end enumerate
7401 7903
7402 @menu 7904 @menu
7403 * Critical Redisplay Sections:: 7905 * Critical Redisplay Sections::
7404 * Line Start Cache:: 7906 * Line Start Cache::
7907 * Redisplay Piece by Piece::
7405 @end menu 7908 @end menu
7406 7909
7407 @node Critical Redisplay Sections 7910 @node Critical Redisplay Sections
7408 @section Critical Redisplay Sections 7911 @section Critical Redisplay Sections
7409 @cindex critical redisplay sections 7912 @cindex critical redisplay sections
7495 @end itemize 7998 @end itemize
7496 7999
7497 In case you're wondering, the Second Golden Rule of Redisplay is not 8000 In case you're wondering, the Second Golden Rule of Redisplay is not
7498 applicable. 8001 applicable.
7499 8002
7500 @node Extents, Faces and Glyphs, The Redisplay Mechanism, Top 8003 @node Redisplay Piece by Piece
8004 @section Redisplay Piece by Piece
8005 @cindex Redisplay Piece by Piece
8006
8007 As you can begin to see redisplay is complex and also not well
8008 documented. Chuck no longer works on XEmacs so this section is my take
8009 on the workings of redisplay.
8010
8011 Redisplay happens in three phases:
8012
8013 @enumerate
8014 @item
8015 Determine desired display in area that needs redisplay.
8016 Implemented by @code{redisplay.c}
8017 @item
8018 Compare desired display with current display
8019 Implemented by @code{redisplay-output.c}
8020 @item
8021 Output changes Implemented by @code{redisplay-output.c},
8022 @code{redisplay-x.c}, @code{redisplay-msw.c} and @code{redisplay-tty.c}
8023 @end enumerate
8024
8025 Steps 1 and 2 are device-independant and relatively complex. Step 3 is
8026 mostly device-dependent.
8027
8028 Determining the desired display
8029
8030 Display attributes are stored in @code{display_line} structures. Each
8031 @code{display_line} consists of a set of @code{display_block}'s and each
8032 @code{display_block} contains a number of @code{rune}'s. Generally
8033 dynarr's of @code{display_line}'s are held by each window representing
8034 the current display and the desired display.
8035
8036 The @code{display_line} structures are tighly tied to buffers which
8037 presents a problem for redisplay as this connection is bogus for the
8038 modeline. Hence the @code{display_line} generation routines are
8039 duplicated for generating the modeline. This means that the modeline
8040 display code has many bugs that the standard redisplay code does not.
8041
8042 The guts of @code{display_line} generation are in
8043 @code{create_text_block}, which creates a single display line for the
8044 desired locale. This incrementally parses the characters on the current
8045 line and generates redisplay structures for each.
8046
8047 Gutter redisplay is different. Because the data to display is stored in
8048 a string we cannot use @code{create_text_block}. Instead we use
8049 @code{create_text_string_block} which performs the same function as
8050 @code{create_text_block} but for strings. Many of the complexities of
8051 @code{create_text_block} to do with cursor handling and selective
8052 display have been removed.
8053
8054 @node Extents, Faces, The Redisplay Mechanism, Top
7501 @chapter Extents 8055 @chapter Extents
7502 8056
7503 @menu 8057 @menu
7504 * Introduction to Extents:: Extents are ranges over text, with properties. 8058 * Introduction to Extents:: Extents are ranges over text, with properties.
7505 * Extent Ordering:: How extents are ordered internally. 8059 * Extent Ordering:: How extents are ordered internally.
7783 Extent fragments have to be very quick to update to a new buffer 8337 Extent fragments have to be very quick to update to a new buffer
7784 position when moving linearly through the buffer. They rely on the 8338 position when moving linearly through the buffer. They rely on the
7785 stack-of-extents code, which does the heavy-duty algorithmic work of 8339 stack-of-extents code, which does the heavy-duty algorithmic work of
7786 determining which extents overly a particular position. 8340 determining which extents overly a particular position.
7787 8341
7788 @node Faces and Glyphs, Specifiers, Extents, Top 8342 @node Faces, Glyphs, Extents, Top
7789 @chapter Faces and Glyphs 8343 @chapter Faces
7790 8344
7791 Not yet documented. 8345 Not yet documented.
7792 8346
7793 @node Specifiers, Menus, Faces and Glyphs, Top 8347 @node Glyphs, Specifiers, Faces, Top
8348 @chapter Glyphs
8349
8350 Glyphs are graphical elements that can be displayed in XEmacs buffers or
8351 gutters. We use the term graphical element here in the broadest possible
8352 sense since glyphs can be as mundane as text to as arcane as a native
8353 tab widget.
8354
8355 In XEmacs, glyphs represent the uninstantiated state of graphical
8356 elements, i.e. they hold all the information necessary to produce an
8357 image on-screen but the image does not exist at this stage.
8358
8359 Glyphs are lazily instantiated by calling one of the glyph
8360 functions. This usually occurs within redisplay when
8361 @code{Fglyph_height} is called. Instantiation causes an image-instance
8362 to be created and cached. This cache is on a device basis for all glyphs
8363 except glyph-widgets, and on a window basis for glyph widgets. The
8364 caching is done by @code{image_instantiate} and is necessary because it
8365 is generally possible to display an image-instance in multiple
8366 domains. For instance if we create a Pixmap, we can actually display
8367 this on multiple windows - even though we only need a single Pixmap
8368 instance to do this. If caching wasn't done then it would be necessary
8369 to create image-instances for every displayable occurrance of a glyph -
8370 and every usage - and this would be extremely memory and cpu intensive.
8371
8372 Widget-glyphs (a.k.a native widgets) are not cached in this way. This is
8373 because widget-glyph image-instances on screen are toolkit windows, and
8374 thus cannot be reused in multiple XEmacs domains. Thus widget-glyphs are
8375 cached on a window basis.
8376
8377 Any action on a glyph first consults the cache before actually
8378 instantiating a widget.
8379
8380 @section Widget-Glyphs in the MS-WIndows Environment
8381
8382 To Do
8383
8384 @section Widget-Glyphs in the X Environment
8385
8386 Widget-glyphs under X make heavy use of lwlib for manipulating the
8387 native toolkit objects. This is primarily so that different toolkits can
8388 be supported for widget-glyphs, just as they are supported for features
8389 such as menubars etc.
8390
8391 Lwlib is extremely poorly documented and quite hairy so here is my
8392 understanding of what goes on.
8393
8394 Lwlib maintains a set of widget_instances which mirror the hierarchical
8395 state of Xt widgets. I think this is so that widgets can be updated and
8396 manipulated generically by the lwlib library. For instance
8397 update_one_widget_instance can cope with multiple types of widget and
8398 multiple types of toolkit. Each element in the widget hierarchy is updated
8399 from its corresponding widget_instance by walking the widget_instance
8400 tree recursively.
8401
8402 This has desirable properties such as lw_modify_all_widgets which is
8403 called from glyphs-x.c and updates all the properties of a widget
8404 without having to know what the widget is or what toolkit it is from.
8405 Unfortunately this also has hairy properrties such as making the lwlib
8406 code quite complex. And of course lwlib has to know at some level what
8407 the widget is and how to set its properties.
8408
8409 @node Specifiers, Menus, Glyphs, Top
7794 @chapter Specifiers 8410 @chapter Specifiers
7795 8411
7796 Not yet documented. 8412 Not yet documented.
7797 8413
7798 @node Menus, Subprocesses, Specifiers, Top 8414 @node Menus, Subprocesses, Specifiers, Top