Mercurial > hg > xemacs-beta
comparison man/internals/internals.texi @ 424:11054d720c21 r21-2-20
Import from CVS: tag r21-2-20
author | cvs |
---|---|
date | Mon, 13 Aug 2007 11:26:11 +0200 |
parents | da8ed4261e83 |
children |
comparison
equal
deleted
inserted
replaced
423:28d9c139be4c | 424:11054d720c21 |
---|---|
61 @setchapternewpage odd | 61 @setchapternewpage odd |
62 @finalout | 62 @finalout |
63 | 63 |
64 @titlepage | 64 @titlepage |
65 @title XEmacs Internals Manual | 65 @title XEmacs Internals Manual |
66 @subtitle Version 1.2, October 1998 | 66 @subtitle Version 1.3, August 1999 |
67 | 67 |
68 @author Ben Wing | 68 @author Ben Wing |
69 @author Martin Buchholz | 69 @author Martin Buchholz |
70 @author Hrvoje Niksic | 70 @author Hrvoje Niksic |
71 @author Matthias Neubauer | |
71 @page | 72 @page |
72 @vskip 0pt plus 1fill | 73 @vskip 0pt plus 1fill |
73 | 74 |
74 @noindent | 75 @noindent |
75 Copyright @copyright{} 1992 - 1996 Ben Wing. @* | 76 Copyright @copyright{} 1992 - 1996 Ben Wing. @* |
76 Copyright @copyright{} 1996, 1997 Sun Microsystems, Inc. @* | 77 Copyright @copyright{} 1996, 1997 Sun Microsystems, Inc. @* |
77 Copyright @copyright{} 1994 - 1998 Free Software Foundation. @* | 78 Copyright @copyright{} 1994 - 1998 Free Software Foundation. @* |
78 Copyright @copyright{} 1994, 1995 Board of Trustees, University of Illinois. | 79 Copyright @copyright{} 1994, 1995 Board of Trustees, University of Illinois. |
79 | 80 |
80 @sp 2 | 81 @sp 2 |
81 Version 1.2 @* | 82 Version 1.3 @* |
82 October 1998.@* | 83 August 1999.@* |
83 | 84 |
84 Permission is granted to make and distribute verbatim copies of this | 85 Permission is granted to make and distribute verbatim copies of this |
85 manual provided the copyright notice and this permission notice are | 86 manual provided the copyright notice and this permission notice are |
86 preserved on all copies. | 87 preserved on all copies. |
87 | 88 |
125 * The Lisp Reader and Compiler:: | 126 * The Lisp Reader and Compiler:: |
126 * Lstreams:: | 127 * Lstreams:: |
127 * Consoles; Devices; Frames; Windows:: | 128 * Consoles; Devices; Frames; Windows:: |
128 * The Redisplay Mechanism:: | 129 * The Redisplay Mechanism:: |
129 * Extents:: | 130 * Extents:: |
130 * Faces and Glyphs:: | 131 * Faces:: |
132 * Glyphs:: | |
131 * Specifiers:: | 133 * Specifiers:: |
132 * Menus:: | 134 * Menus:: |
133 * Subprocesses:: | 135 * Subprocesses:: |
134 * Interface to X Windows:: | 136 * Interface to X Windows:: |
135 * Index:: Index including concepts, functions, variables, | 137 * Index:: Index including concepts, functions, variables, |
172 Allocation of Objects in XEmacs Lisp | 174 Allocation of Objects in XEmacs Lisp |
173 | 175 |
174 * Introduction to Allocation:: | 176 * Introduction to Allocation:: |
175 * Garbage Collection:: | 177 * Garbage Collection:: |
176 * GCPROing:: | 178 * GCPROing:: |
179 * Garbage Collection - Step by Step:: | |
177 * Integers and Characters:: | 180 * Integers and Characters:: |
178 * Allocation from Frob Blocks:: | 181 * Allocation from Frob Blocks:: |
179 * lrecords:: | 182 * lrecords:: |
180 * Low-level allocation:: | 183 * Low-level allocation:: |
181 * Pure Space:: | 184 * Pure Space:: |
258 * Format of the Extent Info:: The extent information in a buffer or string. | 261 * Format of the Extent Info:: The extent information in a buffer or string. |
259 * Zero-Length Extents:: A weird special case. | 262 * Zero-Length Extents:: A weird special case. |
260 * Mathematics of Extent Ordering:: A rigorous foundation. | 263 * Mathematics of Extent Ordering:: A rigorous foundation. |
261 * Extent Fragments:: Cached information useful for redisplay. | 264 * Extent Fragments:: Cached information useful for redisplay. |
262 | 265 |
263 Faces and Glyphs | 266 Faces |
267 | |
268 Glyphs | |
264 | 269 |
265 Specifiers | 270 Specifiers |
266 | 271 |
267 Menus | 272 Menus |
268 | 273 |
4368 | 4373 |
4369 @menu | 4374 @menu |
4370 * Introduction to Allocation:: | 4375 * Introduction to Allocation:: |
4371 * Garbage Collection:: | 4376 * Garbage Collection:: |
4372 * GCPROing:: | 4377 * GCPROing:: |
4378 * Garbage Collection - Step by Step:: | |
4373 * Integers and Characters:: | 4379 * Integers and Characters:: |
4374 * Allocation from Frob Blocks:: | 4380 * Allocation from Frob Blocks:: |
4375 * lrecords:: | 4381 * lrecords:: |
4376 * Low-level allocation:: | 4382 * Low-level allocation:: |
4377 * Pure Space:: | 4383 * Pure Space:: |
4711 stack. That involves looking through all of stack memory and treating | 4717 stack. That involves looking through all of stack memory and treating |
4712 anything that looks like a reference to an object as a reference. This | 4718 anything that looks like a reference to an object as a reference. This |
4713 will result in a few objects not getting collected when they should, but | 4719 will result in a few objects not getting collected when they should, but |
4714 it obviates the need for @code{GCPRO}ing, and allows garbage collection | 4720 it obviates the need for @code{GCPRO}ing, and allows garbage collection |
4715 to happen at any point at all, such as during object allocation. | 4721 to happen at any point at all, such as during object allocation. |
4722 | |
4723 @node Garbage Collection - Step by Step | |
4724 @section Garbage Collection - Step by Step | |
4725 @cindex garbage collection step by step | |
4726 | |
4727 @menu | |
4728 * Invocation:: | |
4729 * garbage_collect_1:: | |
4730 * mark_object:: | |
4731 * gc_sweep:: | |
4732 * sweep_lcrecords_1:: | |
4733 * compact_string_chars:: | |
4734 * sweep_strings:: | |
4735 * sweep_bit_vectors_1:: | |
4736 @end menu | |
4737 | |
4738 @node Invocation | |
4739 @subsection Invocation | |
4740 @cindex garbage collection, invocation | |
4741 | |
4742 The first thing that anyone should know about garbage collection is: | |
4743 when and how the garbage collector is invoked. One might think that this | |
4744 could happen every time new memory is allocated, e.g. new objects are | |
4745 created, but this is @emph{not} the case. Instead, we have the following | |
4746 situation: | |
4747 | |
4748 The entry point of any process of garbage collection is an invocation | |
4749 of the function @code{garbage_collect_1} in file @code{alloc.c}. The | |
4750 invocation can occur @emph{explicitly} by calling the function | |
4751 @code{Fgarbage_collect} (in addition this function provides information | |
4752 about the freed memory), or can occur @emph{implicitly} in four different | |
4753 situations: | |
4754 @enumerate | |
4755 @item | |
4756 In function @code{main_1} in file @code{emacs.c}. This function is called | |
4757 at each startup of xemacs. The garbage collection is invoked after all | |
4758 initial creations are completed, but only if a special internal error | |
4759 checking-constant @code{ERROR_CHECK_GC} is defined. | |
4760 @item | |
4761 In function @code{disksave_object_finalization} in file | |
4762 @code{alloc.c}. The only purpose of this function is to clear the | |
4763 objects from memory which need not be stored with xemacs when we dump out | |
4764 an executable. This is only done by @code{Fdump_emacs} or by | |
4765 @code{Fdump_emacs_data} respectively (both in @code{emacs.c}). The | |
4766 actual clearing is accomplished by making these objects unreachable and | |
4767 starting a garbage collection. The function is only used while building | |
4768 xemacs. | |
4769 @item | |
4770 In function @code{Feval / eval} in file @code{eval.c}. Each time the | |
4771 well known and often used function eval is called to evaluate a form, | |
4772 one of the first things that could happen, is a potential call of | |
4773 @code{garbage_collect_1}. There exist three global variables, | |
4774 @code{consing_since_gc} (counts the created cons-cells since the last | |
4775 garbage collection), @code{gc_cons_threshold} (a specified threshold | |
4776 after which a garbage collection occurs) and @code{always_gc}. If | |
4777 @code{always_gc} is set or if the threshold is exceeded, the garbage | |
4778 collection will start. | |
4779 @item | |
4780 In function @code{Ffuncall / funcall} in file @code{eval.c}. This | |
4781 function evaluates calls of elisp functions and works according to | |
4782 @code{Feval}. | |
4783 @end enumerate | |
4784 | |
4785 The upshot is that garbage collection can basically occur everywhere | |
4786 @code{Feval}, respectively @code{Ffuncall}, is used - either directly or | |
4787 through another function. Since calls to these two functions are | |
4788 hidden in various other functions, many calls to | |
4789 @code{garabge_collect_1} are not obviously foreseeable, and therefore | |
4790 unexpected. Instances where they are used that are worth remembering are | |
4791 various elisp commands, as for example @code{or}, | |
4792 @code{and}, @code{if}, @code{cond}, @code{while}, @code{setq}, etc., | |
4793 miscellaneous @code{gui_item_...} functions, everything related to | |
4794 @code{eval} (@code{Feval_buffer}, @code{call0}, ...) and inside | |
4795 @code{Fsignal}. The latter is used to handle signals, as for example the | |
4796 ones raised by every @code{QUITE}-macro triggered after pressing Ctrl-g. | |
4797 | |
4798 @node garbage_collect_1 | |
4799 @subsection @code{garbage_collect_1} | |
4800 @cindex @code{garbage_collect_1} | |
4801 | |
4802 We can now describe exactly what happens after the invocation takes | |
4803 place. | |
4804 @enumerate | |
4805 @item | |
4806 There are several cases in which the garbage collector is left immediately: | |
4807 when we are already garbage collecting (@code{gc_in_progress}), when | |
4808 the garbage collection is somehow forbidden | |
4809 (@code{gc_currently_forbidden}), when we are currently displaying something | |
4810 (@code{in_display}) or when we are preparing for the armageddon of the | |
4811 whole system (@code{preparing_for_armageddon}). | |
4812 @item | |
4813 Next the correct frame in which to put | |
4814 all the output occurring during garbage collecting is determined. In | |
4815 order to be able to restore the old display's state after displaying the | |
4816 message, some data about the current cursor position has to be | |
4817 saved. The variables @code{pre_gc_curser} and @code{cursor_changed} take | |
4818 care of that. | |
4819 @item | |
4820 The state of @code{gc_currently_forbidden} must be restored after | |
4821 the garbage collection, no matter what happens during the process. We | |
4822 accomplish this by @code{record_unwind_protect}ing the suitable function | |
4823 @code{restore_gc_inhibit} together with the current value of | |
4824 @code{gc_currently_forbidden}. | |
4825 @item | |
4826 If we are concurrently running an interactive xemacs session, the next step | |
4827 is simply to show the garbage collector's cursor/message. | |
4828 @item | |
4829 The following steps are the intrinsic steps of the garbage collector, | |
4830 therefore @code{gc_in_progress} is set. | |
4831 @item | |
4832 For debugging purposes, it is possible to copy the current C stack | |
4833 frame. However, this seems to be a currently unused feature. | |
4834 @item | |
4835 Before actually starting to go over all live objects, references to | |
4836 objects that are no longer used are pruned. We only have to do this for events | |
4837 (@code{clear_event_resource}) and for specifiers | |
4838 (@code{cleanup_specifiers}). | |
4839 @item | |
4840 Now the mark phase begins and marks all accessible elements. In order to | |
4841 start from | |
4842 all slots that serve as roots of accessibility, the function | |
4843 @code{mark_object} is called for each root individually to go out from | |
4844 there to mark all reachable objects. All roots that are traversed are | |
4845 shown in their processed order: | |
4846 @itemize @bullet | |
4847 @item | |
4848 all constant symbols and static variables that are registered via | |
4849 @code{staticpro}@ in the array @code{staticvec}. | |
4850 @xref{Adding Global Lisp Variables}. | |
4851 @item | |
4852 all Lisp objects that are created in C functions and that must be | |
4853 protected from freeing them. They are registered in the global | |
4854 list @code{gcprolist}. | |
4855 @xref{GCPROing}. | |
4856 @item | |
4857 all local variables (i.e. their name fields @code{symbol} and old | |
4858 values @code{old_values}) that are bound during the evaluation by the Lisp | |
4859 engine. They are stored in @code{specbinding} structs pushed on a stack | |
4860 called @code{specpdl}. | |
4861 @xref{Dynamic Binding; The specbinding Stack; Unwind-Protects}. | |
4862 @item | |
4863 all catch blocks that the Lisp engine encounters during the evaluation | |
4864 cause the creation of structs @code{catchtag} inserted in the list | |
4865 @code{catchlist}. Their tag (@code{tag}) and value (@code{val} fields | |
4866 are freshly created objects and therefore have to be marked. | |
4867 @xref{Catch and Throw}. | |
4868 @item | |
4869 every function application pushes new structs @code{backtrace} | |
4870 on the call stack of the Lisp engine (@code{backtrace_list}). The unique | |
4871 parts that have to be marked are the fields for each function | |
4872 (@code{function}) and all their arguments (@code{args}). | |
4873 @xref{Evaluation}. | |
4874 @item | |
4875 all objects that are used by the redisplay engine that must not be freed | |
4876 are marked by a special function called @code{mark_redisplay} (in | |
4877 @code{redisplay.c}). | |
4878 @item | |
4879 all objects created for profiling purposes are allocated by C functions | |
4880 instead of using the lisp allocation mechanisms. In order to receive the | |
4881 right ones during the sweep phase, they also have to be marked | |
4882 manually. That is done by the function @code{mark_profiling_info} | |
4883 @end itemize | |
4884 @item | |
4885 Hash tables in Xemacs belong to a kind of special objects that | |
4886 make use of a concept often called 'weak pointers'. | |
4887 To make a long story short, these kind of pointers are not followed | |
4888 during the estimation of the live objects during garbage collection. | |
4889 Any object referenced only by weak pointers is collected | |
4890 anyway, and the reference to it is cleared. In hash tables there are | |
4891 different usage patterns of them, manifesting in different types of hash | |
4892 tables, namely 'non-weak', 'weak', 'key-weak' and 'value-weak' | |
4893 (internally also 'key-car-weak' and 'value-car-weak') hash tables, each | |
4894 clearing entries depending on different conditions. More information can | |
4895 be found in the documentation to the function @code{make-hash-table}. | |
4896 | |
4897 Because there are complicated dependency rules about when and what to | |
4898 mark while processing weak hash tables, the standard @code{marker} | |
4899 method is only active if it is marking non-weak hash tables. As soon as | |
4900 a weak component is in the table, the hash table entries are ignored | |
4901 while marking. Instead their marking is done each separately by the | |
4902 function @code{finish_marking_weak_hash_tables}. This function iterates | |
4903 over each hash table entry @code{hentries} for each weak hash table in | |
4904 @code{Vall_weak_hash_tables}. Depending on the type of a table, the | |
4905 appropriate action is performed. | |
4906 If a table is acting as @code{HASH_TABLE_KEY_WEAK}, and a key already marked, | |
4907 everything reachable from the @code{value} component is marked. If it is | |
4908 acting as a @code{HASH_TABLE_VALUE_WEAK} and the value component is | |
4909 already marked, the marking starts beginning only from the | |
4910 @code{key} component. | |
4911 If it is a @code{HASH_TABLE_KEY_CAR_WEAK} and the car | |
4912 of the key entry is already marked, we mark both the @code{key} and | |
4913 @code{value} components. | |
4914 Finally, if the table is of the type @code{HASH_TABLE_VALUE_CAR_WEAK} | |
4915 and the car of the value components is already marked, again both the | |
4916 @code{key} and the @code{value} components get marked. | |
4917 | |
4918 Again, there are lists with comparable properties called weak | |
4919 lists. There exist different peculiarities of their types called | |
4920 @code{simple}, @code{assoc}, @code{key-assoc} and | |
4921 @code{value-assoc}. You can find further details about them in the | |
4922 description to the function @code{make-weak-list}. The scheme of their | |
4923 marking is similar: all weak lists are listed in @code{Qall_weak_lists}, | |
4924 therefore we iterate over them. The marking is advanced until we hit an | |
4925 already marked pair. Then we know that during a former run all | |
4926 the rest has been marked completely. Again, depending on the special | |
4927 type of the weak list, our jobs differ. If it is a @code{WEAK_LIST_SIMPLE} | |
4928 and the elem is marked, we mark the @code{cons} part. If it is a | |
4929 @code{WEAK_LIST_ASSOC} and not a pair or a pair with both marked car and | |
4930 cdr, we mark the @code{cons} and the @code{elem}. If it is a | |
4931 @code{WEAK_LIST_KEY_ASSOC} and not a pair or a pair with a marked car of | |
4932 the elem, we mark the @code{cons} and the @code{elem}. Finally, if it is | |
4933 a @code{WEAK_LIST_VALUE_ASSOC} and not a pair or a pair with a marked | |
4934 cdr of the elem, we mark both the @code{cons} and the @code{elem}. | |
4935 | |
4936 Since, by marking objects in reach from weak hash tables and weak lists, | |
4937 other objects could get marked, this perhaps implies further marking of | |
4938 other weak objects, both finishing functions are redone as long as | |
4939 yet unmarked objects get freshly marked. | |
4940 | |
4941 @item | |
4942 After completing the special marking for the weak hash tables and for the weak | |
4943 lists, all entries that point to objects that are going to be swept in | |
4944 the further process are useless, and therefore have to be removed from | |
4945 the table or the list. | |
4946 | |
4947 The function @code{prune_weak_hash_tables} does the job for weak hash | |
4948 tables. Totally unmarked hash tables are removed from the list | |
4949 @code{Vall_weak_hash_tables}. The other ones are treated more carefully | |
4950 by scanning over all entries and removing one as soon as one of | |
4951 the components @code{key} and @code{value} is unmarked. | |
4952 | |
4953 The same idea applies to the weak lists. It is accomplished by | |
4954 @code{prune_weak_lists}: An unmarked list is pruned from | |
4955 @code{Vall_weak_lists} immediately. A marked list is treated more | |
4956 carefully by going over it and removing just the unmarked pairs. | |
4957 | |
4958 @item | |
4959 The function @code{prune_specifiers} checks all listed specifiers held | |
4960 in @code{Vall_speficiers} and removes the ones from the lists that are | |
4961 unmarked. | |
4962 | |
4963 @item | |
4964 All syntax tables are stored in a list called | |
4965 @code{Vall_syntax_tables}. The function @code{prune_syntax_tables} walks | |
4966 through it and unlinks the tables that are unmarked. | |
4967 | |
4968 @item | |
4969 Next, we will attack the complete sweeping - the function | |
4970 @code{gc_sweep} which holds the predominance. | |
4971 @item | |
4972 First, all the variables with respect to garbage collection are | |
4973 reset. @code{consing_since_gc} - the counter of the created cells since | |
4974 the last garbage collection - is set back to 0, and | |
4975 @code{gc_in_progress} is not @code{true} anymore. | |
4976 @item | |
4977 In case the session is interactive, the displayed cursor and message are | |
4978 removed again. | |
4979 @item | |
4980 The state of @code{gc_inhibit} is restored to the former value by | |
4981 unwinding the stack. | |
4982 @item | |
4983 A small memory reserve is always held back that can be reached by | |
4984 @code{breathing_space}. If nothing more is left, we create a new reserve | |
4985 and exit. | |
4986 @end enumerate | |
4987 | |
4988 @node mark_object | |
4989 @subsection @code{mark_object} | |
4990 @cindex @code{mark_object} | |
4991 | |
4992 The first thing that is checked while marking an object is whether the | |
4993 object is a real Lisp object @code{Lisp_Type_Record} or just an integer | |
4994 or a character. Integers and characters are the only two types that are | |
4995 stored directly - without another level of indirection, and therefore they | |
4996 donīt have to be marked and collected. | |
4997 @xref{How Lisp Objects Are Represented in C}. | |
4998 | |
4999 The second case is the one we have to handle. It is the one when we are | |
5000 dealing with a pointer to a Lisp object. But, there exist also three | |
5001 possibilities, that prevent us from doing anything while marking: The | |
5002 object is read only which prevents it from being garbage collected, | |
5003 i.e. marked (@code{C_READONLY_RECORD_HEADER}). The object in question is | |
5004 already marked, and need not be marked for the second time (checked by | |
5005 @code{MARKED_RECORD_HEADER_P}). If it is a special, unmarkable object | |
5006 (@code{UNMARKABLE_RECORD_HEADER_P}, apparently, these are objects that | |
5007 sit in some CONST space, and can therefore not be marked, see | |
5008 @code{this_one_is_unmarkable} in @code{alloc.c}). | |
5009 | |
5010 Now, the actual marking is feasible. We do so by once using the macro | |
5011 @code{MARK_RECORD_HEADER} to mark the object itself (actually the | |
5012 special flag in the lrecord header), and calling its special marker | |
5013 "method" @code{marker} if available. The marker method marks every | |
5014 other object that is in reach from our current object. Note, that these | |
5015 marker methods should not call @code{mark_object} recursively, but | |
5016 instead should return the next object from where further marking has to | |
5017 be performed. | |
5018 | |
5019 In case another object was returned, as mentioned before, we reiterate | |
5020 the whole @code{mark_object} process beginning with this next object. | |
5021 | |
5022 @node gc_sweep | |
5023 @subsection @code{gc_sweep} | |
5024 @cindex @code{gc_sweep} | |
5025 | |
5026 The job of this function is to free all unmarked records from memory. As | |
5027 we know, there are different types of objects implemented and managed, and | |
5028 consequently different ways to free them from memory. | |
5029 @xref{Introduction to Allocation}. | |
5030 | |
5031 We start with all objects stored through @code{lcrecords}. All | |
5032 bulkier objects are allocated and handled using that scheme of | |
5033 @code{lcrecords}. Each object is @code{malloc}ed separately | |
5034 instead of placing it in one of the contiguous frob blocks. All types | |
5035 that are currently stored | |
5036 using @code{lcrecords}īs @code{alloc_lcrecord} and | |
5037 @code{make_lcrecord_list} are the types: vectors, buffers, | |
5038 char-table, char-table-entry, console, weak-list, database, device, | |
5039 ldap, hash-table, command-builder, extent-auxiliary, extent-info, face, | |
5040 coding-system, frame, image-instance, glyph, popup-data, gui-item, | |
5041 keymap, charset, color_instance, font_instance, opaque, opaque-list, | |
5042 process, range-table, specifier, symbol-value-buffer-local, | |
5043 symbol-value-lisp-magic, symbol-value-varalias, toolbar-button, | |
5044 tooltalk-message, tooltalk-pattern, window, and window-configuration. We | |
5045 take care of them in the fist place | |
5046 in order to be able to handle and to finalize items stored in them more | |
5047 easily. The function @code{sweep_lcrecords_1} as described below is | |
5048 doing the whole job for us. | |
5049 For a description about the internals: @xref{lrecords}. | |
5050 | |
5051 Our next candidates are the other objects that behave quite differently | |
5052 than everything else: the strings. They consists of two parts, a | |
5053 fixed-size portion (@code{struct Lisp_string}) holding the string's | |
5054 length, its property list and a pointer to the second part, and the | |
5055 actual string data, which is stored in string-chars blocks comparable to | |
5056 frob blocks. In this block, the data is not only freed, but also a | |
5057 compression of holes is made, i.e. all strings are relocated together. | |
5058 @xref{String}. This compacting phase is performed by the function | |
5059 @code{compact_string_chars}, the actual sweeping by the function | |
5060 @code{sweep_strings} is described below. | |
5061 | |
5062 After that, the other types are swept step by step using functions | |
5063 @code{sweep_conses}, @code{sweep_bit_vectors_1}, | |
5064 @code{sweep_compiled_functions}, @code{sweep_floats}, | |
5065 @code{sweep_symbols}, @code{sweep_extents}, @code{sweep_markers} and | |
5066 @code{sweep_extents}. They are the fixed-size types cons, floats, | |
5067 compiled-functions, symbol, marker, extent, and event stored in | |
5068 so-called "frob blocks", and therefore we can basically do the same on | |
5069 every type objects, using the same macros, especially defined only to | |
5070 handle everything with respect to fixed-size blocks. The only fixed-size | |
5071 type that is not handled here are the fixed-size portion of strings, | |
5072 because we took special care of them earlier. | |
5073 | |
5074 The only big exceptions are bit vectors stored differently and | |
5075 therefore treated differently by the function @code{sweep_bit_vectors_1} | |
5076 described later. | |
5077 | |
5078 At first, we need some brief information about how | |
5079 these fixed-size types are managed in general, in order to understand | |
5080 how the sweeping is done. They have all a fixed size, and are therefore | |
5081 stored in big blocks of memory - allocated at once - that can hold a | |
5082 certain amount of objects of one type. The macro | |
5083 @code{DECLARE_FIXED_TYPE_ALLOC} creates the suitable structures for | |
5084 every type. More precisely, we have the block struct | |
5085 (holding a pointer to the previous block @code{prev} and the | |
5086 objects in @code{block[]}), a pointer to current block | |
5087 (@code{current_..._block)}) and its last index | |
5088 (@code{current_..._block_index}), and a pointer to the free list that | |
5089 will be created. Also a macro @code{FIXED_TYPE_FROM_BLOCK} plus some | |
5090 related macros exists that are used to obtain a new object, either from | |
5091 the free list @code{ALLOCATE_FIXED_TYPE_1} if there is an unused object | |
5092 of that type stored or by allocating a completely new block using | |
5093 @code{ALLOCATE_FIXED_TYPE_FROM_BLOCK}. | |
5094 | |
5095 The rest works as follows: all of them define a | |
5096 macro @code{UNMARK_...} that is used to unmark the object. They define a | |
5097 macro @code{ADDITIONAL_FREE_...} that defines additional work that has | |
5098 to be done when converting an object from in use to not in use (so far, | |
5099 only markers use it in order to unchain them). Then, they all call | |
5100 the macro @code{SWEEP_FIXED_TYPE_BLOCK} instantiated with their type name | |
5101 and their struct name. | |
5102 | |
5103 This call in particular does the following: we go over all blocks | |
5104 starting with the current moving towards the oldest. | |
5105 For each block, we look at every object in it. If the object already | |
5106 freed (checked with @code{FREE_STRUCT_P} using the first pointer of the | |
5107 object), or if it is | |
5108 set to read only (@code{C_READONLY_RECORD_HEADER_P}, nothing must be | |
5109 done. If it is unmarked (checked with @code{MARKED_RECORD_HEADER_P}), it | |
5110 is put in the free list and set free (using the macro | |
5111 @code{FREE_FIXED_TYPE}, otherwise it stays in the block, but is unmarked | |
5112 (by @code{UNMARK_...}). While going through one block, we note if the | |
5113 whole block is empty. If so, the whole block is freed (using | |
5114 @code{xfree}) and the free list state is set to the state it had before | |
5115 handling this block. | |
5116 | |
5117 @node sweep_lcrecords_1 | |
5118 @subsection @code{sweep_lcrecords_1} | |
5119 @cindex @code{sweep_lcrecords_1} | |
5120 | |
5121 After nullifying the complete lcrecord statistics, we go over all | |
5122 lcrecords two separate times. They are all chained together in a list with | |
5123 a head called @code{all_lcrecords}. | |
5124 | |
5125 The first loop calls for each object its @code{finalizer} method, but only | |
5126 in the case that it is not read only | |
5127 (@code{C_READONLY_RECORD_HEADER_P)}, it is not already marked | |
5128 (@code{MARKED_RECORD_HEADER_P}), it is not already in a free list (list of | |
5129 freed objects, field @code{free}) and finally it owns a finalizer | |
5130 method. | |
5131 | |
5132 The second loop actually frees the appropriate objects again by iterating | |
5133 through the whole list. In case an object is read only or marked, it | |
5134 has to persist, otherwise it is manually freed by calling | |
5135 @code{xfree}. During this loop, the lcrecord statistics are kept up to | |
5136 date by calling @code{tick_lcrecord_stats} with the right arguments, | |
5137 | |
5138 @node compact_string_chars | |
5139 @subsection @code{compact_string_chars} | |
5140 @cindex @code{compact_string_chars} | |
5141 | |
5142 The purpose of this function is to compact all the data parts of the | |
5143 strings that are held in so-called @code{string_chars_block}, i.e. the | |
5144 strings that do not exceed a certain maximal length. | |
5145 | |
5146 The procedure with which this is done is as follows. We are keeping two | |
5147 positions in the @code{string_chars_block}s using two pointer/integer | |
5148 pairs, namely @code{from_sb}/@code{from_pos} and | |
5149 @code{to_sb}/@code{to_pos}. They stand for the actual positions, from | |
5150 where to where, to copy the actually handled string. | |
5151 | |
5152 While going over all chained @code{string_char_block}s and their held | |
5153 strings, staring at @code{first_string_chars_block}, both pointers | |
5154 are advanced and eventually a string is copied from @code{from_sb} to | |
5155 @code{to_sb}, depending on the status of the pointed at strings. | |
5156 | |
5157 More precisely, we can distinguish between the following actions. | |
5158 @itemize @bullet | |
5159 @item | |
5160 The string at @code{from_sb}'s position could be marked as free, which | |
5161 is indicated by an invalid pointer to the pointer that should point back | |
5162 to the fixed size string object, and which is checked by | |
5163 @code{FREE_STRUCT_P}. In this case, the @code{from_sb}/@code{from_pos} | |
5164 is advanced to the next string, and nothing has to be copied. | |
5165 @item | |
5166 Also, if a string object itself is unmarked, nothing has to be | |
5167 copied. We likewise advance the @code{from_sb}/@code{from_pos} | |
5168 pair as described above. | |
5169 @item | |
5170 In all other cases, we have a marked string at hand. The string data | |
5171 must be moved from the from-position to the to-position. In case | |
5172 there is not enough space in the actual @code{to_sb}-block, we advance | |
5173 this pointer to the beginning of the next block before copying. In case the | |
5174 from and to positions are different, we perform the | |
5175 actual copying using the library function @code{memmove}. | |
5176 @end itemize | |
5177 | |
5178 After compacting, the pointer to the current | |
5179 @code{string_chars_block}, sitting in @code{current_string_chars_block}, | |
5180 is reset on the last block to which we moved a string, | |
5181 i.e. @code{to_block}, and all remaining blocks (we know that they just | |
5182 carry garbage) are explicitly @code{xfree}d. | |
5183 | |
5184 @node sweep_strings | |
5185 @subsection @code{sweep_strings} | |
5186 @cindex @code{sweep_strings} | |
5187 | |
5188 The sweeping for the fixed sized string objects is essentially exactly | |
5189 the same as it is for all other fixed size types. As before, the freeing | |
5190 into the suitable free list is done by using the macro | |
5191 @code{SWEEP_FIXED_SIZE_BLOCK} after defining the right macros | |
5192 @code{UNMARK_string} and @code{ADDITIONAL_FREE_string}. These two | |
5193 definitions are a little bit special compared to the ones used | |
5194 for the other fixed size types. | |
5195 | |
5196 @code{UNMARK_string} is defined the same way except some additional code | |
5197 used for updating the bookkeeping information. | |
5198 | |
5199 For strings, @code{ADDITIONAL_FREE_string} has to do something in | |
5200 addition: in case, the string was not allocated in a | |
5201 @code{string_chars_block} because it exceeded the maximal length, and | |
5202 therefore it was @code{malloc}ed separately, we know also @code{xfree} | |
5203 it explicitly. | |
5204 | |
5205 @node sweep_bit_vectors_1 | |
5206 @subsection @code{sweep_bit_vectors_1} | |
5207 @cindex @code{sweep_bit_vectors_1} | |
5208 | |
5209 Bit vectors are also one of the rare types that are @code{malloc}ed | |
5210 individually. Consequently, while sweeping, all further needless | |
5211 bit vectors must be freed by hand. This is done, as one might imagine, | |
5212 the expected way: since they are all registered in a list called | |
5213 @code{all_bit_vectors}, all elements of that list are traversed, | |
5214 all unmarked bit vectors are unlinked by calling @code{xfree} and all of | |
5215 them become unmarked. | |
5216 In addition, the bookkeeping information used for garbage | |
5217 collector's output purposes is updated. | |
4716 | 5218 |
4717 @node Integers and Characters | 5219 @node Integers and Characters |
4718 @section Integers and Characters | 5220 @section Integers and Characters |
4719 | 5221 |
4720 Integer and character Lisp objects are created from integers using the | 5222 Integer and character Lisp objects are created from integers using the |
7400 @end enumerate | 7902 @end enumerate |
7401 | 7903 |
7402 @menu | 7904 @menu |
7403 * Critical Redisplay Sections:: | 7905 * Critical Redisplay Sections:: |
7404 * Line Start Cache:: | 7906 * Line Start Cache:: |
7907 * Redisplay Piece by Piece:: | |
7405 @end menu | 7908 @end menu |
7406 | 7909 |
7407 @node Critical Redisplay Sections | 7910 @node Critical Redisplay Sections |
7408 @section Critical Redisplay Sections | 7911 @section Critical Redisplay Sections |
7409 @cindex critical redisplay sections | 7912 @cindex critical redisplay sections |
7495 @end itemize | 7998 @end itemize |
7496 | 7999 |
7497 In case you're wondering, the Second Golden Rule of Redisplay is not | 8000 In case you're wondering, the Second Golden Rule of Redisplay is not |
7498 applicable. | 8001 applicable. |
7499 | 8002 |
7500 @node Extents, Faces and Glyphs, The Redisplay Mechanism, Top | 8003 @node Redisplay Piece by Piece |
8004 @section Redisplay Piece by Piece | |
8005 @cindex Redisplay Piece by Piece | |
8006 | |
8007 As you can begin to see redisplay is complex and also not well | |
8008 documented. Chuck no longer works on XEmacs so this section is my take | |
8009 on the workings of redisplay. | |
8010 | |
8011 Redisplay happens in three phases: | |
8012 | |
8013 @enumerate | |
8014 @item | |
8015 Determine desired display in area that needs redisplay. | |
8016 Implemented by @code{redisplay.c} | |
8017 @item | |
8018 Compare desired display with current display | |
8019 Implemented by @code{redisplay-output.c} | |
8020 @item | |
8021 Output changes Implemented by @code{redisplay-output.c}, | |
8022 @code{redisplay-x.c}, @code{redisplay-msw.c} and @code{redisplay-tty.c} | |
8023 @end enumerate | |
8024 | |
8025 Steps 1 and 2 are device-independant and relatively complex. Step 3 is | |
8026 mostly device-dependent. | |
8027 | |
8028 Determining the desired display | |
8029 | |
8030 Display attributes are stored in @code{display_line} structures. Each | |
8031 @code{display_line} consists of a set of @code{display_block}'s and each | |
8032 @code{display_block} contains a number of @code{rune}'s. Generally | |
8033 dynarr's of @code{display_line}'s are held by each window representing | |
8034 the current display and the desired display. | |
8035 | |
8036 The @code{display_line} structures are tighly tied to buffers which | |
8037 presents a problem for redisplay as this connection is bogus for the | |
8038 modeline. Hence the @code{display_line} generation routines are | |
8039 duplicated for generating the modeline. This means that the modeline | |
8040 display code has many bugs that the standard redisplay code does not. | |
8041 | |
8042 The guts of @code{display_line} generation are in | |
8043 @code{create_text_block}, which creates a single display line for the | |
8044 desired locale. This incrementally parses the characters on the current | |
8045 line and generates redisplay structures for each. | |
8046 | |
8047 Gutter redisplay is different. Because the data to display is stored in | |
8048 a string we cannot use @code{create_text_block}. Instead we use | |
8049 @code{create_text_string_block} which performs the same function as | |
8050 @code{create_text_block} but for strings. Many of the complexities of | |
8051 @code{create_text_block} to do with cursor handling and selective | |
8052 display have been removed. | |
8053 | |
8054 @node Extents, Faces, The Redisplay Mechanism, Top | |
7501 @chapter Extents | 8055 @chapter Extents |
7502 | 8056 |
7503 @menu | 8057 @menu |
7504 * Introduction to Extents:: Extents are ranges over text, with properties. | 8058 * Introduction to Extents:: Extents are ranges over text, with properties. |
7505 * Extent Ordering:: How extents are ordered internally. | 8059 * Extent Ordering:: How extents are ordered internally. |
7783 Extent fragments have to be very quick to update to a new buffer | 8337 Extent fragments have to be very quick to update to a new buffer |
7784 position when moving linearly through the buffer. They rely on the | 8338 position when moving linearly through the buffer. They rely on the |
7785 stack-of-extents code, which does the heavy-duty algorithmic work of | 8339 stack-of-extents code, which does the heavy-duty algorithmic work of |
7786 determining which extents overly a particular position. | 8340 determining which extents overly a particular position. |
7787 | 8341 |
7788 @node Faces and Glyphs, Specifiers, Extents, Top | 8342 @node Faces, Glyphs, Extents, Top |
7789 @chapter Faces and Glyphs | 8343 @chapter Faces |
7790 | 8344 |
7791 Not yet documented. | 8345 Not yet documented. |
7792 | 8346 |
7793 @node Specifiers, Menus, Faces and Glyphs, Top | 8347 @node Glyphs, Specifiers, Faces, Top |
8348 @chapter Glyphs | |
8349 | |
8350 Glyphs are graphical elements that can be displayed in XEmacs buffers or | |
8351 gutters. We use the term graphical element here in the broadest possible | |
8352 sense since glyphs can be as mundane as text to as arcane as a native | |
8353 tab widget. | |
8354 | |
8355 In XEmacs, glyphs represent the uninstantiated state of graphical | |
8356 elements, i.e. they hold all the information necessary to produce an | |
8357 image on-screen but the image does not exist at this stage. | |
8358 | |
8359 Glyphs are lazily instantiated by calling one of the glyph | |
8360 functions. This usually occurs within redisplay when | |
8361 @code{Fglyph_height} is called. Instantiation causes an image-instance | |
8362 to be created and cached. This cache is on a device basis for all glyphs | |
8363 except glyph-widgets, and on a window basis for glyph widgets. The | |
8364 caching is done by @code{image_instantiate} and is necessary because it | |
8365 is generally possible to display an image-instance in multiple | |
8366 domains. For instance if we create a Pixmap, we can actually display | |
8367 this on multiple windows - even though we only need a single Pixmap | |
8368 instance to do this. If caching wasn't done then it would be necessary | |
8369 to create image-instances for every displayable occurrance of a glyph - | |
8370 and every usage - and this would be extremely memory and cpu intensive. | |
8371 | |
8372 Widget-glyphs (a.k.a native widgets) are not cached in this way. This is | |
8373 because widget-glyph image-instances on screen are toolkit windows, and | |
8374 thus cannot be reused in multiple XEmacs domains. Thus widget-glyphs are | |
8375 cached on a window basis. | |
8376 | |
8377 Any action on a glyph first consults the cache before actually | |
8378 instantiating a widget. | |
8379 | |
8380 @section Widget-Glyphs in the MS-WIndows Environment | |
8381 | |
8382 To Do | |
8383 | |
8384 @section Widget-Glyphs in the X Environment | |
8385 | |
8386 Widget-glyphs under X make heavy use of lwlib for manipulating the | |
8387 native toolkit objects. This is primarily so that different toolkits can | |
8388 be supported for widget-glyphs, just as they are supported for features | |
8389 such as menubars etc. | |
8390 | |
8391 Lwlib is extremely poorly documented and quite hairy so here is my | |
8392 understanding of what goes on. | |
8393 | |
8394 Lwlib maintains a set of widget_instances which mirror the hierarchical | |
8395 state of Xt widgets. I think this is so that widgets can be updated and | |
8396 manipulated generically by the lwlib library. For instance | |
8397 update_one_widget_instance can cope with multiple types of widget and | |
8398 multiple types of toolkit. Each element in the widget hierarchy is updated | |
8399 from its corresponding widget_instance by walking the widget_instance | |
8400 tree recursively. | |
8401 | |
8402 This has desirable properties such as lw_modify_all_widgets which is | |
8403 called from glyphs-x.c and updates all the properties of a widget | |
8404 without having to know what the widget is or what toolkit it is from. | |
8405 Unfortunately this also has hairy properrties such as making the lwlib | |
8406 code quite complex. And of course lwlib has to know at some level what | |
8407 the widget is and how to set its properties. | |
8408 | |
8409 @node Specifiers, Menus, Glyphs, Top | |
7794 @chapter Specifiers | 8410 @chapter Specifiers |
7795 | 8411 |
7796 Not yet documented. | 8412 Not yet documented. |
7797 | 8413 |
7798 @node Menus, Subprocesses, Specifiers, Top | 8414 @node Menus, Subprocesses, Specifiers, Top |