Mercurial > hg > xemacs-beta
comparison man/internals/internals.texi @ 355:182f72e8cd0d r21-1-7
Import from CVS: tag r21-1-7
author | cvs |
---|---|
date | Mon, 13 Aug 2007 10:56:21 +0200 |
parents | 7c94d56991e1 |
children | 8e84bee8ddd0 |
comparison
equal
deleted
inserted
replaced
354:3729bef672e0 | 355:182f72e8cd0d |
---|---|
171 Allocation of Objects in XEmacs Lisp | 171 Allocation of Objects in XEmacs Lisp |
172 | 172 |
173 * Introduction to Allocation:: | 173 * Introduction to Allocation:: |
174 * Garbage Collection:: | 174 * Garbage Collection:: |
175 * GCPROing:: | 175 * GCPROing:: |
176 * Garbage Collection - Step by Step:: | |
176 * Integers and Characters:: | 177 * Integers and Characters:: |
177 * Allocation from Frob Blocks:: | 178 * Allocation from Frob Blocks:: |
178 * lrecords:: | 179 * lrecords:: |
179 * Low-level allocation:: | 180 * Low-level allocation:: |
180 * Pure Space:: | 181 * Pure Space:: |
4172 | 4173 |
4173 @menu | 4174 @menu |
4174 * Introduction to Allocation:: | 4175 * Introduction to Allocation:: |
4175 * Garbage Collection:: | 4176 * Garbage Collection:: |
4176 * GCPROing:: | 4177 * GCPROing:: |
4178 * Garbage Collection - Step by Step:: | |
4177 * Integers and Characters:: | 4179 * Integers and Characters:: |
4178 * Allocation from Frob Blocks:: | 4180 * Allocation from Frob Blocks:: |
4179 * lrecords:: | 4181 * lrecords:: |
4180 * Low-level allocation:: | 4182 * Low-level allocation:: |
4181 * Pure Space:: | 4183 * Pure Space:: |
4515 stack. That involves looking through all of stack memory and treating | 4517 stack. That involves looking through all of stack memory and treating |
4516 anything that looks like a reference to an object as a reference. This | 4518 anything that looks like a reference to an object as a reference. This |
4517 will result in a few objects not getting collected when they should, but | 4519 will result in a few objects not getting collected when they should, but |
4518 it obviates the need for @code{GCPRO}ing, and allows garbage collection | 4520 it obviates the need for @code{GCPRO}ing, and allows garbage collection |
4519 to happen at any point at all, such as during object allocation. | 4521 to happen at any point at all, such as during object allocation. |
4522 | |
4523 @node Garbage Collection - Step by Step | |
4524 @section Garbage Collection - Step by Step | |
4525 @cindex garbage collection step by step | |
4526 | |
4527 @menu | |
4528 * Invocation:: | |
4529 * garbage_collect_1:: | |
4530 * mark_object:: | |
4531 * gc_sweep:: | |
4532 * sweep_lcrecords_1:: | |
4533 * compact_string_chars:: | |
4534 * sweep_strings:: | |
4535 * sweep_bit_vectors_1:: | |
4536 @end menu | |
4537 | |
4538 @node Invocation | |
4539 @subsection Invocation | |
4540 @cindex garbage collection, invocation | |
4541 | |
4542 The first thing that anyone should know about garbage collection is: | |
4543 when and how the garbage collector is invoked. One might think that this | |
4544 could happen every time new memory is allocated, e.g. new objects are | |
4545 created, but this is @emph{not} the case. Instead, we have the following | |
4546 situation: | |
4547 | |
4548 The entry point of any process of garbage collection is an invocation | |
4549 of the function @code{garbage_collect_1} in file @code{alloc.c}. The | |
4550 invocation can occur @emph{explicitly} by calling the function | |
4551 @code{Fgarbage_collect} (in addition this function provides information | |
4552 about the freed memory), or can occur @emph{implicitly} in four different | |
4553 situations: | |
4554 @enumerate | |
4555 @item | |
4556 In function @code{main_1} in file @code{emacs.c}. This function is called | |
4557 at each startup of xemacs. The garbage collection is invoked after all | |
4558 initial creations are completed, but only if a special internal error | |
4559 checking-constant @code{ERROR_CHECK_GC} is defined. | |
4560 @item | |
4561 In function @code{disksave_object_finalization} in file | |
4562 @code{alloc.c}. The only purpose of this function is to clear the | |
4563 objects from memory which need not be stored with xemacs when we dump out | |
4564 an executable. This is only done by @code{Fdump_emacs} or by | |
4565 @code{Fdump_emacs_data} respectively (both in @code{emacs.c}). The | |
4566 actual clearing is accomplished by making these objects unreachable and | |
4567 starting a garbage collection. The function is only used while building | |
4568 xemacs. | |
4569 @item | |
4570 In function @code{Feval / eval} in file @code{eval.c}. Each time the | |
4571 well known and often used function eval is called to evaluate a form, | |
4572 one of the first things that could happen, is a potential call of | |
4573 @code{garbage_collect_1}. There exist three global variables, | |
4574 @code{consing_since_gc} (counts the created cons-cells since the last | |
4575 garbage collection), @code{gc_cons_threshold} (a specified threshold | |
4576 after which a garbage collection occurs) and @code{always_gc}. If | |
4577 @code{always_gc} is set or if the threshold is exceeded, the garbage | |
4578 collection will start. | |
4579 @item | |
4580 In function @code{Ffuncall / funcall} in file @code{eval.c}. This | |
4581 function evaluates calls of elisp functions and works according to | |
4582 @code{Feval}. | |
4583 @end enumerate | |
4584 | |
4585 The upshot is that garbage collection can basically occur everywhere | |
4586 @code{Feval}, respectively @code{Ffuncall}, is used - either directly or | |
4587 through another function. Since calls to these two functions are | |
4588 hidden in various other functions, many calls to | |
4589 @code{garabge_collect_1} are not obviously foreseeable, and therefore | |
4590 unexpected. Instances where they are used that are worth remembering are | |
4591 various elisp commands, as for example @code{or}, | |
4592 @code{and}, @code{if}, @code{cond}, @code{while}, @code{setq}, etc., | |
4593 miscellaneous @code{gui_item_...} functions, everything related to | |
4594 @code{eval} (@code{Feval_buffer}, @code{call0}, ...) and inside | |
4595 @code{Fsignal}. The latter is used to handle signals, as for example the | |
4596 ones raised by every @code{QUITE}-macro triggered after pressing Ctrl-g. | |
4597 | |
4598 @node garbage_collect_1 | |
4599 @subsection @code{garbage_collect_1} | |
4600 @cindex @code{garbage_collect_1} | |
4601 | |
4602 We can now describe exactly what happens after the invocation takes | |
4603 place. | |
4604 @enumerate | |
4605 @item | |
4606 There are several cases in which the garbage collector is left immediately: | |
4607 when we are already garbage collecting (@code{gc_in_progress}), when | |
4608 the garbage collection is somehow forbidden | |
4609 (@code{gc_currently_forbidden}), when we are currently displaying something | |
4610 (@code{in_display}) or when we are preparing for the armageddon of the | |
4611 whole system (@code{preparing_for_armageddon}). | |
4612 @item | |
4613 Next the correct frame in which to put | |
4614 all the output occurring during garbage collecting is determined. In | |
4615 order to be able to restore the old display's state after displaying the | |
4616 message, some data about the current cursor position has to be | |
4617 saved. The variables @code{pre_gc_curser} and @code{cursor_changed} take | |
4618 care of that. | |
4619 @item | |
4620 The state of @code{gc_currently_forbidden} must be restored after | |
4621 the garbage collection, no matter what happens during the process. We | |
4622 accomplish this by @code{record_unwind_protect}ing the suitable function | |
4623 @code{restore_gc_inhibit} together with the current value of | |
4624 @code{gc_currently_forbidden}. | |
4625 @item | |
4626 If we are concurrently running an interactive xemacs session, the next step | |
4627 is simply to show the garbage collector's cursor/message. | |
4628 @item | |
4629 The following steps are the intrinsic steps of the garbage collector, | |
4630 therefore @code{gc_in_progress} is set. | |
4631 @item | |
4632 For debugging purposes, it is possible to copy the current C stack | |
4633 frame. However, this seems to be a currently unused feature. | |
4634 @item | |
4635 Before actually starting to go over all live objects, references to | |
4636 objects that are no longer used are pruned. We only have to do this for events | |
4637 (@code{clear_event_resource}) and for specifiers | |
4638 (@code{cleanup_specifiers}). | |
4639 @item | |
4640 Now the mark phase begins and marks all accessible elements. In order to | |
4641 start from | |
4642 all slots that serve as roots of accessibility, the function | |
4643 @code{mark_object} is called for each root individually to go out from | |
4644 there to mark all reachable objects. All roots that are traversed are | |
4645 shown in their processed order: | |
4646 @itemize @bullet | |
4647 @item | |
4648 all constant symbols and static variables that are registered via | |
4649 @code{staticpro}@ in the array @code{staticvec}. | |
4650 @xref{Adding Global Lisp Variables}. | |
4651 @item | |
4652 all Lisp objects that are created in C functions and that must be | |
4653 protected from freeing them. They are registered in the global | |
4654 list @code{gcprolist}. | |
4655 @xref{GCPROing}. | |
4656 @item | |
4657 all local variables (i.e. their name fields @code{symbol} and old | |
4658 values @code{old_values}) that are bound during the evaluation by the Lisp | |
4659 engine. They are stored in @code{specbinding} structs pushed on a stack | |
4660 called @code{specpdl}. | |
4661 @xref{Dynamic Binding; The specbinding Stack; Unwind-Protects}. | |
4662 @item | |
4663 all catch blocks that the Lisp engine encounters during the evaluation | |
4664 cause the creation of structs @code{catchtag} inserted in the list | |
4665 @code{catchlist}. Their tag (@code{tag}) and value (@code{val} fields | |
4666 are freshly created objects and therefore have to be marked. | |
4667 @xref{Catch and Throw}. | |
4668 @item | |
4669 every function application pushes new structs @code{backtrace} | |
4670 on the call stack of the Lisp engine (@code{backtrace_list}). The unique | |
4671 parts that have to be marked are the fields for each function | |
4672 (@code{function}) and all their arguments (@code{args}). | |
4673 @xref{Evaluation}. | |
4674 @item | |
4675 all objects that are used by the redisplay engine that must not be freed | |
4676 are marked by a special function called @code{mark_redisplay} (in | |
4677 @code{redisplay.c}). | |
4678 @item | |
4679 all objects created for profiling purposes are allocated by C functions | |
4680 instead of using the lisp allocation mechanisms. In order to receive the | |
4681 right ones during the sweep phase, they also have to be marked | |
4682 manually. That is done by the function @code{mark_profiling_info} | |
4683 @end itemize | |
4684 @item | |
4685 Hash tables in Xemacs belong to a kind of special objects that | |
4686 make use of a concept often called 'weak pointers'. | |
4687 To make a long story short, these kind of pointers are not followed | |
4688 during the estimation of the live objects during garbage collection. | |
4689 Any object referenced only by weak pointers is collected | |
4690 anyway, and the reference to it is cleared. In hash tables there are | |
4691 different usage patterns of them, manifesting in different types of hash | |
4692 tables, namely 'non-weak', 'weak', 'key-weak' and 'value-weak' | |
4693 (internally also 'key-car-weak' and 'value-car-weak') hash tables, each | |
4694 clearing entries depending on different conditions. More information can | |
4695 be found in the documentation to the function @code{make-hash-table}. | |
4696 | |
4697 Because there are complicated dependency rules about when and what to | |
4698 mark while processing weak hash tables, the standard @code{marker} | |
4699 method is only active if it is marking non-weak hash tables. As soon as | |
4700 a weak component is in the table, the hash table entries are ignored | |
4701 while marking. Instead their marking is done each separately by the | |
4702 function @code{finish_marking_weak_hash_tables}. This function iterates | |
4703 over each hash table entry @code{hentries} for each weak hash table in | |
4704 @code{Vall_weak_hash_tables}. Depending on the type of a table, the | |
4705 appropriate action is performed. | |
4706 If a table is acting as @code{HASH_TABLE_KEY_WEAK}, and a key already marked, | |
4707 everything reachable from the @code{value} component is marked. If it is | |
4708 acting as a @code{HASH_TABLE_VALUE_WEAK} and the value component is | |
4709 already marked, the marking starts beginning only from the | |
4710 @code{key} component. | |
4711 If it is a @code{HASH_TABLE_KEY_CAR_WEAK} and the car | |
4712 of the key entry is already marked, we mark both the @code{key} and | |
4713 @code{value} components. | |
4714 Finally, if the table is of the type @code{HASH_TABLE_VALUE_CAR_WEAK} | |
4715 and the car of the value components is already marked, again both the | |
4716 @code{key} and the @code{value} components get marked. | |
4717 | |
4718 Again, there are lists with comparable properties called weak | |
4719 lists. There exist different peculiarities of their types called | |
4720 @code{simple}, @code{assoc}, @code{key-assoc} and | |
4721 @code{value-assoc}. You can find further details about them in the | |
4722 description to the function @code{make-weak-list}. The scheme of their | |
4723 marking is similar: all weak lists are listed in @code{Qall_weak_lists}, | |
4724 therefore we iterate over them. The marking is advanced until we hit an | |
4725 already marked pair. Then we know that during a former run all | |
4726 the rest has been marked completely. Again, depending on the special | |
4727 type of the weak list, our jobs differ. If it is a @code{WEAK_LIST_SIMPLE} | |
4728 and the elem is marked, we mark the @code{cons} part. If it is a | |
4729 @code{WEAK_LIST_ASSOC} and not a pair or a pair with both marked car and | |
4730 cdr, we mark the @code{cons} and the @code{elem}. If it is a | |
4731 @code{WEAK_LIST_KEY_ASSOC} and not a pair or a pair with a marked car of | |
4732 the elem, we mark the @code{cons} and the @code{elem}. Finally, if it is | |
4733 a @code{WEAK_LIST_VALUE_ASSOC} and not a pair or a pair with a marked | |
4734 cdr of the elem, we mark both the @code{cons} and the @code{elem}. | |
4735 | |
4736 Since, by marking objects in reach from weak hash tables and weak lists, | |
4737 other objects could get marked, this perhaps implies further marking of | |
4738 other weak objects, both finishing functions are redone as long as | |
4739 yet unmarked objects get freshly marked. | |
4740 | |
4741 @item | |
4742 After completing the special marking for the weak hash tables and for the weak | |
4743 lists, all entries that point to objects that are going to be swept in | |
4744 the further process are useless, and therefore have to be removed from | |
4745 the table or the list. | |
4746 | |
4747 The function @code{prune_weak_hash_tables} does the job for weak hash | |
4748 tables. Totally unmarked hash tables are removed from the list | |
4749 @code{Vall_weak_hash_tables}. The other ones are treated more carefully | |
4750 by scanning over all entries and removing one as soon as one of | |
4751 the components @code{key} and @code{value} is unmarked. | |
4752 | |
4753 The same idea applies to the weak lists. It is accomplished by | |
4754 @code{prune_weak_lists}: An unmarked list is pruned from | |
4755 @code{Vall_weak_lists} immediately. A marked list is treated more | |
4756 carefully by going over it and removing just the unmarked pairs. | |
4757 | |
4758 @item | |
4759 The function @code{prune_specifiers} checks all listed specifiers held | |
4760 in @code{Vall_speficiers} and removes the ones from the lists that are | |
4761 unmarked. | |
4762 | |
4763 @item | |
4764 All syntax tables are stored in a list called | |
4765 @code{Vall_syntax_tables}. The function @code{prune_syntax_tables} walks | |
4766 through it and unlinks the tables that are unmarked. | |
4767 | |
4768 @item | |
4769 Next, we will attack the complete sweeping - the function | |
4770 @code{gc_sweep} which holds the predominance. | |
4771 @item | |
4772 First, all the variables with respect to garbage collection are | |
4773 reset. @code{consing_since_gc} - the counter of the created cells since | |
4774 the last garbage collection - is set back to 0, and | |
4775 @code{gc_in_progress} is not @code{true} anymore. | |
4776 @item | |
4777 In case the session is interactive, the displayed cursor and message are | |
4778 removed again. | |
4779 @item | |
4780 The state of @code{gc_inhibit} is restored to the former value by | |
4781 unwinding the stack. | |
4782 @item | |
4783 A small memory reserve is always held back that can be reached by | |
4784 @code{breathing_space}. If nothing more is left, we create a new reserve | |
4785 and exit. | |
4786 @end enumerate | |
4787 | |
4788 @node mark_object | |
4789 @subsection @code{mark_object} | |
4790 @cindex @code{mark_object} | |
4791 | |
4792 The first thing that is checked while marking an object is whether the | |
4793 object is a real Lisp object @code{Lisp_Type_Record} or just an integer | |
4794 or a character. Integers and characters are the only two types that are | |
4795 stored directly - without another level of indirection, and therefore they | |
4796 donīt have to be marked and collected. | |
4797 @xref{How Lisp Objects Are Represented in C}. | |
4798 | |
4799 The second case is the one we have to handle. It is the one when we are | |
4800 dealing with a pointer to a Lisp object. But, there exist also three | |
4801 possibilities, that prevent us from doing anything while marking: The | |
4802 object is read only which prevents it from being garbage collected, | |
4803 i.e. marked (@code{C_READONLY_RECORD_HEADER}). The object in question is | |
4804 already marked, and need not be marked for the second time (checked by | |
4805 @code{MARKED_RECORD_HEADER_P}). If it is a special, unmarkable object | |
4806 (@code{UNMARKABLE_RECORD_HEADER_P}, apparently, these are objects that | |
4807 sit in some CONST space, and can therefore not be marked, see | |
4808 @code{this_one_is_unmarkable} in @code{alloc.c}). | |
4809 | |
4810 Now, the actual marking is feasible. We do so by once using the macro | |
4811 @code{MARK_RECORD_HEADER} to mark the object itself (actually the | |
4812 special flag in the lrecord header), and calling its special marker | |
4813 "method" @code{marker} if available. The marker method marks every | |
4814 other object that is in reach from our current object. Note, that these | |
4815 marker methods should not call @code{mark_object} recursively, but | |
4816 instead should return the next object from where further marking has to | |
4817 be performed. | |
4818 | |
4819 In case another object was returned, as mentioned before, we reiterate | |
4820 the whole @code{mark_object} process beginning with this next object. | |
4821 | |
4822 @node gc_sweep | |
4823 @subsection @code{gc_sweep} | |
4824 @cindex @code{gc_sweep} | |
4825 | |
4826 The job of this function is to free all unmarked records from memory. As | |
4827 we know, there are different types of objects implemented and managed, and | |
4828 consequently different ways to free them from memory. | |
4829 @xref{Introduction to Allocation}. | |
4830 | |
4831 We start with all objects stored through @code{lcrecords}. All | |
4832 bulkier objects are allocated and handled using that scheme of | |
4833 @code{lcrecords}. Each object is @code{malloc}ed separately | |
4834 instead of placing it in one of the contiguous frob blocks. All types | |
4835 that are currently stored | |
4836 using @code{lcrecords}īs @code{alloc_lcrecord} and | |
4837 @code{make_lcrecord_list} are the types: vectors, buffers, | |
4838 char-table, char-table-entry, console, weak-list, database, device, | |
4839 ldap, hash-table, command-builder, extent-auxiliary, extent-info, face, | |
4840 coding-system, frame, image-instance, glyph, popup-data, gui-item, | |
4841 keymap, charset, color_instance, font_instance, opaque, opaque-list, | |
4842 process, range-table, specifier, symbol-value-buffer-local, | |
4843 symbol-value-lisp-magic, symbol-value-varalias, toolbar-button, | |
4844 tooltalk-message, tooltalk-pattern, window, and window-configuration. We | |
4845 take care of them in the fist place | |
4846 in order to be able to handle and to finalize items stored in them more | |
4847 easily. The function @code{sweep_lcrecords_1} as described below is | |
4848 doing the whole job for us. | |
4849 For a description about the internals: @xref{lrecords}. | |
4850 | |
4851 Our next candidates are the other objects that behave quite differently | |
4852 than everything else: the strings. They consists of two parts, a | |
4853 fixed-size portion (@code{struct Lisp_string}) holding the string's | |
4854 length, its property list and a pointer to the second part, and the | |
4855 actual string data, which is stored in string-chars blocks comparable to | |
4856 frob blocks. In this block, the data is not only freed, but also a | |
4857 compression of holes is made, i.e. all strings are relocated together. | |
4858 @xref{String}. This compacting phase is performed by the function | |
4859 @code{compact_string_chars}, the actual sweeping by the function | |
4860 @code{sweep_strings} is described below. | |
4861 | |
4862 After that, the other types are swept step by step using functions | |
4863 @code{sweep_conses}, @code{sweep_bit_vectors_1}, | |
4864 @code{sweep_compiled_functions}, @code{sweep_floats}, | |
4865 @code{sweep_symbols}, @code{sweep_extents}, @code{sweep_markers} and | |
4866 @code{sweep_extents}. They are the fixed-size types cons, floats, | |
4867 compiled-functions, symbol, marker, extent, and event stored in | |
4868 so-called "frob blocks", and therefore we can basically do the same on | |
4869 every type objects, using the same macros, especially defined only to | |
4870 handle everything with respect to fixed-size blocks. The only fixed-size | |
4871 type that is not handled here are the fixed-size portion of strings, | |
4872 because we took special care of them earlier. | |
4873 | |
4874 The only big exceptions are bit vectors stored differently and | |
4875 therefore treated differently by the function @code{sweep_bit_vectors_1} | |
4876 described later. | |
4877 | |
4878 At first, we need some brief information about how | |
4879 these fixed-size types are managed in general, in order to understand | |
4880 how the sweeping is done. They have all a fixed size, and are therefore | |
4881 stored in big blocks of memory - allocated at once - that can hold a | |
4882 certain amount of objects of one type. The macro | |
4883 @code{DECLARE_FIXED_TYPE_ALLOC} creates the suitable structures for | |
4884 every type. More precisely, we have the block struct | |
4885 (holding a pointer to the previous block @code{prev} and the | |
4886 objects in @code{block[]}), a pointer to current block | |
4887 (@code{current_..._block)}) and its last index | |
4888 (@code{current_..._block_index}), and a pointer to the free list that | |
4889 will be created. Also a macro @code{FIXED_TYPE_FROM_BLOCK} plus some | |
4890 related macros exists that are used to obtain a new object, either from | |
4891 the free list @code{ALLOCATE_FIXED_TYPE_1} if there is an unused object | |
4892 of that type stored or by allocating a completely new block using | |
4893 @code{ALLOCATE_FIXED_TYPE_FROM_BLOCK}. | |
4894 | |
4895 The rest works as follows: all of them define a | |
4896 macro @code{UNMARK_...} that is used to unmark the object. They define a | |
4897 macro @code{ADDITIONAL_FREE_...} that defines additional work that has | |
4898 to be done when converting an object from in use to not in use (so far, | |
4899 only markers use it in order to unchain them). Then, they all call | |
4900 the macro @code{SWEEP_FIXED_TYPE_BLOCK} instantiated with their type name | |
4901 and their struct name. | |
4902 | |
4903 This call in particular does the following: we go over all blocks | |
4904 starting with the current moving towards the oldest. | |
4905 For each block, we look at every object in it. If the object already | |
4906 freed (checked with @code{FREE_STRUCT_P} using the first pointer of the | |
4907 object), or if it is | |
4908 set to read only (@code{C_READONLY_RECORD_HEADER_P}, nothing must be | |
4909 done. If it is unmarked (checked with @code{MARKED_RECORD_HEADER_P}), it | |
4910 is put in the free list and set free (using the macro | |
4911 @code{FREE_FIXED_TYPE}, otherwise it stays in the block, but is unmarked | |
4912 (by @code{UNMARK_...}). While going through one block, we note if the | |
4913 whole block is empty. If so, the whole block is freed (using | |
4914 @code{xfree}) and the free list state is set to the state it had before | |
4915 handling this block. | |
4916 | |
4917 @node sweep_lcrecords_1 | |
4918 @subsection @code{sweep_lcrecords_1} | |
4919 @cindex @code{sweep_lcrecords_1} | |
4920 | |
4921 After nullifying the complete lcrecord statistics, we go over all | |
4922 lcrecords two separate times. They are all chained together in a list with | |
4923 a head called @code{all_lcrecords}. | |
4924 | |
4925 The first loop calls for each object its @code{finalizer} method, but only | |
4926 in the case that it is not read only | |
4927 (@code{C_READONLY_RECORD_HEADER_P)}, it is not already marked | |
4928 (@code{MARKED_RECORD_HEADER_P}), it is not already in a free list (list of | |
4929 freed objects, field @code{free}) and finally it owns a finalizer | |
4930 method. | |
4931 | |
4932 The second loop actually frees the appropriate objects again by iterating | |
4933 through the whole list. In case an object is read only or marked, it | |
4934 has to persist, otherwise it is manually freed by calling | |
4935 @code{xfree}. During this loop, the lcrecord statistics are kept up to | |
4936 date by calling @code{tick_lcrecord_stats} with the right arguments, | |
4937 | |
4938 @node compact_string_chars | |
4939 @subsection @code{compact_string_chars} | |
4940 @cindex @code{compact_string_chars} | |
4941 | |
4942 The purpose of this function is to compact all the data parts of the | |
4943 strings that are held in so-called @code{string_chars_block}, i.e. the | |
4944 strings that do not exceed a certain maximal length. | |
4945 | |
4946 The procedure with which this is done is as follows. We are keeping two | |
4947 positions in the @code{string_chars_block}s using two pointer/integer | |
4948 pairs, namely @code{from_sb}/@code{from_pos} and | |
4949 @code{to_sb}/@code{to_pos}. They stand for the actual positions, from | |
4950 where to where, to copy the actually handled string. | |
4951 | |
4952 While going over all chained @code{string_char_block}s and their held | |
4953 strings, staring at @code{first_string_chars_block}, both pointers | |
4954 are advanced and eventually a string is copied from @code{from_sb} to | |
4955 @code{to_sb}, depending on the status of the pointed at strings. | |
4956 | |
4957 More precisely, we can distinguish between the following actions. | |
4958 @itemize @bullet | |
4959 @item | |
4960 The string at @code{from_sb}'s position could be marked as free, which | |
4961 is indicated by an invalid pointer to the pointer that should point back | |
4962 to the fixed size string object, and which is checked by | |
4963 @code{FREE_STRUCT_P}. In this case, the @code{from_sb}/@code{from_pos} | |
4964 is advanced to the next string, and nothing has to be copied. | |
4965 @item | |
4966 Also, if a string object itself is unmarked, nothing has to be | |
4967 copied. We likewise advance the @code{from_sb}/@code{from_pos} | |
4968 pair as described above. | |
4969 @item | |
4970 In all other cases, we have a marked string at hand. The string data | |
4971 must be moved from the from-position to the to-position. In case | |
4972 there is not enough space in the actual @code{to_sb}-block, we advance | |
4973 this pointer to the beginning of the next block before copying. In case the | |
4974 from and to positions are different, we perform the | |
4975 actual copying using the library function @code{memmove}. | |
4976 @end itemize | |
4977 | |
4978 After compacting, the pointer to the current | |
4979 @code{string_chars_block}, sitting in @code{current_string_chars_block}, | |
4980 is reset on the last block to which we moved a string, | |
4981 i.e. @code{to_block}, and all remaining blocks (we know that they just | |
4982 carry garbage) are explicitly @code{xfree}d. | |
4983 | |
4984 @node sweep_strings | |
4985 @subsection @code{sweep_strings} | |
4986 @cindex @code{sweep_strings} | |
4987 | |
4988 The sweeping for the fixed sized string objects is essentially exactly | |
4989 the same as it is for all other fixed size types. As before, the freeing | |
4990 into the suitable free list is done by using the macro | |
4991 @code{SWEEP_FIXED_SIZE_BLOCK} after defining the right macros | |
4992 @code{UNMARK_string} and @code{ADDITIONAL_FREE_string}. These two | |
4993 definitions are a little bit special compared to the ones used | |
4994 for the other fixed size types. | |
4995 | |
4996 @code{UNMARK_string} is defined the same way except some additional code | |
4997 used for updating the bookkeeping information. | |
4998 | |
4999 For strings, @code{ADDITIONAL_FREE_string} has to do something in | |
5000 addition: in case, the string was not allocated in a | |
5001 @code{string_chars_block} because it exceeded the maximal length, and | |
5002 therefore it was @code{malloc}ed separately, we know also @code{xfree} | |
5003 it explicitly. | |
5004 | |
5005 @node sweep_bit_vectors_1 | |
5006 @subsection @code{sweep_bit_vectors_1} | |
5007 @cindex @code{sweep_bit_vectors_1} | |
5008 | |
5009 Bit vectors are also one of the rare types that are @code{malloc}ed | |
5010 individually. Consequently, while sweeping, all further needless | |
5011 bit vectors must be freed by hand. This is done, as one might imagine, | |
5012 the expected way: since they are all registered in a list called | |
5013 @code{all_bit_vectors}, all elements of that list are traversed, | |
5014 all unmarked bit vectors are unlinked by calling @code{xfree} and all of | |
5015 them become unmarked. | |
5016 In addition, the bookkeeping information used for garbage | |
5017 collector's output purposes is updated. | |
4520 | 5018 |
4521 @node Integers and Characters | 5019 @node Integers and Characters |
4522 @section Integers and Characters | 5020 @section Integers and Characters |
4523 | 5021 |
4524 Integer and character Lisp objects are created from integers using the | 5022 Integer and character Lisp objects are created from integers using the |