# HG changeset patch # User Marcus Crestani # Date 1278346659 -7200 # Node ID 2cc24c69446c68ed6f7915e0b8144553c871b1a6 # Parent 6466bc9ebf15c4feacba67c6ade3f4da905cbd65 Document the new allocator and the new garbage collector in gc.c and mc-alloc.c. diff -r 6466bc9ebf15 -r 2cc24c69446c src/ChangeLog --- a/src/ChangeLog Wed Jun 23 08:04:18 2010 -0400 +++ b/src/ChangeLog Mon Jul 05 18:17:39 2010 +0200 @@ -1,3 +1,9 @@ +2010-06-05 Marcus Crestani + + * gc.c: + * mc-alloc.c: + Document the new allocator and the new garbage collector. + 2010-06-21 Jeff Sparkes * console-x-impl.h (DEVICE_X_XFTDRAW): Define, instead of diff -r 6466bc9ebf15 -r 2cc24c69446c src/gc.c --- a/src/gc.c Wed Jun 23 08:04:18 2010 -0400 +++ b/src/gc.c Mon Jul 05 18:17:39 2010 +0200 @@ -21,6 +21,318 @@ /* Synched up with: Not in FSF. */ +/* + Garbage Collectors in XEmacs + + Currently, XEmacs comes with two garbage collectors: + + - The "old garbage collector": a simple mark and sweep collector, + its implementation is mainly spread out over gc.c and alloc.c. + It is used by the default configuration or if you configure + `--with-newgc=no'. + + - The "new garbage collector": an incremental mark and sweep collector, + its implementation is in gc.c. It is used if you configure + `--with-newgc'. It comes with a new allocator, see mc-alloc.c, and + with the KKCC mark algorith, see below. + + Additionally, the old garbage collectors comes with two mark algorithms: + + - The "recursive mark algorithm" marks live objects by recursively + calling mark_* functions on live objects. It is the default mark + algorithm of the old garbage collector. + + - The "KKCC mark algorithm" uses an explicit stack that to keep + track of the current progress of traversal and uses memory layout + descriptions (that are also used by the portable dumper) instead + of the mark_* functions. The old garbage collector uses it if + you configure `--with-kkcc'. It is the default and only mark + algorithm of the new garbage collector. + + + The New Incremental Garbage Collector + + An incremental garbage collector keeps garbage collection pause + times short by interleaving small amounts of collection work with + program execution, it does that by instrumenting write barrier + algorithms that essentially allow interrupting the mark phase. + + + Write Barrier + + A write barrier is the most important prerequisite for fancy + garbage collection techniques. We implement a "Virtual Dirty Bit + (short: vdb) Write Barrier" that makes uses of the operating + system's memory-protection mechanisms: The write barrier + write-protects memory pages containing heap objects. If the + mutator tries to modify these objects by writing into the + write-protected page, the operating system generates a fault. The + write barrier catches this fault, reads out the error-causing + address and can thus identify the updated object and page. + + Not all environments and operating systems provide the mechanism to + write-protect memory, catch resulting write faults, and read out + the faulting address. But luckily, most of today's operating + systems provide the features needed for the write-barrier + implementation. Currently, XEmacs includes write-barrier + implementations for the following platforms: + + - POSIX-compliant platforms like up-to-date UNIX, Linux, Solaris, + etc. use the system call `mprotect' for memory protection, + `sigaction' for signal handling and get the faulting address from + `struct siginfo'. See file vdb-posix.c. + + - Mach-based systems like Mac OS X use "Mach Exception Handlers". + See file vdb-mach.c. + + - Windows systems like native Windows and Cygwin use Microsoft's + so-called "Structured Exception Handling". See file vdb-win32.c. + + The configure script determines which write barrier implementation + to use for a system. If no write barrier implementation is working + on that system, a fall-back "fake" implementation is used: This + implementation simply turns of the incremental write barrier at + runtime and does not allow any incremental collection (see + vdb-fake.c). The garbage collector then acts like a traditional + mark-and-sweep garbage collector. Generally, the incremental + garbage collector can be turned of at runtime by the user or by + applications, see below. + + + Memory Protection and Object Layout + + Implementations of a memory-protection mechanism may restrict the + size and the alignment of the memory region to be on page-size + boundaries. All objects subject to be covered by the write barrier + have to be allocated on logical memory pages, so that they meet the + requirement to be write-protected. The new allocator mc-alloc is + aware of a system page size---it allocates all Lisp objects on + logical memory pages and is therefore defaulted to on when the new + garbage collector is enabled. + + Unfortunately, the Lisp object layout that works with the old + collector leads to holes in the write barrier: Not all data + structures containing pointers to Lisp objects are allocated on the + Lisp heap. Some Lisp objects do not carry all their information in + the object itself. External parts are kept in separately allocated + memory blocks that are not managed by the new Lisp allocator. + Examples for these objects are hash tables and dynamic arrays, two + objects that can dynamically grow and shrink. The separate memory + blocks are not guaranteed to reside on page boundaries, and thus + cannot be watched by the write barrier. + + Moreover, the separate parts can contain live pointers to other Lisp + objects. These pointers are not covered by the write barrier and + modifications by the client during garbage collection do escape. In + this case, the client changes the connectivity of the reachability + graph behind the collector's back, which eventually leads to + erroneous collection of live objects. To solve this problem, I + transformed the separately allocated parts to fully qualified Lisp + objects that are managed by the allocator and thus are covered by + the write barrier. This also removes a lot of special allocation + and removal code for the out-sourced parts. Generally, allocating + all data structures that contain pointers to Lisp objects on one + heap makes the whole memory layout more consistent. + + + Debugging + + The virtual-dirty-bit write barrier provokes signals on purpose, + namely SIGSEGV and SIGBUS. When debugging XEmacs with this write + barrier running, the debugger always breaks whenever a signal + occurs. This behavior is generally desired: A debugger has to break + on signals, to allow the user to examine the cause of the + signal---especially for illegal memory access, which is a common + programming error. But the debugger should not break for signals + caused by the write barrier. Therefore, most debuggers provide the + ability to turn of their fault handling for specific signals. The + configure script generates the debugger's settings .gdbinit and + .dbxrc, adding code to turn of signal handling for SIGSEGV and + SIGBUS, if the new garbage collector is used. + + But what happens if a bug in XEmacs causes an illegal memory access? + To maintain basic debugging abilities, we use another signal: First, + the write-barrier signal handler has to determine if the current + error situation is caused by the write-barrier memory protection or + not. Therefore, the signal handler checks if the faulting address + has been write-protected before. If it has not, the fault is caused + by a bug; the debugger has to break in this situation. To achieve + this, the signal handler raises SIGABRT to abort the program. Since + SIGABRT is not masked out by the debugger, XEmacs aborts and allows + the user to examine the problem. + + + Incremental Garbage Collection + + The new garbage collector is still a mark-and-sweep collector, but + now the mark phase no longer runs in one atomic action, it is + interleaved with program execution. The incremental garbage + collector needs an explicit mark stack to store the state of the + incremental traversal: the KKCC mark algorithm is a prerequisite and + is enabled by default when the new garbage collector is on. + + Garbage collection is invoked as before: After `gc-cons-threshold' + bytes have been allocated since the last garbage collection (or + after `gc-cons-percentage' percentage of the total amount of memory + used for Lisp data has been allocated since the last garbage + collection) a collection starts. After some initialization, the + marking begins. + + The variable `gc-incremental-traversal-threshold' contains how many + steps of incremental work have to be executed in one incremental + traversal cycle. After that many steps have been made, the mark + phase is interrupted and the client resumes. Now, the Lisp memory + is write-protected and the write barrier records modified objects. + Incremental traversal is resumed after + `gc-cons-incremental-threshold' bytes have been allocated since the + interruption of garbage collection. Then, the objects recorded by + the write-barrier have to be re-examined by the traversal, i.e. they + are re-pushed onto the mark stack and processed again. Once the + mark stack is empty, the traversal is done. + + A full incremental collection is slightly slower than a full garbage + collection before: There is an overhead for storing pointers into + objects when the write barrier is running, and an overhead for + repeated traversal of modified objects. However, the new + incremental garbage collector reduces client pause times to + one-third, so even when a garbage collection is running, XEmacs + stays reactive. + + + Tricolor Marking: White, Black, and Grey Mark Bits + + Garbage collection traverses the graph of reachable objects and + colors them. The objects subject to garbage collection are white at + the beginning. By the end of the collection, those that will be + retained are colored black. When there are no reachable objects left + to blacken, the traversal of live data structures is finished. In + traditional mark-and-sweep collectors, this black and white coloring + is sufficient. + + In an incremental collector, the intermediate state of the traversal + is im- portant because of ongoing mutator activity: the mutator + cannot be allowed to change things in such way that the collector + will fail to find all reachable objects. To understand and prevent + such interactions between the mutator and the collector, it is + useful to introduce a third color, grey. + + Grey objects have been reached by the traversal, but its descendants + may not have been. White objects are changed to grey when they are + reached by the traversal. Grey objects mark the current state of the + traversal: traversal pro- ceeds by processing the grey objects. The + KKCC mark stack holds all the currently grey-colored objects. + Processing a grey object means following its outgoing pointers, and + coloring it black afterwards. + + Intuitively, the traversal proceeds in a wavefront of grey objects + that separates the unreached objects, which are colored white, from + the already processed black objects. + + The allocator takes care of storing the mark bits: The mark bits are + kept in a tree like structure, for details see mc-alloc.c. + + + Internal States of the Incremental Garbage Collector + + To keep track of its current state, the collector holds it's current + phase in the global `gc_state' variable. A collector phase is one + of the following: + + NONE No incremental or full collection is currently running. + + INIT_GC The collector prepares for a new collection, e.g. sets some + global variables. + + PUSH_ROOT_SET The collector pushes the root set on the mark stack + to start the traversal of live objects. + + MARK The traversal of live objects colors the reachable objects + white, grey, or black, according to their lifeness. The mark + phase can be interrupted by the incremental collection algorithm: + Before the client (i.e. the non collector part of XEmacs) resumes, + the write barrier has to be installed so that the collector knows + what objects get modified during the collector's pause. + Installing a write barrier means protecting pages that only + contain black objects and recording write access to these objects. + Pages with white or grey objects do not need to be protected, + since these pages are due to marking anyways when the collector + resumes. Once the collector resumes, it has to re-scan all + objects that have been modified during the collector pause and + have been caught by the write barrier. The mark phase is done when + there are no more grey objects on the heap, i.e. the KKCC mark stack + is empty. + + REPUSH_ROOT_SET After the mark phase is done, the collector has to + traverse the root set pointers again, since modifications to the + objects in the root set can not all be covered by the write barrier + (e.g. root set objects that are on the call stack). Therefore, the + collector has to traverse the root set again without interruption. + + FINISH_MARK After the mark phase is finished, some objects with + special liveness semantics have to be treated separately, e.g. + ephemerons and the various flavors of weak objects. + + FINALIZE The collector registers all objects that have finalizers + for finalization. Finalizations happens asynchronously sometimes + after the collection has finished. + + SWEEP The allocator scans the entire heap and frees all white marked + objects. The freed memory is recycled and can be re-used for future + allocations. The sweep phase is carried out atomically. + + FINISH_GC The collector cleans up after the garbage collection by + resetting some global variables. + + + Lisp Interface + + The new garbage collector can be accessed directly from Emacs Lisp. + Basically, two functions invoke the garbage collector: + + (gc-full) starts a full garbage collection. If an incremental + garbage collection is already running, it is finished without + further interruption. This function guarantees that unused + objects have been freed when it returns. + + (gc-incremental) starts an incremental garbage collection. If an + incremental garbage collection is already running, the next cycle + of incremental traversal is started. The garbage collection is + finished if the traversal completes. Note that this function does + not necessarily free any memory. It only guarantees that the + traversal of the heap makes progress. + + The old garbage collector uses the function (garbage-collect) to + invoke a garbage collection. This function is still in use by some + applications that explicitly want to invoke a garbage collection. + Since these applications may expect that unused memory has really + been freed when (garbage-collect) returns, it maps to (gc-full). + + The new garbage collector is highly customizable during runtime; it + can even be switched back to the traditional mark-and-sweep garbage + collector: The variable allow-incremental-gc controls whether + garbage collections may be interrupted or if they have to be carried + out in one atomic action. Setting allow-incremental-gc to nil + prevents incremental garbage collection, and the garbage collector + then only does full collects, even if (gc-incremental) is called. + Non-nil allows incremental garbage collection. + + This way applications can freely decide what garbage collection + algorithm is best for the upcoming memory usage. How frequently a + garbage collection occurs and how much traversal work is done in one + incremental cycle can also be modified during runtime. See + + M-x customize RET alloc RET + + for an overview of all settings. + + + More Information + + More details can be found in + http://crestani.de/xemacs/pdf/thesis-newgc.pdf . + +*/ + #include #include "lisp.h" @@ -50,8 +362,14 @@ #include "vdb.h" +/* Number of bytes of consing since gc before a full gc should happen. */ #define GC_CONS_THRESHOLD 2000000 + +/* Number of bytes of consing since gc before another cycle of the gc + should happen in incremental mode. */ #define GC_CONS_INCREMENTAL_THRESHOLD 200000 + +/* Number of elements marked in one cycle of incremental GC. */ #define GC_INCREMENTAL_TRAVERSAL_THRESHOLD 100000 /* Number of bytes of consing done since the last GC. */ diff -r 6466bc9ebf15 -r 2cc24c69446c src/mc-alloc.c --- a/src/mc-alloc.c Wed Jun 23 08:04:18 2010 -0400 +++ b/src/mc-alloc.c Mon Jul 05 18:17:39 2010 +0200 @@ -21,6 +21,227 @@ /* Synched up with: Not in FSF. */ +/* + The New Allocator + + The ideas and algorithms are based on the allocator of the + Boehm-Demers-Weiser conservative garbage collector. See + http://www.hpl.hp.com/personal/Hans_ Boehm/gc/index.html. + + The new allocator is enabled when the new garbage collector + is enabled (with `--with-newgc'). The implementation of + the new garbage collector is in gc.c. + + The new allocator takes care of: + - allocating objects in a write-barrier-friendly way + - manage object's mark bits + + Three-Level Allocation + + The new allocator efficiently manages the allocation of Lisp + objects by minimizing the number of times malloc() and free() are + called. The allocation process has three layers of abstraction: + + 1. It allocates memory in very large chunks called heap sections. + + 2. The heap sections are subdivided into pages. The page size is + determined by the constant PAGE_SIZE. It holds the size of a page + in bytes. + + 3. One page consists of one or more cells. Each cell represents + a memory location for an object. The cells on one page all have + the same size, thus every page only contains equal-sized + objects. + + If an object is bigger than page size, it is allocated on a + multi-page. Then there is only one cell on a multi-page (the cell + covers the full multi-page). Is an object smaller than 1/2 PAGE_SIZE, + a page contains several objects and several cells. There + is only one cell on a page for object sizes from 1/2 PAGE_SIZE to + PAGE_SIZE (whereas multi-pages always contain 2 only one + cell). Only in layer one malloc() and free() are called. + + + Size Classes and Page Lists + + Meta-information about every page and multi-page is kept in a page + header. The page header contains some bookkeeping information like + number of used and free cells, and pointers to other page + headers. The page headers are linked in a page list. + + Every page list builds a size class. A size class contains all + pages (linked via page headers) for objects of the same size. The + new allocator does not group objects based on their type, it groups + objects based on their sizes. + + Here is an example: A cons contains a lrecord_header, a car and cdr + field. Altogether it uses 12 bytes of memory (on 32 bits + machines). All conses are allocated on pages with a cell size of 12 + bytes. All theses pages are kept together in a page list, which + represents the size class for 12 bytes objects. But this size class + is not exclusively for conses only. Other objects, which are also + 12 bytes big (e.g. weak-boxes), are allocated in the same size + class and on the same pages. + + The number of size classes is customizable, so is the size step + between successive size classes. + + + Used and Unused Heap + + The memory which is managed by the allocator can be divided in two + logical parts: + + The used heap contains pages, on which objects are allocated. These + pages are com- pletely or partially occupied. In the used heap, it + is important to quickly find a free spot for a new + object. Therefore the size classes of the used heap are defined by + the size of the cells on the pages. The size classes should match + common object sizes, to avoid wasting memory. + + The unused heap only contains completely empty pages. They have + never been used or have been freed completely again. In the unused + heap, the size of consecutive memory tips the scales. A page is the + smallest entity which is asked for. Therefore, the size classes of + the unused heap are defined by the number of consecutive pages. + + The parameters for the different size classes can be adjusted + independently, see `configurable values' below. + + + The Allocator's Data Structures + + The struct `mc_allocator_globals holds' all the data structures + that the new allocator uses (lists of used and unused pages, mark + bits, etc.). + + + Mapping of Heap Pointers to Page Headers + + For caching benefits, the page headers and mark bits are stored + separately from their associated page. During garbage collection + (i.e. for marking and freeing objects) it is important to identify + the page header which is responsible for a given Lisp object. + + To do this task quickly, I added a two level search tree: the upper + 10 bits of the heap pointer are the index of the first level. This + entry of the first level links to the second level, where the next + 10 bits of the heap pointer are used to identify the page + header. The remaining bits point to the object relative to the + page. + + On architectures with more than 32 bits pointers, a hash value of + the upper bits is used to index into the first level. + + + Mark Bits + + For caching purposes, the mark bits are no longer kept within the + objects, they are kept in a separate bit field. + + Every page header has a field for the mark bits of the objects on + the page. If there are less cells on the page than there fit bits + in the integral data type EMACS_INT, the mark bits are stored + directly in this EMACS_INT. + + Otherwise, the mark bits are written in a separate space, with the + page header pointing to this space. This happens to pages with + rather small objects: many cells fit on a page, thus many mark bits + are needed. + + + Allocate Memory + + Use + void *mc_alloc (size_t size) + to request memory from the allocator. This returns a pointer to a + newly allocated block of memory of given size. + + This is how the new allocator allocates memory: + 1. Determine the size class of the object. + 2. Is there already a page in this size class and is there a free + cell on this page? + * YES + 3. Unlink free cell from free list, return address of free cell. + DONE. + * NO + 3. Is there a page in the unused heap? + * YES + 4. Move unused page to used heap. + 5. Initialize page header, free list, and mark bits. + 6. Unlink first cell from free list, return address of cell. + DONE. + * NO + 4. Expand the heap, add new memory to unused heap + [go back to 3. and proceed with the YES case]. + + The allocator puts partially filled pages to the front of the page + list, completely filled ones to the end. That guarantees a fast + terminating search for free cells. Are there two successive full + pages at the front of the page list, the complete size class is + full, a new page has to be added. + + + Expand Heap + + To expand the heap, a big chunk of contiguous memory is allocated + using malloc(). These pieces are called heap sections. How big a new + heap section is (and thus the growth of the heap) is adjustable: See + MIN_HEAP_INCREASE, MAX_HEAP_INCREASE, and HEAP_GROWTH_DIVISOR below. + + + Free Memory + + One optimization in XEmacs is that locally used Lisp objects are + freed manually (the memory is not wasted till the next garbage + collection). Therefore the new allocator provides this function: + void mc_free (void *ptr) + That frees the object pointed to by ptr. + + This function is also used internally during sweep phase of the + garbage collection. This is how it works in detail: + + 1. Use pointer to identify page header + (use lookup mechanism described above). + 2. Mark cell as free and hook it into free list. + 3. Is the page completely empty? + * YES + 4. Unlink page from page list. + 5. Remove page header, free list, and mark bits. + 6. Move page to unused heap. + * NO + 4. Move page to front of size class (to speed up allocation + of objects). + + If the last object of a page is freed, the empty page is returned to + the unused heap. The allocator tries to coalesce adjacent pages, to + gain a big piece of contiguous memory. The resulting chunk is hooked + into the according size class of the unused heap. If this created a + complete heap section, the heap section is returned to the operating + system by using free(). + + + Allocator and Garbage Collector + + The new allocator simplifies the interface to the Garbage Collector: + * mark live objects: MARK_[WHITE|GREY|BLACK] (ptr) + * sweep heap: EMACS_INT mc_sweep (void) + * run finalizers: EMACS_INT mc_finalize (void) + + + Allocator and Dumper + + The new allocator provides special finalization for the portable + dumper (to save disk space): EMACS_INT mc_finalize_for_disksave (void) + + + More Information + + More details can be found in + http://crestani.de/xemacs/pdf/mc-alloc.pdf . + +*/ + #include #include "lisp.h"