Mercurial > hg > xemacs-beta
diff src/mc-alloc.c @ 5238:2cc24c69446c
Document the new allocator and the new garbage collector in gc.c and mc-alloc.c.
author | Marcus Crestani <crestani@informatik.uni-tuebingen.de> |
---|---|
date | Mon, 05 Jul 2010 18:17:39 +0200 |
parents | 9b8c2168d231 |
children | b249c479f9e1 308d34e9f07d |
line wrap: on
line diff
--- a/src/mc-alloc.c Wed Jun 23 08:04:18 2010 -0400 +++ b/src/mc-alloc.c Mon Jul 05 18:17:39 2010 +0200 @@ -21,6 +21,227 @@ /* Synched up with: Not in FSF. */ +/* + The New Allocator + + The ideas and algorithms are based on the allocator of the + Boehm-Demers-Weiser conservative garbage collector. See + http://www.hpl.hp.com/personal/Hans_ Boehm/gc/index.html. + + The new allocator is enabled when the new garbage collector + is enabled (with `--with-newgc'). The implementation of + the new garbage collector is in gc.c. + + The new allocator takes care of: + - allocating objects in a write-barrier-friendly way + - manage object's mark bits + + Three-Level Allocation + + The new allocator efficiently manages the allocation of Lisp + objects by minimizing the number of times malloc() and free() are + called. The allocation process has three layers of abstraction: + + 1. It allocates memory in very large chunks called heap sections. + + 2. The heap sections are subdivided into pages. The page size is + determined by the constant PAGE_SIZE. It holds the size of a page + in bytes. + + 3. One page consists of one or more cells. Each cell represents + a memory location for an object. The cells on one page all have + the same size, thus every page only contains equal-sized + objects. + + If an object is bigger than page size, it is allocated on a + multi-page. Then there is only one cell on a multi-page (the cell + covers the full multi-page). Is an object smaller than 1/2 PAGE_SIZE, + a page contains several objects and several cells. There + is only one cell on a page for object sizes from 1/2 PAGE_SIZE to + PAGE_SIZE (whereas multi-pages always contain 2 only one + cell). Only in layer one malloc() and free() are called. + + + Size Classes and Page Lists + + Meta-information about every page and multi-page is kept in a page + header. The page header contains some bookkeeping information like + number of used and free cells, and pointers to other page + headers. The page headers are linked in a page list. + + Every page list builds a size class. A size class contains all + pages (linked via page headers) for objects of the same size. The + new allocator does not group objects based on their type, it groups + objects based on their sizes. + + Here is an example: A cons contains a lrecord_header, a car and cdr + field. Altogether it uses 12 bytes of memory (on 32 bits + machines). All conses are allocated on pages with a cell size of 12 + bytes. All theses pages are kept together in a page list, which + represents the size class for 12 bytes objects. But this size class + is not exclusively for conses only. Other objects, which are also + 12 bytes big (e.g. weak-boxes), are allocated in the same size + class and on the same pages. + + The number of size classes is customizable, so is the size step + between successive size classes. + + + Used and Unused Heap + + The memory which is managed by the allocator can be divided in two + logical parts: + + The used heap contains pages, on which objects are allocated. These + pages are com- pletely or partially occupied. In the used heap, it + is important to quickly find a free spot for a new + object. Therefore the size classes of the used heap are defined by + the size of the cells on the pages. The size classes should match + common object sizes, to avoid wasting memory. + + The unused heap only contains completely empty pages. They have + never been used or have been freed completely again. In the unused + heap, the size of consecutive memory tips the scales. A page is the + smallest entity which is asked for. Therefore, the size classes of + the unused heap are defined by the number of consecutive pages. + + The parameters for the different size classes can be adjusted + independently, see `configurable values' below. + + + The Allocator's Data Structures + + The struct `mc_allocator_globals holds' all the data structures + that the new allocator uses (lists of used and unused pages, mark + bits, etc.). + + + Mapping of Heap Pointers to Page Headers + + For caching benefits, the page headers and mark bits are stored + separately from their associated page. During garbage collection + (i.e. for marking and freeing objects) it is important to identify + the page header which is responsible for a given Lisp object. + + To do this task quickly, I added a two level search tree: the upper + 10 bits of the heap pointer are the index of the first level. This + entry of the first level links to the second level, where the next + 10 bits of the heap pointer are used to identify the page + header. The remaining bits point to the object relative to the + page. + + On architectures with more than 32 bits pointers, a hash value of + the upper bits is used to index into the first level. + + + Mark Bits + + For caching purposes, the mark bits are no longer kept within the + objects, they are kept in a separate bit field. + + Every page header has a field for the mark bits of the objects on + the page. If there are less cells on the page than there fit bits + in the integral data type EMACS_INT, the mark bits are stored + directly in this EMACS_INT. + + Otherwise, the mark bits are written in a separate space, with the + page header pointing to this space. This happens to pages with + rather small objects: many cells fit on a page, thus many mark bits + are needed. + + + Allocate Memory + + Use + void *mc_alloc (size_t size) + to request memory from the allocator. This returns a pointer to a + newly allocated block of memory of given size. + + This is how the new allocator allocates memory: + 1. Determine the size class of the object. + 2. Is there already a page in this size class and is there a free + cell on this page? + * YES + 3. Unlink free cell from free list, return address of free cell. + DONE. + * NO + 3. Is there a page in the unused heap? + * YES + 4. Move unused page to used heap. + 5. Initialize page header, free list, and mark bits. + 6. Unlink first cell from free list, return address of cell. + DONE. + * NO + 4. Expand the heap, add new memory to unused heap + [go back to 3. and proceed with the YES case]. + + The allocator puts partially filled pages to the front of the page + list, completely filled ones to the end. That guarantees a fast + terminating search for free cells. Are there two successive full + pages at the front of the page list, the complete size class is + full, a new page has to be added. + + + Expand Heap + + To expand the heap, a big chunk of contiguous memory is allocated + using malloc(). These pieces are called heap sections. How big a new + heap section is (and thus the growth of the heap) is adjustable: See + MIN_HEAP_INCREASE, MAX_HEAP_INCREASE, and HEAP_GROWTH_DIVISOR below. + + + Free Memory + + One optimization in XEmacs is that locally used Lisp objects are + freed manually (the memory is not wasted till the next garbage + collection). Therefore the new allocator provides this function: + void mc_free (void *ptr) + That frees the object pointed to by ptr. + + This function is also used internally during sweep phase of the + garbage collection. This is how it works in detail: + + 1. Use pointer to identify page header + (use lookup mechanism described above). + 2. Mark cell as free and hook it into free list. + 3. Is the page completely empty? + * YES + 4. Unlink page from page list. + 5. Remove page header, free list, and mark bits. + 6. Move page to unused heap. + * NO + 4. Move page to front of size class (to speed up allocation + of objects). + + If the last object of a page is freed, the empty page is returned to + the unused heap. The allocator tries to coalesce adjacent pages, to + gain a big piece of contiguous memory. The resulting chunk is hooked + into the according size class of the unused heap. If this created a + complete heap section, the heap section is returned to the operating + system by using free(). + + + Allocator and Garbage Collector + + The new allocator simplifies the interface to the Garbage Collector: + * mark live objects: MARK_[WHITE|GREY|BLACK] (ptr) + * sweep heap: EMACS_INT mc_sweep (void) + * run finalizers: EMACS_INT mc_finalize (void) + + + Allocator and Dumper + + The new allocator provides special finalization for the portable + dumper (to save disk space): EMACS_INT mc_finalize_for_disksave (void) + + + More Information + + More details can be found in + http://crestani.de/xemacs/pdf/mc-alloc.pdf . + +*/ + #include <config.h> #include "lisp.h"