comparison src/mc-alloc.c @ 5238:2cc24c69446c

Document the new allocator and the new garbage collector in gc.c and mc-alloc.c.
author Marcus Crestani <crestani@informatik.uni-tuebingen.de>
date Mon, 05 Jul 2010 18:17:39 +0200
parents 9b8c2168d231
children b249c479f9e1 308d34e9f07d
comparison
equal deleted inserted replaced
5237:6466bc9ebf15 5238:2cc24c69446c
18 along with XEmacs; see the file COPYING. If not, write to 18 along with XEmacs; see the file COPYING. If not, write to
19 the Free Software Foundation, Inc., 59 Temple Place - Suite 330, 19 the Free Software Foundation, Inc., 59 Temple Place - Suite 330,
20 Boston, MA 02111-1307, USA. */ 20 Boston, MA 02111-1307, USA. */
21 21
22 /* Synched up with: Not in FSF. */ 22 /* Synched up with: Not in FSF. */
23
24 /*
25 The New Allocator
26
27 The ideas and algorithms are based on the allocator of the
28 Boehm-Demers-Weiser conservative garbage collector. See
29 http://www.hpl.hp.com/personal/Hans_ Boehm/gc/index.html.
30
31 The new allocator is enabled when the new garbage collector
32 is enabled (with `--with-newgc'). The implementation of
33 the new garbage collector is in gc.c.
34
35 The new allocator takes care of:
36 - allocating objects in a write-barrier-friendly way
37 - manage object's mark bits
38
39 Three-Level Allocation
40
41 The new allocator efficiently manages the allocation of Lisp
42 objects by minimizing the number of times malloc() and free() are
43 called. The allocation process has three layers of abstraction:
44
45 1. It allocates memory in very large chunks called heap sections.
46
47 2. The heap sections are subdivided into pages. The page size is
48 determined by the constant PAGE_SIZE. It holds the size of a page
49 in bytes.
50
51 3. One page consists of one or more cells. Each cell represents
52 a memory location for an object. The cells on one page all have
53 the same size, thus every page only contains equal-sized
54 objects.
55
56 If an object is bigger than page size, it is allocated on a
57 multi-page. Then there is only one cell on a multi-page (the cell
58 covers the full multi-page). Is an object smaller than 1/2 PAGE_SIZE,
59 a page contains several objects and several cells. There
60 is only one cell on a page for object sizes from 1/2 PAGE_SIZE to
61 PAGE_SIZE (whereas multi-pages always contain 2 only one
62 cell). Only in layer one malloc() and free() are called.
63
64
65 Size Classes and Page Lists
66
67 Meta-information about every page and multi-page is kept in a page
68 header. The page header contains some bookkeeping information like
69 number of used and free cells, and pointers to other page
70 headers. The page headers are linked in a page list.
71
72 Every page list builds a size class. A size class contains all
73 pages (linked via page headers) for objects of the same size. The
74 new allocator does not group objects based on their type, it groups
75 objects based on their sizes.
76
77 Here is an example: A cons contains a lrecord_header, a car and cdr
78 field. Altogether it uses 12 bytes of memory (on 32 bits
79 machines). All conses are allocated on pages with a cell size of 12
80 bytes. All theses pages are kept together in a page list, which
81 represents the size class for 12 bytes objects. But this size class
82 is not exclusively for conses only. Other objects, which are also
83 12 bytes big (e.g. weak-boxes), are allocated in the same size
84 class and on the same pages.
85
86 The number of size classes is customizable, so is the size step
87 between successive size classes.
88
89
90 Used and Unused Heap
91
92 The memory which is managed by the allocator can be divided in two
93 logical parts:
94
95 The used heap contains pages, on which objects are allocated. These
96 pages are com- pletely or partially occupied. In the used heap, it
97 is important to quickly find a free spot for a new
98 object. Therefore the size classes of the used heap are defined by
99 the size of the cells on the pages. The size classes should match
100 common object sizes, to avoid wasting memory.
101
102 The unused heap only contains completely empty pages. They have
103 never been used or have been freed completely again. In the unused
104 heap, the size of consecutive memory tips the scales. A page is the
105 smallest entity which is asked for. Therefore, the size classes of
106 the unused heap are defined by the number of consecutive pages.
107
108 The parameters for the different size classes can be adjusted
109 independently, see `configurable values' below.
110
111
112 The Allocator's Data Structures
113
114 The struct `mc_allocator_globals holds' all the data structures
115 that the new allocator uses (lists of used and unused pages, mark
116 bits, etc.).
117
118
119 Mapping of Heap Pointers to Page Headers
120
121 For caching benefits, the page headers and mark bits are stored
122 separately from their associated page. During garbage collection
123 (i.e. for marking and freeing objects) it is important to identify
124 the page header which is responsible for a given Lisp object.
125
126 To do this task quickly, I added a two level search tree: the upper
127 10 bits of the heap pointer are the index of the first level. This
128 entry of the first level links to the second level, where the next
129 10 bits of the heap pointer are used to identify the page
130 header. The remaining bits point to the object relative to the
131 page.
132
133 On architectures with more than 32 bits pointers, a hash value of
134 the upper bits is used to index into the first level.
135
136
137 Mark Bits
138
139 For caching purposes, the mark bits are no longer kept within the
140 objects, they are kept in a separate bit field.
141
142 Every page header has a field for the mark bits of the objects on
143 the page. If there are less cells on the page than there fit bits
144 in the integral data type EMACS_INT, the mark bits are stored
145 directly in this EMACS_INT.
146
147 Otherwise, the mark bits are written in a separate space, with the
148 page header pointing to this space. This happens to pages with
149 rather small objects: many cells fit on a page, thus many mark bits
150 are needed.
151
152
153 Allocate Memory
154
155 Use
156 void *mc_alloc (size_t size)
157 to request memory from the allocator. This returns a pointer to a
158 newly allocated block of memory of given size.
159
160 This is how the new allocator allocates memory:
161 1. Determine the size class of the object.
162 2. Is there already a page in this size class and is there a free
163 cell on this page?
164 * YES
165 3. Unlink free cell from free list, return address of free cell.
166 DONE.
167 * NO
168 3. Is there a page in the unused heap?
169 * YES
170 4. Move unused page to used heap.
171 5. Initialize page header, free list, and mark bits.
172 6. Unlink first cell from free list, return address of cell.
173 DONE.
174 * NO
175 4. Expand the heap, add new memory to unused heap
176 [go back to 3. and proceed with the YES case].
177
178 The allocator puts partially filled pages to the front of the page
179 list, completely filled ones to the end. That guarantees a fast
180 terminating search for free cells. Are there two successive full
181 pages at the front of the page list, the complete size class is
182 full, a new page has to be added.
183
184
185 Expand Heap
186
187 To expand the heap, a big chunk of contiguous memory is allocated
188 using malloc(). These pieces are called heap sections. How big a new
189 heap section is (and thus the growth of the heap) is adjustable: See
190 MIN_HEAP_INCREASE, MAX_HEAP_INCREASE, and HEAP_GROWTH_DIVISOR below.
191
192
193 Free Memory
194
195 One optimization in XEmacs is that locally used Lisp objects are
196 freed manually (the memory is not wasted till the next garbage
197 collection). Therefore the new allocator provides this function:
198 void mc_free (void *ptr)
199 That frees the object pointed to by ptr.
200
201 This function is also used internally during sweep phase of the
202 garbage collection. This is how it works in detail:
203
204 1. Use pointer to identify page header
205 (use lookup mechanism described above).
206 2. Mark cell as free and hook it into free list.
207 3. Is the page completely empty?
208 * YES
209 4. Unlink page from page list.
210 5. Remove page header, free list, and mark bits.
211 6. Move page to unused heap.
212 * NO
213 4. Move page to front of size class (to speed up allocation
214 of objects).
215
216 If the last object of a page is freed, the empty page is returned to
217 the unused heap. The allocator tries to coalesce adjacent pages, to
218 gain a big piece of contiguous memory. The resulting chunk is hooked
219 into the according size class of the unused heap. If this created a
220 complete heap section, the heap section is returned to the operating
221 system by using free().
222
223
224 Allocator and Garbage Collector
225
226 The new allocator simplifies the interface to the Garbage Collector:
227 * mark live objects: MARK_[WHITE|GREY|BLACK] (ptr)
228 * sweep heap: EMACS_INT mc_sweep (void)
229 * run finalizers: EMACS_INT mc_finalize (void)
230
231
232 Allocator and Dumper
233
234 The new allocator provides special finalization for the portable
235 dumper (to save disk space): EMACS_INT mc_finalize_for_disksave (void)
236
237
238 More Information
239
240 More details can be found in
241 http://crestani.de/xemacs/pdf/mc-alloc.pdf .
242
243 */
23 244
24 #include <config.h> 245 #include <config.h>
25 246
26 #include "lisp.h" 247 #include "lisp.h"
27 #include "mc-alloc.h" 248 #include "mc-alloc.h"