Mercurial > hg > xemacs-beta
comparison src/mc-alloc.c @ 5238:2cc24c69446c
Document the new allocator and the new garbage collector in gc.c and mc-alloc.c.
author | Marcus Crestani <crestani@informatik.uni-tuebingen.de> |
---|---|
date | Mon, 05 Jul 2010 18:17:39 +0200 |
parents | 9b8c2168d231 |
children | b249c479f9e1 308d34e9f07d |
comparison
equal
deleted
inserted
replaced
5237:6466bc9ebf15 | 5238:2cc24c69446c |
---|---|
18 along with XEmacs; see the file COPYING. If not, write to | 18 along with XEmacs; see the file COPYING. If not, write to |
19 the Free Software Foundation, Inc., 59 Temple Place - Suite 330, | 19 the Free Software Foundation, Inc., 59 Temple Place - Suite 330, |
20 Boston, MA 02111-1307, USA. */ | 20 Boston, MA 02111-1307, USA. */ |
21 | 21 |
22 /* Synched up with: Not in FSF. */ | 22 /* Synched up with: Not in FSF. */ |
23 | |
24 /* | |
25 The New Allocator | |
26 | |
27 The ideas and algorithms are based on the allocator of the | |
28 Boehm-Demers-Weiser conservative garbage collector. See | |
29 http://www.hpl.hp.com/personal/Hans_ Boehm/gc/index.html. | |
30 | |
31 The new allocator is enabled when the new garbage collector | |
32 is enabled (with `--with-newgc'). The implementation of | |
33 the new garbage collector is in gc.c. | |
34 | |
35 The new allocator takes care of: | |
36 - allocating objects in a write-barrier-friendly way | |
37 - manage object's mark bits | |
38 | |
39 Three-Level Allocation | |
40 | |
41 The new allocator efficiently manages the allocation of Lisp | |
42 objects by minimizing the number of times malloc() and free() are | |
43 called. The allocation process has three layers of abstraction: | |
44 | |
45 1. It allocates memory in very large chunks called heap sections. | |
46 | |
47 2. The heap sections are subdivided into pages. The page size is | |
48 determined by the constant PAGE_SIZE. It holds the size of a page | |
49 in bytes. | |
50 | |
51 3. One page consists of one or more cells. Each cell represents | |
52 a memory location for an object. The cells on one page all have | |
53 the same size, thus every page only contains equal-sized | |
54 objects. | |
55 | |
56 If an object is bigger than page size, it is allocated on a | |
57 multi-page. Then there is only one cell on a multi-page (the cell | |
58 covers the full multi-page). Is an object smaller than 1/2 PAGE_SIZE, | |
59 a page contains several objects and several cells. There | |
60 is only one cell on a page for object sizes from 1/2 PAGE_SIZE to | |
61 PAGE_SIZE (whereas multi-pages always contain 2 only one | |
62 cell). Only in layer one malloc() and free() are called. | |
63 | |
64 | |
65 Size Classes and Page Lists | |
66 | |
67 Meta-information about every page and multi-page is kept in a page | |
68 header. The page header contains some bookkeeping information like | |
69 number of used and free cells, and pointers to other page | |
70 headers. The page headers are linked in a page list. | |
71 | |
72 Every page list builds a size class. A size class contains all | |
73 pages (linked via page headers) for objects of the same size. The | |
74 new allocator does not group objects based on their type, it groups | |
75 objects based on their sizes. | |
76 | |
77 Here is an example: A cons contains a lrecord_header, a car and cdr | |
78 field. Altogether it uses 12 bytes of memory (on 32 bits | |
79 machines). All conses are allocated on pages with a cell size of 12 | |
80 bytes. All theses pages are kept together in a page list, which | |
81 represents the size class for 12 bytes objects. But this size class | |
82 is not exclusively for conses only. Other objects, which are also | |
83 12 bytes big (e.g. weak-boxes), are allocated in the same size | |
84 class and on the same pages. | |
85 | |
86 The number of size classes is customizable, so is the size step | |
87 between successive size classes. | |
88 | |
89 | |
90 Used and Unused Heap | |
91 | |
92 The memory which is managed by the allocator can be divided in two | |
93 logical parts: | |
94 | |
95 The used heap contains pages, on which objects are allocated. These | |
96 pages are com- pletely or partially occupied. In the used heap, it | |
97 is important to quickly find a free spot for a new | |
98 object. Therefore the size classes of the used heap are defined by | |
99 the size of the cells on the pages. The size classes should match | |
100 common object sizes, to avoid wasting memory. | |
101 | |
102 The unused heap only contains completely empty pages. They have | |
103 never been used or have been freed completely again. In the unused | |
104 heap, the size of consecutive memory tips the scales. A page is the | |
105 smallest entity which is asked for. Therefore, the size classes of | |
106 the unused heap are defined by the number of consecutive pages. | |
107 | |
108 The parameters for the different size classes can be adjusted | |
109 independently, see `configurable values' below. | |
110 | |
111 | |
112 The Allocator's Data Structures | |
113 | |
114 The struct `mc_allocator_globals holds' all the data structures | |
115 that the new allocator uses (lists of used and unused pages, mark | |
116 bits, etc.). | |
117 | |
118 | |
119 Mapping of Heap Pointers to Page Headers | |
120 | |
121 For caching benefits, the page headers and mark bits are stored | |
122 separately from their associated page. During garbage collection | |
123 (i.e. for marking and freeing objects) it is important to identify | |
124 the page header which is responsible for a given Lisp object. | |
125 | |
126 To do this task quickly, I added a two level search tree: the upper | |
127 10 bits of the heap pointer are the index of the first level. This | |
128 entry of the first level links to the second level, where the next | |
129 10 bits of the heap pointer are used to identify the page | |
130 header. The remaining bits point to the object relative to the | |
131 page. | |
132 | |
133 On architectures with more than 32 bits pointers, a hash value of | |
134 the upper bits is used to index into the first level. | |
135 | |
136 | |
137 Mark Bits | |
138 | |
139 For caching purposes, the mark bits are no longer kept within the | |
140 objects, they are kept in a separate bit field. | |
141 | |
142 Every page header has a field for the mark bits of the objects on | |
143 the page. If there are less cells on the page than there fit bits | |
144 in the integral data type EMACS_INT, the mark bits are stored | |
145 directly in this EMACS_INT. | |
146 | |
147 Otherwise, the mark bits are written in a separate space, with the | |
148 page header pointing to this space. This happens to pages with | |
149 rather small objects: many cells fit on a page, thus many mark bits | |
150 are needed. | |
151 | |
152 | |
153 Allocate Memory | |
154 | |
155 Use | |
156 void *mc_alloc (size_t size) | |
157 to request memory from the allocator. This returns a pointer to a | |
158 newly allocated block of memory of given size. | |
159 | |
160 This is how the new allocator allocates memory: | |
161 1. Determine the size class of the object. | |
162 2. Is there already a page in this size class and is there a free | |
163 cell on this page? | |
164 * YES | |
165 3. Unlink free cell from free list, return address of free cell. | |
166 DONE. | |
167 * NO | |
168 3. Is there a page in the unused heap? | |
169 * YES | |
170 4. Move unused page to used heap. | |
171 5. Initialize page header, free list, and mark bits. | |
172 6. Unlink first cell from free list, return address of cell. | |
173 DONE. | |
174 * NO | |
175 4. Expand the heap, add new memory to unused heap | |
176 [go back to 3. and proceed with the YES case]. | |
177 | |
178 The allocator puts partially filled pages to the front of the page | |
179 list, completely filled ones to the end. That guarantees a fast | |
180 terminating search for free cells. Are there two successive full | |
181 pages at the front of the page list, the complete size class is | |
182 full, a new page has to be added. | |
183 | |
184 | |
185 Expand Heap | |
186 | |
187 To expand the heap, a big chunk of contiguous memory is allocated | |
188 using malloc(). These pieces are called heap sections. How big a new | |
189 heap section is (and thus the growth of the heap) is adjustable: See | |
190 MIN_HEAP_INCREASE, MAX_HEAP_INCREASE, and HEAP_GROWTH_DIVISOR below. | |
191 | |
192 | |
193 Free Memory | |
194 | |
195 One optimization in XEmacs is that locally used Lisp objects are | |
196 freed manually (the memory is not wasted till the next garbage | |
197 collection). Therefore the new allocator provides this function: | |
198 void mc_free (void *ptr) | |
199 That frees the object pointed to by ptr. | |
200 | |
201 This function is also used internally during sweep phase of the | |
202 garbage collection. This is how it works in detail: | |
203 | |
204 1. Use pointer to identify page header | |
205 (use lookup mechanism described above). | |
206 2. Mark cell as free and hook it into free list. | |
207 3. Is the page completely empty? | |
208 * YES | |
209 4. Unlink page from page list. | |
210 5. Remove page header, free list, and mark bits. | |
211 6. Move page to unused heap. | |
212 * NO | |
213 4. Move page to front of size class (to speed up allocation | |
214 of objects). | |
215 | |
216 If the last object of a page is freed, the empty page is returned to | |
217 the unused heap. The allocator tries to coalesce adjacent pages, to | |
218 gain a big piece of contiguous memory. The resulting chunk is hooked | |
219 into the according size class of the unused heap. If this created a | |
220 complete heap section, the heap section is returned to the operating | |
221 system by using free(). | |
222 | |
223 | |
224 Allocator and Garbage Collector | |
225 | |
226 The new allocator simplifies the interface to the Garbage Collector: | |
227 * mark live objects: MARK_[WHITE|GREY|BLACK] (ptr) | |
228 * sweep heap: EMACS_INT mc_sweep (void) | |
229 * run finalizers: EMACS_INT mc_finalize (void) | |
230 | |
231 | |
232 Allocator and Dumper | |
233 | |
234 The new allocator provides special finalization for the portable | |
235 dumper (to save disk space): EMACS_INT mc_finalize_for_disksave (void) | |
236 | |
237 | |
238 More Information | |
239 | |
240 More details can be found in | |
241 http://crestani.de/xemacs/pdf/mc-alloc.pdf . | |
242 | |
243 */ | |
23 | 244 |
24 #include <config.h> | 245 #include <config.h> |
25 | 246 |
26 #include "lisp.h" | 247 #include "lisp.h" |
27 #include "mc-alloc.h" | 248 #include "mc-alloc.h" |