Mercurial > hg > xemacs-beta
comparison man/internals/internals.texi @ 371:cc15677e0335 r21-2b1
Import from CVS: tag r21-2b1
author | cvs |
---|---|
date | Mon, 13 Aug 2007 11:03:08 +0200 |
parents | a4f53d9b3154 |
children | 6240c7796c7a |
comparison
equal
deleted
inserted
replaced
370:bd866891f083 | 371:cc15677e0335 |
---|---|
3 @setfilename ../../info/internals.info | 3 @setfilename ../../info/internals.info |
4 @settitle XEmacs Internals Manual | 4 @settitle XEmacs Internals Manual |
5 @c %**end of header | 5 @c %**end of header |
6 | 6 |
7 @ifinfo | 7 @ifinfo |
8 @dircategory XEmacs Editor | |
9 @direntry | |
10 * Internals: (internals). XEmacs Internals Manual. | |
11 @end direntry | |
12 | 8 |
13 Copyright @copyright{} 1992 - 1996 Ben Wing. | 9 Copyright @copyright{} 1992 - 1996 Ben Wing. |
14 Copyright @copyright{} 1996, 1997 Sun Microsystems. | 10 Copyright @copyright{} 1996, 1997 Sun Microsystems. |
15 Copyright @copyright{} 1994, 1995 Free Software Foundation. | 11 Copyright @copyright{} 1994, 1995 Free Software Foundation. |
16 Copyright @copyright{} 1994, 1995 Board of Trustees, University of Illinois. | 12 Copyright @copyright{} 1994, 1995 Board of Trustees, University of Illinois. |
171 Allocation of Objects in XEmacs Lisp | 167 Allocation of Objects in XEmacs Lisp |
172 | 168 |
173 * Introduction to Allocation:: | 169 * Introduction to Allocation:: |
174 * Garbage Collection:: | 170 * Garbage Collection:: |
175 * GCPROing:: | 171 * GCPROing:: |
176 * Garbage Collection - Step by Step:: | |
177 * Integers and Characters:: | 172 * Integers and Characters:: |
178 * Allocation from Frob Blocks:: | 173 * Allocation from Frob Blocks:: |
179 * lrecords:: | 174 * lrecords:: |
180 * Low-level allocation:: | 175 * Low-level allocation:: |
181 * Pure Space:: | 176 * Pure Space:: |
601 @itemize @bullet | 596 @itemize @bullet |
602 @item | 597 @item |
603 version 20.1 released September 17, 1997. | 598 version 20.1 released September 17, 1997. |
604 @item | 599 @item |
605 version 20.2 released September 20, 1997. | 600 version 20.2 released September 20, 1997. |
606 @item | |
607 version 20.3 released August 19, 1998. | |
608 @end itemize | 601 @end itemize |
609 | 602 |
610 @node XEmacs | 603 @node XEmacs |
611 @section XEmacs | 604 @section XEmacs |
612 @cindex XEmacs | 605 @cindex XEmacs |
1424 | 1417 |
1425 @example | 1418 @example |
1426 1.983e-4 | 1419 1.983e-4 |
1427 @end example | 1420 @end example |
1428 | 1421 |
1429 converts to a float whose value is 1.983e-4, or .0001983. | 1422 converts to a float whose value is 1983.23e-4, or .0001983. |
1430 | 1423 |
1431 @example | 1424 @example |
1432 ?b | 1425 ?b |
1433 @end example | 1426 @end example |
1434 | 1427 |
1659 | 1652 |
1660 @menu | 1653 @menu |
1661 * General Coding Rules:: | 1654 * General Coding Rules:: |
1662 * Writing Lisp Primitives:: | 1655 * Writing Lisp Primitives:: |
1663 * Adding Global Lisp Variables:: | 1656 * Adding Global Lisp Variables:: |
1664 * Coding for Mule:: | |
1665 * Techniques for XEmacs Developers:: | 1657 * Techniques for XEmacs Developers:: |
1666 @end menu | 1658 @end menu |
1667 | 1659 |
1668 @node General Coding Rules | 1660 @node General Coding Rules |
1669 @section General Coding Rules | 1661 @section General Coding Rules |
1688 the same directory as the C sources) and @file{lisp.h}. @file{config.h} | 1680 the same directory as the C sources) and @file{lisp.h}. @file{config.h} |
1689 should always be included before any other header files (including | 1681 should always be included before any other header files (including |
1690 system header files) to ensure that certain tricks played by various | 1682 system header files) to ensure that certain tricks played by various |
1691 @file{s/} and @file{m/} files work out correctly. | 1683 @file{s/} and @file{m/} files work out correctly. |
1692 | 1684 |
1693 When including header files, always use angle brackets, not double | |
1694 quotes, except when the file to be included is in the same directory as | |
1695 the including file. If either file is a generated file, then that is | |
1696 not likely to be the case. In order to understand why we have this | |
1697 rule, imagine what happens when you do a build in the source directory | |
1698 using @samp{./configure} and another build in another directory using | |
1699 @samp{../work/configure}. There will be two different @file{config.h} | |
1700 files. Which one will be used if you @samp{#include "config.h"}? | |
1701 | |
1702 @strong{All global and static variables that are to be modifiable must | 1685 @strong{All global and static variables that are to be modifiable must |
1703 be declared uninitialized.} This means that you may not use the ``declare | 1686 be declared uninitialized.} This means that you may not use the ``declare |
1704 with initializer'' form for these variables, such as @code{int | 1687 with initializer'' form for these variables, such as @code{int |
1705 some_variable = 0;}. The reason for this has to do with some kludges | 1688 some_variable = 0;}. The reason for this has to do with some kludges |
1706 done during the dumping process: If possible, the initialized data | 1689 done during the dumping process: If possible, the initialized data |
1769 | 1752 |
1770 while (!NILP (args)) | 1753 while (!NILP (args)) |
1771 @{ | 1754 @{ |
1772 val = Feval (XCAR (args)); | 1755 val = Feval (XCAR (args)); |
1773 if (!NILP (val)) | 1756 if (!NILP (val)) |
1774 break; | 1757 break; |
1775 args = XCDR (args); | 1758 args = XCDR (args); |
1776 @} | 1759 @} |
1777 | 1760 |
1778 UNGCPRO; | 1761 UNGCPRO; |
1779 return val; | 1762 return val; |
2038 C variable in the @code{vars_of_*()} function. Otherwise, the | 2021 C variable in the @code{vars_of_*()} function. Otherwise, the |
2039 garbage-collection mechanism won't know that the object in this variable | 2022 garbage-collection mechanism won't know that the object in this variable |
2040 is in use, and will happily collect it and reuse its storage for another | 2023 is in use, and will happily collect it and reuse its storage for another |
2041 Lisp object, and you will be the one who's unhappy when you can't figure | 2024 Lisp object, and you will be the one who's unhappy when you can't figure |
2042 out how your variable got overwritten. | 2025 out how your variable got overwritten. |
2043 | |
2044 @node Coding for Mule | |
2045 @section Coding for Mule | |
2046 @cindex Coding for Mule | |
2047 | |
2048 Although Mule support is not compiled by default in XEmacs, many people | |
2049 are using it, and we consider it crucial that new code works correctly | |
2050 with multibyte characters. This is not hard; it is only a matter of | |
2051 following several simple user-interface guidelines. Even if you never | |
2052 compile with Mule, with a little practice you will find it quite easy | |
2053 to code Mule-correctly. | |
2054 | |
2055 Note that these guidelines are not necessarily tied to the current Mule | |
2056 implementation; they are also a good idea to follow on the grounds of | |
2057 code generalization for future I18N work. | |
2058 | |
2059 @menu | |
2060 * Character-Related Data Types:: | |
2061 * Working With Character and Byte Positions:: | |
2062 * Conversion of External Data:: | |
2063 * General Guidelines for Writing Mule-Aware Code:: | |
2064 * An Example of Mule-Aware Code:: | |
2065 @end menu | |
2066 | |
2067 @node Character-Related Data Types | |
2068 @subsection Character-Related Data Types | |
2069 | |
2070 First, we will list the basic character-related datatypes used by | |
2071 XEmacs. Note that the separate @code{typedef}s are not required for the | |
2072 code to work (all of them boil down to @code{unsigned char} or | |
2073 @code{int}), but they improve clarity of code a great deal, because one | |
2074 glance at the declaration can tell the intended use of the variable. | |
2075 | |
2076 @table @code | |
2077 @item Emchar | |
2078 @cindex Emchar | |
2079 An @code{Emchar} holds a single Emacs character. | |
2080 | |
2081 Obviously, the equality between characters and bytes is lost in the Mule | |
2082 world. Characters can be represented by one or more bytes in the | |
2083 buffer, and @code{Emchar} is the C type large enough to hold any | |
2084 character. | |
2085 | |
2086 Without Mule support, an @code{Emchar} is equivalent to an | |
2087 @code{unsigned char}. | |
2088 | |
2089 @item Bufbyte | |
2090 @cindex Bufbyte | |
2091 The data representing the text in a buffer or string is logically a set | |
2092 of @code{Bufbyte}s. | |
2093 | |
2094 XEmacs does not work with character formats all the time; when reading | |
2095 characters from the outside, it decodes them to an internal format, and | |
2096 likewise encodes them when writing. @code{Bufbyte} (in fact | |
2097 @code{unsigned char}) is the basic unit of XEmacs internal buffers and | |
2098 strings format. | |
2099 | |
2100 One character can correspond to one or more @code{Bufbyte}s. In the | |
2101 current implementation, an ASCII character is represented by the same | |
2102 @code{Bufbyte}, and extended characters are represented by a sequence of | |
2103 @code{Bufbyte}s. | |
2104 | |
2105 Without Mule support, a @code{Bufbyte} is equivalent to an | |
2106 @code{Emchar}. | |
2107 | |
2108 @item Bufpos | |
2109 @itemx Charcount | |
2110 A @code{Bufpos} represents a character position in a buffer or string. | |
2111 A @code{Charcount} represents a number (count) of characters. | |
2112 Logically, subtracting two @code{Bufpos} values yields a | |
2113 @code{Charcount} value. Although all of these are @code{typedef}ed to | |
2114 @code{int}, we use them in preference to @code{int} to make it clear | |
2115 what sort of position is being used. | |
2116 | |
2117 @code{Bufpos} and @code{Charcount} values are the only ones that are | |
2118 ever visible to Lisp. | |
2119 | |
2120 @item Bytind | |
2121 @itemx Bytecount | |
2122 A @code{Bytind} represents a byte position in a buffer or string. A | |
2123 @code{Bytecount} represents the distance between two positions in bytes. | |
2124 The relationship between @code{Bytind} and @code{Bytecount} is the same | |
2125 as the relationship between @code{Bufpos} and @code{Charcount}. | |
2126 | |
2127 @item Extbyte | |
2128 @itemx Extcount | |
2129 When dealing with the outside world, XEmacs works with @code{Extbyte}s, | |
2130 which are equivalent to @code{unsigned char}. Obviously, an | |
2131 @code{Extcount} is the distance between two @code{Extbyte}s. Extbytes | |
2132 and Extcounts are not all that frequent in XEmacs code. | |
2133 @end table | |
2134 | |
2135 @node Working With Character and Byte Positions | |
2136 @subsection Working With Character and Byte Positions | |
2137 | |
2138 Now that we have defined the basic character-related types, we can look | |
2139 at the macros and functions designed for work with them and for | |
2140 conversion between them. Most of these macros are defined in | |
2141 @file{buffer.h}, and we don't discuss all of them here, but only the | |
2142 most important ones. Examining the existing code is the best way to | |
2143 learn about them. | |
2144 | |
2145 @table @code | |
2146 @item MAX_EMCHAR_LEN | |
2147 This preprocessor constant is the maximum number of buffer bytes per | |
2148 Emacs character, i.e. the byte length of an @code{Emchar}. It is useful | |
2149 when allocating temporary strings to keep a known number of characters. | |
2150 For instance: | |
2151 | |
2152 @example | |
2153 @group | |
2154 @{ | |
2155 Charcount cclen; | |
2156 ... | |
2157 @{ | |
2158 /* Allocate place for @var{cclen} characters. */ | |
2159 Bufbyte *tmp_buf = (Bufbyte *)alloca (cclen * MAX_EMCHAR_LEN); | |
2160 ... | |
2161 @end group | |
2162 @end example | |
2163 | |
2164 If you followed the previous section, you can guess that, logically, | |
2165 multiplying a @code{Charcount} value with @code{MAX_EMCHAR_LEN} produces | |
2166 a @code{Bytecount} value. | |
2167 | |
2168 In the current Mule implementation, @code{MAX_EMCHAR_LEN} equals 4. | |
2169 Without Mule, it is 1. | |
2170 | |
2171 @item charptr_emchar | |
2172 @item set_charptr_emchar | |
2173 @code{charptr_emchar} macro takes a @code{Bufbyte} pointer and returns | |
2174 the underlying @code{Emchar}. If it were a function, its prototype | |
2175 would be: | |
2176 | |
2177 @example | |
2178 Emchar charptr_emchar (Bufbyte *p); | |
2179 @end example | |
2180 | |
2181 @code{set_charptr_emchar} stores an @code{Emchar} to the specified byte | |
2182 position. It returns the number of bytes stored: | |
2183 | |
2184 @example | |
2185 Bytecount set_charptr_emchar (Bufbyte *p, Emchar c); | |
2186 @end example | |
2187 | |
2188 It is important to note that @code{set_charptr_emchar} is safe only for | |
2189 appending a character at the end of a buffer, not for overwriting a | |
2190 character in the middle. This is because the width of characters | |
2191 varies, and @code{set_charptr_emchar} cannot resize the string if it | |
2192 writes, say, a two-byte character where a single-byte character used to | |
2193 reside. | |
2194 | |
2195 A typical use of @code{set_charptr_emchar} can be demonstrated by this | |
2196 example, which copies characters from buffer @var{buf} to a temporary | |
2197 string of Bufbytes. | |
2198 | |
2199 @example | |
2200 @group | |
2201 @{ | |
2202 Bufpos pos; | |
2203 for (pos = beg; pos < end; pos++) | |
2204 @{ | |
2205 Emchar c = BUF_FETCH_CHAR (buf, pos); | |
2206 p += set_charptr_emchar (buf, c); | |
2207 @} | |
2208 @} | |
2209 @end group | |
2210 @end example | |
2211 | |
2212 Note how @code{set_charptr_emchar} is used to store the @code{Emchar} | |
2213 and increment the counter, at the same time. | |
2214 | |
2215 @item INC_CHARPTR | |
2216 @itemx DEC_CHARPTR | |
2217 These two macros increment and decrement a @code{Bufbyte} pointer, | |
2218 respectively. The pointer needs to be correctly positioned at the | |
2219 beginning of a valid character position. | |
2220 | |
2221 Without Mule support, @code{INC_CHARPTR (p)} and @code{DEC_CHARPTR (p)} | |
2222 simply expand to @code{p++} and @code{p--}, respectively. | |
2223 | |
2224 @item bytecount_to_charcount | |
2225 Given a pointer to a text string and a length in bytes, return the | |
2226 equivalent length in characters. | |
2227 | |
2228 @example | |
2229 Charcount bytecount_to_charcount (Bufbyte *p, Bytecount bc); | |
2230 @end example | |
2231 | |
2232 @item charcount_to_bytecount | |
2233 Given a pointer to a text string and a length in characters, return the | |
2234 equivalent length in bytes. | |
2235 | |
2236 @example | |
2237 Bytecount charcount_to_bytecount (Bufbyte *p, Charcount cc); | |
2238 @end example | |
2239 | |
2240 @item charptr_n_addr | |
2241 Return a pointer to the beginning of the character offset @var{cc} (in | |
2242 characters) from @var{p}. | |
2243 | |
2244 @example | |
2245 Bufbyte *charptr_n_addr (Bufbyte *p, Charcount cc); | |
2246 @end example | |
2247 @end table | |
2248 | |
2249 @node Conversion of External Data | |
2250 @subsection Conversion of External Data | |
2251 | |
2252 When an external function, such as a C library function, returns a | |
2253 @code{char} pointer, you should never treat it as @code{Bufbyte}. This | |
2254 is because these returned strings may contain 8bit characters which can | |
2255 be misinterpreted by XEmacs, and cause a crash. Instead, you should use | |
2256 a conversion macro. Many different conversion macros are defined in | |
2257 @file{buffer.h}, so I will try to order them logically, by direction and | |
2258 by format. | |
2259 | |
2260 Thus the basic conversion macros are @code{GET_CHARPTR_INT_DATA_ALLOCA} | |
2261 and @code{GET_CHARPTR_EXT_DATA_ALLOCA}. The former is used to convert | |
2262 external data to internal format, and the latter is used to convert the | |
2263 other way around. The arguments each of these receives are @var{ptr} | |
2264 (pointer to the text in external format), @var{len} (length of texts in | |
2265 bytes), @var{fmt} (format of the external text), @var{ptr_out} (lvalue | |
2266 to which new text should be copied), and @var{len_out} (lvalue which | |
2267 will be assigned the length of the internal text in bytes). The | |
2268 resulting text is stored to a stack-allocated buffer. If the text | |
2269 doesn't need changing, these macros will do nothing, except for setting | |
2270 @var{len_out}. | |
2271 | |
2272 Currently meaningful formats are @code{FORMAT_BINARY}, | |
2273 @code{FORMAT_FILENAME}, @code{FORMAT_OS}, and @code{FORMAT_CTEXT}. | |
2274 | |
2275 The two macros above take many arguments which makes them unwieldy. For | |
2276 this reason, several convenience macros are defined with obvious | |
2277 functionality, but accepting less arguments: | |
2278 | |
2279 @table @code | |
2280 @item GET_C_CHARPTR_EXT_DATA_ALLOCA | |
2281 @itemx GET_C_CHARPTR_INT_DATA_ALLOCA | |
2282 These two macros work on ``C char pointers'', which are zero-terminated, | |
2283 and thus do not need @var{len} or @var{len_out} parameters. | |
2284 | |
2285 @item GET_STRING_EXT_DATA_ALLOCA | |
2286 @itemx GET_C_STRING_EXT_DATA_ALLOCA | |
2287 These two macros work on Lisp strings, thus also not needing a @var{len} | |
2288 parameter. However, @code{GET_STRING_EXT_DATA_ALLOCA} still provides a | |
2289 @var{len_out} parameter. Note that for Lisp strings only one conversion | |
2290 direction makes sense. | |
2291 | |
2292 @item GET_C_CHARPTR_EXT_BINARY_DATA_ALLOCA | |
2293 @itemx GET_C_CHARPTR_EXT_FILENAME_DATA_ALLOCA | |
2294 @itemx GET_C_CHARPTR_EXT_CTEXT_DATA_ALLOCA | |
2295 @itemx ... | |
2296 These macros are a combination of the above, but with the @var{fmt} | |
2297 argument encoded into the name of the macro. | |
2298 @end table | |
2299 | |
2300 @node General Guidelines for Writing Mule-Aware Code | |
2301 @subsection General Guidelines for Writing Mule-Aware Code | |
2302 | |
2303 This section contains some general guidance on how to write Mule-aware | |
2304 code, as well as some pitfalls you should avoid. | |
2305 | |
2306 @table @emph | |
2307 @item Never use @code{char} and @code{char *}. | |
2308 In XEmacs, the use of @code{char} and @code{char *} is almost always a | |
2309 mistake. If you want to manipulate an Emacs character from ``C'', use | |
2310 @code{Emchar}. If you want to examine a specific octet in the internal | |
2311 format, use @code{Bufbyte}. If you want a Lisp-visible character, use a | |
2312 @code{Lisp_Object} and @code{make_char}. If you want a pointer to move | |
2313 through the internal text, use @code{Bufbyte *}. Also note that you | |
2314 almost certainly do not need @code{Emchar *}. | |
2315 | |
2316 @item Be careful not to confuse @code{Charcount}, @code{Bytecount}, and @code{Bufpos}. | |
2317 The whole point of using different types is to avoid confusion about the | |
2318 use of certain variables. Lest this effect be nullified, you need to be | |
2319 careful about using the right types. | |
2320 | |
2321 @item Always convert external data | |
2322 It is extremely important to always convert external data, because | |
2323 XEmacs can crash if unexpected 8bit sequences are copied to its internal | |
2324 buffers literally. | |
2325 | |
2326 This means that when a system function, such as @code{readdir}, returns | |
2327 a string, you need to convert it using one of the conversion macros | |
2328 described in the previous chapter, before passing it further to Lisp. | |
2329 In the case of @code{readdir}, you would use the | |
2330 @code{GET_C_CHARPTR_INT_FILENAME_DATA_ALLOCA} macro. | |
2331 | |
2332 Also note that many internal functions, such as @code{make_string}, | |
2333 accept Bufbytes, which removes the need for them to convert the data | |
2334 they receive. This increases efficiency because that way external data | |
2335 needs to be decoded only once, when it is read. After that, it is | |
2336 passed around in internal format. | |
2337 @end table | |
2338 | |
2339 @node An Example of Mule-Aware Code | |
2340 @subsection An Example of Mule-Aware Code | |
2341 | |
2342 As an example of Mule-aware code, we shall will analyze the | |
2343 @code{string} function, which conses up a Lisp string from the character | |
2344 arguments it receives. Here is the definition, pasted from | |
2345 @code{alloc.c}: | |
2346 | |
2347 @example | |
2348 @group | |
2349 DEFUN ("string", Fstring, 0, MANY, 0, /* | |
2350 Concatenate all the argument characters and make the result a string. | |
2351 */ | |
2352 (int nargs, Lisp_Object *args)) | |
2353 @{ | |
2354 Bufbyte *storage = alloca_array (Bufbyte, nargs * MAX_EMCHAR_LEN); | |
2355 Bufbyte *p = storage; | |
2356 | |
2357 for (; nargs; nargs--, args++) | |
2358 @{ | |
2359 Lisp_Object lisp_char = *args; | |
2360 CHECK_CHAR_COERCE_INT (lisp_char); | |
2361 p += set_charptr_emchar (p, XCHAR (lisp_char)); | |
2362 @} | |
2363 return make_string (storage, p - storage); | |
2364 @} | |
2365 @end group | |
2366 @end example | |
2367 | |
2368 Now we can analyze the source line by line. | |
2369 | |
2370 Obviously, string will be as long as there are arguments to the | |
2371 function. This is why we allocate @code{MAX_EMCHAR_LEN} * @var{nargs} | |
2372 bytes on the stack, i.e. the worst-case number of bytes for @var{nargs} | |
2373 @code{Emchar}s to fit in the string. | |
2374 | |
2375 Then, the loop checks that each element is a character, converting | |
2376 integers in the process. Like many other functions in XEmacs, this | |
2377 function silently accepts integers where characters are expected, for | |
2378 historical and compatibility reasons. Unless you know what you are | |
2379 doing, @code{CHECK_CHAR} will also suffice. @code{XCHAR (lisp_char)} | |
2380 extracts the @code{Emchar} from the @code{Lisp_Object}, and | |
2381 @code{set_charptr_emchar} stores it to storage, increasing @code{p} in | |
2382 the process. | |
2383 | |
2384 Other instructing examples of correct coding under Mule can be found all | |
2385 over XEmacs code. For starters, I recommend | |
2386 @code{Fnormalize_menu_item_name} in @file{menubar.c}. After you have | |
2387 understood this section of the manual and studied the examples, you can | |
2388 proceed writing new Mule-aware code. | |
2389 | 2026 |
2390 @node Techniques for XEmacs Developers | 2027 @node Techniques for XEmacs Developers |
2391 @section Techniques for XEmacs Developers | 2028 @section Techniques for XEmacs Developers |
2392 | 2029 |
2393 To make a quantified XEmacs, do: @code{make quantmacs}. | 2030 To make a quantified XEmacs, do: @code{make quantmacs}. |
4182 | 3819 |
4183 @menu | 3820 @menu |
4184 * Introduction to Allocation:: | 3821 * Introduction to Allocation:: |
4185 * Garbage Collection:: | 3822 * Garbage Collection:: |
4186 * GCPROing:: | 3823 * GCPROing:: |
4187 * Garbage Collection - Step by Step:: | |
4188 * Integers and Characters:: | 3824 * Integers and Characters:: |
4189 * Allocation from Frob Blocks:: | 3825 * Allocation from Frob Blocks:: |
4190 * lrecords:: | 3826 * lrecords:: |
4191 * Low-level allocation:: | 3827 * Low-level allocation:: |
4192 * Pure Space:: | 3828 * Pure Space:: |
4526 stack. That involves looking through all of stack memory and treating | 4162 stack. That involves looking through all of stack memory and treating |
4527 anything that looks like a reference to an object as a reference. This | 4163 anything that looks like a reference to an object as a reference. This |
4528 will result in a few objects not getting collected when they should, but | 4164 will result in a few objects not getting collected when they should, but |
4529 it obviates the need for @code{GCPRO}ing, and allows garbage collection | 4165 it obviates the need for @code{GCPRO}ing, and allows garbage collection |
4530 to happen at any point at all, such as during object allocation. | 4166 to happen at any point at all, such as during object allocation. |
4531 | |
4532 @node Garbage Collection - Step by Step | |
4533 @section Garbage Collection - Step by Step | |
4534 @cindex garbage collection step by step | |
4535 | |
4536 @menu | |
4537 * Invocation:: | |
4538 * garbage_collect_1:: | |
4539 * mark_object:: | |
4540 * gc_sweep:: | |
4541 * sweep_lcrecords_1:: | |
4542 * compact_string_chars:: | |
4543 * sweep_strings:: | |
4544 * sweep_bit_vectors_1:: | |
4545 @end menu | |
4546 | |
4547 @node Invocation | |
4548 @subsection Invocation | |
4549 @cindex garbage collection, invocation | |
4550 | |
4551 The first thing that anyone should know about garbage collection is: | |
4552 when and how the garbage collector is invoked. One might think that this | |
4553 could happen every time new memory is allocated, e.g. new objects are | |
4554 created, but this is @emph{not} the case. Instead, we have the following | |
4555 situation: | |
4556 | |
4557 The entry point of any process of garbage collection is an invocation | |
4558 of the function @code{garbage_collect_1} in file @code{alloc.c}. The | |
4559 invocation can occur @emph{explicitly} by calling the function | |
4560 @code{Fgarbage_collect} (in addition this function provides information | |
4561 about the freed memory), or can occur @emph{implicitly} in four different | |
4562 situations: | |
4563 @enumerate | |
4564 @item | |
4565 In function @code{main_1} in file @code{emacs.c}. This function is called | |
4566 at each startup of xemacs. The garbage collection is invoked after all | |
4567 initial creations are completed, but only if a special internal error | |
4568 checking-constant @code{ERROR_CHECK_GC} is defined. | |
4569 @item | |
4570 In function @code{disksave_object_finalization} in file | |
4571 @code{alloc.c}. The only purpose of this function is to clear the | |
4572 objects from memory which need not be stored with xemacs when we dump out | |
4573 an executable. This is only done by @code{Fdump_emacs} or by | |
4574 @code{Fdump_emacs_data} respectively (both in @code{emacs.c}). The | |
4575 actual clearing is accomplished by making these objects unreachable and | |
4576 starting a garbage collection. The function is only used while building | |
4577 xemacs. | |
4578 @item | |
4579 In function @code{Feval / eval} in file @code{eval.c}. Each time the | |
4580 well known and often used function eval is called to evaluate a form, | |
4581 one of the first things that could happen, is a potential call of | |
4582 @code{garbage_collect_1}. There exist three global variables, | |
4583 @code{consing_since_gc} (counts the created cons-cells since the last | |
4584 garbage collection), @code{gc_cons_threshold} (a specified threshold | |
4585 after which a garbage collection occurs) and @code{always_gc}. If | |
4586 @code{always_gc} is set or if the threshold is exceeded, the garbage | |
4587 collection will start. | |
4588 @item | |
4589 In function @code{Ffuncall / funcall} in file @code{eval.c}. This | |
4590 function evaluates calls of elisp functions and works according to | |
4591 @code{Feval}. | |
4592 @end enumerate | |
4593 | |
4594 The upshot is that garbage collection can basically occur everywhere | |
4595 @code{Feval}, respectively @code{Ffuncall}, is used - either directly or | |
4596 through another function. Since calls to these two functions are | |
4597 hidden in various other functions, many calls to | |
4598 @code{garabge_collect_1} are not obviously foreseeable, and therefore | |
4599 unexpected. Instances where they are used that are worth remembering are | |
4600 various elisp commands, as for example @code{or}, | |
4601 @code{and}, @code{if}, @code{cond}, @code{while}, @code{setq}, etc., | |
4602 miscellaneous @code{gui_item_...} functions, everything related to | |
4603 @code{eval} (@code{Feval_buffer}, @code{call0}, ...) and inside | |
4604 @code{Fsignal}. The latter is used to handle signals, as for example the | |
4605 ones raised by every @code{QUIT}-macro triggered after pressing Ctrl-g. | |
4606 | |
4607 @node garbage_collect_1 | |
4608 @subsection @code{garbage_collect_1} | |
4609 @cindex @code{garbage_collect_1} | |
4610 | |
4611 We can now describe exactly what happens after the invocation takes | |
4612 place. | |
4613 @enumerate | |
4614 @item | |
4615 There are several cases in which the garbage collector is left immediately: | |
4616 when we are already garbage collecting (@code{gc_in_progress}), when | |
4617 the garbage collection is somehow forbidden | |
4618 (@code{gc_currently_forbidden}), when we are currently displaying something | |
4619 (@code{in_display}) or when we are preparing for the armageddon of the | |
4620 whole system (@code{preparing_for_armageddon}). | |
4621 @item | |
4622 Next the correct frame in which to put | |
4623 all the output occurring during garbage collecting is determined. In | |
4624 order to be able to restore the old display's state after displaying the | |
4625 message, some data about the current cursor position has to be | |
4626 saved. The variables @code{pre_gc_curser} and @code{cursor_changed} take | |
4627 care of that. | |
4628 @item | |
4629 The state of @code{gc_currently_forbidden} must be restored after | |
4630 the garbage collection, no matter what happens during the process. We | |
4631 accomplish this by @code{record_unwind_protect}ing the suitable function | |
4632 @code{restore_gc_inhibit} together with the current value of | |
4633 @code{gc_currently_forbidden}. | |
4634 @item | |
4635 If we are concurrently running an interactive xemacs session, the next step | |
4636 is simply to show the garbage collector's cursor/message. | |
4637 @item | |
4638 The following steps are the intrinsic steps of the garbage collector, | |
4639 therefore @code{gc_in_progress} is set. | |
4640 @item | |
4641 For debugging purposes, it is possible to copy the current C stack | |
4642 frame. However, this seems to be a currently unused feature. | |
4643 @item | |
4644 Before actually starting to go over all live objects, references to | |
4645 objects that are no longer used are pruned. We only have to do this for events | |
4646 (@code{clear_event_resource}) and for specifiers | |
4647 (@code{cleanup_specifiers}). | |
4648 @item | |
4649 Now the mark phase begins and marks all accessible elements. In order to | |
4650 start from | |
4651 all slots that serve as roots of accessibility, the function | |
4652 @code{mark_object} is called for each root individually to go out from | |
4653 there to mark all reachable objects. All roots that are traversed are | |
4654 shown in their processed order: | |
4655 @itemize @bullet | |
4656 @item | |
4657 all constant symbols and static variables that are registered via | |
4658 @code{staticpro}@ in the array @code{staticvec}. | |
4659 @xref{Adding Global Lisp Variables}. | |
4660 @item | |
4661 all Lisp objects that are created in C functions and that must be | |
4662 protected from freeing them. They are registered in the global | |
4663 list @code{gcprolist}. | |
4664 @xref{GCPROing}. | |
4665 @item | |
4666 all local variables (i.e. their name fields @code{symbol} and old | |
4667 values @code{old_values}) that are bound during the evaluation by the Lisp | |
4668 engine. They are stored in @code{specbinding} structs pushed on a stack | |
4669 called @code{specpdl}. | |
4670 @xref{Dynamic Binding; The specbinding Stack; Unwind-Protects}. | |
4671 @item | |
4672 all catch blocks that the Lisp engine encounters during the evaluation | |
4673 cause the creation of structs @code{catchtag} inserted in the list | |
4674 @code{catchlist}. Their tag (@code{tag}) and value (@code{val} fields | |
4675 are freshly created objects and therefore have to be marked. | |
4676 @xref{Catch and Throw}. | |
4677 @item | |
4678 every function application pushes new structs @code{backtrace} | |
4679 on the call stack of the Lisp engine (@code{backtrace_list}). The unique | |
4680 parts that have to be marked are the fields for each function | |
4681 (@code{function}) and all their arguments (@code{args}). | |
4682 @xref{Evaluation}. | |
4683 @item | |
4684 all objects that are used by the redisplay engine that must not be freed | |
4685 are marked by a special function called @code{mark_redisplay} (in | |
4686 @code{redisplay.c}). | |
4687 @item | |
4688 all objects created for profiling purposes are allocated by C functions | |
4689 instead of using the lisp allocation mechanisms. In order to receive the | |
4690 right ones during the sweep phase, they also have to be marked | |
4691 manually. That is done by the function @code{mark_profiling_info} | |
4692 @end itemize | |
4693 @item | |
4694 Hash tables in Xemacs belong to a kind of special objects that | |
4695 make use of a concept often called 'weak pointers'. | |
4696 To make a long story short, these kind of pointers are not followed | |
4697 during the estimation of the live objects during garbage collection. | |
4698 Any object referenced only by weak pointers is collected | |
4699 anyway, and the reference to it is cleared. In hash tables there are | |
4700 different usage patterns of them, manifesting in different types of hash | |
4701 tables, namely 'non-weak', 'weak', 'key-weak' and 'value-weak' | |
4702 (internally also 'key-car-weak' and 'value-car-weak') hash tables, each | |
4703 clearing entries depending on different conditions. More information can | |
4704 be found in the documentation to the function @code{make-hash-table}. | |
4705 | |
4706 Because there are complicated dependency rules about when and what to | |
4707 mark while processing weak hash tables, the standard @code{marker} | |
4708 method is only active if it is marking non-weak hash tables. As soon as | |
4709 a weak component is in the table, the hash table entries are ignored | |
4710 while marking. Instead their marking is done each separately by the | |
4711 function @code{finish_marking_weak_hash_tables}. This function iterates | |
4712 over each hash table entry @code{hentries} for each weak hash table in | |
4713 @code{Vall_weak_hash_tables}. Depending on the type of a table, the | |
4714 appropriate action is performed. | |
4715 If a table is acting as @code{HASH_TABLE_KEY_WEAK}, and a key already marked, | |
4716 everything reachable from the @code{value} component is marked. If it is | |
4717 acting as a @code{HASH_TABLE_VALUE_WEAK} and the value component is | |
4718 already marked, the marking starts beginning only from the | |
4719 @code{key} component. | |
4720 If it is a @code{HASH_TABLE_KEY_CAR_WEAK} and the car | |
4721 of the key entry is already marked, we mark both the @code{key} and | |
4722 @code{value} components. | |
4723 Finally, if the table is of the type @code{HASH_TABLE_VALUE_CAR_WEAK} | |
4724 and the car of the value components is already marked, again both the | |
4725 @code{key} and the @code{value} components get marked. | |
4726 | |
4727 Again, there are lists with comparable properties called weak | |
4728 lists. There exist different peculiarities of their types called | |
4729 @code{simple}, @code{assoc}, @code{key-assoc} and | |
4730 @code{value-assoc}. You can find further details about them in the | |
4731 description to the function @code{make-weak-list}. The scheme of their | |
4732 marking is similar: all weak lists are listed in @code{Qall_weak_lists}, | |
4733 therefore we iterate over them. The marking is advanced until we hit an | |
4734 already marked pair. Then we know that during a former run all | |
4735 the rest has been marked completely. Again, depending on the special | |
4736 type of the weak list, our jobs differ. If it is a @code{WEAK_LIST_SIMPLE} | |
4737 and the elem is marked, we mark the @code{cons} part. If it is a | |
4738 @code{WEAK_LIST_ASSOC} and not a pair or a pair with both marked car and | |
4739 cdr, we mark the @code{cons} and the @code{elem}. If it is a | |
4740 @code{WEAK_LIST_KEY_ASSOC} and not a pair or a pair with a marked car of | |
4741 the elem, we mark the @code{cons} and the @code{elem}. Finally, if it is | |
4742 a @code{WEAK_LIST_VALUE_ASSOC} and not a pair or a pair with a marked | |
4743 cdr of the elem, we mark both the @code{cons} and the @code{elem}. | |
4744 | |
4745 Since, by marking objects in reach from weak hash tables and weak lists, | |
4746 other objects could get marked, this perhaps implies further marking of | |
4747 other weak objects, both finishing functions are redone as long as | |
4748 yet unmarked objects get freshly marked. | |
4749 | |
4750 @item | |
4751 After completing the special marking for the weak hash tables and for the weak | |
4752 lists, all entries that point to objects that are going to be swept in | |
4753 the further process are useless, and therefore have to be removed from | |
4754 the table or the list. | |
4755 | |
4756 The function @code{prune_weak_hash_tables} does the job for weak hash | |
4757 tables. Totally unmarked hash tables are removed from the list | |
4758 @code{Vall_weak_hash_tables}. The other ones are treated more carefully | |
4759 by scanning over all entries and removing one as soon as one of | |
4760 the components @code{key} and @code{value} is unmarked. | |
4761 | |
4762 The same idea applies to the weak lists. It is accomplished by | |
4763 @code{prune_weak_lists}: An unmarked list is pruned from | |
4764 @code{Vall_weak_lists} immediately. A marked list is treated more | |
4765 carefully by going over it and removing just the unmarked pairs. | |
4766 | |
4767 @item | |
4768 The function @code{prune_specifiers} checks all listed specifiers held | |
4769 in @code{Vall_speficiers} and removes the ones from the lists that are | |
4770 unmarked. | |
4771 | |
4772 @item | |
4773 All syntax tables are stored in a list called | |
4774 @code{Vall_syntax_tables}. The function @code{prune_syntax_tables} walks | |
4775 through it and unlinks the tables that are unmarked. | |
4776 | |
4777 @item | |
4778 Next, we will attack the complete sweeping - the function | |
4779 @code{gc_sweep} which holds the predominance. | |
4780 @item | |
4781 First, all the variables with respect to garbage collection are | |
4782 reset. @code{consing_since_gc} - the counter of the created cells since | |
4783 the last garbage collection - is set back to 0, and | |
4784 @code{gc_in_progress} is not @code{true} anymore. | |
4785 @item | |
4786 In case the session is interactive, the displayed cursor and message are | |
4787 removed again. | |
4788 @item | |
4789 The state of @code{gc_inhibit} is restored to the former value by | |
4790 unwinding the stack. | |
4791 @item | |
4792 A small memory reserve is always held back that can be reached by | |
4793 @code{breathing_space}. If nothing more is left, we create a new reserve | |
4794 and exit. | |
4795 @end enumerate | |
4796 | |
4797 @node mark_object | |
4798 @subsection @code{mark_object} | |
4799 @cindex @code{mark_object} | |
4800 | |
4801 The first thing that is checked while marking an object is whether the | |
4802 object is a real Lisp object @code{Lisp_Type_Record} or just an integer | |
4803 or a character. Integers and characters are the only two types that are | |
4804 stored directly - without another level of indirection, and therefore they | |
4805 donīt have to be marked and collected. | |
4806 @xref{How Lisp Objects Are Represented in C}. | |
4807 | |
4808 The second case is the one we have to handle. It is the one when we are | |
4809 dealing with a pointer to a Lisp object. But, there exist also three | |
4810 possibilities, that prevent us from doing anything while marking: The | |
4811 object is read only which prevents it from being garbage collected, | |
4812 i.e. marked (@code{C_READONLY_RECORD_HEADER}). The object in question is | |
4813 already marked, and need not be marked for the second time (checked by | |
4814 @code{MARKED_RECORD_HEADER_P}). If it is a special, unmarkable object | |
4815 (@code{UNMARKABLE_RECORD_HEADER_P}, apparently, these are objects that | |
4816 sit in some CONST space, and can therefore not be marked, see | |
4817 @code{this_one_is_unmarkable} in @code{alloc.c}). | |
4818 | |
4819 Now, the actual marking is feasible. We do so by once using the macro | |
4820 @code{MARK_RECORD_HEADER} to mark the object itself (actually the | |
4821 special flag in the lrecord header), and calling its special marker | |
4822 "method" @code{marker} if available. The marker method marks every | |
4823 other object that is in reach from our current object. Note, that these | |
4824 marker methods should not call @code{mark_object} recursively, but | |
4825 instead should return the next object from where further marking has to | |
4826 be performed. | |
4827 | |
4828 In case another object was returned, as mentioned before, we reiterate | |
4829 the whole @code{mark_object} process beginning with this next object. | |
4830 | |
4831 @node gc_sweep | |
4832 @subsection @code{gc_sweep} | |
4833 @cindex @code{gc_sweep} | |
4834 | |
4835 The job of this function is to free all unmarked records from memory. As | |
4836 we know, there are different types of objects implemented and managed, and | |
4837 consequently different ways to free them from memory. | |
4838 @xref{Introduction to Allocation}. | |
4839 | |
4840 We start with all objects stored through @code{lcrecords}. All | |
4841 bulkier objects are allocated and handled using that scheme of | |
4842 @code{lcrecords}. Each object is @code{malloc}ed separately | |
4843 instead of placing it in one of the contiguous frob blocks. All types | |
4844 that are currently stored | |
4845 using @code{lcrecords}īs @code{alloc_lcrecord} and | |
4846 @code{make_lcrecord_list} are the types: vectors, buffers, | |
4847 char-table, char-table-entry, console, weak-list, database, device, | |
4848 ldap, hash-table, command-builder, extent-auxiliary, extent-info, face, | |
4849 coding-system, frame, image-instance, glyph, popup-data, gui-item, | |
4850 keymap, charset, color_instance, font_instance, opaque, opaque-list, | |
4851 process, range-table, specifier, symbol-value-buffer-local, | |
4852 symbol-value-lisp-magic, symbol-value-varalias, toolbar-button, | |
4853 tooltalk-message, tooltalk-pattern, window, and window-configuration. We | |
4854 take care of them in the fist place | |
4855 in order to be able to handle and to finalize items stored in them more | |
4856 easily. The function @code{sweep_lcrecords_1} as described below is | |
4857 doing the whole job for us. | |
4858 For a description about the internals: @xref{lrecords}. | |
4859 | |
4860 Our next candidates are the other objects that behave quite differently | |
4861 than everything else: the strings. They consists of two parts, a | |
4862 fixed-size portion (@code{struct Lisp_string}) holding the string's | |
4863 length, its property list and a pointer to the second part, and the | |
4864 actual string data, which is stored in string-chars blocks comparable to | |
4865 frob blocks. In this block, the data is not only freed, but also a | |
4866 compression of holes is made, i.e. all strings are relocated together. | |
4867 @xref{String}. This compacting phase is performed by the function | |
4868 @code{compact_string_chars}, the actual sweeping by the function | |
4869 @code{sweep_strings} is described below. | |
4870 | |
4871 After that, the other types are swept step by step using functions | |
4872 @code{sweep_conses}, @code{sweep_bit_vectors_1}, | |
4873 @code{sweep_compiled_functions}, @code{sweep_floats}, | |
4874 @code{sweep_symbols}, @code{sweep_extents}, @code{sweep_markers} and | |
4875 @code{sweep_extents}. They are the fixed-size types cons, floats, | |
4876 compiled-functions, symbol, marker, extent, and event stored in | |
4877 so-called "frob blocks", and therefore we can basically do the same on | |
4878 every type objects, using the same macros, especially defined only to | |
4879 handle everything with respect to fixed-size blocks. The only fixed-size | |
4880 type that is not handled here are the fixed-size portion of strings, | |
4881 because we took special care of them earlier. | |
4882 | |
4883 The only big exceptions are bit vectors stored differently and | |
4884 therefore treated differently by the function @code{sweep_bit_vectors_1} | |
4885 described later. | |
4886 | |
4887 At first, we need some brief information about how | |
4888 these fixed-size types are managed in general, in order to understand | |
4889 how the sweeping is done. They have all a fixed size, and are therefore | |
4890 stored in big blocks of memory - allocated at once - that can hold a | |
4891 certain amount of objects of one type. The macro | |
4892 @code{DECLARE_FIXED_TYPE_ALLOC} creates the suitable structures for | |
4893 every type. More precisely, we have the block struct | |
4894 (holding a pointer to the previous block @code{prev} and the | |
4895 objects in @code{block[]}), a pointer to current block | |
4896 (@code{current_..._block)}) and its last index | |
4897 (@code{current_..._block_index}), and a pointer to the free list that | |
4898 will be created. Also a macro @code{FIXED_TYPE_FROM_BLOCK} plus some | |
4899 related macros exists that are used to obtain a new object, either from | |
4900 the free list @code{ALLOCATE_FIXED_TYPE_1} if there is an unused object | |
4901 of that type stored or by allocating a completely new block using | |
4902 @code{ALLOCATE_FIXED_TYPE_FROM_BLOCK}. | |
4903 | |
4904 The rest works as follows: all of them define a | |
4905 macro @code{UNMARK_...} that is used to unmark the object. They define a | |
4906 macro @code{ADDITIONAL_FREE_...} that defines additional work that has | |
4907 to be done when converting an object from in use to not in use (so far, | |
4908 only markers use it in order to unchain them). Then, they all call | |
4909 the macro @code{SWEEP_FIXED_TYPE_BLOCK} instantiated with their type name | |
4910 and their struct name. | |
4911 | |
4912 This call in particular does the following: we go over all blocks | |
4913 starting with the current moving towards the oldest. | |
4914 For each block, we look at every object in it. If the object already | |
4915 freed (checked with @code{FREE_STRUCT_P} using the first pointer of the | |
4916 object), or if it is | |
4917 set to read only (@code{C_READONLY_RECORD_HEADER_P}, nothing must be | |
4918 done. If it is unmarked (checked with @code{MARKED_RECORD_HEADER_P}), it | |
4919 is put in the free list and set free (using the macro | |
4920 @code{FREE_FIXED_TYPE}, otherwise it stays in the block, but is unmarked | |
4921 (by @code{UNMARK_...}). While going through one block, we note if the | |
4922 whole block is empty. If so, the whole block is freed (using | |
4923 @code{xfree}) and the free list state is set to the state it had before | |
4924 handling this block. | |
4925 | |
4926 @node sweep_lcrecords_1 | |
4927 @subsection @code{sweep_lcrecords_1} | |
4928 @cindex @code{sweep_lcrecords_1} | |
4929 | |
4930 After nullifying the complete lcrecord statistics, we go over all | |
4931 lcrecords two separate times. They are all chained together in a list with | |
4932 a head called @code{all_lcrecords}. | |
4933 | |
4934 The first loop calls for each object its @code{finalizer} method, but only | |
4935 in the case that it is not read only | |
4936 (@code{C_READONLY_RECORD_HEADER_P)}, it is not already marked | |
4937 (@code{MARKED_RECORD_HEADER_P}), it is not already in a free list (list of | |
4938 freed objects, field @code{free}) and finally it owns a finalizer | |
4939 method. | |
4940 | |
4941 The second loop actually frees the appropriate objects again by iterating | |
4942 through the whole list. In case an object is read only or marked, it | |
4943 has to persist, otherwise it is manually freed by calling | |
4944 @code{xfree}. During this loop, the lcrecord statistics are kept up to | |
4945 date by calling @code{tick_lcrecord_stats} with the right arguments, | |
4946 | |
4947 @node compact_string_chars | |
4948 @subsection @code{compact_string_chars} | |
4949 @cindex @code{compact_string_chars} | |
4950 | |
4951 The purpose of this function is to compact all the data parts of the | |
4952 strings that are held in so-called @code{string_chars_block}, i.e. the | |
4953 strings that do not exceed a certain maximal length. | |
4954 | |
4955 The procedure with which this is done is as follows. We are keeping two | |
4956 positions in the @code{string_chars_block}s using two pointer/integer | |
4957 pairs, namely @code{from_sb}/@code{from_pos} and | |
4958 @code{to_sb}/@code{to_pos}. They stand for the actual positions, from | |
4959 where to where, to copy the actually handled string. | |
4960 | |
4961 While going over all chained @code{string_char_block}s and their held | |
4962 strings, staring at @code{first_string_chars_block}, both pointers | |
4963 are advanced and eventually a string is copied from @code{from_sb} to | |
4964 @code{to_sb}, depending on the status of the pointed at strings. | |
4965 | |
4966 More precisely, we can distinguish between the following actions. | |
4967 @itemize @bullet | |
4968 @item | |
4969 The string at @code{from_sb}'s position could be marked as free, which | |
4970 is indicated by an invalid pointer to the pointer that should point back | |
4971 to the fixed size string object, and which is checked by | |
4972 @code{FREE_STRUCT_P}. In this case, the @code{from_sb}/@code{from_pos} | |
4973 is advanced to the next string, and nothing has to be copied. | |
4974 @item | |
4975 Also, if a string object itself is unmarked, nothing has to be | |
4976 copied. We likewise advance the @code{from_sb}/@code{from_pos} | |
4977 pair as described above. | |
4978 @item | |
4979 In all other cases, we have a marked string at hand. The string data | |
4980 must be moved from the from-position to the to-position. In case | |
4981 there is not enough space in the actual @code{to_sb}-block, we advance | |
4982 this pointer to the beginning of the next block before copying. In case the | |
4983 from and to positions are different, we perform the | |
4984 actual copying using the library function @code{memmove}. | |
4985 @end itemize | |
4986 | |
4987 After compacting, the pointer to the current | |
4988 @code{string_chars_block}, sitting in @code{current_string_chars_block}, | |
4989 is reset on the last block to which we moved a string, | |
4990 i.e. @code{to_block}, and all remaining blocks (we know that they just | |
4991 carry garbage) are explicitly @code{xfree}d. | |
4992 | |
4993 @node sweep_strings | |
4994 @subsection @code{sweep_strings} | |
4995 @cindex @code{sweep_strings} | |
4996 | |
4997 The sweeping for the fixed sized string objects is essentially exactly | |
4998 the same as it is for all other fixed size types. As before, the freeing | |
4999 into the suitable free list is done by using the macro | |
5000 @code{SWEEP_FIXED_SIZE_BLOCK} after defining the right macros | |
5001 @code{UNMARK_string} and @code{ADDITIONAL_FREE_string}. These two | |
5002 definitions are a little bit special compared to the ones used | |
5003 for the other fixed size types. | |
5004 | |
5005 @code{UNMARK_string} is defined the same way except some additional code | |
5006 used for updating the bookkeeping information. | |
5007 | |
5008 For strings, @code{ADDITIONAL_FREE_string} has to do something in | |
5009 addition: in case, the string was not allocated in a | |
5010 @code{string_chars_block} because it exceeded the maximal length, and | |
5011 therefore it was @code{malloc}ed separately, we know also @code{xfree} | |
5012 it explicitly. | |
5013 | |
5014 @node sweep_bit_vectors_1 | |
5015 @subsection @code{sweep_bit_vectors_1} | |
5016 @cindex @code{sweep_bit_vectors_1} | |
5017 | |
5018 Bit vectors are also one of the rare types that are @code{malloc}ed | |
5019 individually. Consequently, while sweeping, all further needless | |
5020 bit vectors must be freed by hand. This is done, as one might imagine, | |
5021 the expected way: since they are all registered in a list called | |
5022 @code{all_bit_vectors}, all elements of that list are traversed, | |
5023 all unmarked bit vectors are unlinked by calling @code{xfree} and all of | |
5024 them become unmarked. | |
5025 In addition, the bookkeeping information used for garbage | |
5026 collector's output purposes is updated. | |
5027 | 4167 |
5028 @node Integers and Characters | 4168 @node Integers and Characters |
5029 @section Integers and Characters | 4169 @section Integers and Characters |
5030 | 4170 |
5031 Integer and character Lisp objects are created from integers using the | 4171 Integer and character Lisp objects are created from integers using the |
6558 Many are accessible indirectly in Lisp programs via Lisp primitives. | 5698 Many are accessible indirectly in Lisp programs via Lisp primitives. |
6559 | 5699 |
6560 @table @code | 5700 @table @code |
6561 @item name | 5701 @item name |
6562 The buffer name is a string that names the buffer. It is guaranteed to | 5702 The buffer name is a string that names the buffer. It is guaranteed to |
6563 be unique. @xref{Buffer Names,,, lispref, XEmacs Lisp Reference | 5703 be unique. @xref{Buffer Names,,, lispref, XEmacs Lisp Programmer's |
6564 Manual}. | 5704 Manual}. |
6565 | 5705 |
6566 @item save_modified | 5706 @item save_modified |
6567 This field contains the time when the buffer was last saved, as an | 5707 This field contains the time when the buffer was last saved, as an |
6568 integer. @xref{Buffer Modification,,, lispref, XEmacs Lisp Reference | 5708 integer. @xref{Buffer Modification,,, lispref, XEmacs Lisp Programmer's |
6569 Manual}. | 5709 Manual}. |
6570 | 5710 |
6571 @item modtime | 5711 @item modtime |
6572 This field contains the modification time of the visited file. It is | 5712 This field contains the modification time of the visited file. It is |
6573 set when the file is written or read. Every time the buffer is written | 5713 set when the file is written or read. Every time the buffer is written |
6574 to the file, this field is compared to the modification time of the | 5714 to the file, this field is compared to the modification time of the |
6575 file. @xref{Buffer Modification,,, lispref, XEmacs Lisp Reference | 5715 file. @xref{Buffer Modification,,, lispref, XEmacs Lisp Programmer's |
6576 Manual}. | 5716 Manual}. |
6577 | 5717 |
6578 @item auto_save_modified | 5718 @item auto_save_modified |
6579 This field contains the time when the buffer was last auto-saved. | 5719 This field contains the time when the buffer was last auto-saved. |
6580 | 5720 |
6582 This field contains the @code{window-start} position in the buffer as of | 5722 This field contains the @code{window-start} position in the buffer as of |
6583 the last time the buffer was displayed in a window. | 5723 the last time the buffer was displayed in a window. |
6584 | 5724 |
6585 @item undo_list | 5725 @item undo_list |
6586 This field points to the buffer's undo list. @xref{Undo,,, lispref, | 5726 This field points to the buffer's undo list. @xref{Undo,,, lispref, |
6587 XEmacs Lisp Reference Manual}. | 5727 XEmacs Lisp Programmer's Manual}. |
6588 | 5728 |
6589 @item syntax_table_v | 5729 @item syntax_table_v |
6590 This field contains the syntax table for the buffer. @xref{Syntax | 5730 This field contains the syntax table for the buffer. @xref{Syntax |
6591 Tables,,, lispref, XEmacs Lisp Reference Manual}. | 5731 Tables,,, lispref, XEmacs Lisp Programmer's Manual}. |
6592 | 5732 |
6593 @item downcase_table | 5733 @item downcase_table |
6594 This field contains the conversion table for converting text to lower | 5734 This field contains the conversion table for converting text to lower |
6595 case. @xref{Case Tables,,, lispref, XEmacs Lisp Reference Manual}. | 5735 case. @xref{Case Tables,,, lispref, XEmacs Lisp Programmer's Manual}. |
6596 | 5736 |
6597 @item upcase_table | 5737 @item upcase_table |
6598 This field contains the conversion table for converting text to upper | 5738 This field contains the conversion table for converting text to upper |
6599 case. @xref{Case Tables,,, lispref, XEmacs Lisp Reference Manual}. | 5739 case. @xref{Case Tables,,, lispref, XEmacs Lisp Programmer's Manual}. |
6600 | 5740 |
6601 @item case_canon_table | 5741 @item case_canon_table |
6602 This field contains the conversion table for canonicalizing text for | 5742 This field contains the conversion table for canonicalizing text for |
6603 case-folding search. @xref{Case Tables,,, lispref, XEmacs Lisp | 5743 case-folding search. @xref{Case Tables,,, lispref, XEmacs Lisp |
6604 Reference Manual}. | 5744 Programmer's Manual}. |
6605 | 5745 |
6606 @item case_eqv_table | 5746 @item case_eqv_table |
6607 This field contains the equivalence table for case-folding search. | 5747 This field contains the equivalence table for case-folding search. |
6608 @xref{Case Tables,,, lispref, XEmacs Lisp Reference Manual}. | 5748 @xref{Case Tables,,, lispref, XEmacs Lisp Programmer's Manual}. |
6609 | 5749 |
6610 @item display_table | 5750 @item display_table |
6611 This field contains the buffer's display table, or @code{nil} if it | 5751 This field contains the buffer's display table, or @code{nil} if it |
6612 doesn't have one. @xref{Display Tables,,, lispref, XEmacs Lisp | 5752 doesn't have one. @xref{Display Tables,,, lispref, XEmacs Lisp |
6613 Reference Manual}. | 5753 Programmer's Manual}. |
6614 | 5754 |
6615 @item markers | 5755 @item markers |
6616 This field contains the chain of all markers that currently point into | 5756 This field contains the chain of all markers that currently point into |
6617 the buffer. Deletion of text in the buffer, and motion of the buffer's | 5757 the buffer. Deletion of text in the buffer, and motion of the buffer's |
6618 gap, must check each of these markers and perhaps update it. | 5758 gap, must check each of these markers and perhaps update it. |
6619 @xref{Markers,,, lispref, XEmacs Lisp Reference Manual}. | 5759 @xref{Markers,,, lispref, XEmacs Lisp Programmer's Manual}. |
6620 | 5760 |
6621 @item backed_up | 5761 @item backed_up |
6622 This field is a flag that tells whether a backup file has been made for | 5762 This field is a flag that tells whether a backup file has been made for |
6623 the visited file of this buffer. | 5763 the visited file of this buffer. |
6624 | 5764 |
6625 @item mark | 5765 @item mark |
6626 This field contains the mark for the buffer. The mark is a marker, | 5766 This field contains the mark for the buffer. The mark is a marker, |
6627 hence it is also included on the list @code{markers}. @xref{The Mark,,, | 5767 hence it is also included on the list @code{markers}. @xref{The Mark,,, |
6628 lispref, XEmacs Lisp Reference Manual}. | 5768 lispref, XEmacs Lisp Programmer's Manual}. |
6629 | 5769 |
6630 @item mark_active | 5770 @item mark_active |
6631 This field is non-@code{nil} if the buffer's mark is active. | 5771 This field is non-@code{nil} if the buffer's mark is active. |
6632 | 5772 |
6633 @item local_var_alist | 5773 @item local_var_alist |
6634 This field contains the association list describing the variables local | 5774 This field contains the association list describing the variables local |
6635 in this buffer, and their values, with the exception of local variables | 5775 in this buffer, and their values, with the exception of local variables |
6636 that have special slots in the buffer object. (Those slots are omitted | 5776 that have special slots in the buffer object. (Those slots are omitted |
6637 from this table.) @xref{Buffer-Local Variables,,, lispref, XEmacs Lisp | 5777 from this table.) @xref{Buffer-Local Variables,,, lispref, XEmacs Lisp |
6638 Reference Manual}. | 5778 Programmer's Manual}. |
6639 | 5779 |
6640 @item modeline_format | 5780 @item modeline_format |
6641 This field contains a Lisp object which controls how to display the mode | 5781 This field contains a Lisp object which controls how to display the mode |
6642 line for this buffer. @xref{Modeline Format,,, lispref, XEmacs Lisp | 5782 line for this buffer. @xref{Modeline Format,,, lispref, XEmacs Lisp |
6643 Reference Manual}. | 5783 Programmer's Manual}. |
6644 | 5784 |
6645 @item base_buffer | 5785 @item base_buffer |
6646 This field holds the buffer's base buffer (if it is an indirect buffer), | 5786 This field holds the buffer's base buffer (if it is an indirect buffer), |
6647 or @code{nil}. | 5787 or @code{nil}. |
6648 @end table | 5788 @end table |
7017 this is the code executed to handle any stuff that needs to be done | 6157 this is the code executed to handle any stuff that needs to be done |
7018 (e.g. designating back to ASCII and left-to-right mode) after all | 6158 (e.g. designating back to ASCII and left-to-right mode) after all |
7019 other encoded/decoded data has been written out. This is not used for | 6159 other encoded/decoded data has been written out. This is not used for |
7020 charset CCL programs. | 6160 charset CCL programs. |
7021 | 6161 |
7022 REGISTER: 0..7 -- referred by RRR or rrr | 6162 REGISTER: 0..7 -- refered by RRR or rrr |
7023 | 6163 |
7024 OPERATOR BIT FIELD (27-bit): XXXXXXXXXXXXXXX RRR TTTTT | 6164 OPERATOR BIT FIELD (27-bit): XXXXXXXXXXXXXXX RRR TTTTT |
7025 TTTTT (5-bit): operator type | 6165 TTTTT (5-bit): operator type |
7026 RRR (3-bit): register number | 6166 RRR (3-bit): register number |
7027 XXXXXXXXXXXXXXXX (15-bit): | 6167 XXXXXXXXXXXXXXXX (15-bit): |
7403 There is a separate Lisp object type for each of these four concepts. | 6543 There is a separate Lisp object type for each of these four concepts. |
7404 Furthermore, there is logically a @dfn{selected console}, | 6544 Furthermore, there is logically a @dfn{selected console}, |
7405 @dfn{selected display}, @dfn{selected frame}, and @dfn{selected window}. | 6545 @dfn{selected display}, @dfn{selected frame}, and @dfn{selected window}. |
7406 Each of these objects is distinguished in various ways, such as being the | 6546 Each of these objects is distinguished in various ways, such as being the |
7407 default object for various functions that act on objects of that type. | 6547 default object for various functions that act on objects of that type. |
7408 Note that every containing object remembers the ``selected'' object | 6548 Note that every containing object rememembers the ``selected'' object |
7409 among the objects that it contains: e.g. not only is there a selected | 6549 among the objects that it contains: e.g. not only is there a selected |
7410 window, but every frame remembers the last window in it that was | 6550 window, but every frame remembers the last window in it that was |
7411 selected, and changing the selected frame causes the remembered window | 6551 selected, and changing the selected frame causes the remembered window |
7412 within it to become the selected window. Similar relationships apply | 6552 within it to become the selected window. Similar relationships apply |
7413 for consoles to devices and devices to frames. | 6553 for consoles to devices and devices to frames. |