Mercurial > hg > xemacs-beta
annotate man/lispref/hash-tables.texi @ 4921:17362f371cc2
add more byte-code assertions and better failure output
-------------------- ChangeLog entries follow: --------------------
src/ChangeLog addition:
2010-02-03 Ben Wing <ben@xemacs.org>
* alloc.c (Fmake_byte_code):
* bytecode.h:
* lisp.h:
* lread.c:
* lread.c (readevalloop):
* lread.c (Fread):
* lread.c (Fread_from_string):
* lread.c (read_list_conser):
* lread.c (read_list):
* lread.c (vars_of_lread):
* symbols.c:
* symbols.c (Fdefine_function):
Turn on the "compiled-function annotation hack". Implement it
properly by hooking into Fdefalias(). Note in the docstring to
`defalias' that we do this. Remove some old broken code and
change code that implemented the old kludgy way of hooking into
the Lisp reader into bracketed by `#ifdef
COMPILED_FUNCTION_ANNOTATION_HACK_OLD_WAY', which is not enabled.
Also enable byte-code metering when DEBUG_XEMACS -- this is a form
of profiling for computing histograms of which sequences of two
bytecodes are used most often.
* bytecode-ops.h:
* bytecode-ops.h (OPCODE):
New file. Extract out all the opcodes and declare them using
OPCODE(), a bit like frame slots and such. This way the file can
be included multiple times if necessary to iterate multiple times
over the byte opcodes.
* bytecode.c:
* bytecode.c (NUM_REMEMBERED_BYTE_OPS):
* bytecode.c (OPCODE):
* bytecode.c (assert_failed_with_remembered_ops):
* bytecode.c (READ_UINT_2):
* bytecode.c (READ_INT_1):
* bytecode.c (READ_INT_2):
* bytecode.c (PEEK_INT_1):
* bytecode.c (PEEK_INT_2):
* bytecode.c (JUMP_RELATIVE):
* bytecode.c (JUMP_NEXT):
* bytecode.c (PUSH):
* bytecode.c (POP_WITH_MULTIPLE_VALUES):
* bytecode.c (DISCARD):
* bytecode.c (UNUSED):
* bytecode.c (optimize_byte_code):
* bytecode.c (optimize_compiled_function):
* bytecode.c (Fbyte_code):
* bytecode.c (vars_of_bytecode):
* bytecode.c (init_opcode_table_multi_op):
* bytecode.c (reinit_vars_of_bytecode):
* emacs.c (main_1):
* eval.c (funcall_compiled_function):
* symsinit.h:
Any time we change either the instruction pointer or the stack
pointer, assert that we're going to move it to a valid location.
This should catch failures right when they occur rather than
sometime later. This requires that we pass in another couple of
parameters into some functions (only with error-checking enabled,
see below).
Also keep track, using a circular queue, of the last 100 byte
opcodes seen, and when we hit an assert failure during byte-code
execution, output the contents of the queue in a nice readable
fashion. This requires that bytecode-ops.h be included a second
time so that a table mapping opcodes to the name of their operation
can be constructed. This table is constructed in new function
reinit_vars_of_bytecode().
Everything in the last two paras happens only when
ERROR_CHECK_BYTE_CODE.
Add some longish comments describing how the arrays that hold the
stack and instructions, and the pointers used to access them, work.
* gc.c:
Import some code from my `latest-fix' workspace to mark the
staticpro's in order from lowest to highest, rather than highest to
lowest, so it's easier to debug when something goes wrong.
* lisp.h (abort_with_message): Renamed from abort_with_msg().
* symbols.c (defsymbol_massage_name_1):
* symbols.c (defsymbol_nodump):
* symbols.c (defsymbol):
* symbols.c (defkeyword):
* symeval.h (DEFVAR_SYMVAL_FWD_OBJECT):
Make the various calls to staticpro() instead call staticpro_1(),
passing in the name of the C var being staticpro'ed, so that it
shows up in staticpro_names. Otherwise staticpro_names just has
1000+ copies of the word `location'.
author | Ben Wing <ben@xemacs.org> |
---|---|
date | Wed, 03 Feb 2010 08:01:55 -0600 |
parents | e6dec75ded0e |
children | 71ee43b8a74d |
rev | line source |
---|---|
428 | 1 @c -*-texinfo-*- |
2 @c This is part of the XEmacs Lisp Reference Manual. | |
3 @c Copyright (C) 1996 Ben Wing. | |
4 @c See the file lispref.texi for copying conditions. | |
5 @setfilename ../../info/hash-tables.info | |
6 @node Hash Tables, Range Tables, Display, top | |
7 @chapter Hash Tables | |
8 @cindex hash table | |
9 | |
10 @defun hash-table-p object | |
11 This function returns @code{t} if @var{object} is a hash table, else @code{nil}. | |
12 @end defun | |
13 | |
14 @menu | |
15 * Introduction to Hash Tables:: Hash tables are fast data structures for | |
16 implementing simple tables (i.e. finite | |
17 mappings from keys to values). | |
18 * Working With Hash Tables:: Hash table functions. | |
19 * Weak Hash Tables:: Hash tables with special garbage-collection | |
20 behavior. | |
21 @end menu | |
22 | |
23 @node Introduction to Hash Tables | |
24 @section Introduction to Hash Tables | |
25 | |
26 A @dfn{hash table} is a data structure that provides mappings from | |
27 arbitrary Lisp objects called @dfn{keys} to other arbitrary Lisp objects | |
28 called @dfn{values}. A key/value pair is sometimes called an | |
29 @dfn{entry} in the hash table. There are many ways other than hash | |
30 tables of implementing the same sort of mapping, e.g. association lists | |
31 (@pxref{Association Lists}) and property lists (@pxref{Property Lists}), | |
32 but hash tables provide much faster lookup when there are many entries | |
33 in the mapping. Hash tables are an implementation of the abstract data | |
34 type @dfn{dictionary}, also known as @dfn{associative array}. | |
35 | |
36 Internally, hash tables are hashed using the @dfn{linear probing} hash | |
37 table implementation method. This method hashes each key to a | |
38 particular spot in the hash table, and then scans forward sequentially | |
39 until a blank entry is found. To look up a key, hash to the appropriate | |
40 spot, then search forward for the key until either a key is found or a | |
41 blank entry stops the search. This method is used in preference to | |
42 double hashing because of changes in recent hardware. The penalty for | |
43 non-sequential access to memory has been increasing, and this | |
44 compensates for the problem of clustering that linear probing entails. | |
45 | |
46 When hash tables are created, the user may (but is not required to) | |
47 specify initial properties that influence performance. | |
48 | |
49 Use the @code{:size} parameter to specify the number of entries that are | |
50 likely to be stored in the hash table, to avoid the overhead of resizing | |
51 the table. But if the pre-allocated space for the entries is never | |
52 used, it is simply wasted and makes XEmacs slower. Excess unused hash | |
53 table entries exact a small continuous performance penalty, since they | |
54 must be scanned at every garbage collection. If the number of entries | |
55 in the hash table is unknown, simply avoid using the @code{:size} | |
56 keyword. | |
57 | |
58 Use the @code{:rehash-size} and @code{:rehash-threshold} keywords to | |
59 adjust the algorithm for deciding when to rehash the hash table. For | |
60 temporary hash tables that are going to be very heavily used, use a | |
61 small rehash threshold, for example, 0.4 and a large rehash size, for | |
62 example 2.0. For permanent hash tables that will be infrequently used, | |
63 specify a large rehash threshold, for example 0.8. | |
64 | |
65 Hash tables can also be created by the lisp reader using structure | |
66 syntax, for example: | |
67 @example | |
4820
e6dec75ded0e
Use keywords, not ordinary symbols, in the structure syntax for hash tables.
Aidan Kehoe <kehoea@parhasard.net>
parents:
442
diff
changeset
|
68 #s(hash-table :size 20 :data (foo 1 bar 2)) |
428 | 69 @end example |
70 | |
4820
e6dec75ded0e
Use keywords, not ordinary symbols, in the structure syntax for hash tables.
Aidan Kehoe <kehoea@parhasard.net>
parents:
442
diff
changeset
|
71 The structure syntax accepts the same keywords as |
e6dec75ded0e
Use keywords, not ordinary symbols, in the structure syntax for hash tables.
Aidan Kehoe <kehoea@parhasard.net>
parents:
442
diff
changeset
|
72 @code{make-hash-table}, as well as the additional keyword @code{data}, |
e6dec75ded0e
Use keywords, not ordinary symbols, in the structure syntax for hash tables.
Aidan Kehoe <kehoea@parhasard.net>
parents:
442
diff
changeset
|
73 which specifies the initial hash table contents. Older versions of |
e6dec75ded0e
Use keywords, not ordinary symbols, in the structure syntax for hash tables.
Aidan Kehoe <kehoea@parhasard.net>
parents:
442
diff
changeset
|
74 XEmacs required that the keywords not have the initial ``:'' in the |
e6dec75ded0e
Use keywords, not ordinary symbols, in the structure syntax for hash tables.
Aidan Kehoe <kehoea@parhasard.net>
parents:
442
diff
changeset
|
75 structure syntax, and this version of XEmacs still supports that syntax, |
e6dec75ded0e
Use keywords, not ordinary symbols, in the structure syntax for hash tables.
Aidan Kehoe <kehoea@parhasard.net>
parents:
442
diff
changeset
|
76 but you cannot mix the two styles within one structure. |
428 | 77 |
78 @defun make-hash-table &key @code{test} @code{size} @code{rehash-size} @code{rehash-threshold} @code{weakness} | |
79 This function returns a new empty hash table object. | |
80 | |
81 Keyword @code{:test} can be @code{eq}, @code{eql} (default) or @code{equal}. | |
82 Comparison between keys is done using this function. | |
83 If speed is important, consider using @code{eq}. | |
84 When storing strings in the hash table, you will likely need to use @code{equal}. | |
85 | |
86 Keyword @code{:size} specifies the number of keys likely to be inserted. | |
87 This number of entries can be inserted without enlarging the hash table. | |
88 | |
89 Keyword @code{:rehash-size} must be a float greater than 1.0, and specifies | |
90 the factor by which to increase the size of the hash table when enlarging. | |
91 | |
92 Keyword @code{:rehash-threshold} must be a float between 0.0 and 1.0, | |
93 and specifies the load factor of the hash table which triggers enlarging. | |
94 | |
442 | 95 Non-standard keyword @code{:weakness} can be @code{nil} (default), |
96 @code{t}, @code{key-and-value}, @code{key}, @code{value} or | |
97 @code{key-or-value}. @code{t} is an alias for @code{key-and-value}. | |
428 | 98 |
442 | 99 A key-and-value-weak hash table, also known as a fully-weak or simply |
100 as a weak hash table, is one whose pointers do not count as GC | |
101 referents: for any key-value pair in the hash table, if the only | |
102 remaining pointer to either the key or the value is in a weak hash | |
103 table, then the pair will be removed from the hash table, and the key | |
104 and value collected. A non-weak hash table (or any other pointer) | |
105 would prevent the object from being collected. | |
428 | 106 |
107 A key-weak hash table is similar to a fully-weak hash table except that | |
108 a key-value pair will be removed only if the key remains unmarked | |
109 outside of weak hash tables. The pair will remain in the hash table if | |
110 the key is pointed to by something other than a weak hash table, even | |
111 if the value is not. | |
112 | |
113 A value-weak hash table is similar to a fully-weak hash table except | |
114 that a key-value pair will be removed only if the value remains | |
115 unmarked outside of weak hash tables. The pair will remain in the | |
116 hash table if the value is pointed to by something other than a weak | |
117 hash table, even if the key is not. | |
442 | 118 |
119 A key-or-value-weak hash table is similar to a fully-weak hash table except | |
120 that a key-value pair will be removed only if the value and the key remain | |
121 unmarked outside of weak hash tables. The pair will remain in the | |
122 hash table if the value or key are pointed to by something other than a weak | |
123 hash table, even if the other is not. | |
428 | 124 @end defun |
125 | |
126 @defun copy-hash-table hash-table | |
127 This function returns a new hash table which contains the same keys and | |
128 values as @var{hash-table}. The keys and values will not themselves be | |
129 copied. | |
130 @end defun | |
131 | |
132 @defun hash-table-count hash-table | |
133 This function returns the number of entries in @var{hash-table}. | |
134 @end defun | |
135 | |
136 @defun hash-table-test hash-table | |
137 This function returns the test function of @var{hash-table}. | |
138 This can be one of @code{eq}, @code{eql} or @code{equal}. | |
139 @end defun | |
140 | |
141 @defun hash-table-size hash-table | |
142 This function returns the current number of slots in @var{hash-table}, | |
143 whether occupied or not. | |
144 @end defun | |
145 | |
146 @defun hash-table-rehash-size hash-table | |
147 This function returns the current rehash size of @var{hash-table}. | |
148 This is a float greater than 1.0; the factor by which @var{hash-table} | |
149 is enlarged when the rehash threshold is exceeded. | |
150 @end defun | |
151 | |
152 @defun hash-table-rehash-threshold hash-table | |
153 This function returns the current rehash threshold of @var{hash-table}. | |
154 This is a float between 0.0 and 1.0; the maximum @dfn{load factor} of | |
155 @var{hash-table}, beyond which the @var{hash-table} is enlarged by rehashing. | |
156 @end defun | |
157 | |
158 @defun hash-table-weakness hash-table | |
159 This function returns the weakness of @var{hash-table}. | |
160 This can be one of @code{nil}, @code{t}, @code{key} or @code{value}. | |
161 @end defun | |
162 | |
163 @node Working With Hash Tables | |
164 @section Working With Hash Tables | |
165 | |
166 @defun puthash key value hash-table | |
167 This function hashes @var{key} to @var{value} in @var{hash-table}. | |
168 @end defun | |
169 | |
170 @defun gethash key hash-table &optional default | |
171 This function finds the hash value for @var{key} in @var{hash-table}. | |
172 If there is no entry for @var{key} in @var{hash-table}, @var{default} is | |
173 returned (which in turn defaults to @code{nil}). | |
174 @end defun | |
175 | |
176 @defun remhash key hash-table | |
177 This function removes the entry for @var{key} from @var{hash-table}. | |
178 Does nothing if there is no entry for @var{key} in @var{hash-table}. | |
179 @end defun | |
180 | |
181 @defun clrhash hash-table | |
182 This function removes all entries from @var{hash-table}, leaving it empty. | |
183 @end defun | |
184 | |
185 @defun maphash function hash-table | |
186 This function maps @var{function} over entries in @var{hash-table}, | |
187 calling it with two args, each key and value in the hash table. | |
188 | |
189 @var{function} may not modify @var{hash-table}, with the one exception | |
190 that @var{function} may remhash or puthash the entry currently being | |
191 processed by @var{function}. | |
192 @end defun | |
193 | |
194 | |
195 @node Weak Hash Tables | |
196 @section Weak Hash Tables | |
197 @cindex hash table, weak | |
198 @cindex weak hash table | |
199 | |
200 A @dfn{weak hash table} is a special variety of hash table whose | |
201 elements do not count as GC referents. For any key-value pair in such a | |
202 hash table, if either the key or value (or in some cases, if one | |
203 particular one of the two) has no references to it outside of weak hash | |
204 tables (and similar structures such as weak lists), the pair will be | |
205 removed from the table, and the key and value collected. A non-weak | |
206 hash table (or any other pointer) would prevent the objects from being | |
207 collected. | |
208 | |
209 Weak hash tables are useful for keeping track of information in a | |
210 non-obtrusive way, for example to implement caching. If the cache | |
211 contains objects such as buffers, markers, image instances, etc. that | |
212 will eventually disappear and get garbage-collected, using a weak hash | |
213 table ensures that these objects are collected normally rather than | |
214 remaining around forever, long past their actual period of use. | |
215 (Otherwise, you'd have to explicitly map over the hash table every so | |
216 often and remove unnecessary elements.) | |
217 | |
442 | 218 There are four types of weak hash tables: |
428 | 219 |
220 @table @asis | |
442 | 221 @item key-and-value-weak hash tables |
222 In these hash tables, also known as fully weak or simply as weak hash | |
223 tables, a pair disappears if either the key or the value is unreferenced | |
224 outside of the table. | |
428 | 225 @item key-weak hash tables |
226 In these hash tables, a pair disappears if the key is unreferenced outside | |
227 of the table, regardless of how the value is referenced. | |
228 @item value-weak hash tables | |
229 In these hash tables, a pair disappears if the value is unreferenced outside | |
230 of the table, regardless of how the key is referenced. | |
442 | 231 @item key-or-value-weak hash tables |
232 In these hash tables, a pair disappears if both the key and the value | |
233 are unreferenced outside of the table. | |
428 | 234 @end table |
235 | |
236 Also see @ref{Weak Lists}. | |
237 | |
238 Weak hash tables are created by specifying the @code{:weakness} keyword to | |
239 @code{make-hash-table}. |