428
|
1 @c -*-texinfo-*-
|
|
2 @c This is part of the XEmacs Lisp Reference Manual.
|
|
3 @c Copyright (C) 1996 Ben Wing.
|
|
4 @c See the file lispref.texi for copying conditions.
|
|
5 @setfilename ../../info/hash-tables.info
|
|
6 @node Hash Tables, Range Tables, Display, top
|
|
7 @chapter Hash Tables
|
|
8 @cindex hash table
|
|
9
|
|
10 @defun hash-table-p object
|
|
11 This function returns @code{t} if @var{object} is a hash table, else @code{nil}.
|
|
12 @end defun
|
|
13
|
|
14 @menu
|
|
15 * Introduction to Hash Tables:: Hash tables are fast data structures for
|
|
16 implementing simple tables (i.e. finite
|
|
17 mappings from keys to values).
|
|
18 * Working With Hash Tables:: Hash table functions.
|
|
19 * Weak Hash Tables:: Hash tables with special garbage-collection
|
|
20 behavior.
|
|
21 @end menu
|
|
22
|
|
23 @node Introduction to Hash Tables
|
|
24 @section Introduction to Hash Tables
|
|
25
|
|
26 A @dfn{hash table} is a data structure that provides mappings from
|
|
27 arbitrary Lisp objects called @dfn{keys} to other arbitrary Lisp objects
|
|
28 called @dfn{values}. A key/value pair is sometimes called an
|
|
29 @dfn{entry} in the hash table. There are many ways other than hash
|
|
30 tables of implementing the same sort of mapping, e.g. association lists
|
|
31 (@pxref{Association Lists}) and property lists (@pxref{Property Lists}),
|
|
32 but hash tables provide much faster lookup when there are many entries
|
|
33 in the mapping. Hash tables are an implementation of the abstract data
|
|
34 type @dfn{dictionary}, also known as @dfn{associative array}.
|
|
35
|
|
36 Internally, hash tables are hashed using the @dfn{linear probing} hash
|
|
37 table implementation method. This method hashes each key to a
|
|
38 particular spot in the hash table, and then scans forward sequentially
|
|
39 until a blank entry is found. To look up a key, hash to the appropriate
|
|
40 spot, then search forward for the key until either a key is found or a
|
|
41 blank entry stops the search. This method is used in preference to
|
|
42 double hashing because of changes in recent hardware. The penalty for
|
|
43 non-sequential access to memory has been increasing, and this
|
|
44 compensates for the problem of clustering that linear probing entails.
|
|
45
|
|
46 When hash tables are created, the user may (but is not required to)
|
|
47 specify initial properties that influence performance.
|
|
48
|
|
49 Use the @code{:size} parameter to specify the number of entries that are
|
|
50 likely to be stored in the hash table, to avoid the overhead of resizing
|
|
51 the table. But if the pre-allocated space for the entries is never
|
|
52 used, it is simply wasted and makes XEmacs slower. Excess unused hash
|
|
53 table entries exact a small continuous performance penalty, since they
|
|
54 must be scanned at every garbage collection. If the number of entries
|
|
55 in the hash table is unknown, simply avoid using the @code{:size}
|
|
56 keyword.
|
|
57
|
|
58 Use the @code{:rehash-size} and @code{:rehash-threshold} keywords to
|
|
59 adjust the algorithm for deciding when to rehash the hash table. For
|
|
60 temporary hash tables that are going to be very heavily used, use a
|
|
61 small rehash threshold, for example, 0.4 and a large rehash size, for
|
|
62 example 2.0. For permanent hash tables that will be infrequently used,
|
|
63 specify a large rehash threshold, for example 0.8.
|
|
64
|
|
65 Hash tables can also be created by the lisp reader using structure
|
|
66 syntax, for example:
|
|
67 @example
|
|
68 #s(hash-table size 20 data (foo 1 bar 2))
|
|
69 @end example
|
|
70
|
|
71 The structure syntax accepts the same keywords as @code{make-hash-table}
|
|
72 (without the @code{:} character), as well as the additional keyword
|
|
73 @code{data}, which specifies the initial hash table contents.
|
|
74
|
|
75 @defun make-hash-table &key @code{test} @code{size} @code{rehash-size} @code{rehash-threshold} @code{weakness}
|
|
76 This function returns a new empty hash table object.
|
|
77
|
|
78 Keyword @code{:test} can be @code{eq}, @code{eql} (default) or @code{equal}.
|
|
79 Comparison between keys is done using this function.
|
|
80 If speed is important, consider using @code{eq}.
|
|
81 When storing strings in the hash table, you will likely need to use @code{equal}.
|
|
82
|
|
83 Keyword @code{:size} specifies the number of keys likely to be inserted.
|
|
84 This number of entries can be inserted without enlarging the hash table.
|
|
85
|
|
86 Keyword @code{:rehash-size} must be a float greater than 1.0, and specifies
|
|
87 the factor by which to increase the size of the hash table when enlarging.
|
|
88
|
|
89 Keyword @code{:rehash-threshold} must be a float between 0.0 and 1.0,
|
|
90 and specifies the load factor of the hash table which triggers enlarging.
|
|
91
|
442
|
92 Non-standard keyword @code{:weakness} can be @code{nil} (default),
|
|
93 @code{t}, @code{key-and-value}, @code{key}, @code{value} or
|
|
94 @code{key-or-value}. @code{t} is an alias for @code{key-and-value}.
|
428
|
95
|
442
|
96 A key-and-value-weak hash table, also known as a fully-weak or simply
|
|
97 as a weak hash table, is one whose pointers do not count as GC
|
|
98 referents: for any key-value pair in the hash table, if the only
|
|
99 remaining pointer to either the key or the value is in a weak hash
|
|
100 table, then the pair will be removed from the hash table, and the key
|
|
101 and value collected. A non-weak hash table (or any other pointer)
|
|
102 would prevent the object from being collected.
|
428
|
103
|
|
104 A key-weak hash table is similar to a fully-weak hash table except that
|
|
105 a key-value pair will be removed only if the key remains unmarked
|
|
106 outside of weak hash tables. The pair will remain in the hash table if
|
|
107 the key is pointed to by something other than a weak hash table, even
|
|
108 if the value is not.
|
|
109
|
|
110 A value-weak hash table is similar to a fully-weak hash table except
|
|
111 that a key-value pair will be removed only if the value remains
|
|
112 unmarked outside of weak hash tables. The pair will remain in the
|
|
113 hash table if the value is pointed to by something other than a weak
|
|
114 hash table, even if the key is not.
|
442
|
115
|
|
116 A key-or-value-weak hash table is similar to a fully-weak hash table except
|
|
117 that a key-value pair will be removed only if the value and the key remain
|
|
118 unmarked outside of weak hash tables. The pair will remain in the
|
|
119 hash table if the value or key are pointed to by something other than a weak
|
|
120 hash table, even if the other is not.
|
428
|
121 @end defun
|
|
122
|
|
123 @defun copy-hash-table hash-table
|
|
124 This function returns a new hash table which contains the same keys and
|
|
125 values as @var{hash-table}. The keys and values will not themselves be
|
|
126 copied.
|
|
127 @end defun
|
|
128
|
|
129 @defun hash-table-count hash-table
|
|
130 This function returns the number of entries in @var{hash-table}.
|
|
131 @end defun
|
|
132
|
|
133 @defun hash-table-test hash-table
|
|
134 This function returns the test function of @var{hash-table}.
|
|
135 This can be one of @code{eq}, @code{eql} or @code{equal}.
|
|
136 @end defun
|
|
137
|
|
138 @defun hash-table-size hash-table
|
|
139 This function returns the current number of slots in @var{hash-table},
|
|
140 whether occupied or not.
|
|
141 @end defun
|
|
142
|
|
143 @defun hash-table-rehash-size hash-table
|
|
144 This function returns the current rehash size of @var{hash-table}.
|
|
145 This is a float greater than 1.0; the factor by which @var{hash-table}
|
|
146 is enlarged when the rehash threshold is exceeded.
|
|
147 @end defun
|
|
148
|
|
149 @defun hash-table-rehash-threshold hash-table
|
|
150 This function returns the current rehash threshold of @var{hash-table}.
|
|
151 This is a float between 0.0 and 1.0; the maximum @dfn{load factor} of
|
|
152 @var{hash-table}, beyond which the @var{hash-table} is enlarged by rehashing.
|
|
153 @end defun
|
|
154
|
|
155 @defun hash-table-weakness hash-table
|
|
156 This function returns the weakness of @var{hash-table}.
|
|
157 This can be one of @code{nil}, @code{t}, @code{key} or @code{value}.
|
|
158 @end defun
|
|
159
|
|
160 @node Working With Hash Tables
|
|
161 @section Working With Hash Tables
|
|
162
|
|
163 @defun puthash key value hash-table
|
|
164 This function hashes @var{key} to @var{value} in @var{hash-table}.
|
|
165 @end defun
|
|
166
|
|
167 @defun gethash key hash-table &optional default
|
|
168 This function finds the hash value for @var{key} in @var{hash-table}.
|
|
169 If there is no entry for @var{key} in @var{hash-table}, @var{default} is
|
|
170 returned (which in turn defaults to @code{nil}).
|
|
171 @end defun
|
|
172
|
|
173 @defun remhash key hash-table
|
|
174 This function removes the entry for @var{key} from @var{hash-table}.
|
|
175 Does nothing if there is no entry for @var{key} in @var{hash-table}.
|
|
176 @end defun
|
|
177
|
|
178 @defun clrhash hash-table
|
|
179 This function removes all entries from @var{hash-table}, leaving it empty.
|
|
180 @end defun
|
|
181
|
|
182 @defun maphash function hash-table
|
|
183 This function maps @var{function} over entries in @var{hash-table},
|
|
184 calling it with two args, each key and value in the hash table.
|
|
185
|
|
186 @var{function} may not modify @var{hash-table}, with the one exception
|
|
187 that @var{function} may remhash or puthash the entry currently being
|
|
188 processed by @var{function}.
|
|
189 @end defun
|
|
190
|
|
191
|
|
192 @node Weak Hash Tables
|
|
193 @section Weak Hash Tables
|
|
194 @cindex hash table, weak
|
|
195 @cindex weak hash table
|
|
196
|
|
197 A @dfn{weak hash table} is a special variety of hash table whose
|
|
198 elements do not count as GC referents. For any key-value pair in such a
|
|
199 hash table, if either the key or value (or in some cases, if one
|
|
200 particular one of the two) has no references to it outside of weak hash
|
|
201 tables (and similar structures such as weak lists), the pair will be
|
|
202 removed from the table, and the key and value collected. A non-weak
|
|
203 hash table (or any other pointer) would prevent the objects from being
|
|
204 collected.
|
|
205
|
|
206 Weak hash tables are useful for keeping track of information in a
|
|
207 non-obtrusive way, for example to implement caching. If the cache
|
|
208 contains objects such as buffers, markers, image instances, etc. that
|
|
209 will eventually disappear and get garbage-collected, using a weak hash
|
|
210 table ensures that these objects are collected normally rather than
|
|
211 remaining around forever, long past their actual period of use.
|
|
212 (Otherwise, you'd have to explicitly map over the hash table every so
|
|
213 often and remove unnecessary elements.)
|
|
214
|
442
|
215 There are four types of weak hash tables:
|
428
|
216
|
|
217 @table @asis
|
442
|
218 @item key-and-value-weak hash tables
|
|
219 In these hash tables, also known as fully weak or simply as weak hash
|
|
220 tables, a pair disappears if either the key or the value is unreferenced
|
|
221 outside of the table.
|
428
|
222 @item key-weak hash tables
|
|
223 In these hash tables, a pair disappears if the key is unreferenced outside
|
|
224 of the table, regardless of how the value is referenced.
|
|
225 @item value-weak hash tables
|
|
226 In these hash tables, a pair disappears if the value is unreferenced outside
|
|
227 of the table, regardless of how the key is referenced.
|
442
|
228 @item key-or-value-weak hash tables
|
|
229 In these hash tables, a pair disappears if both the key and the value
|
|
230 are unreferenced outside of the table.
|
428
|
231 @end table
|
|
232
|
|
233 Also see @ref{Weak Lists}.
|
|
234
|
|
235 Weak hash tables are created by specifying the @code{:weakness} keyword to
|
|
236 @code{make-hash-table}.
|