Mercurial > hg > xemacs-beta
view src/README.kkcc @ 4407:4ee73bbe4f8e
Always use boyer_moore in ASCII or Latin-1 buffers with ASCII search strings.
2007-12-26 Aidan Kehoe <kehoea@parhasard.net>
* casetab.c:
Extend and correct some case table documentation.
* search.c (search_buffer):
Correct a bug where only the first entry for a character in the
case equivalence table was examined in determining if the
Boyer-Moore search algorithm is appropriate.
If there are case mappings outside of the charset and row of the
characters specified in the search string, those case mappings can
be safely ignored (and Boyer-Moore search can be used) if we know
from the buffer statistics that the corresponding characters cannot
occur.
* search.c (boyer_moore):
Assert that we haven't been passed a string with varying
characters sets or rows within character sets. That's what
simple_search is for.
In the very rare event that a character in the search string has a
canonical case mapping that is not in the same character set and
row, don't try to search for the canonical character, search for
some other character that is in the the desired character set and
row. Assert that the case table isn't corrupt.
Do not search for any character case mappings that cannot possibly
occur in the buffer, given the buffer metadata about its
contents.
author | Aidan Kehoe <kehoea@parhasard.net> |
---|---|
date | Wed, 26 Dec 2007 17:30:16 +0100 |
parents | ac1be85b4a5f |
children | 3889ef128488 |
line wrap: on
line source
2002-07-17 Marcus Crestani <crestani@informatik.uni-tuebingen.de> Markus Kaltenbach <makalten@informatik.uni-tuebingen.de> Mike Sperber <mike@xemacs.org> updated 2003-07-29 New KKCC-GC mark algorithm: configure flag : --use-kkcc For better understanding, first a few words about the mark algorithm up to now: Every Lisp_Object has its own mark method, which calls mark_object with the stuff to be marked. Also, many Lisp_Objects have pdump descriptions memory_descriptions, which are used by the portable dumper. The dumper gets all the information it needs about the Lisp_Object from the descriptions. Also the garbage collector can use the information in the pdump descriptions, so we can get rid of the mark methods. That is what we have been doing. DUMPABLE FLAG ------------- First we added a dumpable flag to lrecord_implementation. It shows, if the object is dumpable and should be processed by the dumper. The dumpable flag is the third argument of a lrecord_implementation definition (DEFINE_LRECORD_IMPLEMENTATION). If it is set to 1, the dumper processes the descriptions and dumps the Object, if it is set to 0, the dumper does not care about it. KKCC MARKING ------------ All Lisp_Objects have memory_descriptions now, so we could get rid of the mark_object calls. The KKCC algorithm manages its own stack. Instead of calling mark_object, all the alive Lisp_Objects are pushed on the kkcc_gc_stack. Then these elements on the stack are processed according to their descriptions. TODO ---- - For weakness use weak datatypes instead of XD_FLAG_NO_KKCC. XD_FLAG_NO_KKCC occurs in: * elhash.c: htentry * extents.c: lispobject_gap_array, extent_list, extent_info * marker.c: marker Not everything has to be rewritten. See Ben's comment in lrecord.h. - Clean up special case marking (weak_hash_tables, weak_lists, ephemerons). - Stack optimization (have one stack during runtime instead of malloc/free it for every garbage collect) There are a few Lisp_Objects, where there occured differences and inexactness between the mark-method and the pdump description. All these Lisp_Objects get dumped (except image instances), so their descriptions have been written, before we started our work: * alloc.c: string description: size_, data_, and plist is described mark: only plist is marked, but flush_cached_extent_info is called. flush_cached_extent_info -> free_soe -> free_extent_list -> free_gap_array -> gap_array_delete_all_markers -> Add gap_array to the gap_array_marker_freelist * glyphs.c: image_instance description: device is not set to nil mark: mark method sets device to nil if dead See comment above the description.