view src/README.kkcc @ 4407:4ee73bbe4f8e

Always use boyer_moore in ASCII or Latin-1 buffers with ASCII search strings. 2007-12-26 Aidan Kehoe <kehoea@parhasard.net> * casetab.c: Extend and correct some case table documentation. * search.c (search_buffer): Correct a bug where only the first entry for a character in the case equivalence table was examined in determining if the Boyer-Moore search algorithm is appropriate. If there are case mappings outside of the charset and row of the characters specified in the search string, those case mappings can be safely ignored (and Boyer-Moore search can be used) if we know from the buffer statistics that the corresponding characters cannot occur. * search.c (boyer_moore): Assert that we haven't been passed a string with varying characters sets or rows within character sets. That's what simple_search is for. In the very rare event that a character in the search string has a canonical case mapping that is not in the same character set and row, don't try to search for the canonical character, search for some other character that is in the the desired character set and row. Assert that the case table isn't corrupt. Do not search for any character case mappings that cannot possibly occur in the buffer, given the buffer metadata about its contents.
author Aidan Kehoe <kehoea@parhasard.net>
date Wed, 26 Dec 2007 17:30:16 +0100
parents ac1be85b4a5f
children 3889ef128488
line wrap: on
line source

2002-07-17  Marcus Crestani  <crestani@informatik.uni-tuebingen.de>
	    Markus Kaltenbach  <makalten@informatik.uni-tuebingen.de>
	    Mike Sperber <mike@xemacs.org>

	updated 2003-07-29

	New KKCC-GC mark algorithm:
	configure flag : --use-kkcc

	For better understanding, first a few words about the mark algorithm 
	up to now:
	Every Lisp_Object has its own mark method, which calls mark_object
	with the stuff to be marked.
	Also, many Lisp_Objects have pdump descriptions memory_descriptions, 
	which are used by the portable dumper. The dumper gets all the 
	information it needs about the Lisp_Object from the descriptions.

	Also the garbage collector can use the information in the pdump
	descriptions, so we can get rid of the mark methods.
	That is what we have been doing.

	
	DUMPABLE FLAG
	-------------
	First we added a dumpable flag to lrecord_implementation. It shows,
	if the object is dumpable and should be processed by the dumper.
	The dumpable flag is the third argument of a lrecord_implementation
	definition (DEFINE_LRECORD_IMPLEMENTATION).
	If it is set to 1, the dumper processes the descriptions and dumps
	the Object, if it is set to 0, the dumper does not care about it.
		

	KKCC MARKING
	------------
	All Lisp_Objects have memory_descriptions now, so we could get
	rid of the mark_object calls.
	The KKCC algorithm manages its own stack. Instead of calling 
	mark_object, all the alive Lisp_Objects are pushed on the 
	kkcc_gc_stack. Then these elements on the stack  are processed 
	according to their descriptions.


	TODO
	----
	- For weakness use weak datatypes instead of XD_FLAG_NO_KKCC.
	  XD_FLAG_NO_KKCC occurs in:
		* elhash.c: htentry
		* extents.c: lispobject_gap_array, extent_list, extent_info
		* marker.c: marker     
	  Not everything has to be rewritten. See Ben's comment in lrecord.h.
	- Clean up special case marking (weak_hash_tables, weak_lists,
	  ephemerons).
	- Stack optimization (have one stack during runtime instead of 
	  malloc/free it for every garbage collect)

	There are a few Lisp_Objects, where there occured differences and
	inexactness between the mark-method and the pdump description.  All
	these Lisp_Objects get dumped (except image instances), so their
	descriptions have been written, before we started our work:
	* alloc.c: string
	description: size_, data_, and plist is described
	mark: only plist is marked, but flush_cached_extent_info is called.
	      flush_cached_extent_info ->
		free_soe ->
		  free_extent_list ->
		    free_gap_array ->
		      gap_array_delete_all_markers ->
			Add gap_array to the gap_array_marker_freelist

	* glyphs.c: image_instance
	description: device is not set to nil
	mark: mark method sets device to nil if dead
	See comment above the description.