Mercurial > hg > xemacs-beta
view src/README.kkcc @ 5776:65d65b52d608
Pass character count from coding systems to buffer insertion code.
src/ChangeLog addition:
2014-01-16 Aidan Kehoe <kehoea@parhasard.net>
Pass character count information from the no-conversion and
unicode coding systems to the buffer insertion code, making
#'find-file on large buffers a little snappier (if
ERROR_CHECK_TEXT is not defined).
* file-coding.c:
* file-coding.c (coding_character_tell): New.
* file-coding.c (conversion_coding_stream_description): New.
* file-coding.c (no_conversion_convert):
Update characters_seen when decoding.
* file-coding.c (no_conversion_character_tell): New.
* file-coding.c (lstream_type_create_file_coding): Create the
no_conversion type with data.
* file-coding.c (coding_system_type_create):
Make the character_tell method available here.
* file-coding.h:
* file-coding.h (struct coding_system_methods):
Add a new character_tell() method, passing charcount information
from the coding systems to the buffer code, avoiding duplicate
bytecount-to-charcount work especially with large buffers.
* fileio.c (Finsert_file_contents_internal):
Update this to pass charcount information to
buffer_insert_string_1(), if that is available from the lstream code.
* insdel.c:
* insdel.c (buffer_insert_string_1):
Add a new CCLEN argument, giving the character count of the string
to insert. It can be -1 to indicate that te function should work
it out itself using bytecount_to_charcount(), as it used to.
* insdel.c (buffer_insert_raw_string_1):
* insdel.c (buffer_insert_lisp_string_1):
* insdel.c (buffer_insert_ascstring_1):
* insdel.c (buffer_insert_emacs_char_1):
* insdel.c (buffer_insert_from_buffer_1):
* insdel.c (buffer_replace_char):
Update these functions to use the new calling convention.
* insdel.h:
* insdel.h (buffer_insert_string):
Update this header to reflect the new buffer_insert_string_1()
argument.
* lstream.c (Lstream_character_tell): New.
Return the number of characters *read* and seen by the consumer so
far, taking into account the unget buffer, and buffered reading.
* lstream.c (Lstream_unread):
Update unget_character_count here as appropriate.
* lstream.c (Lstream_rewind):
Reset unget_character_count here too.
* lstream.h:
* lstream.h (struct lstream):
Provide the character_tell method, add a new field,
unget_character_count, giving the number of characters ever passed
to Lstream_unread().
Declare Lstream_character_tell().
Make Lstream_ungetc(), which happens to be unused, an inline
function rather than a macro, in the course of updating it to
modify unget_character_count.
* print.c (output_string):
Use the new argument to buffer_insert_string_1().
* tests.c:
* tests.c (Ftest_character_tell):
New test function.
* tests.c (syms_of_tests):
Make it available.
* unicode.c:
* unicode.c (struct unicode_coding_stream):
* unicode.c (unicode_character_tell):
New method.
* unicode.c (unicode_convert):
Update the character counter as appropriate.
* unicode.c (coding_system_type_create_unicode):
Make the character_tell method available.
author | Aidan Kehoe <kehoea@parhasard.net> |
---|---|
date | Thu, 16 Jan 2014 16:27:52 +0000 |
parents | 3889ef128488 |
children |
line wrap: on
line source
2002-07-17 Marcus Crestani <crestani@informatik.uni-tuebingen.de> Markus Kaltenbach <makalten@informatik.uni-tuebingen.de> Mike Sperber <mike@xemacs.org> updated 2003-07-29 New KKCC-GC mark algorithm: configure flag : --use-kkcc For better understanding, first a few words about the mark algorithm up to now: Every Lisp_Object has its own mark method, which calls mark_object with the stuff to be marked. Also, many Lisp_Objects have pdump descriptions memory_descriptions, which are used by the portable dumper. The dumper gets all the information it needs about the Lisp_Object from the descriptions. Also the garbage collector can use the information in the pdump descriptions, so we can get rid of the mark methods. That is what we have been doing. DUMPABLE FLAG ------------- First we added a dumpable flag to lrecord_implementation. It shows, if the object is dumpable and should be processed by the dumper. The dumpable flag is the third argument of a lrecord_implementation definition (DEFINE_LRECORD_IMPLEMENTATION). If it is set to 1, the dumper processes the descriptions and dumps the Object, if it is set to 0, the dumper does not care about it. KKCC MARKING ------------ All Lisp_Objects have memory_descriptions now, so we could get rid of the mark_object calls. The KKCC algorithm manages its own stack. Instead of calling mark_object, all the alive Lisp_Objects are pushed on the kkcc_gc_stack. Then these elements on the stack are processed according to their descriptions. TODO ---- - For weakness use weak datatypes instead of XD_FLAG_NO_KKCC. XD_FLAG_NO_KKCC occurs in: * elhash.c: htentry * extents.c: lispobject_gap_array, extent_list, extent_info * marker.c: marker Not everything has to be rewritten. See Ben's comment in lrecord.h. - Clean up special case marking (weak_hash_tables, weak_lists, ephemerons). - Stack optimization (have one stack during runtime instead of malloc/free it for every garbage collect) There are a few Lisp_Objects, where there occurred differences and inexactness between the mark-method and the pdump description. All these Lisp_Objects get dumped (except image instances), so their descriptions have been written, before we started our work: * alloc.c: string description: size_, data_, and plist is described mark: only plist is marked, but flush_cached_extent_info is called. flush_cached_extent_info -> free_soe -> free_extent_list -> free_gap_array -> gap_array_delete_all_markers -> Add gap_array to the gap_array_marker_freelist * glyphs.c: image_instance description: device is not set to nil mark: mark method sets device to nil if dead See comment above the description.