Mercurial > hg > xemacs-beta
view tests/automated/lisp-reader-tests.el @ 5776:65d65b52d608
Pass character count from coding systems to buffer insertion code.
src/ChangeLog addition:
2014-01-16 Aidan Kehoe <kehoea@parhasard.net>
Pass character count information from the no-conversion and
unicode coding systems to the buffer insertion code, making
#'find-file on large buffers a little snappier (if
ERROR_CHECK_TEXT is not defined).
* file-coding.c:
* file-coding.c (coding_character_tell): New.
* file-coding.c (conversion_coding_stream_description): New.
* file-coding.c (no_conversion_convert):
Update characters_seen when decoding.
* file-coding.c (no_conversion_character_tell): New.
* file-coding.c (lstream_type_create_file_coding): Create the
no_conversion type with data.
* file-coding.c (coding_system_type_create):
Make the character_tell method available here.
* file-coding.h:
* file-coding.h (struct coding_system_methods):
Add a new character_tell() method, passing charcount information
from the coding systems to the buffer code, avoiding duplicate
bytecount-to-charcount work especially with large buffers.
* fileio.c (Finsert_file_contents_internal):
Update this to pass charcount information to
buffer_insert_string_1(), if that is available from the lstream code.
* insdel.c:
* insdel.c (buffer_insert_string_1):
Add a new CCLEN argument, giving the character count of the string
to insert. It can be -1 to indicate that te function should work
it out itself using bytecount_to_charcount(), as it used to.
* insdel.c (buffer_insert_raw_string_1):
* insdel.c (buffer_insert_lisp_string_1):
* insdel.c (buffer_insert_ascstring_1):
* insdel.c (buffer_insert_emacs_char_1):
* insdel.c (buffer_insert_from_buffer_1):
* insdel.c (buffer_replace_char):
Update these functions to use the new calling convention.
* insdel.h:
* insdel.h (buffer_insert_string):
Update this header to reflect the new buffer_insert_string_1()
argument.
* lstream.c (Lstream_character_tell): New.
Return the number of characters *read* and seen by the consumer so
far, taking into account the unget buffer, and buffered reading.
* lstream.c (Lstream_unread):
Update unget_character_count here as appropriate.
* lstream.c (Lstream_rewind):
Reset unget_character_count here too.
* lstream.h:
* lstream.h (struct lstream):
Provide the character_tell method, add a new field,
unget_character_count, giving the number of characters ever passed
to Lstream_unread().
Declare Lstream_character_tell().
Make Lstream_ungetc(), which happens to be unused, an inline
function rather than a macro, in the course of updating it to
modify unget_character_count.
* print.c (output_string):
Use the new argument to buffer_insert_string_1().
* tests.c:
* tests.c (Ftest_character_tell):
New test function.
* tests.c (syms_of_tests):
Make it available.
* unicode.c:
* unicode.c (struct unicode_coding_stream):
* unicode.c (unicode_character_tell):
New method.
* unicode.c (unicode_convert):
Update the character counter as appropriate.
* unicode.c (coding_system_type_create_unicode):
Make the character_tell method available.
author | Aidan Kehoe <kehoea@parhasard.net> |
---|---|
date | Thu, 16 Jan 2014 16:27:52 +0000 |
parents | cc7f8a0e569a |
children | ee27ca517e90 |
line wrap: on
line source
;; Copyright (C) 2005 Martin Kuehl. ;; Author: Martin Kuehl <martin.kuehl@gmail.com> ;; Maintainer: Martin Kuehl <martin.kuehl@gmail.com> ;; Created: 2005 ;; Keywords: tests ;; This file is part of XEmacs. ;; XEmacs is free software: you can redistribute it and/or modify it ;; under the terms of the GNU General Public License as published by the ;; Free Software Foundation, either version 3 of the License, or (at your ;; option) any later version. ;; XEmacs is distributed in the hope that it will be useful, but WITHOUT ;; ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or ;; FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License ;; for more details. ;; You should have received a copy of the GNU General Public License ;; along with XEmacs. If not, see <http://www.gnu.org/licenses/>. ;;; Synched up with: Not in FSF. ;;; Commentary: ;; Test the lisp reader. ;; See test-harness.el for instructions on how to run these tests. ;;; Raw Strings ;;; =========== ;; Equality to "traditional" strings ;; --------------------------------- (dolist (strings '((#r"xyz" "xyz") ; no backslashes (#r"\xyz" "\\xyz") ; backslash at start (#r"\\xyz" "\\\\xyz") ; backslashes at start (#r"\nxyz" "\\nxyz") ; escape seq. at start (#r"\"xyz" "\\\"xyz") ; quote at start (#r"xy\z" "xy\\z") ; backslash in middle (#r"xy\\z" "xy\\\\z") ; backslashes in middle (#r"xy\nz" "xy\\nz") ; escape seq. in middle (#r"xy\"z" "xy\\\"z") ; quote in middle ;;(#r"xyz\" "xyz\\") ; backslash at end: error (#r"xyz\\" "xyz\\\\") ; backslashes at end (#r"xyz\n" "xyz\\n") ; escape seq. at end (#r"xyz\"" "xyz\\\"") ; quote at end (#ru"\u00ABxyz" "\u00ABxyz") ; one Unicode escape (#rU"\U000000ABxyz" "\U000000ABxyz") ; another Unicode escape (#rU"xyz\u00AB" "xyz\u00AB") ; one Unicode escape )) (Assert (apply #'string= strings))) ;; Odd number of backslashes at the end ;; ------------------------------------ (dolist (string '("#r\"xyz\\\"" ; `#r"abc\"': escaped delimiter "#r\"xyz\\\\\\\"" ; `#r"abc\\\"': escaped delimiter )) (with-temp-buffer (insert string) (Check-Error end-of-file (eval-buffer)))) ;; Alternate string/regex delimiters ;; --------------------------------- (dolist (string '("#r/xyz/" ; Perl syntax "#r:ix/xyz/" ; Extended Perl syntax "#r|xyz|" ; TeX syntax "#r[xyz]" ; (uncommon) Perl syntax "#r<xyz>" ; Perl6 syntax? "#r(xyz)" ; arbitrary santax "#r{xyz}" ; arbitrary santax "#r,xyz," ; arbitrary santax "#r!xyz!" ; arbitrary santax )) (with-temp-buffer (insert string) (Check-Error-Message invalid-read-syntax "unrecognized raw string" (eval-buffer)))) (when (featurep 'bignum) ;; This failed, up to 20110501. (Assert (eql (1+ most-positive-fixnum) (read (format "+%d" (1+ most-positive-fixnum)))) "checking leading + is handled properly if reading a bignum") ;; This never did. (Assert (eql (1- most-positive-fixnum) (read (format "+%d" (1- most-positive-fixnum)))) "checking leading + is handled properly if reading a fixnum")) ;; Test print-circle. (let ((cons '#1=(1 2 3 4 5 6 . #1#)) (vector #2=[1 2 3 4 5 6 #2#]) (compiled-function #3=#[(argument) "\xc2\x09\x08\"\x87" [pi argument #3#] 3]) (char-table #4=#s(char-table :type generic :data (?\u0080 #4#))) (hash-table #5=#s(hash-table :test eql :data (a b c #5# e f))) (range-table #6=#s(range-table :type start-closed-end-open :data ((#x00 #xff) hello (#x100 #x1ff) #6# (#x200 #x2ff) everyone))) (print-readably t) (print-circle t) deserialized-cons deserialized-vector deserialized-compiled-function deserialized-char-table deserialized-hash-table deserialized-range-table) (Assert (eq (nthcdr 6 cons) cons) "checking basic recursive cons read properly") (Assert (eq vector (aref vector (1- (length vector)))) "checking basic recursive vector read properly") (Assert (eq compiled-function (find-if #'compiled-function-p (compiled-function-constants compiled-function))) "checking basic recursive compiled-function read properly") (Check-Error wrong-number-of-arguments (funcall compiled-function 3)) (Assert (eq char-table (get-char-table ?\u0080 char-table)) "checking basic recursive char table read properly") (Assert (eq hash-table (gethash 'c hash-table)) "checking basic recursive hash table read properly") (Assert (eq range-table (get-range-table #x180 range-table)) "checking basic recursive range table read properly") (setf (gethash 'g hash-table) cons (car cons) hash-table deserialized-hash-table (read (prin1-to-string hash-table))) (Assert (not (eq deserialized-hash-table hash-table)) "checking printing and reading hash-table creates a new object") (Assert (eq deserialized-hash-table (gethash 'c deserialized-hash-table)) "checking the lisp reader handles deserialized hash-table identity") (Assert (eq deserialized-hash-table (car (gethash 'g deserialized-hash-table))) "checking the reader handles deserialization identity, hash-table") (setf (get-char-table ?a char-table) cons (car cons) char-table deserialized-char-table (read (prin1-to-string char-table))) (Assert (not (eq deserialized-char-table char-table)) "checking printing and reading creates a new object") (Assert (eq deserialized-char-table (get-char-table ?\u0080 deserialized-char-table)) "checking the lisp reader handles deserialization identity") (Assert (eq deserialized-char-table (car (get-char-table ?a deserialized-char-table))) "checking the lisp reader handles deserialization identity, mixed") (put-range-table #x1000 #x1010 cons range-table) (setf (car cons) range-table deserialized-range-table (read (prin1-to-string range-table))) (Assert (not (eq deserialized-range-table range-table)) "checking printing and reading creates a new object") (Assert (eq deserialized-range-table (get-range-table #x101 deserialized-range-table)) "checking the lisp reader handles deserialization identity") (Assert (eq deserialized-range-table (car (get-range-table #x1001 deserialized-range-table))) "checking the lisp reader handles deserialization identity, mixed")) (when (featurep 'bignum) (Assert (null (list-length (read (format "#%d=(1 #1=(5) 3 4 . #%d#)" (+ most-positive-fixnum 2) (+ most-positive-fixnum 2))))) "checking bignum object labels don't wrap on reading"))