view tests/automated/case-tests.el @ 4690:257b468bf2ca

Move the #'query-coding-region implementation to C. This is necessary because there is no reasonable way to access the corresponding mswindows-multibyte functionality from Lisp, and we need such functionality if we're going to have a reliable and portable #'query-coding-region implementation. However, this change doesn't yet provide #'query-coding-region for the mswindow-multibyte coding systems, there should be no functional differences between an XEmacs with this change and one without it. src/ChangeLog addition: 2009-09-19 Aidan Kehoe <kehoea@parhasard.net> Move the #'query-coding-region implementation to C. This is necessary because there is no reasonable way to access the corresponding mswindows-multibyte functionality from Lisp, and we need such functionality if we're going to have a reliable and portable #'query-coding-region implementation. However, this change doesn't yet provide #'query-coding-region for the mswindow-multibyte coding systems, there should be no functional differences between an XEmacs with this change and one without it. * mule-coding.c (struct fixed_width_coding_system): Add a new coding system type, fixed_width, and implement it. It uses the CCL infrastructure but has a much simpler creation API, and its own query_method, formerly in lisp/mule/mule-coding.el. * unicode.c: Move the Unicode query method implementation here from unicode.el. * lisp.h: Declare Fmake_coding_system_internal, Fcopy_range_table here. * intl-win32.c (complex_vars_of_intl_win32): Use Fmake_coding_system_internal, not Fmake_coding_system. * general-slots.h: Add Qsucceeded, Qunencodable, Qinvalid_sequence here. * file-coding.h (enum coding_system_variant): Add fixed_width_coding_system here. (struct coding_system_methods): Add query_method and query_lstream_method to the coding system methods. Provide flags for the query methods. Declare the default query method; initialise it correctly in INITIALIZE_CODING_SYSTEM_TYPE. * file-coding.c (default_query_method): New function, the default query method for coding systems that do not set it. Moved from coding.el. (make_coding_system_1): Accept new elements in PROPS in #'make-coding-system; aliases, a list of aliases; safe-chars and safe-charsets (these were previously accepted but not saved); and category. (Fmake_coding_system_internal): New function, what used to be #'make-coding-system--on Mule builds, we've now moved some of the functionality of this to Lisp. (Fcoding_system_canonical_name_p): Move this earlier in the file, since it's now called from within make_coding_system_1. (Fquery_coding_region): Move the implementation of this here, from coding.el. (complex_vars_of_file_coding): Call Fmake_coding_system_internal, not Fmake_coding_system; specify safe-charsets properties when we're a mule build. * extents.h (mouse_highlight_priority, Fset_extent_priority, Fset_extent_face, Fmap_extents): Make these available to other C files. lisp/ChangeLog addition: 2009-09-19 Aidan Kehoe <kehoea@parhasard.net> Move the #'query-coding-region implementation to C. * coding.el: Consolidate code that depends on the presence or absence of Mule at the end of this file. (default-query-coding-region, query-coding-region): Move these functions to C. (default-query-coding-region-safe-charset-skip-chars-map): Remove this variable, the corresponding C variable is Vdefault_query_coding_region_chartab_cache in file-coding.c. (query-coding-string): Update docstring to reflect actual multiple values, be more careful about not modifying a range table that we're currently mapping over. (encode-coding-char): Make the implementation of this simpler. (featurep 'mule): Autoload #'make-coding-system from mule/make-coding-system.el if we're a mule build; provide an appropriate compiler macro. Do various non-mule compatibility things if we're not a mule build. * update-elc.el (additional-dump-dependencies): Add mule/make-coding-system as a dump time dependency if we're a mule build. * unicode.el (ccl-encode-to-ucs-2): (decode-char): (encode-char): Move these earlier in the file, for the sake of some byte compile warnings. (unicode-query-coding-region): Move this to unicode.c * mule/make-coding-system.el: New file, not dumped. Contains the functionality to rework the arguments necessary for fixed-width coding systems, and contains the implementation of #'make-coding-system, which now calls #'make-coding-system-internal. * mule/vietnamese.el (viscii): * mule/latin.el (iso-8859-2): (windows-1250): (iso-8859-3): (iso-8859-4): (iso-8859-14): (iso-8859-15): (iso-8859-16): (iso-8859-9): (macintosh): (windows-1252): * mule/hebrew.el (iso-8859-8): * mule/greek.el (iso-8859-7): (windows-1253): * mule/cyrillic.el (iso-8859-5): (koi8-r): (koi8-u): (windows-1251): (alternativnyj): (koi8-ru): (koi8-t): (koi8-c): (koi8-o): * mule/arabic.el (iso-8859-6): (windows-1256): Move all these coding systems to being of type fixed-width, not of type CCL. This allows the distinct query-coding-region for them to be in C, something which will eventually allow us to implement query-coding-region for the mswindows-multibyte coding systems. * mule/general-late.el (posix-charset-to-coding-system-hash): Document why we're pre-emptively persuading the byte compiler that the ELC for this file needs to be written using escape-quoted. Call #'set-unicode-query-skip-chars-args, now the Unicode query-coding-region implementation is in C. * mule/thai-xtis.el (tis-620): Don't bother checking whether we're XEmacs or not here. * mule/mule-coding.el: Move the eight bit fixed-width functionality from this file to make-coding-system.el. tests/ChangeLog addition: 2009-09-19 Aidan Kehoe <kehoea@parhasard.net> * automated/mule-tests.el: Check a coding system's type, not an 8-bit-fixed property, for whether that coding system should be treated as a fixed-width coding system. * automated/query-coding-tests.el: Don't test the query coding functionality for mswindows-multibyte coding systems, it's not yet implemented.
author Aidan Kehoe <kehoea@parhasard.net>
date Sat, 19 Sep 2009 22:53:13 +0100
parents 1982c8c55632
children 189fb67ca31a
line wrap: on
line source

;;; -*- coding: iso-8859-1 -*-

;; Copyright (C) 2000 Free Software Foundation, Inc.

;; Author: Yoshiki Hayashi  <yoshiki@xemacs.org>
;; Maintainer: Yoshiki Hayashi  <yoshiki@xemacs.org>
;; Created: 2000
;; Keywords: tests

;; This file is part of XEmacs.

;; XEmacs is free software; you can redistribute it and/or modify it
;; under the terms of the GNU General Public License as published by
;; the Free Software Foundation; either version 2, or (at your option)
;; any later version.

;; XEmacs is distributed in the hope that it will be useful, but
;; WITHOUT ANY WARRANTY; without even the implied warranty of
;; MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
;; General Public License for more details.

;; You should have received a copy of the GNU General Public License
;; along with XEmacs; see the file COPYING.  If not, write to the Free
;; Software Foundation, Inc., 59 Temple Place - Suite 330, Boston, MA
;; 02111-1307, USA.

;;; Synched up with: Not in FSF.

;;; Commentary:

;; Test case-table related functionality.

(defvar pristine-case-table nil
  "The standard case table, without manipulation from case-tests.el")

(setq pristine-case-table (or
			   ;; This is the compiled run; we've retained
			   ;; it from the interpreted run.
			   pristine-case-table 
			   ;; This is the interpreted run; set it.
			   (copy-case-table (standard-case-table))))

(Assert (case-table-p (standard-case-table)))
;; Old case table test.
(Assert (case-table-p (list
		       (make-string 256 ?a)
		       nil nil nil)))
(Assert (case-table-p (list
		       (make-string 256 ?a)
		       (make-string 256 ?b)
		       nil nil)))
(Assert (case-table-p (list
		       (make-string 256 ?a)
		       (make-string 256 ?b)
		       (make-string 256 ?c)
		       nil)))
(Assert (case-table-p (list
		       (make-string 256 ?a)
		       (make-string 256 ?b)
		       (make-string 256 ?c)
		       (make-string 256 ?d))))
(Assert (not (case-table-p (list (make-string 256 ?a)
				 (make-string 256 ?b)
				 (make-string 256 ?c)
				 (make-string 254 ?d)))))
(Assert (not (case-table-p (list (make-string 256 ?a)))))

(Assert (case-table-p (set-case-table (current-case-table))))

(defvar string-0-through-32
  (let ((result (make-string 33 (int-to-char 0))))
    (dotimes (i 33)
      (aset result i (int-to-char i)))
    result)
  "String containing characters from code point 0 (NUL) through 32 (SPC).")

(defvar string-127-through-160
  (let ((result (make-string 34 (int-to-char 0))))
    (dotimes (i 34)
      (aset result i (int-to-char (+ 127 i))))
    result)
  "String containing characters from code point 127 (DEL) through 160
\(no-break-space).")

;; Case table sanity check.
(let ((downcase-string
       (concat string-0-through-32
	       "!\"#$%&'()*+,-./0123456789:;<=>?@abcdefghijklmnopqrstuvwxyz[\\]^_`abcdefghijklmnopqrstuvwxyz{|}~"
	       string-127-through-160
		"¡¢£¤¥¦§¨©ª«¬­®¯°±²³´µ¶·¸¹º»¼½¾¿àáâãäåæçèéêëìíîïðñòóôõö×øùúûüýþßàáâãäåæçèéêëìíîïðñòóôõö÷øùúûüýþÿ"))
       (upcase-string
	(concat string-0-through-32
		"!\"#$%&'()*+,-./0123456789:;<=>?@abcdefghijklmnopqrstuvwxyz[\\]^_`ABCDEFGHIJKLMNOPQRSTUVWXYZ{|}~"
		string-127-through-160
		"¡¢£¤¥¦§¨©ª«¬­®¯°±²³´µ¶·¸¹º»¼½¾¿àáâãäåæçèéêëìíîïðñòóôõö×øùúûüýþßÀÁÂÃÄÅÆÇÈÉÊËÌÍÎÏÐÑÒÓÔÕÖ÷ØÙÚÛÜÝÞÿ"))
       (table (standard-case-table)))
  (dotimes (i 256)
    (Assert (eq (get-case-table 'downcase (int-to-char i) table)
		(aref downcase-string i)))
    (Assert (eq (get-case-table 'upcase (int-to-char i) table)
		(aref upcase-string i)))))

(Check-Error-Message error "Char case must be downcase or upcase"
		     (get-case-table 'foo ?a (standard-case-table)))

(Assert
 (string=
  (upcase "!\"#$%&'()*+,-./0123456789:;<=>?@abcdefghijklmnopqrstuvwxyz")
  "!\"#$%&'()*+,-./0123456789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ"))

(Assert
 (string=
  (upcase "!\"#$%&'()*+,-./0123456789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ")
  "!\"#$%&'()*+,-./0123456789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ"))

(Assert
 (string=
  (upcase " ¡¢£¤¥¦§¨©ª«¬­®¯°±²³´µ¶·¸¹º»¼½¾¿àáâãäåæçèéêëìíîïðñòóôõö×øùúûüýþßàáâãäåæçèéêëìíîïðñòóôõö÷øùúûüýþÿ")
  " ¡¢£¤¥¦§¨©ª«¬­®¯°±²³´µ¶·¸¹º»¼½¾¿ÀÁÂÃÄÅÆÇÈÉÊËÌÍÎÏÐÑÒÓÔÕÖ×ØÙÚÛÜÝÞßÀÁÂÃÄÅÆÇÈÉÊËÌÍÎÏÐÑÒÓÔÕÖ÷ØÙÚÛÜÝÞÿ"))

(Assert
 (string=
  (upcase " ¡¢£¤¥¦§¨©ª«¬­®¯°±²³´µ¶·¸¹º»¼½¾¿ÀÁÂÃÄÅÆÇÈÉÊËÌÍÎÏÐÑÒÓÔÕÖ×ØÙÚÛÜÝÞßÀÁÂÃÄÅÆÇÈÉÊËÌÍÎÏÐÑÒÓÔÕÖ÷ØÙÚÛÜÝÞÿ")
  " ¡¢£¤¥¦§¨©ª«¬­®¯°±²³´µ¶·¸¹º»¼½¾¿ÀÁÂÃÄÅÆÇÈÉÊËÌÍÎÏÐÑÒÓÔÕÖ×ØÙÚÛÜÝÞßÀÁÂÃÄÅÆÇÈÉÊËÌÍÎÏÐÑÒÓÔÕÖ÷ØÙÚÛÜÝÞÿ"))

(Assert
 (string=
  (downcase "!\"#$%&'()*+,-./0123456789:;<=>?@abcdefghijklmnopqrstuvwxyz")
  "!\"#$%&'()*+,-./0123456789:;<=>?@abcdefghijklmnopqrstuvwxyz"))

(Assert
 (string=
  (downcase "!\"#$%&'()*+,-./0123456789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ")
  "!\"#$%&'()*+,-./0123456789:;<=>?@abcdefghijklmnopqrstuvwxyz"))

(Assert
 (string=
  (downcase " ¡¢£¤¥¦§¨©ª«¬­®¯°±²³´µ¶·¸¹º»¼½¾¿àáâãäåæçèéêëìíîïðñòóôõö×øùúûüýþßàáâãäåæçèéêëìíîïðñòóôõö÷øùúûüýþÿ")
  " ¡¢£¤¥¦§¨©ª«¬­®¯°±²³´µ¶·¸¹º»¼½¾¿àáâãäåæçèéêëìíîïðñòóôõö×øùúûüýþßàáâãäåæçèéêëìíîïðñòóôõö÷øùúûüýþÿ"))

(Assert
 (string=
  (downcase " ¡¢£¤¥¦§¨©ª«¬­®¯°±²³´µ¶·¸¹º»¼½¾¿ÀÁÂÃÄÅÆÇÈÉÊËÌÍÎÏÐÑÒÓÔÕÖ×ØÙÚÛÜÝÞßÀÁÂÃÄÅÆÇÈÉÊËÌÍÎÏÐÑÒÓÔÕÖ÷ØÙÚÛÜÝÞÿ")
  " ¡¢£¤¥¦§¨©ª«¬­®¯°±²³´µ¶·¸¹º»¼½¾¿àáâãäåæçèéêëìíîïðñòóôõö×øùúûüýþßàáâãäåæçèéêëìíîïðñòóôõö÷øùúûüýþÿ"))

;; Old case table format test.
(with-temp-buffer
  (set-case-table
   (list
    (concat string-0-through-32
	     "!\"#$%&'()*+,-./0123456789:;<=>?@abcdefghijklmnopqrstuvwxyz[\\]^_`abcdefghijklmnopqrstuvwxyz{|}~"
	     string-127-through-160
	     "¡¢£¤¥¦§¨©ª«¬­®¯°±²³´µ¶·¸¹º»¼½¾¿àáâãäåæçèéêëìíîïðñòóôõö×øùúûüýþßàáâãäåæçèéêëìíîïðñòóôõö÷øùúûüýþÿ")
     nil nil nil))
  (Assert
   (string=
    (upcase "!\"#$%&'()*+,-./0123456789:;<=>?@abcdefghijklmnopqrstuvwxyz")
    "!\"#$%&'()*+,-./0123456789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ"))
  (Assert
   (string=
    (downcase "!\"#$%&'()*+,-./0123456789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ")
    "!\"#$%&'()*+,-./0123456789:;<=>?@abcdefghijklmnopqrstuvwxyz")))

(with-temp-buffer
  (insert "Test Buffer")
  (let ((case-fold-search t))
    (goto-char (point-min))
    (Assert (eq (search-forward "test buffer" nil t) 12))
    (goto-char (point-min))
    (Assert (eq (search-forward "Test buffer" nil t) 12))
    (goto-char (point-min))
    (Assert (eq (search-forward "Test Buffer" nil t) 12))

    (setq case-fold-search nil)
    (goto-char (point-min))
    (Assert (not (search-forward "test buffer" nil t)))
    (goto-char (point-min))
    (Assert (not (search-forward "Test buffer" nil t)))
    (goto-char (point-min))
    (Assert (eq (search-forward "Test Buffer" nil t) 12))))

(with-temp-buffer
  (insert "abcdefghijklmnäopqrstuÄvwxyz")
  ;; case insensitive
  (Assert (not (search-forward "ö" nil t)))
  (goto-char (point-min))
  (Assert (eq 16 (search-forward "ä" nil t)))
  (Assert (eq 24 (search-forward "ä" nil t)))
  (goto-char (point-min))
  (Assert (eq 16 (search-forward "Ä" nil t)))
  (Assert (eq 24 (search-forward "Ä" nil t)))
  (goto-char (point-max))
  (Assert (eq 23 (search-backward "ä" nil t)))
  (Assert (eq 15 (search-backward "ä" nil t)))
  (goto-char (point-max))
  (Assert (eq 23 (search-backward "Ä" nil t)))
  (Assert (eq 15 (search-backward "Ä" nil t)))
  ;; case sensitive
  (setq case-fold-search nil)
  (goto-char (point-min))
  (Assert (not (search-forward "ö" nil t)))
  (goto-char (point-min))
  (Assert (eq 16 (search-forward "ä" nil t)))
  (Assert (not (search-forward "ä" nil t)))
  (goto-char (point-min))
  (Assert (eq 24 (search-forward "Ä" nil t)))
  (goto-char 16)
  (Assert (eq 24 (search-forward "Ä" nil t)))
  (goto-char (point-max))
  (Assert (eq 15 (search-backward "ä" nil t)))
  (goto-char 15)
  (Assert (not (search-backward "ä" nil t)))
  (goto-char (point-max))
  (Assert (eq 23 (search-backward "Ä" nil t)))
  (Assert (not (search-backward "Ä" nil t))))

(with-temp-buffer
  (insert "aaaaäÄäÄäÄäÄäÄbbbb")
  (goto-char (point-min))
  (Assert (eq 15 (search-forward "ää" nil t 5)))
  (goto-char (point-min))
  (Assert (not (search-forward "ää" nil t 6)))
  (goto-char (point-max))
  (Assert (eq 5 (search-backward "ää" nil t 5)))
  (goto-char (point-max))
  (Assert (not (search-backward "ää" nil t 6))))

(when (featurep 'mule)
  (let* ((hiragana-a (make-char 'japanese-jisx0208 36 34))
	 (a-diaeresis ?ä)
	 (case-table (copy-case-table (standard-case-table)))
	 (str-hiragana-a (char-to-string hiragana-a))
	 (str-a-diaeresis (char-to-string a-diaeresis))
	 (string (concat str-hiragana-a str-a-diaeresis)))
    (put-case-table-pair hiragana-a a-diaeresis case-table)
    (with-temp-buffer
      (set-case-table case-table)
      (insert hiragana-a "abcdefg" a-diaeresis)
      ;; forward
      (goto-char (point-min))
      (Assert (not (search-forward "ö" nil t)))
      (goto-char (point-min))
      (Assert (eq 2 (search-forward str-hiragana-a nil t)))
      (goto-char (point-min))
      (Assert (eq 2 (search-forward str-a-diaeresis nil t)))
      (goto-char (1+ (point-min)))
      (Assert (eq (point-max)
		  (search-forward str-hiragana-a nil t)))
      (goto-char (1+ (point-min)))
      (Assert (eq (point-max)
		  (search-forward str-a-diaeresis nil t)))
      ;; backward
      (goto-char (point-max))
      (Assert (not (search-backward "ö" nil t)))
      (goto-char (point-max))
      (Assert (eq (1- (point-max)) (search-backward str-hiragana-a nil t)))
      (goto-char (point-max))
      (Assert (eq (1- (point-max)) (search-backward str-a-diaeresis nil t)))
      (goto-char (1- (point-max)))
      (Assert (eq 1 (search-backward str-hiragana-a nil t)))
      (goto-char (1- (point-max)))
      (Assert (eq 1 (search-backward str-a-diaeresis nil t)))
      (replace-match "a")
      (Assert (looking-at (format "abcdefg%c" a-diaeresis))))
    (with-temp-buffer
      (set-case-table case-table)
      (insert string)
      (insert string)
      (insert string)
      (insert string)
      (insert string)
      (goto-char (point-min))
      (Assert (eq 11 (search-forward string nil t 5)))
      (goto-char (point-min))
      (Assert (not (search-forward string nil t 6)))
      (goto-char (point-max))
      (Assert (eq 1 (search-backward string nil t 5)))
      (goto-char (point-max))
      (Assert (not (search-backward string nil t 6))))))

;; Bug reported in http://mid.gmane.org/y9lk5lu5orq.fsf@deinprogramm.de from
;; Michael Sperber. Fixed 2008-01-29.
(with-string-as-buffer-contents "\n\nDer beruhmte deutsche Flei\xdf\n\n"
  (goto-char (point-min))
  (Assert (search-forward "Flei\xdf")))

(with-temp-buffer
  (let ((target "M\xe9zard")
        (debug-xemacs-searches 1))
    (Assert (not (search-forward target nil t)))
    (insert target)
    (goto-char (point-min))
    ;; #### search-algorithm-used is simple-search after the following,
    ;; which shouldn't be necessary; it should be possible to use
    ;; Boyer-Moore. 
    ;;
    ;; But searches for ASCII strings in buffers with nothing above ?\xFF
    ;; use Boyer Moore with the current implementation, which is the
    ;; important thing for the Gnus use case.
    (Assert (= (1+ (length target)) (search-forward target nil t)))))

(Skip-Test-Unless
 (boundp 'debug-xemacs-searches) ; normal when we have DEBUG_XEMACS
 "not a DEBUG_XEMACS build"
 "checks that the algorithm chosen by #'search-forward is relatively sane"
 (let ((debug-xemacs-searches 1))
   (with-temp-buffer
     (set-case-table pristine-case-table)
     (insert "\n\nDer beruhmte deutsche Fleiss\n\n")
     (goto-char (point-min))
     (Assert (search-forward "Fleiss"))
     (delete-region (point-min) (point-max))
     (insert "\n\nDer beruhmte deutsche Flei\xdf\n\n")
     (goto-char (point-min))
     (Assert (search-forward "Flei\xdf"))
     (Assert (eq 'boyer-moore search-algorithm-used))
     (delete-region (point-min) (point-max))
     (when (featurep 'mule)
       (insert "\n\nDer beruhmte deutsche Flei\xdf\n\n")
       (goto-char (point-min))
       (Assert 
        (search-forward (format "Fle%c\xdf"
                                (make-char 'latin-iso8859-9 #xfd))))
       (Assert (eq 'boyer-moore search-algorithm-used))
       (insert (make-char 'latin-iso8859-9 #xfd))
       (goto-char (point-min))
       (Assert (search-forward "Flei\xdf"))
       (Assert (eq 'simple-search search-algorithm-used)) 
       (goto-char (point-min))
       (Assert (search-forward (format "Fle%c\xdf"
                                       (make-char 'latin-iso8859-9 #xfd))))
       (Assert (eq 'simple-search search-algorithm-used))))))