changeset 4989:d2ec55325515

make utf-8 default for Cygwin 1.7, rewrite init code determining default coding systems -------------------- ChangeLog entries follow: -------------------- lisp/ChangeLog addition: 2010-02-06 Ben Wing <ben@xemacs.org> * code-init.el: * code-init.el (set-eol-detection): * code-init.el (coding-system-current-system-configuration): * code-init.el (coding-system-default-configuration-table): New. * code-init.el (no-mule-no-eol-detection): * code-init.el (define-coding-system-default-configuration): New. * code-init.el (coding-system-variable-default-value-table): Removed. * code-init.el (no-mule-eol-detection): * code-init.el (coding-system-default-configuration-list): Removed. * code-init.el (coding-system-default-variable-list): * code-init.el (get-coding-system-variable): * code-init.el (set-coding-system-variable): * code-init.el (coding-system-variable-default-value): * code-init.el (reset-coding-categories-to-default): Significant clean-up, add Cygwin-UTF-8 support. 1. Shorten the names of the coding system variables to follow what used to be considered the "abbreviations": default-process-coding-system-read -> process-read default-process-coding-system-write -> process-write buffer-file-coding-system-for-read -> bfcs-for-read default-buffer-file-coding-system -> default-bfcs no-conversion-coding-system-mapping -> no-conv-cs 2. Instead of listing all the defaults in a big, strangely organized table, use a new function `define-coding-system-default-configuration' to define a particular configuration. This uses a hash table stored in `coding-system-default-configuration-table'. Rewrite `coding-system-variable-default-value' appropriately. 3. Rename configurations to eliminate `unix' from the name: unix-no-mule-no-eol-detection -> no-mule-no-eol-detection unix-no-mule-eol-detection -> no-mule-eol-detection unix-mule -> mule This is because these are really for all systems but Windows, not just Unix. 4. Add configuration `cygwin-utf-8', enabled when (featurep 'cygwin-use-utf-8). Uses `utf-8' for all defaults except for `bfcs-for-read', which is `undecided'.
author Ben Wing <ben@xemacs.org>
date Sat, 06 Feb 2010 03:59:18 -0600
parents c914214b788d
children 8f0cf4fd3d2c
files lisp/ChangeLog lisp/code-init.el
diffstat 2 files changed, 188 insertions(+), 85 deletions(-) [+]
line wrap: on
line diff
--- a/lisp/ChangeLog	Wed Feb 03 02:56:21 2010 -0600
+++ b/lisp/ChangeLog	Sat Feb 06 03:59:18 2010 -0600
@@ -1,3 +1,50 @@
+2010-02-06  Ben Wing  <ben@xemacs.org>
+
+	* code-init.el:
+	* code-init.el (set-eol-detection):
+	* code-init.el (coding-system-current-system-configuration):
+	* code-init.el (coding-system-default-configuration-table): New.
+	* code-init.el (no-mule-no-eol-detection):
+	* code-init.el (define-coding-system-default-configuration): New.
+	* code-init.el (coding-system-variable-default-value-table): Removed.
+	* code-init.el (no-mule-eol-detection):
+	* code-init.el (coding-system-default-configuration-list): Removed.
+	* code-init.el (coding-system-default-variable-list):
+	* code-init.el (get-coding-system-variable):
+	* code-init.el (set-coding-system-variable):
+	* code-init.el (coding-system-variable-default-value):
+	* code-init.el (reset-coding-categories-to-default):
+	Significant clean-up, add Cygwin-UTF-8 support.
+
+	1. Shorten the names of the coding system variables to follow
+	   what used to be considered the "abbreviations":
+
+	   default-process-coding-system-read	->	process-read
+	   default-process-coding-system-write	->	process-write
+	   buffer-file-coding-system-for-read	->	bfcs-for-read
+	   default-buffer-file-coding-system	->	default-bfcs
+	   no-conversion-coding-system-mapping	->	no-conv-cs
+
+	2. Instead of listing all the defaults in a big, strangely organized
+	   table, use a new function
+	   `define-coding-system-default-configuration' to define a
+	   particular configuration.  This uses a hash table stored in
+	   `coding-system-default-configuration-table'.  Rewrite
+	   `coding-system-variable-default-value' appropriately.
+
+	3. Rename configurations to eliminate `unix' from the name:
+
+	   unix-no-mule-no-eol-detection      ->     no-mule-no-eol-detection
+	   unix-no-mule-eol-detection	      ->     no-mule-eol-detection
+	   unix-mule			      ->     mule
+
+	   This is because these are really for all systems but Windows,
+	   not just Unix.
+
+	4. Add configuration `cygwin-utf-8', enabled when (featurep
+	   'cygwin-use-utf-8).  Uses `utf-8' for all defaults except for
+	   `bfcs-for-read', which is `undecided'.
+
 2010-02-01  Aidan Kehoe  <kehoea@parhasard.net>
 
 	* loadhist.el (symbol-file):
--- a/lisp/code-init.el	Wed Feb 03 02:56:21 2010 -0600
+++ b/lisp/code-init.el	Sat Feb 06 03:59:18 2010 -0600
@@ -1,6 +1,6 @@
 ;;; code-init.el --- Handle coding system default values
 
-;; Copyright (C) 2001, 2002, 2003 Ben Wing.
+;; Copyright (C) 2001, 2002, 2003, 2010 Ben Wing.
 
 ;; This file is part of XEmacs.
 
@@ -21,6 +21,8 @@
 
 ;;; Commentary:
 
+;; Author: Ben wing, 2001?
+
 ;; Placed in a separate file so it can be loaded after the various
 ;; coding systems have been created, because we'll be using them at
 ;; load time.
@@ -60,62 +62,112 @@
 default, but this may change.  NOTE: You *REALLY* should not turn off EOL
 detection on Windows!  Your files will have lots of annoying ^M's in them
 if you do this."
-  (dolist (x '(buffer-file-coding-system-for-read
+  (dolist (x '(bfcs-for-read
 	       keyboard
-	       default-process-coding-system-read
-	       no-conversion-coding-system-mapping))
+	       process-read
+	       no-conv-cs))
     (set-coding-system-variable
      x (coding-system-change-eol-conversion (get-coding-system-variable x)
 					    (if flag nil 'lf)))))
 
 (defun coding-system-current-system-configuration ()
-  (cond ((memq system-type '(windows-nt cygwin32))
+  "Function to decide which default coding system configuration applies."
+  (cond ((featurep 'cygwin-use-utf-8) 'cygwin-utf-8)
+	((memq system-type '(windows-nt cygwin32))
 	 (if (featurep 'mule) 'windows-mule 'windows-no-mule))
-	((featurep 'mule) 'unix-mule)
-	(eol-detection-enabled-p 'unix-no-mule-eol-detection)
-	(t 'unix-no-mule-no-eol-detection)))
+	((featurep 'mule) 'mule)
+	(eol-detection-enabled-p 'no-mule-eol-detection)
+	(t 'no-mule-no-eol-detection)))
+
+(defvar coding-system-default-configuration-table (make-hash-table))
+
+(defun define-coding-system-default-configuration (name doc props)
+  (puthash name (nconc `(doc ,doc) props)
+	   coding-system-default-configuration-table))
 
 ;; NOTE NOTE NOTE: These values may get overridden when the language
 ;; environment is initialized (set-language-environment-coding-systems).
-(defvar coding-system-variable-default-value-table
-  '((buffer-file-coding-system-for-read
-     binary raw-text undecided raw-text undecided)
-    (default-buffer-file-coding-system
-      ;; #### iso-2022-8 with no eol specified?  can that be OK?
-      binary binary iso-2022-8 raw-text-dos mswindows-multibyte-dos)
-    (native
-     binary binary binary raw-text-dos mswindows-multibyte-system-default-dos)
-    (keyboard
-     binary raw-text undecided-unix raw-text undecided-unix)
-    ;; the `terminal' coding system is used for output to stderr.  such
-    ;; streams do automatic lf->crlf encoding in the C library, so we need
-    ;; to not do the same translations ourselves.
-    (terminal
-     binary binary binary binary mswindows-multibyte-unix)
-    (default-process-coding-system-read
-      binary raw-text undecided raw-text undecided)
-    (default-process-coding-system-write
-      binary binary binary raw-text mswindows-multibyte-system-default)
-    (no-conversion-coding-system-mapping
-     binary raw-text raw-text raw-text mswindows-multibyte)
-    ))
+(define-coding-system-default-configuration
+  'no-mule-no-eol-detection
+  "No Mule support, EOL detection not enabled."
+  '(bfcs-for-read	binary
+    default-bfcs	binary
+    process-read	binary
+    process-write	binary
+    keyboard		binary
+    native		binary
+    no-conv-cs		binary
+    terminal		binary))
+
+(define-coding-system-default-configuration
+  'no-mule-eol-detection
+  "No Mule support, EOL detection enabled."
+  '(bfcs-for-read	raw-text
+    default-bfcs	binary
+    process-read	raw-text
+    process-write	binary
+    keyboard		raw-text
+    native		binary
+    no-conv-cs		raw-text
+    terminal		binary))
+
+(define-coding-system-default-configuration
+  'mule
+  "Mule support enabled."
+  '(bfcs-for-read	undecided
+    default-bfcs	iso-2022-8
+    process-read	undecided
+    process-write	binary
+    keyboard		undecided-unix
+    native		binary
+    no-conv-cs		raw-text
+    terminal		binary))
 
-(defvar coding-system-default-configuration-list
-  '(unix-no-mule-no-eol-detection
-    unix-no-mule-eol-detection
-    unix-mule
-    windows-no-mule
-    windows-mule))
+(define-coding-system-default-configuration
+  'windows-no-mule
+  "Microsoft Windows, no Mule support."
+  '(bfcs-for-read	raw-text
+    default-bfcs	raw-text-dos
+    process-read	raw-text
+    process-write	raw-text
+    keyboard		raw-text
+    native		raw-text-dos
+    no-conv-cs		raw-text
+    terminal		binary))
+
+(define-coding-system-default-configuration
+  'windows-mule
+  "Microsoft Windows, Mule support enabled."
+  '(bfcs-for-read	undecided
+    default-bfcs	mswindows-multibyte-dos
+    process-read	undecided
+    process-write	mswindows-multibyte-system-default
+    keyboard		undecided-unix
+    native		mswindows-multibyte-system-default-dos
+    no-conv-cs		mswindows-multibyte
+    terminal		mswindows-multibyte-unix))
+
+(define-coding-system-default-configuration
+  'cygwin-utf-8
+  "Mule support enabled."
+  '(bfcs-for-read	undecided
+    default-bfcs	utf-8
+    process-read	utf-8
+    process-write	utf-8
+    keyboard		utf-8
+    native		utf-8
+    no-conv-cs		utf-8
+    terminal		utf-8))
 
 (defvar coding-system-default-variable-list
-  '(buffer-file-coding-system-for-read
-    default-buffer-file-coding-system
+  '(bfcs-for-read
+    default-bfcs
     native
     keyboard
     terminal
-    default-process-coding-system-read
-    default-process-coding-system-write
-    no-conversion-coding-system-mapping))
+    process-read
+    process-write
+    no-conv-cs))
 
 (defun get-coding-system-variable (var)
   "Return the value of a basic coding system variable.
@@ -124,15 +176,15 @@
 `coding-system-variable-default-value' for a list of the possible values of
 VAR."
   (case var
-    (buffer-file-coding-system-for-read buffer-file-coding-system-for-read)
-    (default-buffer-file-coding-system
+    (bfcs-for-read buffer-file-coding-system-for-read)
+    (default-bfcs
       (default-value 'buffer-file-coding-system))
     (native (coding-system-aliasee 'native))
     (keyboard (coding-system-aliasee 'keyboard))
     (terminal (coding-system-aliasee 'terminal))
-    (default-process-coding-system-read (car default-process-coding-system))
-    (default-process-coding-system-write (cdr default-process-coding-system))
-    (no-conversion-coding-system-mapping
+    (process-read (car default-process-coding-system))
+    (process-write (cdr default-process-coding-system))
+    (no-conv-cs
      (coding-category-system 'no-conversion))
     (t (error 'invalid-constant "Invalid coding system variable" var))))
 
@@ -143,20 +195,20 @@
 `coding-system-variable-default-value' for a list of the possible values of
 VAR."
   (case var
-    (buffer-file-coding-system-for-read
+    (bfcs-for-read
      (set-buffer-file-coding-system-for-read value))
-    (default-buffer-file-coding-system
+    (default-bfcs
       (set-default-buffer-file-coding-system value))
     (native (define-coding-system-alias 'native value))
     (keyboard (set-keyboard-coding-system value))
     (terminal (set-terminal-coding-system value))
-    (default-process-coding-system-read
+    (process-read
       (setq default-process-coding-system
 	    (cons value (cdr default-process-coding-system))))
-    (default-process-coding-system-write
+    (process-write
       (setq default-process-coding-system
 	    (cons (car default-process-coding-system) value)))
-    (no-conversion-coding-system-mapping
+    (no-conv-cs
      (set-coding-category-system 'no-conversion value))
     (t (error 'invalid-constant "Invalid coding system variable" var))))
 
@@ -170,28 +222,29 @@
 The table of default values looks like this: (see below for abbreviations)
 
 
-               Unix    Unix+EOL  Unix+Mule       MSW           MSW+Mule
------------------------------------------------------------------------------
-bfcs-for-read  binary  raw-text  undecided       raw-text      undecided
-default bfcs   binary  binary    iso-2022-8      raw-text-dos  MSW-MB-dos
-native         binary  binary    binary          raw-text-dos  MSW-MB-SD-dos
-keyboard       binary  raw-text  undecided-unix  raw-text      undecided-unix
-terminal       binary  binary    binary          binary        MSW-MB-unix
-process-read   binary  raw-text  undecided       raw-text      undecided
-process-write  binary  binary    binary          raw-text      MSW-MB-SD
-no-conv-cs     binary  raw-text  raw-text        raw-text      MSW-MB
+              NoMule NoMuleEOL Mule       MSW          MSWMule       CygUTF 
+------------------------------------------------------------------------------
+bfcs-for-read binary raw-text undecided   raw-text     undecided     undecided
+default-bfcs  binary binary   iso-2022-8  raw-text-dos MSW-MB-dos    utf-8
+native        binary binary   binary      raw-text-dos MSW-MB-SD-dos utf-8
+keyboard      binary raw-text undecided-  raw-text     undecided-    utf-8
+                                unix                     unix
+terminal      binary binary   binary      binary       MSW-MB-unix   utf-8
+process-read  binary raw-text undecided   raw-text     undecided     utf-8
+process-write binary binary   binary      raw-text     MSW-MB-SD     utf-8
+no-conv-cs    binary raw-text raw-text    raw-text     MSW-MB        utf-8
 
 
-VAR can be one of: (abbreviations in parens)
+VAR can be one of:
 
-`buffer-file-coding-system-for-read' (bfcs-for-read)
+`bfcs-for-read'
 
   Lisp variable of the same name; the default coding system used when
   reading in a file, in the absence of more specific settings. (See
   `insert-file-contents' for a description of exactly how a file's
   coding system is determined when it's read in.)
 
-`default-buffer-file-coding-system' (default bfcs)
+`default-bfcs'
 
   Default value of `buffer-file-coding-system', the buffer-local
   variable specifying a file's coding system to be used when it is
@@ -201,58 +254,61 @@
   system used to read the file in; the default value applies to newly
   created files.
 
-`native' (native)
+`native'
 
   The coding system named `native'.  Changed using
   `define-coding-system-alias'.  Used internally when passing
   text to or from system API's, unless the particular
   API specifies another coding system.
 
-`keyboard' (keyboard)
+`keyboard'
 
  #### fill in
 
-`terminal' (terminal)
+`terminal'
 
  #### fill in
 
-`default-process-coding-system-read' (process-read)
+`process-read'
 
  #### fill in
 
-`default-process-coding-system-write' (process-write)
+`process-write'
 
  #### fill in
 
-`no-conversion-coding-system-mapping' (no-conv-cs)
+`no-conv-cs'
 
   Coding system used when category `no-conversion' is detected.
 
 
 CONFIG is one of: (abbreviations in parens)
 
-`unix-no-mule-no-eol-detection' (Unix)
+`no-mule-no-eol-detection' (NoMule)
 
-Unix, no Mule support, no automatic EOL detection. (Controlled by
+Non-Windows, no Mule support, no automatic EOL detection. (Controlled by
 `eol-detection-enabled-p', which is set by the command-line flag
 -enable-eol-detection or the configure flag --with-default-eol-detection.)
 
-`unix-no-mule-eol-detection' (Unix+EOL)
+`unix-no-mule-eol-detection' (NoMuleEOL)
 
-Unix, no Mule support, automatic EOL detection.
+Non-Windows, no Mule support, automatic EOL detection.
 
-`unix-mule' (Unix+Mule)
+`unix-mule' (Mule)
 
-Unix, Mule support.
+Non-Windows, Mule support.
 
 `windows-no-mule' (MSW)
 
-MS Windows or Cygwin, no Mule support.
+MS Windows or old Cygwin, no Mule support.
+
+`windows-mule' (MSWMule)
 
-`windows-mule'. (MSW+Mule)
+MS Windows or old Cygwin, Mule support.
 
-MS Windows or Cygwin, Mule support.
+`cygwin-utf-8' (CygUTF)
 
+Cygwin 1.7 or later, which uses UTF-8 consistently.
 
 The following coding system abbreviations are also used in the table:
 
@@ -260,12 +316,12 @@
 MSW-MB = mswindows-multibyte-system-default
 "
   (setq config (or config (coding-system-current-system-configuration)))
-  (let ((defs (cdr (assq var coding-system-variable-default-value-table))))
-    (or defs (error 'invalid-constant "Invalid coding system variable" var))
-    (let ((pos (position config coding-system-default-configuration-list)))
-      (or pos (error 'invalid-constant "Invalid coding system configuration"
+  (or (memq var coding-system-default-variable-list)
+      (error 'invalid-constant "Invalid coding system variable" var))
+  (let ((props (gethash config coding-system-default-configuration-table)))
+    (or props (error 'invalid-constant "Invalid coding system configuration"
 		     config))
-      (nth pos defs))))
+    (getf props var)))
 
 (defun reset-coding-system-defaults (&optional config)
   "Reset all basic coding system variables are set to their default values.
@@ -332,7 +388,7 @@
     )
   (set-coding-category-system
    'no-conversion
-   (coding-system-variable-default-value 'no-conversion-coding-system-mapping))
+   (coding-system-variable-default-value 'no-conv-cs))
   (set-coding-category-system 'ucs-4 'ucs-4)
   (set-coding-category-system 'utf-8 'utf-8)
   (set-coding-category-system 'utf-8-bom 'utf-8-bom)