view src/README.global-renaming @ 5882:bbe4146603db

Reduce regexp usage, now CL-oriented non-regexp code available, core Lisp lisp/ChangeLog addition: 2015-04-01 Aidan Kehoe <kehoea@parhasard.net> When calling #'string-match with a REGEXP without regular expression special characters, call #'search, #'mismatch, #'find, etc. instead, making our code less likely to side-effect other functions' match data and a little faster. * apropos.el (apropos-command): * apropos.el (apropos): Call (position ?\n ...) rather than (string-match "\n" ...) here. * buff-menu.el: * buff-menu.el (buffers-menu-omit-invisible-buffers): Don't fire up the regexp engine just to check if a string starts with a space. * buff-menu.el (select-buffers-tab-buffers-by-mode): Don't fire up the regexp engine just to compare mode basenames. * buff-menu.el (format-buffers-tab-line): * buff-menu.el (build-buffers-tab-internal): Moved to being a label within the following. * buff-menu.el (buffers-tab-items): Use the label. * bytecomp.el (byte-compile-log-1): Don't fire up the regexp engine just to look for a newline. * cus-edit.el (get): Ditto. * cus-edit.el (custom-variable-value-create): Ditto, but for a colon. * descr-text.el (describe-text-sexp): Ditto. * descr-text.el (describe-char-unicode-data): Use #'split-string-by-char given that we're just looking for a semicolon. * descr-text.el (describe-char): Don't fire up the regexp engine just to look for a newline. * disass.el (disassemble-internal): Ditto. * files.el (file-name-sans-extension): Implement this using #'position. * files.el (file-name-extension): Correct this function's docstring, implement it in terms of #'position. * files.el (insert-directory): Don't fire up the regexp engine to split a string by space; don't reverse the list of switches, this is actually a longstand bug as far as I can see. * gnuserv.el (gnuserv-process-filter): Use #'position here, instead of consing inside #'split-string needlessly. * gtk-file-dialog.el (gtk-file-dialog-update-dropdown): Use #'split-string-by-char here, don't fire up #'split-string for directory-sep-char. * gtk-font-menu.el (hack-font-truename): Implement this more cheaply in terms of #'find, #'split-string-by-char, #'equal, rather than #'string-match, #'split-string, #'string-equal. * hyper-apropos.el (hyper-apropos-grok-functions): * hyper-apropos.el (hyper-apropos-grok-variables): Look for a newline using #'position rather than #'string-match in these functions. * info.el (Info-insert-dir): * info.el (Info-insert-file-contents): * info.el (Info-follow-reference): * info.el (Info-extract-menu-node-name): * info.el (Info-menu): Look for fixed strings using #'position or #'search as appropriate in this file. * ldap.el (ldap-decode-string): * ldap.el (ldap-encode-string): #'encode-coding-string, #'decode-coding-string are always available, don't check if they're fboundp. * ldap.el (ldap-decode-address): * ldap.el (ldap-encode-address): Use #'split-string-by-char in these functions. * lisp-mnt.el (lm-creation-date): * lisp-mnt.el (lm-last-modified-date): Don't fire up the regexp engine just to look for spaces in this file. * menubar-items.el (default-menubar): Use (not (mismatch ...)) rather than #'string-match here, for simple regexp. Use (search "beta" ...) rather than (string-match "beta" ...) * menubar-items.el (sort-buffers-menu-alphabetically): * menubar-items.el (sort-buffers-menu-by-mode-then-alphabetically): * menubar-items.el (group-buffers-menu-by-mode-then-alphabetically): Don't fire up the regexp engine to check if a string starts with a space or an asterisk. Use the more fine-grained results of #'compare-strings; compare case-insensitively for the buffer menu. * menubar-items.el (list-all-buffers): * menubar-items.el (tutorials-menu-filter): Use #'equal rather than #'string-equal, which, in this context, has the drawback of not having a bytecode, and no redeeming features. * minibuf.el: * minibuf.el (un-substitute-in-file-name): Use #'count, rather than counting the occurences of $ using the regexp engine. * minibuf.el (read-file-name-internal-1): Don't fire up the regexp engine to search for ?=. * mouse.el (mouse-eval-sexp): Check for newline with #'find. * msw-font-menu.el (mswindows-reset-device-font-menus): Split a string by newline with #'split-string-by-char. * mule/japanese.el: * mule/japanese.el ("Japanese"): Use #'search rather than #'string-match; canoncase before comparing; fix a bug I had introduced where I had been making case insensitive comparisons where the case mattered. * mule/korea-util.el (default-korean-keyboard): Look for ?3 using #'find, not #'string-march. * mule/korea-util.el (quail-hangul-switch-hanja): Search for a fixed string using #'search. * mule/mule-cmds.el (set-locale-for-language-environment): #'position, #'substitute rather than #'string-match, #'replace-in-string. * newcomment.el (comment-make-extra-lines): Use #'search rather than #'string-match for a simple string. * package-get.el (package-get-remote-filename): Use #'position when looking for ?@ * process.el (setenv): * process.el (read-envvar-name): Use #'position when looking for ?=. * replace.el (map-query-replace-regexp): Use #'split-string-by-char instead of using an inline implementation of it. * select.el (select-convert-from-cf-text): * select.el (select-convert-from-cf-unicodetext): Use #'position rather than #'string-match in these functions. * setup-paths.el (paths-emacs-data-root-p): Use #'search when looking for simple string. * sound.el (load-sound-file): Use #'split-string-by-char rather than an inline reimplementation of same. * startup.el (splash-screen-window-body): * startup.el (splash-screen-tty-body): Search for simple strings using #'search. * version.el (emacs-version): Ditto. * x-font-menu.el (hack-font-truename): Implement this more cheaply in terms of #'find, #'split-string-by-char, #'equal, rather than #'string-match, #'split-string, #'string-equal. * x-font-menu.el (x-reset-device-font-menus-core): Use #'split-string-by-char here. * x-init.el (x-initialize-keyboard): Search for a simple string using #'search.
author Aidan Kehoe <kehoea@parhasard.net>
date Wed, 01 Apr 2015 14:28:20 +0100
parents 2aa9cd456ae7
children
line wrap: on
line source

README.global-renaming

This file documents the generic scripts that have been used to implement
the recent type renamings, e.g. the "great integral type renaming" and the
"text/char type renaming".  More information about these changes can be
found in the Internals manual.

A sample script to do such renaming is this (used in the great integral
type renaming):

----------------------------------- cut ------------------------------------
files="*.[ch] s/*.h m/*.h config.h.in ../configure.in Makefile.in.in ../lib-src/*.[ch] ../lwlib/*.[ch]"
gr Memory_Count Bytecount $files
gr Lstream_Data_Count Bytecount $files
gr Element_Count Elemcount $files
gr Hash_Code Hashcode $files
gr extcount bytecount $files
gr bufpos charbpos $files
gr bytind bytebpos $files
gr memind membpos $files
gr bufbyte intbyte $files
gr Extcount Bytecount $files
gr Bufpos Charbpos $files
gr Bytind Bytebpos $files
gr Memind Membpos $files
gr Bufbyte Intbyte $files
gr EXTCOUNT BYTECOUNT $files
gr BUFPOS CHARBPOS $files
gr BYTIND BYTEBPOS $files
gr MEMIND MEMBPOS $files
gr BUFBYTE INTBYTE $files
gr MEMORY_COUNT BYTECOUNT $files
gr LSTREAM_DATA_COUNT BYTECOUNT $files
gr ELEMENT_COUNT ELEMCOUNT $files
gr HASH_CODE HASHCODE $files
----------------------------------- cut ------------------------------------


`fixtypes.sh' is a Bourne-shell script; it uses 'gr':


----------------------------------- cut ------------------------------------
#!/bin/sh

# Usage is like this:

# gr FROM TO FILES ...

# globally replace FROM with TO in FILES.  FROM and TO are regular expressions.
# backup files are stored in the `backup' directory.
from="$1"
to="$2"
shift 2
echo ${1+"$@"} | xargs global-replace "s/$from/$to/g"
----------------------------------- cut ------------------------------------


`gr' in turn uses a Perl script to do its real work, `global-replace',
which follows:


----------------------------------- cut ------------------------------------
: #-*- Perl -*-

### global-replace --- modify the contents of a file by a Perl expression

## Copyright (C) 1999 Martin Buchholz.
## Copyright (C) 2001, 2002 Ben Wing.

## Authors: Martin Buchholz <martin@xemacs.org>, Ben Wing <ben@xemacs.org>
## Maintainer: Ben Wing <ben@xemacs.org>
## Current Version: 1.2, March 12, 2002

# This program is free software: you can redistribute it and/or modify it
# under the terms of the GNU General Public License as published by the
# Free Software Foundation, either version 3 of the License, or (at your
# option) any later version.
# 
# This program is distributed in the hope that it will be useful, but WITHOUT
# ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
# FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License
# for more details.
# 
# You should have received a copy of the GNU General Public License
# along with this program.  If not, see <http://www.gnu.org/licenses/>.

eval 'exec perl -w -S $0 ${1+"$@"}'
    if 0;

use strict;
use FileHandle;
use Carp;
use Getopt::Long;
use File::Basename;

(my $myName = $0) =~ s@.*/@@; my $usage="
Usage: $myName [--help] [--backup-dir=DIR] [--line-mode] [--hunk-mode]
       PERLEXPR FILE ...

Globally modify a file, either line by line or in one big hunk.

Typical usage is like this:

[with GNU print, GNU xargs: guaranteed to handle spaces, quotes, etc.
 in file names]

find . -name '*.[ch]' -print0 | xargs -0 $0 's/\bCONST\b/const/g'\n

[with non-GNU print, xargs]

find . -name '*.[ch]' -print | xargs $0 's/\bCONST\b/const/g'\n


The file is read in, either line by line (with --line-mode specified)
or in one big hunk (with --hunk-mode specified; it's the default), and
the Perl expression is then evalled with \$_ set to the line or hunk of
text, including the terminating newline if there is one.  It should
destructively modify the value there, storing the changed result in \$_.

Files in which any modifications are made are backed up to the directory
specified using --backup-dir, or to `backup.orig' by default.  To disable
this, use --backup-dir= with no argument.

Hunk mode is the default because it is MUCH MUCH faster than line-by-line.
Use line-by-line only when it matters, e.g. you want to do a replacement
only once per line (the default without the `g' argument).  Conversely,
when using hunk mode, *ALWAYS* use `g'; otherwise, you will only make one
replacement in the entire file!
";

my %options = ();
$Getopt::Long::ignorecase = 0;
&GetOptions (
	     \%options,
	     'help', 'backup-dir=s', 'line-mode', 'hunk-mode',
);


die $usage if $options{"help"} or @ARGV <= 1;
my $code = shift;

die $usage if grep (-d || ! -w, @ARGV);

sub SafeOpen {
  open ((my $fh = new FileHandle), $_[0]);
  confess "Can't open $_[0]: $!" if ! defined $fh;
  return $fh;
}

sub SafeClose {
  close $_[0] or confess "Can't close $_[0]: $!";
}

sub FileContents {
  my $fh = SafeOpen ("< $_[0]");
  my $olddollarslash = $/;
  local $/ = undef;
  my $contents = <$fh>;
  $/ = $olddollarslash;
  return $contents;
}

sub WriteStringToFile {
  my $fh = SafeOpen ("> $_[0]");
  binmode $fh;
  print $fh $_[1] or confess "$_[0]: $!\n";
  SafeClose $fh;
}

foreach my $file (@ARGV) {
  my $changed_p = 0;
  my $new_contents = "";
  if ($options{"line-mode"}) {
    my $fh = SafeOpen $file;
    while (<$fh>) {
      my $save_line = $_;
      eval $code;
      $changed_p = 1 if $save_line ne $_;
      $new_contents .= $_;
    }
  } else {
    my $orig_contents = $_ = FileContents $file;
    eval $code;
    if ($_ ne $orig_contents) {
      $changed_p = 1;
      $new_contents = $_;
    }
  }

  if ($changed_p) {
    my $backdir = $options{"backup-dir"};
    $backdir = "backup.orig" if !defined ($backdir);
    if ($backdir) {
      my ($name, $path, $suffix) = fileparse ($file, "");
      my $backfulldir = $path . $backdir;
      my $backfile = "$backfulldir/$name";
      mkdir $backfulldir, 0755 unless -d $backfulldir;
      print "modifying $file (original saved in $backfile)\n";
      rename $file, $backfile;
    }
    WriteStringToFile ($file, $new_contents);
  }
}
----------------------------------- cut ------------------------------------