view src/realpath.c @ 4690:257b468bf2ca

Move the #'query-coding-region implementation to C. This is necessary because there is no reasonable way to access the corresponding mswindows-multibyte functionality from Lisp, and we need such functionality if we're going to have a reliable and portable #'query-coding-region implementation. However, this change doesn't yet provide #'query-coding-region for the mswindow-multibyte coding systems, there should be no functional differences between an XEmacs with this change and one without it. src/ChangeLog addition: 2009-09-19 Aidan Kehoe <kehoea@parhasard.net> Move the #'query-coding-region implementation to C. This is necessary because there is no reasonable way to access the corresponding mswindows-multibyte functionality from Lisp, and we need such functionality if we're going to have a reliable and portable #'query-coding-region implementation. However, this change doesn't yet provide #'query-coding-region for the mswindow-multibyte coding systems, there should be no functional differences between an XEmacs with this change and one without it. * mule-coding.c (struct fixed_width_coding_system): Add a new coding system type, fixed_width, and implement it. It uses the CCL infrastructure but has a much simpler creation API, and its own query_method, formerly in lisp/mule/mule-coding.el. * unicode.c: Move the Unicode query method implementation here from unicode.el. * lisp.h: Declare Fmake_coding_system_internal, Fcopy_range_table here. * intl-win32.c (complex_vars_of_intl_win32): Use Fmake_coding_system_internal, not Fmake_coding_system. * general-slots.h: Add Qsucceeded, Qunencodable, Qinvalid_sequence here. * file-coding.h (enum coding_system_variant): Add fixed_width_coding_system here. (struct coding_system_methods): Add query_method and query_lstream_method to the coding system methods. Provide flags for the query methods. Declare the default query method; initialise it correctly in INITIALIZE_CODING_SYSTEM_TYPE. * file-coding.c (default_query_method): New function, the default query method for coding systems that do not set it. Moved from coding.el. (make_coding_system_1): Accept new elements in PROPS in #'make-coding-system; aliases, a list of aliases; safe-chars and safe-charsets (these were previously accepted but not saved); and category. (Fmake_coding_system_internal): New function, what used to be #'make-coding-system--on Mule builds, we've now moved some of the functionality of this to Lisp. (Fcoding_system_canonical_name_p): Move this earlier in the file, since it's now called from within make_coding_system_1. (Fquery_coding_region): Move the implementation of this here, from coding.el. (complex_vars_of_file_coding): Call Fmake_coding_system_internal, not Fmake_coding_system; specify safe-charsets properties when we're a mule build. * extents.h (mouse_highlight_priority, Fset_extent_priority, Fset_extent_face, Fmap_extents): Make these available to other C files. lisp/ChangeLog addition: 2009-09-19 Aidan Kehoe <kehoea@parhasard.net> Move the #'query-coding-region implementation to C. * coding.el: Consolidate code that depends on the presence or absence of Mule at the end of this file. (default-query-coding-region, query-coding-region): Move these functions to C. (default-query-coding-region-safe-charset-skip-chars-map): Remove this variable, the corresponding C variable is Vdefault_query_coding_region_chartab_cache in file-coding.c. (query-coding-string): Update docstring to reflect actual multiple values, be more careful about not modifying a range table that we're currently mapping over. (encode-coding-char): Make the implementation of this simpler. (featurep 'mule): Autoload #'make-coding-system from mule/make-coding-system.el if we're a mule build; provide an appropriate compiler macro. Do various non-mule compatibility things if we're not a mule build. * update-elc.el (additional-dump-dependencies): Add mule/make-coding-system as a dump time dependency if we're a mule build. * unicode.el (ccl-encode-to-ucs-2): (decode-char): (encode-char): Move these earlier in the file, for the sake of some byte compile warnings. (unicode-query-coding-region): Move this to unicode.c * mule/make-coding-system.el: New file, not dumped. Contains the functionality to rework the arguments necessary for fixed-width coding systems, and contains the implementation of #'make-coding-system, which now calls #'make-coding-system-internal. * mule/vietnamese.el (viscii): * mule/latin.el (iso-8859-2): (windows-1250): (iso-8859-3): (iso-8859-4): (iso-8859-14): (iso-8859-15): (iso-8859-16): (iso-8859-9): (macintosh): (windows-1252): * mule/hebrew.el (iso-8859-8): * mule/greek.el (iso-8859-7): (windows-1253): * mule/cyrillic.el (iso-8859-5): (koi8-r): (koi8-u): (windows-1251): (alternativnyj): (koi8-ru): (koi8-t): (koi8-c): (koi8-o): * mule/arabic.el (iso-8859-6): (windows-1256): Move all these coding systems to being of type fixed-width, not of type CCL. This allows the distinct query-coding-region for them to be in C, something which will eventually allow us to implement query-coding-region for the mswindows-multibyte coding systems. * mule/general-late.el (posix-charset-to-coding-system-hash): Document why we're pre-emptively persuading the byte compiler that the ELC for this file needs to be written using escape-quoted. Call #'set-unicode-query-skip-chars-args, now the Unicode query-coding-region implementation is in C. * mule/thai-xtis.el (tis-620): Don't bother checking whether we're XEmacs or not here. * mule/mule-coding.el: Move the eight bit fixed-width functionality from this file to make-coding-system.el. tests/ChangeLog addition: 2009-09-19 Aidan Kehoe <kehoea@parhasard.net> * automated/mule-tests.el: Check a coding system's type, not an 8-bit-fixed property, for whether that coding system should be treated as a fixed-width coding system. * automated/query-coding-tests.el: Don't test the query coding functionality for mswindows-multibyte coding systems, it's not yet implemented.
author Aidan Kehoe <kehoea@parhasard.net>
date Sat, 19 Sep 2009 22:53:13 +0100
parents d00888bfced1
children 19d70297d866
line wrap: on
line source

/*
 * realpath.c -- canonicalize pathname by removing symlinks
 * Copyright (C) 1993 Rick Sladkey <jrs@world.std.com>
 * Copyright (C) 2001, 2002, 2004 Ben Wing.
 *

This file is part of XEmacs.

XEmacs is free software; you can redistribute it and/or modify it
under the terms of the GNU General Public License as published by the
Free Software Foundation; either version 2, or (at your option) any
later version.

XEmacs is distributed in the hope that it will be useful, but WITHOUT
ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License
for more details.

You should have received a copy of the GNU General Public License
along with XEmacs; see the file COPYING.  If not, write to
the Free Software Foundation, Inc., 59 Temple Place - Suite 330,
Boston, MA 02111-1307, USA.  */

/* Synched up with: Not in FSF. */

/* This file has been Mule-ized, June 2001 by Ben Wing.

   Everything in this file now works in terms of internal, not external,
   data.  This is the only way to be safe, and it makes the code cleaner. */

#include <config.h>
#include "lisp.h"

#include "profile.h"

#include "sysfile.h"
#include "sysdir.h"

#define MAX_READLINKS 32

#ifdef WIN32_ANY
#include "syswindows.h"
#ifndef ELOOP
#define ELOOP 10062 /* = WSAELOOP in winsock.h */
#endif
#endif

Lisp_Object QSin_qxe_realpath;

/* Length of start of absolute filename. */
static int 
abs_start (const Ibyte *name)
{
#ifdef WIN32_ANY
  if (isalpha (*name) && IS_DEVICE_SEP (name[1])
      && IS_DIRECTORY_SEP (name[2]))
    return 3;
  else if (IS_DIRECTORY_SEP (*name))
    return IS_DIRECTORY_SEP (name[1]) ? 2 : 1;
  else 
    return 0;
#else /* not WIN32_ANY */
  return IS_DIRECTORY_SEP (*name) ? 1 : 0;
#endif
}

/* Find real name of a file by resolving symbolic links and/or shortcuts
   under Windows (.LNK links), if such support is enabled.

   If no link found, and LINKS_ONLY is false, look up the correct case in
   the file system of the last component.

   Under Windows, UNC servers and shares are lower-cased.  Directories must
   be given without trailing '/'. One day, this could read Win2K's reparse
   points.

   Returns length of characters copied info BUF.
   DOES NOT ZERO TERMINATE!!!!!
*/

static int
readlink_or_correct_case (const Ibyte *name, Ibyte *buf, Bytecount size,
#ifndef WIN32_ANY
			  Boolint UNUSED (links_only)
#else
			  Boolint links_only
#endif
			  )
{
#ifndef WIN32_ANY
  return qxe_readlink (name, buf, (size_t) size);
#else
# ifdef CYGWIN
  Ibyte *tmp;
  int n = qxe_readlink (name, buf, (size_t) size);
  if (n >= 0 || errno != EINVAL)
    return n;

  /* The file may exist, but isn't a symlink. Try to find the
     right name. */
  tmp =
    alloca_ibytes (cygwin_posix_to_win32_path_list_buf_size ((char *) name));
  cygwin_posix_to_win32_path_list ((char *) name, (char *) tmp);
  name = tmp;
# else
  if (mswindows_shortcuts_are_symlinks)
    {
      Ibyte *tmp = mswindows_read_link (name);

      if (tmp != NULL)
	{
	  /* Fucking fixed buffers. */
	  Bytecount len = qxestrlen (tmp);
	  if (len > size)
	    {
	      errno = ENAMETOOLONG;
	      return -1;
	    }
	  memcpy (buf, tmp, len);
	  xfree (tmp, Ibyte *);
	  return len;
	}
    }
# endif

  if (links_only)
    {
      errno = EINVAL;
      return -1;
    }

  {
    int len = 0;
    int err = 0;
    const Ibyte *lastname;
    int count = 0;
    const Ibyte *nn;
    DECLARE_EISTRING (result);
  
    assert (*name);
  
    /* Sort of check we have a valid filename. */
    if (qxestrpbrk (name, "*?|<>\""))
      {
	errno = ENOENT;
	return -1;
      }
    else if (qxestrlen (name) >= PATH_MAX_INTERNAL)
      {
	errno = ENAMETOOLONG;
	return -1;
      }
  
    /* Find start of filename */
    lastname = name + qxestrlen (name);
    while (lastname > name && !IS_DIRECTORY_SEP (lastname[-1]))
      --lastname;

    /* Count slashes in unc path */
    if (abs_start (name) == 2)
      for (nn = name; *nn; nn++)
	if (IS_DIRECTORY_SEP (*nn))
	  count++;

    if (count >= 2 && count < 4)
      {
	eicpy_rawz (result, lastname);
	eilwr (result);
      }
    else
      {
	WIN32_FIND_DATAW find_data;
	Extbyte *nameext;
	HANDLE dir_handle;

	C_STRING_TO_TSTR (name, nameext);
	dir_handle = qxeFindFirstFile (nameext, &find_data);
	if (dir_handle == INVALID_HANDLE_VALUE)
	  {
	    errno = ENOENT;
	    return -1;
	  }
	eicpy_ext (result, (Extbyte *) find_data.cFileName, Qmswindows_tstr);
	FindClose (dir_handle);
      }

    if ((len = eilen (result)) <= size)
      {
	DECLARE_EISTRING (eilastname);

	eicpy_rawz (eilastname, lastname);
	if (eicmp_ei (eilastname, result) == 0)
          /* Signal that the name is already OK. */
          err = EINVAL;
	else
	  memcpy (buf, eidata (result), len);
      }
    else
      err = ENAMETOOLONG;

    errno = err;
    return err ? -1 : len;
  }
#endif /* WIN32_ANY */
}

/* Mule Note: This function works with and returns
   internally-formatted strings.

   if LINKS_ONLY is true, don't do case canonicalization under
   Windows. */

Ibyte *
qxe_realpath (const Ibyte *path, Ibyte *resolved_path, Boolint links_only)
{
  Ibyte copy_path[PATH_MAX_INTERNAL];
  Ibyte *new_path = resolved_path;
  Ibyte *max_path;
  Ibyte *retval = NULL;
#if defined (HAVE_READLINK) || defined (WIN32_ANY)
  int readlinks = 0;
  Ibyte link_path[PATH_MAX_INTERNAL];
  int n;
  int abslen = abs_start (path);
#endif

  PROFILE_DECLARE ();

  PROFILE_RECORD_ENTERING_SECTION (QSin_qxe_realpath);

 restart:

  /* Make a copy of the source path since we may need to modify it. */
  qxestrcpy (copy_path, path);
  path = copy_path;
  max_path = copy_path + PATH_MAX_INTERNAL - 2;

  if (0)
    ;
#ifdef WIN32_ANY
  /* Check for c:/... or //server/... */
  else if (abslen == 3 || abslen == 2)
    {
      /* Make sure drive letter is lowercased. */
      if (abslen == 3)
	{
	  *new_path = tolower (*path);
	  new_path++;
	  path++;
	  abslen--;
	}
      /* Coerce directory chars. */
      while (abslen-- > 0)
	{
	  if (IS_DIRECTORY_SEP (*path))
	    *new_path++ = DIRECTORY_SEP;
	  else
	    *new_path++ = *path;
	  path++;
	}
    }
#endif
#ifdef WIN32_NATIVE
  /* No drive letter, but a beginning slash? Prepend drive letter. */
  else if (abslen == 1)
    {
      get_initial_directory (new_path, PATH_MAX_INTERNAL - 1);
      new_path += 3;
      path++;
    }
  /* Just a path name, prepend the current directory */
  else
    {
      get_initial_directory (new_path, PATH_MAX_INTERNAL - 1);
      new_path += qxestrlen (new_path);
      if (!IS_DIRECTORY_SEP (new_path[-1]))
	*new_path++ = DIRECTORY_SEP;
    }
#else
  /* If it's a relative pathname use get_initial_directory for starters. */
  else if (abslen == 0)
    {
      get_initial_directory (new_path, PATH_MAX_INTERNAL - 1);
      new_path += qxestrlen (new_path);
      if (!IS_DIRECTORY_SEP (new_path[-1]))
	*new_path++ = DIRECTORY_SEP;
    }
  else
    {
      /* Copy first directory sep. May have two on cygwin. */
      qxestrncpy (new_path, path, abslen);
      new_path += abslen;
      path += abslen;
    }
#endif
  /* Expand each slash-separated pathname component. */
  while (*path != '\0')
    {
      /* Ignore stray "/". */
      if (IS_DIRECTORY_SEP (*path))
	{
	  path++;
	  continue;
	}

      if (*path == '.')
	{
	  /* Ignore ".". */
	  if (path[1] == '\0' || IS_DIRECTORY_SEP (path[1]))
	    {
	      path++;
	      continue;
	    }

	  /* Handle ".." */
	  if (path[1] == '.' &&
	      (path[2] == '\0' || IS_DIRECTORY_SEP (path[2])))
	    {
	      path += 2;

	      /* Ignore ".." at root. */
	      if (new_path == resolved_path + abs_start (resolved_path))
		continue;

	      /* Handle ".." by backing up. */
	      --new_path;
	      while (!IS_DIRECTORY_SEP (new_path[-1]))
		--new_path;
	      continue;
	    }
	}

      /* Safely copy the next pathname component. */
      while (*path != '\0' && !IS_DIRECTORY_SEP (*path))
	{
	  if (path > max_path)
	    {
	      errno = ENAMETOOLONG;
	      goto done;
	    }
	  *new_path++ = *path++;
	}

#if defined (HAVE_READLINK) || defined (WIN32_ANY)
      /* See if latest pathname component is a symlink or needs case
	 correction. */
      *new_path = '\0';
      n = readlink_or_correct_case (resolved_path, link_path,
				    PATH_MAX_INTERNAL - 1, links_only);

      if (n < 0)
	{
	  /* EINVAL means the file exists but isn't a symlink or doesn't
	     need case correction. */
#ifdef WIN32_ANY
	  if (errno != EINVAL && errno != ENOENT)
#else
	  if (errno != EINVAL) 
#endif
	    goto done;
	}
      else
	{
	  /* Protect against infinite loops. */
	  if (readlinks++ > MAX_READLINKS)
	    {
	      errno = ELOOP;
	      goto done;
	    }

	  /* Note: readlink doesn't add the null byte. */
	  link_path[n] = '\0';
	  
	  abslen = abs_start (link_path);
	  if (abslen > 0)
	    {
	      /* Start over for an absolute symlink. */
	      new_path = resolved_path;
	      qxestrcat (link_path, path);
	      path = link_path;
	      goto restart;
	    }

	  /* Otherwise back up over this component. */
	  for (--new_path; !IS_DIRECTORY_SEP (*new_path); --new_path)
	    assert (new_path > resolved_path);

	  /* Safe sex check. */
	  if (qxestrlen (path) + n >= PATH_MAX_INTERNAL)
	    {
	      errno = ENAMETOOLONG;
	      goto done;
	    }

	  /* Insert symlink contents into path. */
	  qxestrcat (link_path, path);
	  qxestrcpy (copy_path, link_path);
	  path = copy_path;
	}
#endif /* HAVE_READLINK || WIN32_ANY */
      *new_path++ = DIRECTORY_SEP;
    }

  /* Delete trailing slash but don't whomp a lone slash. */
  if (new_path != resolved_path + abs_start (resolved_path) &&
      IS_DIRECTORY_SEP (new_path[-1]))
    new_path--;

  /* Make sure it's null terminated. */
  *new_path = '\0';

  retval = resolved_path;
done:
  PROFILE_RECORD_EXITING_SECTION (QSin_qxe_realpath);
  return retval;
}

void
vars_of_realpath (void)
{
  QSin_qxe_realpath =
    build_msg_string ("(in qxe_realpath)");
  staticpro (&QSin_qxe_realpath);
}