view lib-src/make-msgfile.c @ 4690:257b468bf2ca

Move the #'query-coding-region implementation to C. This is necessary because there is no reasonable way to access the corresponding mswindows-multibyte functionality from Lisp, and we need such functionality if we're going to have a reliable and portable #'query-coding-region implementation. However, this change doesn't yet provide #'query-coding-region for the mswindow-multibyte coding systems, there should be no functional differences between an XEmacs with this change and one without it. src/ChangeLog addition: 2009-09-19 Aidan Kehoe <kehoea@parhasard.net> Move the #'query-coding-region implementation to C. This is necessary because there is no reasonable way to access the corresponding mswindows-multibyte functionality from Lisp, and we need such functionality if we're going to have a reliable and portable #'query-coding-region implementation. However, this change doesn't yet provide #'query-coding-region for the mswindow-multibyte coding systems, there should be no functional differences between an XEmacs with this change and one without it. * mule-coding.c (struct fixed_width_coding_system): Add a new coding system type, fixed_width, and implement it. It uses the CCL infrastructure but has a much simpler creation API, and its own query_method, formerly in lisp/mule/mule-coding.el. * unicode.c: Move the Unicode query method implementation here from unicode.el. * lisp.h: Declare Fmake_coding_system_internal, Fcopy_range_table here. * intl-win32.c (complex_vars_of_intl_win32): Use Fmake_coding_system_internal, not Fmake_coding_system. * general-slots.h: Add Qsucceeded, Qunencodable, Qinvalid_sequence here. * file-coding.h (enum coding_system_variant): Add fixed_width_coding_system here. (struct coding_system_methods): Add query_method and query_lstream_method to the coding system methods. Provide flags for the query methods. Declare the default query method; initialise it correctly in INITIALIZE_CODING_SYSTEM_TYPE. * file-coding.c (default_query_method): New function, the default query method for coding systems that do not set it. Moved from coding.el. (make_coding_system_1): Accept new elements in PROPS in #'make-coding-system; aliases, a list of aliases; safe-chars and safe-charsets (these were previously accepted but not saved); and category. (Fmake_coding_system_internal): New function, what used to be #'make-coding-system--on Mule builds, we've now moved some of the functionality of this to Lisp. (Fcoding_system_canonical_name_p): Move this earlier in the file, since it's now called from within make_coding_system_1. (Fquery_coding_region): Move the implementation of this here, from coding.el. (complex_vars_of_file_coding): Call Fmake_coding_system_internal, not Fmake_coding_system; specify safe-charsets properties when we're a mule build. * extents.h (mouse_highlight_priority, Fset_extent_priority, Fset_extent_face, Fmap_extents): Make these available to other C files. lisp/ChangeLog addition: 2009-09-19 Aidan Kehoe <kehoea@parhasard.net> Move the #'query-coding-region implementation to C. * coding.el: Consolidate code that depends on the presence or absence of Mule at the end of this file. (default-query-coding-region, query-coding-region): Move these functions to C. (default-query-coding-region-safe-charset-skip-chars-map): Remove this variable, the corresponding C variable is Vdefault_query_coding_region_chartab_cache in file-coding.c. (query-coding-string): Update docstring to reflect actual multiple values, be more careful about not modifying a range table that we're currently mapping over. (encode-coding-char): Make the implementation of this simpler. (featurep 'mule): Autoload #'make-coding-system from mule/make-coding-system.el if we're a mule build; provide an appropriate compiler macro. Do various non-mule compatibility things if we're not a mule build. * update-elc.el (additional-dump-dependencies): Add mule/make-coding-system as a dump time dependency if we're a mule build. * unicode.el (ccl-encode-to-ucs-2): (decode-char): (encode-char): Move these earlier in the file, for the sake of some byte compile warnings. (unicode-query-coding-region): Move this to unicode.c * mule/make-coding-system.el: New file, not dumped. Contains the functionality to rework the arguments necessary for fixed-width coding systems, and contains the implementation of #'make-coding-system, which now calls #'make-coding-system-internal. * mule/vietnamese.el (viscii): * mule/latin.el (iso-8859-2): (windows-1250): (iso-8859-3): (iso-8859-4): (iso-8859-14): (iso-8859-15): (iso-8859-16): (iso-8859-9): (macintosh): (windows-1252): * mule/hebrew.el (iso-8859-8): * mule/greek.el (iso-8859-7): (windows-1253): * mule/cyrillic.el (iso-8859-5): (koi8-r): (koi8-u): (windows-1251): (alternativnyj): (koi8-ru): (koi8-t): (koi8-c): (koi8-o): * mule/arabic.el (iso-8859-6): (windows-1256): Move all these coding systems to being of type fixed-width, not of type CCL. This allows the distinct query-coding-region for them to be in C, something which will eventually allow us to implement query-coding-region for the mswindows-multibyte coding systems. * mule/general-late.el (posix-charset-to-coding-system-hash): Document why we're pre-emptively persuading the byte compiler that the ELC for this file needs to be written using escape-quoted. Call #'set-unicode-query-skip-chars-args, now the Unicode query-coding-region implementation is in C. * mule/thai-xtis.el (tis-620): Don't bother checking whether we're XEmacs or not here. * mule/mule-coding.el: Move the eight bit fixed-width functionality from this file to make-coding-system.el. tests/ChangeLog addition: 2009-09-19 Aidan Kehoe <kehoea@parhasard.net> * automated/mule-tests.el: Check a coding system's type, not an 8-bit-fixed property, for whether that coding system should be treated as a fixed-width coding system. * automated/query-coding-tests.el: Don't test the query coding functionality for mswindows-multibyte coding systems, it's not yet implemented.
author Aidan Kehoe <kehoea@parhasard.net>
date Sat, 19 Sep 2009 22:53:13 +0100
parents ecf1ebac70d8
children
line wrap: on
line source

/* #### Old code!  Replaced with make-msgfile.lex. */


/* Scan specified C and Lisp files, extracting the following messages:

     C files:
	GETTEXT (...)
	DEFER_GETTEXT (...)
	DEFUN interactive prompts
     Lisp files:
	(gettext ...)
	(dgettext "domain-name" ...)
	(defer-gettext ...)
	(interactive ...)

  The arguments given to this program are all the C and Lisp source files
  of XEmacs.  .el and .c files are allowed.  There is no support for .elc
  files at this time, but they may be specified; the corresponding .el file
  will be used.  Similarly, .o files can also be specified, and the corresponding
  .c file will be used.  This helps the makefile pass the correct list of files.

  The results, which go to standard output or to a file specified with -a or -o
  (-a to append, -o to start from nothing), are quoted strings wrapped in
  gettext(...).  The results can be passed to xgettext to produce a .po message
  file.
*/

#include <stdio.h>
#include <string.h>

#define LINESIZE 256
#define GET_LINE	fgets (line, LINESIZE, infile)
#define CHECK_EOL(p)	if (*(p) == '\0')  (p) = GET_LINE
#define SKIP_BLANKS(p)	while ((*p) == ' ' || (*p) == '\t')  (p)++

enum filetype { C_FILE, LISP_FILE, INVALID_FILE };
/* some brain-dead headers define this ... */
#undef FALSE
#undef TRUE
enum boolean { FALSE, TRUE };

FILE *infile;
FILE *outfile;
char line[LINESIZE];


void scan_file (char *filename);
void process_C_file (void);
void process_Lisp_file (void);
char *copy_up_to_paren (register char *p);
char *copy_quoted_string (register char *p);
enum boolean no_interactive_prompt (register char *q);
char *skip_blanks (register char *p);


main (int argc, char *argv[])
{
  register int i;

  outfile = stdout;

  /* If first two args are -o FILE, output to FILE. */
  i = 1;
  if (argc > i + 1 && strcmp (argv[i], "-o") == 0) {
    outfile = fopen (argv[++i], "w");
    ++i;
  }
  /* ...Or if args are -a FILE, append to FILE. */
  if (argc > i + 1 && strcmp (argv[i], "-a") == 0) {
    outfile = fopen (argv[++i], "a");
    ++i;
  }
  if (!outfile) {
    fprintf (stderr, "Unable to open output file %s\n", argv[--i]);
    return;
  }

  for (; i < argc; i++)
    scan_file (argv[i]);

  return 0;
}


void scan_file (char *filename)
{
  enum filetype type = INVALID_FILE;
  register char *p = filename + strlen (filename);

  if (strcmp (p - 4, ".elc") == 0) {
    *--p = '\0';				/* Use .el file instead */
    type = LISP_FILE;
  } else if (strcmp (p - 3, ".el") == 0)
    type = LISP_FILE;
  else if (strcmp (p - 2, ".o") == 0) {
    *--p = 'c';					/* Use .c file instead */
    type = C_FILE;
  } else if (strcmp (p - 2, ".c") == 0)
    type = C_FILE;

  if (type == INVALID_FILE) {
    fprintf (stderr, "File %s being ignored\n", filename);
    return;
  }
  infile = fopen (filename, "r");
  if (!infile) {
    fprintf (stderr, "Unable to open input file %s\n", filename);
    return;
  }

  fprintf (outfile, "/* %s */\n", filename);
  if (type == C_FILE)
    process_C_file ();
  else
    process_Lisp_file ();
  fputc ('\n', outfile);

  fclose (infile);
}


void process_C_file (void)
{
  register char *p;
  char *gettext, *defun;

  while (p = GET_LINE) {
    gettext = strstr (p, "GETTEXT");
    defun = strstr (p, "DEFUN");
    if (gettext || defun) {
      if (gettext) {
	p = gettext;
	p += 7;			/* Skip over "GETTEXT" */
      }
      else if (defun) {
	p = defun;
	p += 5;			/* Skip over "DEFUN" */
      }

      p = skip_blanks (p);
      if (*p++ != '(')
	continue;

      if (defun) {
	register int i;

	for (i = 0; i < 5; i++)	/* Skip over commas to doc string */
	  while (*p++ != ',')
	    CHECK_EOL (p);
	if (*p == '\n')
	  p = GET_LINE;
      }

      p = skip_blanks (p);
      if (*p != '\"')		/* Make sure there is a quoted string */
	continue;

      if (defun && no_interactive_prompt (p))
	continue;

      fprintf (outfile, "gettext(");
      if (gettext)
	p = copy_up_to_paren (p);
      else
	p = copy_quoted_string (p);
      fprintf (outfile, ")\n");
    }
  }
}


void process_Lisp_file (void)
{
  register char *p;
  char *gettext, *interactive;
  enum boolean dgettext = FALSE;

  while (p = GET_LINE) {
    gettext = strstr (p, "gettext");
    interactive = strstr (p, "(interactive");
    if (gettext || interactive) {
      if (!interactive)
	p = gettext;
      else if (!gettext)
	p = interactive;
      else if (gettext < interactive) {
	p = gettext;
	interactive = NULL;
      } else {
	p = interactive;
	gettext = NULL;
      }

      if (gettext) {
	if (p > line && *(p-1) == 'd')
	  dgettext = TRUE;
	p += 7;		/* Skip over "gettext" */
      } else
	p += 12;	/* Skip over "(interactive" */

      p = skip_blanks (p);
      if (*p != '\"')		/* Make sure there is a quoted string */
	continue;

      if (dgettext) {		/* Skip first quoted string (domain name) */
	while (*++p != '"')
	  ;  /* null statement */
	++p;
	p = skip_blanks (p);
	if (*p != '\"')		/* Check for second quoted string (message) */
	  continue;
      }

      if (interactive && no_interactive_prompt (p))
	continue;

      fprintf (outfile, "gettext(");
      p = copy_up_to_paren (p);
      fprintf (outfile, ")\n");
    }
  }
}


/* Assuming p points to some character beyond an opening parenthesis, copy
   everything to outfile up to but not including the closing parenthesis.
*/
char *copy_up_to_paren (register char *p)
{
  for (;;) {
    SKIP_BLANKS (p);	/* We don't call skip_blanks() in order to */
    CHECK_EOL (p);	/* preserve blanks at the beginning of the line */
    if (*p == ')')
      break;

    if (*p == '\"')
      p = copy_quoted_string (p);
    else
      fputc (*p++, outfile);
  }
  return p;
}


/* Assuming p points to a quote character, copy the quoted string to outfile.
*/
char *copy_quoted_string (register char *p)
{
  do {
    if (*p == '\\')
      fputc (*p++, outfile);
    fputc (*p++, outfile);
    CHECK_EOL (p);
  } while (*p != '\"');

  fputc (*p++, outfile);
  return p;
}


/* Return TRUE if the interactive specification consists only
   of code letters and no prompt.
*/
enum boolean no_interactive_prompt (register char *q)
{
  while (++q, *q == '*' || *q == '@')
    ; /* null statement */
  if (*q == '\"')
    return TRUE;
 skip_code_letter:
  if (*++q == '\"')
    return TRUE;
  if (*q == '\\' && *++q == 'n') {
    ++q;
    goto skip_code_letter;
  }
  return FALSE;
}


char *skip_blanks (register char *p)
{
  while (*p == ' ' || *p == '\t' || *p == '\n') {
    p++;
    CHECK_EOL (p);
  }
  return p;
}