comparison src/intl-encap-win32.c @ 2367:ecf1ebac70d8

[xemacs-hg @ 2004-11-04 23:05:23 by ben] commit mega-patch configure.in: Turn off -Winline and -Wchar-subscripts. Use the right set of cflags when compiling modules. Rewrite ldap configuration to separate the inclusion of lber (needed in recent Cygwin) from the basic checks for the needed libraries. add a function for MAKE_JUNK_C; initially code was added to generate xemacs.def using this, but it will need to be rewritten. add an rm -f for junk.c to avoid weird Cygwin bug with cp -f onto an existing file. Sort list of auto-detected functions and eliminate unused checks for stpcpy, setlocale and getwd. Add autodetection of Cygwin scanf problems BETA: Rewrite section on configure to indicate what flags are important and what not. digest-doc.c, make-dump-id.c, profile.c, sorted-doc.c: Add proper decls for main(). make-msgfile.c: Document that this is old junk. Move proposal to text.c. make-msgfile.lex: Move proposal to text.c. make-mswin-unicode.pl: Convert error-generating code so that the entire message will be seen as a single unrecognized token. mule/mule-ccl.el: Update docs. lispref/mule.texi: Update CCL docs. ldap/eldap.c: Mule-ize. Use EXTERNAL_LIST_LOOP_2 instead of deleted EXTERNAL_LIST_LOOP. * XEmacs 21.5.18 "chestnut" is released. --------------------------------------------------------------- MULE-RELATED WORK: --------------------------------------------------------------- --------------------------- byte-char conversion --------------------------- buffer.c, buffer.h, insdel.c, text.c: Port FSF algorithm for byte-char conversion, replacing broken previous version. Track the char position of the gap. Add functions to do char-byte conversion downwards as well as upwards. Move comments about algorithm workings to internals manual. --------------------------- work on types --------------------------- alloc.c, console-x-impl.h, dump-data.c, dump-data.h, dumper.c, dialog-msw.c, dired-msw.c, doc.c, editfns.c, esd.c, event-gtk.h, event-msw.c, events.c, file-coding.c, file-coding.h, fns.c, glyphs-eimage.c, glyphs-gtk.c, glyphs-msw.c, glyphs-shared.c, glyphs-x.c, glyphs.c, glyphs.h, gui.c, hpplay.c, imgproc.c, intl-win32.c, lrecord.h, lstream.c, keymap.c, lisp.h, libsst.c, linuxplay.c, miscplay.c, miscplay.h, mule-coding.c, nas.c, nt.c, ntheap.c, ntplay.c, objects-msw.c, objects-tty.c, objects-x.c, print.c, process-nt.c, process.c, redisplay.h, select-common.h, select-gtk.c, select-x.c, sgiplay.c, sound.c, sound.h, sunplay.c, sysfile.h, sysdep.c, syswindows.h, text.c, unexnt.c, win32.c, xgccache.c: Further work on types. This creates a full set of types for all the basic semantics of `char' that I have so far identified, so that its semantics can always be identified for the purposes of proper Mule-safe code, and the raw use of `char' always avoided. (1) More type renaming, for consistency of naming. Char_ASCII -> Ascbyte UChar_ASCII -> UAscbyte Char_Binary -> CBinbyte UChar_Binary -> Binbyte SChar_Binary -> SBinbyte (2) Introduce Rawbyte, CRawbyte, Boolbyte, Chbyte, UChbyte, and Bitbyte and use them. (3) New types Itext, Wexttext and Textcount for separating out the concepts of bytes and textual units (different under UTF-16 and UTF-32, which are potential internal encodings). (4) qxestr*_c -> qxestr*_ascii. lisp.h: New; goes with other qxe() functions. #### Maybe goes in a different section. lisp.h: Group generic int-type defs together with EMACS_INT defs. lisp.h: * lisp.h (WEXTTEXT_IS_WIDE) New defns. lisp.h: New type to replace places where int occurs as a boolean. It's signed because occasionally people may want to use -1 as an error value, and because unsigned ints are viral -- see comments in the internals manual against using them. dynarr.c: int -> Bytecount. --------------------------- Mule-izing --------------------------- device-x.c: Partially Mule-ize. dumper.c, dumper.h: Mule-ize. Use Rawbyte. Use stderr_out not printf. Use wext_*(). sysdep.c, syswindows.h, text.c: New Wexttext API for manipulation of external text that may be Unicode (e.g. startup code under Windows). emacs.c: Mule-ize. Properly deal with argv in external encoding. Use wext_*() and Wexttext. Use Rawbyte. #if 0 some old junk on SCO that is unlikely to be correct. Rewrite allocation code in run-temacs. emacs.c, symsinit.h, win32.c: Rename win32 init function and call it even earlier, to initialize mswindows_9x_p even earlier, for use in startup code (XEUNICODE_P). process.c: Use _wenviron not environ under Windows, to get Unicode environment variables. event-Xt.c: Mule-ize drag-n-drop related stuff. dragdrop.c, dragdrop.h, frame-x.c: Mule-ize. text.h: Add some more stand-in defines for particular kinds of conversion; use in Mule-ization work in frame-x.c etc. --------------------------- Freshening --------------------------- intl-auto-encap-win32.c, intl-auto-encap-win32.h: Regenerate. --------------------------- Unicode-work --------------------------- intl-win32.c, syswindows.h: Factor out common options to MultiByteToWideChar and WideCharToMultiByte. Add convert_unicode_to_multibyte_malloc() and convert_unicode_to_multibyte_dynarr() and use. Add stuff for alloca() conversion of multibyte/unicode. alloc.c: Use dfc_external_data_len() in case of unicode coding system. alloc.c, mule-charset.c: Don't zero out and reinit charset Unicode tables. This fucks up dump-time loading. Anyway, either we load them at dump time or run time, never both. unicode.c: Dump the blank tables as well. --------------------------------------------------------------- DOCUMENTATION, MOSTLY MULE-RELATED: --------------------------------------------------------------- EmacsFrame.c, emodules.c, event-Xt.c, fileio.c, input-method-xlib.c, mule-wnnfns.c, redisplay-gtk.c, redisplay-tty.c, redisplay-x.c, regex.c, sysdep.c: Add comment about Mule work needed. text.h: Add more documentation describing why DFC routines were not written to return their value. Add some other DFC documentation. console-msw.c, console-msw.h: Add pointer to docs in win32.c. emacs.c: Add comments on sources of doc info. text.c, charset.h, unicode.c, intl-win32.c, intl-encap-win32.c, text.h, file-coding.c, mule-coding.c: Collect background comments and related to text matters and internationalization, and proposals for work to be done, in text.c or Internals manual, stuff related to specific textual API's in text.h, and stuff related to internal implementation of Unicode conversion in unicode.c. Put lots of pointers to the comments to make them easier to find. s/mingw32.h, s/win32-common.h, s/win32-native.h, s/windowsnt.h, win32.c: Add bunches of new documentation on the different kinds of builds and environments under Windows and how they work. Collect this info in win32.c. Add pointers to these docs in the relevant s/* files. emacs.c: Document places with long comments. Remove comment about exiting, move to internals manual, put in pointer. event-stream.c: Move docs about event queues and focus to internals manual, put in pointer. events.h: Move docs about event stream callbacks to internals manual, put in pointer. profile.c, redisplay.c, signal.c: Move documentation to the Internals manual. process-nt.c: Add pointer to comment in win32-native.el. lisp.h: Add comments about some comment conventions. lisp.h: Add comment about the second argument. device-msw.c, redisplay-msw.c: @@#### comments are out-of-date. --------------------------------------------------------------- PDUMP WORK (MOTIVATED BY UNICODE CHANGES) --------------------------------------------------------------- alloc.c, buffer.c, bytecode.c, console-impl.h, console.c, device.c, dumper.c, lrecord.h, elhash.c, emodules.h, events.c, extents.c, frame.c, glyphs.c, glyphs.h, mule-charset.c, mule-coding.c, objects.c, profile.c, rangetab.c, redisplay.c, specifier.c, specifier.h, window.c, lstream.c, file-coding.h, file-coding.c: PDUMP: Properly implement dump_add_root_block(), which never worked before, and is necessary for dumping Unicode tables. Pdump name changes for accuracy: XD_STRUCT_PTR -> XD_BLOCK_PTR. XD_STRUCT_ARRAY -> XD_BLOCK_ARRAY. XD_C_STRING -> XD_ASCII_STRING. *_structure_* -> *_block_*. lrecord.h: some comments added about dump_add_root_block() vs dump_add_root_block_ptr(). extents.c: remove incorrect comment about pdump problems with gap array. --------------------------------------------------------------- ALLOCATION --------------------------------------------------------------- abbrev.c, alloc.c, bytecode.c, casefiddle.c, device-msw.c, device-x.c, dired-msw.c, doc.c, doprnt.c, dragdrop.c, editfns.c, emodules.c, file-coding.c, fileio.c, filelock.c, fns.c, glyphs-eimage.c, glyphs-gtk.c, glyphs-msw.c, glyphs-x.c, gui-msw.c, gui-x.c, imgproc.c, intl-win32.c, lread.c, menubar-gtk.c, menubar.c, nt.c, objects-msw.c, objects-x.c, print.c, process-nt.c, process-unix.c, process.c, realpath.c, redisplay.c, search.c, select-common.c, symbols.c, sysdep.c, syswindows.h, text.c, text.h, ui-byhand.c: New macros {alloca,xnew}_{itext,{i,ext,raw,bin,asc}bytes} for more convenient allocation of these commonly requested items. Modify functions to use alloca_ibytes, alloca_array, alloca_extbytes, xnew_ibytes, etc. also XREALLOC_ARRAY, xnew. alloc.c: Rewrite the allocation functions to factor out repeated code. Add assertions for freeing dumped data. lisp.h: Moved down and consolidated with other allocation stuff. lisp.h, dynarr.c: New functions for allocation that's very efficient when mostly in LIFO order. lisp.h, text.c, text.h: Factor out some stuff for general use by alloca()-conversion funs. text.h, lisp.h: Fill out convenience routines for allocating various kinds of bytes and put them in lisp.h. Use them in place of xmalloc(), ALLOCA(). text.h: Fill out the convenience functions so the _MALLOC() kinds match the alloca() kinds. --------------------------------------------------------------- ERROR-CHECKING --------------------------------------------------------------- text.h: Create ASSERT_ASCTEXT_ASCII() and ASSERT_ASCTEXT_ASCII_LEN() from similar Eistring checkers and change the Eistring checkers to use them instead. --------------------------------------------------------------- MACROS IN LISP.H --------------------------------------------------------------- lisp.h: Redo GCPRO declarations. Create a "base" set of functions that can be used to generate any kind of gcpro sets -- regular, ngcpro, nngcpro, private ones used in GC_EXTERNAL_LIST_LOOP_2. buffer.c, callint.c, chartab.c, console-msw.c, device-x.c, dialog-msw.c, dired.c, extents.c, ui-gtk.c, rangetab.c, nt.c, mule-coding.c, minibuf.c, menubar-msw.c, menubar.c, menubar-gtk.c, lread.c, lisp.h, gutter.c, glyphs.c, glyphs-widget.c, fns.c, fileio.c, file-coding.c, specifier.c: Eliminate EXTERNAL_LIST_LOOP, which does not check for circularities. Use EXTERNAL_LIST_LOOP_2 instead or EXTERNAL_LIST_LOOP_3 or EXTERNAL_PROPERTY_LIST_LOOP_3 or GC_EXTERNAL_LIST_LOOP_2 (new macro). Removed/redid comments on EXTERNAL_LIST_LOOP. --------------------------------------------------------------- SPACING FIXES --------------------------------------------------------------- callint.c, hftctl.c, number-gmp.c, process-unix.c: Spacing fixes. --------------------------------------------------------------- FIX FOR GEOMETRY PROBLEM IN FIRST FRAME --------------------------------------------------------------- unicode.c: Add workaround for newlib bug in sscanf() [should be fixed by release 1.5.12 of Cygwin]. toolbar.c: bug fix for problem of initial frame being 77 chars wide on Windows. will be overridden by my other ws. --------------------------------------------------------------- FIX FOR LEAKING PROCESS HANDLES: --------------------------------------------------------------- process-nt.c: Fixes for leaking handles. Inspired by work done by Adrian Aichner <adrian@xemacs.org>. --------------------------------------------------------------- FIX FOR CYGWIN BUG (Unicode-related): --------------------------------------------------------------- unicode.c: Add workaround for newlib bug in sscanf() [should be fixed by release 1.5.12 of Cygwin]. --------------------------------------------------------------- WARNING FIXES: --------------------------------------------------------------- console-stream.c: `reinit' is unused. compiler.h, event-msw.c, frame-msw.c, intl-encap-win32.c, text.h: Add stuff to deal with ANSI-aliasing warnings I got. regex.c: Gather includes together to avoid warning. --------------------------------------------------------------- CHANGES TO INITIALIZATION ROUTINES: --------------------------------------------------------------- buffer.c, emacs.c, console.c, debug.c, device-x.c, device.c, dragdrop.c, emodules.c, eval.c, event-Xt.c, event-gtk.c, event-msw.c, event-stream.c, event-tty.c, events.c, extents.c, faces.c, file-coding.c, fileio.c, font-lock.c, frame-msw.c, glyphs-widget.c, glyphs.c, gui-x.c, insdel.c, lread.c, lstream.c, menubar-gtk.c, menubar-x.c, minibuf.c, mule-wnnfns.c, objects-msw.c, objects.c, print.c, scrollbar-x.c, search.c, select-x.c, text.c, undo.c, unicode.c, window.c, symsinit.h: Call reinit_*() functions directly from emacs.c, for clarity. Factor out some redundant init code. Move disallowed stuff that had crept into vars_of_glyphs() into complex_vars_of_glyphs(). Call init_eval_semi_early() from eval.c not in the middle of vars_of_() in emacs.c since there should be no order dependency in the latter calls. --------------------------------------------------------------- ARMAGEDDON: --------------------------------------------------------------- alloc.c, emacs.c, lisp.h, print.c: Rename inhibit_non_essential_printing_operations to inhibit_non_essential_conversion_operations. text.c: Assert on !inhibit_non_essential_conversion_operations. console-msw.c, print.c: Don't do conversion in SetConsoleTitle or FindWindow to avoid problems during armageddon. Put #errors for NON_ASCII_INTERNAL_FORMAT in places where problems would arise. --------------------------------------------------------------- CHANGES TO THE BUILD PROCEDURE: --------------------------------------------------------------- config.h.in, s/cxux.h, s/usg5-4-2.h, m/powerpc.h: Add comment about correct ordering of this file. Rearrange everything to follow this -- put all #undefs together and before the s&m files. Add undefs for HAVE_ALLOCA, C_ALLOCA, BROKEN_ALLOCA_IN_FUNCTION_CALLS, STACK_DIRECTION. Remove unused HAVE_STPCPY, HAVE_GETWD, HAVE_SETLOCALE. m/gec63.h: Deleted; totally broken, not used at all, not in FSF. m/7300.h, m/acorn.h, m/alliant-2800.h, m/alliant.h, m/altos.h, m/amdahl.h, m/apollo.h, m/att3b.h, m/aviion.h, m/celerity.h, m/clipper.h, m/cnvrgnt.h, m/convex.h, m/cydra5.h, m/delta.h, m/delta88k.h, m/dpx2.h, m/elxsi.h, m/ews4800r.h, m/gould.h, m/hp300bsd.h, m/hp800.h, m/hp9000s300.h, m/i860.h, m/ibmps2-aix.h, m/ibmrs6000.h, m/ibmrt-aix.h, m/ibmrt.h, m/intel386.h, m/iris4d.h, m/iris5d.h, m/iris6d.h, m/irist.h, m/isi-ov.h, m/luna88k.h, m/m68k.h, m/masscomp.h, m/mg1.h, m/mips-nec.h, m/mips-siemens.h, m/mips.h, m/news.h, m/nh3000.h, m/nh4000.h, m/ns32000.h, m/orion105.h, m/pfa50.h, m/plexus.h, m/pmax.h, m/powerpc.h, m/pyrmips.h, m/sequent-ptx.h, m/sequent.h, m/sgi-challenge.h, m/symmetry.h, m/tad68k.h, m/tahoe.h, m/targon31.h, m/tekxd88.h, m/template.h, m/tower32.h, m/tower32v3.h, m/ustation.h, m/vax.h, m/wicat.h, m/xps100.h: Delete C_ALLOCA, HAVE_ALLOCA, STACK_DIRECTION, BROKEN_ALLOCA_IN_FUNCTION_CALLS. All of this is auto-detected. When in doubt, I followed recent FSF sources, which also have these things deleted.
author ben
date Thu, 04 Nov 2004 23:08:28 +0000
parents 09e68196904a
children 3d8143fc88e1
comparison
equal deleted inserted replaced
2366:2a392e0c390a 2367:ecf1ebac70d8
1 /* Unicode-encapsulation of Win32 library functions. 1 /* Unicode-encapsulation of Win32 library functions.
2 Copyright (C) 2000, 2001, 2002 Ben Wing. 2 Copyright (C) 2000, 2001, 2002, 2004 Ben Wing.
3 3
4 This file is part of XEmacs. 4 This file is part of XEmacs.
5 5
6 XEmacs is free software; you can redistribute it and/or modify it 6 XEmacs is free software; you can redistribute it and/or modify it
7 under the terms of the GNU General Public License as published by the 7 under the terms of the GNU General Public License as published by the
35 #include "lisp.h" 35 #include "lisp.h"
36 36
37 #include "console-msw.h" 37 #include "console-msw.h"
38 38
39 int no_mswin_unicode_lib_calls; 39 int no_mswin_unicode_lib_calls;
40
41 /* The golden rules of writing Unicode-safe code:
42
43 -- There are no preprocessor games going on.
44
45 -- Do not set the UNICODE constant.
46
47 -- You need to change your code to call the Windows API prefixed with "qxe"
48 functions (when they exist) and use the ...W structs instead of the
49 generic ones. String arguments in the qxe functions are of type Extbyte
50 *.
51
52 -- You code is responsible for conversion of text arguments. We try to
53 handle everything else -- the argument differences, the copying back and
54 forth of structures, etc. Use Qmswindows_tstr and macros such as
55 C_STRING_TO_TSTR. You are also responsible for interpreting and
56 specifying string sizes, which have not been changed. Usually these are
57 in characters, meaning you need to divide by XETCHAR_SIZE. (But, some
58 functions want sizes in bytes, even with Unicode strings. Look in the
59 documentation.) Use XETEXT when specifying string constants, so that
60 they show up in Unicode as necessary.
61
62 -- If you need to process external strings (in general you should not do
63 this; do all your manipulations in internal format and convert at the
64 point of entry into or exit from the function), use the xet...()
65 functions.
66
67 more specifically:
68
69 Unicode support is important for supporting many languages under
70 Windows, such as Cyrillic, without resorting to translation tables for
71 particular Windows-specific code pages. Internally, all characters in
72 Windows can be represented in two encodings: code pages and Unicode.
73 With Unicode support, we can seamlessly support all Windows
74 characters. Currently, the test in the drive to support Unicode is if
75 IME input works properly, since it is being converted from Unicode.
76
77 Unicode support also requires that the various Windows API's be
78 "Unicode-encapsulated", so that they automatically call the ANSI or
79 Unicode version of the API call appropriately and handle the size
80 differences in structures. What this means is:
81
82 -- first, note that Windows already provides a sort of encapsulation
83 of all API's that deal with text. All such API's are underlyingly
84 provided in two versions, with an A or W suffix (ANSI or "wide"
85 i.e. Unicode), and the compile-time constant UNICODE controls which is
86 selected by the unsuffixed API. Same thing happens with structures, and
87 also with types, where the generic types have names beginning with T --
88 TCHAR, LPTSTR, etc.. Unfortunately, this is compile-time only, not
89 run-time, so not sufficient. (Creating the necessary run-time encoding
90 is not conceptually difficult, but very time-consuming to write. It
91 adds no significant overhead, and the only reason it's not standard in
92 Windows is conscious marketing attempts by Microsoft to cripple Windows
93 95. FUCK MICROSOFT! They even describe in a KnowledgeBase article
94 exactly how to create such an API [although we don't exactly follow
95 their procedure], and point out its usefulness; the procedure is also
96 described more generally in Nadine Kano's book on Win32
97 internationalization -- written SIX YEARS AGO! Obviously Microsoft has
98 such an API available internally.)
99
100 -- what we do is provide an encapsulation of each standard Windows API call
101 that is split into A and W versions. current theory is to avoid all
102 preprocessor games; so we name the function with a prefix -- "qxe"
103 currently -- and require callers to use the prefixed name. Callers need
104 to explicitly use the W version of all structures, and convert text
105 themselves using Qmswindows_tstr. the qxe encapsulated version will
106 automatically call the appropriate A or W version depending on whether
107 we're running on 9x or NT (you can force use of the A calls on NT,
108 e.g. for testing purposes, using the command- line switch -nuni aka
109 -no-unicode-lib-calls), and copy data between W and A versions of the
110 structures as necessary.
111
112 -- We require the caller to handle the actual translation of text to
113 avoid possible overflow when dealing with fixed-size Windows
114 structures. There are no such problems when copying data between
115 the A and W versions because ANSI text is never larger than its
116 equivalent Unicode representation.
117
118 NOTE NOTE NOTE: As of August 2001, Microsoft (finally! See my nasty
119 comment above) released their own Unicode-encapsulation library, called
120 Microsoft Layer for Unicode on Windows 95/98/Me Systems. It tries to be
121 more transparent than we are, in that
122
123 -- its routines do ANSI/Unicode string translation, while we don't, for
124 efficiency (we already have to do internal/external conversion so it's
125 no extra burden to do the proper conversion directly rather than always
126 converting to Unicode and then doing a second conversion to ANSI as
127 necessary)
128
129 -- rather than requiring separately-named routines (qxeFooBar), they
130 physically override the existing routines at the link level. it also
131 appears that they do this BADLY, in that if you link with the MLU, you
132 get an application that runs ONLY on Win9x!!! (hint -- use
133 GetProcAddress()). there's still no way to create a single binary!
134 fucking losers.
135
136 -- they assume you compile with UNICODE defined, so there's no need for the
137 application to explicitly use ...W structures, as we require.
138
139 -- they also intercept windows procedures to deal with notify messages as
140 necessary, which we don't do yet.
141
142 -- they (of course) don't use Extbyte.
143
144 at some point (especially when they fix the single-binary problem!), we
145 should consider switching. for the meantime, we'll stick with what i've
146 already written. perhaps we should think about adopting some of the
147 greater transparency they have; but i opted against transparency on
148 purpose, to make the code easier to follow for someone who's not familiar
149 with it. until our library is really complete and bug-free, we should
150 think twice before doing this.
151 */
152 40
153 41
154 /************************************************************************/ 42 /************************************************************************/
155 /* auto-generation */ 43 /* auto-generation */
156 /************************************************************************/ 44 /************************************************************************/
2511 classnew = *lpWndClass; 2399 classnew = *lpWndClass;
2512 classnew.lpfnWndProc = intercepted_wnd_proc; 2400 classnew.lpfnWndProc = intercepted_wnd_proc;
2513 if (XEUNICODE_P) 2401 if (XEUNICODE_P)
2514 return RegisterClassW (&classnew); 2402 return RegisterClassW (&classnew);
2515 else 2403 else
2516 return RegisterClassA ((CONST WNDCLASSA *) &classnew); 2404 /* The intermediate cast fools gcc into not outputting strict-aliasing
2405 complaints */
2406 return RegisterClassA ((CONST WNDCLASSA *) (void *) &classnew);
2517 } 2407 }
2518 2408
2519 BOOL 2409 BOOL
2520 qxeUnregisterClass (const Extbyte * lpClassName, HINSTANCE hInstance) 2410 qxeUnregisterClass (const Extbyte * lpClassName, HINSTANCE hInstance)
2521 { 2411 {
2555 classnew = *lpWndClass; 2445 classnew = *lpWndClass;
2556 classnew.lpfnWndProc = intercepted_wnd_proc; 2446 classnew.lpfnWndProc = intercepted_wnd_proc;
2557 if (XEUNICODE_P) 2447 if (XEUNICODE_P)
2558 return RegisterClassExW (&classnew); 2448 return RegisterClassExW (&classnew);
2559 else 2449 else
2560 return RegisterClassExA ((CONST WNDCLASSEXA *) &classnew); 2450 /* The intermediate cast fools gcc into not outputting strict-aliasing
2451 complaints */
2452 return RegisterClassExA ((CONST WNDCLASSEXA *) (void *) &classnew);
2561 } 2453 }
2562 2454
2563 2455
2564 /************************************************************************/ 2456 /************************************************************************/
2565 /* COMMCTRL.H */ 2457 /* COMMCTRL.H */