Mercurial > hg > xemacs-beta
comparison man/internals/internals.texi @ 3322:cf02a1da936a
[xemacs-hg @ 2006-03-31 17:51:18 by stephent]
Miscellaneous doc cleanup. <87u09eqzja.fsf@tleepslib.sk.tsukuba.ac.jp>
author | stephent |
---|---|
date | Fri, 31 Mar 2006 17:51:39 +0000 |
parents | 971e3c687f18 |
children | 15fb91e3a115 |
comparison
equal
deleted
inserted
replaced
3321:4309d96fb8b7 | 3322:cf02a1da936a |
---|---|
474 * Internal Text APIs:: | 474 * Internal Text APIs:: |
475 * Coding for Mule:: | 475 * Coding for Mule:: |
476 * CCL:: | 476 * CCL:: |
477 * Microsoft Windows-Related Multilingual Issues:: | 477 * Microsoft Windows-Related Multilingual Issues:: |
478 * Modules for Internationalization:: | 478 * Modules for Internationalization:: |
479 * The Great Mule Merge of March 2002:: | |
479 | 480 |
480 Encodings | 481 Encodings |
481 | 482 |
482 * Japanese EUC (Extended Unix Code):: | 483 * Japanese EUC (Extended Unix Code):: |
483 * JIS7:: | 484 * JIS7:: |
519 * More about locales:: | 520 * More about locales:: |
520 * Unicode support under Windows:: | 521 * Unicode support under Windows:: |
521 * The golden rules of writing Unicode-safe code:: | 522 * The golden rules of writing Unicode-safe code:: |
522 * The format of the locale in setlocale():: | 523 * The format of the locale in setlocale():: |
523 * Random other Windows I18N docs:: | 524 * Random other Windows I18N docs:: |
525 | |
526 The Great Mule Merge of March 2002 | |
527 | |
528 * List of changed files in new Mule workspace:: | |
529 * Changes to the MULE subsystems:: | |
530 * Pervasive changes throughout XEmacs sources:: | |
531 * Changes to specific subsystems:: | |
532 * Mule changes by theme:: | |
533 * File-coding rewrite:: | |
534 * General User-Visible Changes:: | |
535 * General Lisp-Visible Changes:: | |
536 * User documentation:: | |
537 * General internal changes:: | |
538 * Ben's TODO list:: Probably obsolete. | |
539 * Ben's README:: Probably obsolete. | |
524 | 540 |
525 Consoles; Devices; Frames; Windows | 541 Consoles; Devices; Frames; Windows |
526 | 542 |
527 * Introduction to Consoles; Devices; Frames; Windows:: | 543 * Introduction to Consoles; Devices; Frames; Windows:: |
528 * Point:: | 544 * Point:: |
575 * Creating an Lstream:: Creating an lstream object. | 591 * Creating an Lstream:: Creating an lstream object. |
576 * Lstream Types:: Different sorts of things that are streamed. | 592 * Lstream Types:: Different sorts of things that are streamed. |
577 * Lstream Functions:: Functions for working with lstreams. | 593 * Lstream Functions:: Functions for working with lstreams. |
578 * Lstream Methods:: Creating new lstream types. | 594 * Lstream Methods:: Creating new lstream types. |
579 | 595 |
596 Subprocesses | |
597 | |
598 * Ben's separate stderr notes:: Probably obsolete. | |
599 | |
580 Interface to MS Windows | 600 Interface to MS Windows |
581 | 601 |
582 * Different kinds of Windows environments:: | 602 * Different kinds of Windows environments:: |
583 * Windows Build Flags:: | 603 * Windows Build Flags:: |
584 * Windows I18N Introduction:: | 604 * Windows I18N Introduction:: |
585 * Modules for Interfacing with MS Windows:: | 605 * Modules for Interfacing with MS Windows:: |
606 * CHANGES from 21.4-windows branch:: Probably obsolete. | |
586 | 607 |
587 Interface to the X Window System | 608 Interface to the X Window System |
588 | 609 |
589 * Lucid Widget Library:: An interface to various widget sets. | 610 * Lucid Widget Library:: An interface to various widget sets. |
590 * Modules for Interfacing with X Windows:: | 611 * Modules for Interfacing with X Windows:: |
10371 * Internal Text APIs:: | 10392 * Internal Text APIs:: |
10372 * Coding for Mule:: | 10393 * Coding for Mule:: |
10373 * CCL:: | 10394 * CCL:: |
10374 * Microsoft Windows-Related Multilingual Issues:: | 10395 * Microsoft Windows-Related Multilingual Issues:: |
10375 * Modules for Internationalization:: | 10396 * Modules for Internationalization:: |
10397 * The Great Mule Merge of March 2002:: | |
10376 @end menu | 10398 @end menu |
10377 | 10399 |
10378 @node Introduction to Multilingual Issues #1, Introduction to Multilingual Issues #2, Multilingual Support, Multilingual Support | 10400 @node Introduction to Multilingual Issues #1, Introduction to Multilingual Issues #2, Multilingual Support, Multilingual Support |
10379 @section Introduction to Multilingual Issues #1 | 10401 @section Introduction to Multilingual Issues #1 |
10380 @cindex introduction to multilingual issues #1 | 10402 @cindex introduction to multilingual issues #1 |
14078 definition with a call to the macro XETEXT. This appropriately makes a | 14100 definition with a call to the macro XETEXT. This appropriately makes a |
14079 string of either regular or wide chars, which is to say this string may be | 14101 string of either regular or wide chars, which is to say this string may be |
14080 prepended with an L (causing it to be a wide string) depending on | 14102 prepended with an L (causing it to be a wide string) depending on |
14081 XEUNICODE_P. | 14103 XEUNICODE_P. |
14082 | 14104 |
14083 @node Modules for Internationalization, , Microsoft Windows-Related Multilingual Issues, Multilingual Support | 14105 @node Modules for Internationalization, The Great Mule Merge of March 2002, Microsoft Windows-Related Multilingual Issues, Multilingual Support |
14084 @section Modules for Internationalization | 14106 @section Modules for Internationalization |
14085 @cindex modules for internationalization | 14107 @cindex modules for internationalization |
14086 @cindex internationalization, modules for | 14108 @cindex internationalization, modules for |
14087 | 14109 |
14088 @example | 14110 @example |
14157 @file{iso-wide.h} | 14179 @file{iso-wide.h} |
14158 @end example | 14180 @end example |
14159 | 14181 |
14160 This contains leftover code from an earlier implementation of | 14182 This contains leftover code from an earlier implementation of |
14161 Asian-language support, and is not currently used. | 14183 Asian-language support, and is not currently used. |
14184 | |
14185 | |
14186 @c | |
14187 @c DO NOT CHANGE THE NAME OF THIS NODE; ChangeLogs refer to it. | |
14188 @c Well, of course you're welcome to seek them out and fix them, too. | |
14189 @c | |
14190 | |
14191 @node The Great Mule Merge of March 2002, , Modules for Internationalization, Multilingual Support | |
14192 @section The Great Mule Merge of March 2002 | |
14193 @cindex The Great Mule Merge | |
14194 @cindex Mule Merge, The Great | |
14195 | |
14196 In March 2002, just after the release of XEmacs 21.5 beta 5, Ben Wing | |
14197 merged what was nominally a very large refactoring of the ``Mule'' | |
14198 multilingual support code into the mainline. This merge added robust | |
14199 support for Unicode on all platforms, and by providing support for Win32 | |
14200 Unicode APIs made the Mule support on the Windows platform a reality. | |
14201 This merge also included a large number of other changes and | |
14202 improvements, not necessarily related to internationalization. | |
14203 | |
14204 This node basically amounts to the ChangeLog for 2002-03-12. | |
14205 | |
14206 Some effort has been put into proper markup for code and file names, and | |
14207 some reorganization according to themes of revision. However, much | |
14208 remains to be done. | |
14209 | |
14210 @menu | |
14211 * List of changed files in new Mule workspace:: | |
14212 * Changes to the MULE subsystems:: | |
14213 * Pervasive changes throughout XEmacs sources:: | |
14214 * Changes to specific subsystems:: | |
14215 * Mule changes by theme:: | |
14216 * File-coding rewrite:: | |
14217 * General User-Visible Changes:: | |
14218 * General Lisp-Visible Changes:: | |
14219 * User documentation:: | |
14220 * General internal changes:: | |
14221 * Ben's TODO list:: Probably obsolete. | |
14222 * Ben's README:: Probably obsolete. | |
14223 @end menu | |
14224 | |
14225 | |
14226 @node List of changed files in new Mule workspace, Changes to the MULE subsystems, , The Great Mule Merge of March 2002 | |
14227 @subsection List of changed files in new Mule workspace | |
14228 | |
14229 This node lists the files that were touched in the Great Mule Merge. | |
14230 | |
14231 @heading Deleted files | |
14232 | |
14233 @example | |
14234 src/iso-wide.h | |
14235 src/mule-charset.h | |
14236 src/mule.c | |
14237 src/ntheap.h | |
14238 src/syscommctrl.h | |
14239 lisp/files-nomule.el | |
14240 lisp/help-nomule.el | |
14241 lisp/mule/mule-help.el | |
14242 lisp/mule/mule-init.el | |
14243 lisp/mule/mule-misc.el | |
14244 nt/config.h | |
14245 @end example | |
14246 | |
14247 @heading Other deleted files | |
14248 | |
14249 These files were all zero-width and accidentally present. | |
14250 | |
14251 @example | |
14252 src/events-mod.h | |
14253 tests/Dnd/README.OffiX | |
14254 tests/Dnd/dragtest.el | |
14255 netinstall/README.xemacs | |
14256 lib-src/srcdir-symlink.stamp | |
14257 @end example | |
14258 | |
14259 @heading New files | |
14260 | |
14261 @example | |
14262 CHANGES-ben-mule | |
14263 README.ben-mule-21-5 | |
14264 README.ben-separate-stderr | |
14265 TODO.ben-mule-21-5 | |
14266 etc/TUTORIAL.@{cs,es,nl,sk,sl@} | |
14267 etc/unicode/* | |
14268 lib-src/make-mswin-unicode.pl | |
14269 lisp/code-init.el | |
14270 lisp/resize-minibuffer.el | |
14271 lisp/unicode.el | |
14272 lisp/mule/china-util.el | |
14273 lisp/mule/cyril-util.el | |
14274 lisp/mule/devan-util.el | |
14275 lisp/mule/devanagari.el | |
14276 lisp/mule/ethio-util.el | |
14277 lisp/mule/indian.el | |
14278 lisp/mule/japan-util.el | |
14279 lisp/mule/korea-util.el | |
14280 lisp/mule/lao-util.el | |
14281 lisp/mule/lao.el | |
14282 lisp/mule/mule-locale.txt | |
14283 lisp/mule/mule-msw-init.el | |
14284 lisp/mule/thai-util.el | |
14285 lisp/mule/thai.el | |
14286 lisp/mule/tibet-util.el | |
14287 lisp/mule/tibetan.el | |
14288 lisp/mule/viet-util.el | |
14289 src/charset.h | |
14290 src/intl-auto-encap-win32.c | |
14291 src/intl-auto-encap-win32.h | |
14292 src/intl-encap-win32.c | |
14293 src/intl-win32.c | |
14294 src/intl-x.c | |
14295 src/mule-coding.c | |
14296 src/text.c | |
14297 src/text.h | |
14298 src/unicode.c | |
14299 src/s/win32-common.h | |
14300 src/s/win32-native.h | |
14301 @end example | |
14302 | |
14303 @heading Changed files | |
14304 | |
14305 ``Too numerous to mention.'' (Ben didn't write that, I did, but it's a | |
14306 good guess that's the intent....) | |
14307 | |
14308 | |
14309 @node Changes to the MULE subsystems, Pervasive changes throughout XEmacs sources, List of changed files in new Mule workspace, The Great Mule Merge of March 2002 | |
14310 @subsection Changes to the MULE subsystems | |
14311 | |
14312 @heading configure changes | |
14313 | |
14314 @itemize | |
14315 @item | |
14316 file-coding always compiled in. eol detection is off by default on | |
14317 unix, non-mule, but can be enabled with configure option | |
14318 @code{--with-default-eol-detection} or command-line flag @code{-eol}. | |
14319 | |
14320 @item | |
14321 code that selects which files are compiled is mostly moved to | |
14322 @file{Makefile.in.in}. see comment in @file{Makefile.in.in}. | |
14323 | |
14324 @item | |
14325 vestigial i18n3 code deleted. | |
14326 | |
14327 @item | |
14328 new cygwin mswin libs imm32 (input methods), mpr (user name | |
14329 enumeration). | |
14330 | |
14331 @item | |
14332 check for @code{link}, @code{symlink}. | |
14333 | |
14334 @item | |
14335 @code{vfork}-related code deleted. | |
14336 | |
14337 @item | |
14338 fix @file{configure.usage}. (delete @code{--with-file-coding}, | |
14339 @code{--no-doc-file}, add @code{--with-default-eol-detection}, | |
14340 @code{--quick-build}). | |
14341 | |
14342 @item | |
14343 @file{nt/config.h} has been eliminated and everything in it merged into | |
14344 @file{config.h.in} and @file{s/windowsnt.h}. see @file{config.h.in} for | |
14345 more info. | |
14346 | |
14347 @item | |
14348 massive rewrite of @file{s/windowsnt.h}, @file{m/windowsnt.h}, | |
14349 @file{s/cygwin32.h}, @file{s/mingw32.h}. common code moved into | |
14350 @file{s/win32-common.h}, @file{s/win32-native.h}. | |
14351 | |
14352 @item | |
14353 in @file{nt/xemacs.mak}, @file{nt/config.inc.samp}, variable is called | |
14354 @code{MULE}, not @code{HAVE_MULE}, for consistency with sources. | |
14355 | |
14356 @item | |
14357 define @code{TABDLY}, @code{TAB3} in @file{freebsd.h} (#### from where?) | |
14358 @end itemize | |
14359 | |
14360 | |
14361 @node Pervasive changes throughout XEmacs sources, Changes to specific subsystems, Changes to the MULE subsystems, The Great Mule Merge of March 2002 | |
14362 @subsection Pervasive changes throughout XEmacs sources | |
14363 | |
14364 @itemize | |
14365 @item | |
14366 all @code{#ifdef FILE_CODING} statements removed from code. | |
14367 @end itemize | |
14368 | |
14369 @heading Changes to string processing | |
14370 | |
14371 @itemize | |
14372 @item | |
14373 new @samp{qxe()} string functions that accept @code{Intbyte *} as | |
14374 arguments. These work exactly like the standard @code{strcmp()}, | |
14375 @code{strcpy()}, @code{sprintf()}, etc. except for the argument | |
14376 declaration differences. We use these whenever we have @code{Intbyte *} | |
14377 strings, which is quite often. | |
14378 | |
14379 @item | |
14380 new fun @code{build_intstring()} takes an @code{Intbyte *}. also new | |
14381 funs @code{build_msg_intstring} (like @code{build_intstring()}) and | |
14382 @code{build_msg_string} (like @code{build_string()}) to do a | |
14383 @code{GETTEXT()} before building the string. (elimination of old | |
14384 @code{build_translated_string()}, replaced by | |
14385 @code{build_msg_string()}). | |
14386 | |
14387 @item | |
14388 function @code{intern_int()} for @code{Intbyte *} arguments, like | |
14389 @code{intern()}. | |
14390 | |
14391 @item | |
14392 numerous places throughout code where @code{char *} replaced with | |
14393 something else, e.g. @code{Char_ASCII *}, @code{Intbyte *}, | |
14394 @code{Char_Binary *}, etc. same with unsigned @code{char *}, going to | |
14395 @code{UChar_Binary *}, etc. | |
14396 @end itemize | |
14397 | |
14398 | |
14399 @node Changes to specific subsystems, Mule changes by theme, Pervasive changes throughout XEmacs sources, The Great Mule Merge of March 2002 | |
14400 @subsection Changes to specific subsystems | |
14401 | |
14402 @heading Changes to the init code | |
14403 | |
14404 @itemize | |
14405 @item | |
14406 lots of init code rewritten to be mule-correct. | |
14407 @end itemize | |
14408 | |
14409 @heading Changes to processes | |
14410 | |
14411 @itemize | |
14412 @item | |
14413 always call @code{egetenv()}, never @code{getenv()}, for mule | |
14414 correctness. | |
14415 @end itemize | |
14416 | |
14417 @heading command line (@file{startup.el}, @file{emacs.c}) | |
14418 | |
14419 @itemize | |
14420 @item | |
14421 new option @code{-eol} to enable auto EOL detection under non-mule unix. | |
14422 | |
14423 @item | |
14424 new option @code{-nuni} (@code{--no-unicode-lib-calls}) to force use of | |
14425 non-Unicode API's under Windows NT, mostly for debugging purposes. | |
14426 @end itemize | |
14427 | |
14428 | |
14429 @node Mule changes by theme, File-coding rewrite, Changes to specific subsystems, The Great Mule Merge of March 2002 | |
14430 @subsection Mule changes by theme | |
14431 | |
14432 @itemize | |
14433 @item | |
14434 the code that handles the details of processing multilingual text has | |
14435 been consolidated to make it easier to extend it. it has been yanked | |
14436 out of various files (@file{buffer.h}, @file{mule-charset.h}, | |
14437 @file{lisp.h}, @file{insdel.c}, @file{fns.c}, @file{file-coding.c}, | |
14438 etc.) and put into @file{text.c} and @file{text.h}. | |
14439 @file{mule-charset.h} has also been renamed @file{charset.h}. all long | |
14440 comments concerning the representations and their processing have been | |
14441 consolidated into @file{text.c}. | |
14442 | |
14443 @item | |
14444 major rewriting of file-coding. it's mostly abstracted into coding | |
14445 systems that are defined by methods (similar to devices and specifiers), | |
14446 with the ultimate aim being to allow non-i18n coding systems such as | |
14447 gzip. there is a ``chain'' coding system that allows multiple coding | |
14448 systems to be chained together. (it doesn't yet have the concept that | |
14449 either end of a coding system can be bytes or chars; this needs to be | |
14450 added.) | |
14451 | |
14452 @item | |
14453 large amounts of code throughout the code base have been Mule-ized, not | |
14454 just Windows code. | |
14455 | |
14456 @item | |
14457 total rewriting of OS locale code. it notices your locale at startup | |
14458 and sets the language environment accordingly, and calls | |
14459 @code{setlocale()} and sets @code{LANG} when you change the language | |
14460 environment. new language environment properties @code{locale}, | |
14461 @code{mswindows-locale}, @code{cygwin-locale}, | |
14462 @code{native-coding-system}, to determine langenv from locale and | |
14463 vice-versa; fix all language environments (lots of language files). | |
14464 langenv startup code rewritten. many new functions to convert between | |
14465 locales, language environments, etc. | |
14466 | |
14467 @item | |
14468 major overhaul of the way default values for the various coding system | |
14469 variables are handled. all default values are collected into one | |
14470 location, a new file @file{code-init.el}, which provides a unified | |
14471 mechanism for setting and querying what i call ``basic coding system | |
14472 variables'' (which may be aliases, parts of conses, etc.) and a | |
14473 mechanism of different configurations (Windows w/Mule, Windows w/o Mule, | |
14474 Unix w/Mule, Unix w/o Mule, unix w/o Mule but w/auto EOL), each of which | |
14475 specifies a set of default values. we determine the configuration at | |
14476 startup and set all the values in one place. (@file{code-init.el}, | |
14477 @file{code-files.el}, @file{coding.el}, ...) | |
14478 | |
14479 @item | |
14480 i copied the remaining language-specific files from fsf. i made some | |
14481 minor changes in certain cases but for the most part the stuff was just | |
14482 copied and may not work. | |
14483 | |
14484 @item | |
14485 ms windows mule support, with full unicode support. required font, | |
14486 redisplay, event, other changes. ime support from ikeyama. | |
14487 @end itemize | |
14488 | |
14489 @heading Lisp-Visible Changes: | |
14490 | |
14491 @itemize | |
14492 @item | |
14493 ensure that @code{escape-quoted} works correctly even without Mule | |
14494 support and use it for all auto-saves. (@file{auto-save.el}, | |
14495 @file{fileio.c}, @file{coding.el}, @file{files.el}) | |
14496 | |
14497 @item | |
14498 new var @code{buffer-file-coding-system-when-loaded} specifies the | |
14499 actual coding system used when the file was loaded | |
14500 (@code{buffer-file-coding-system} is usually the same, but may be | |
14501 changed because it controls how the file is written out). use it in | |
14502 revert-buffer (@file{files.el}, @file{code-files.el}) and in new submenu | |
14503 File->Revert Buffer with Specified Encoding (@file{menubar-items.el}). | |
14504 | |
14505 @item | |
14506 improve docs on how the coding system is determined when a file is read | |
14507 in; improved docs are in both @code{find-file} and | |
14508 @code{insert-file-contents} and a reference to where to find them is in | |
14509 @code{buffer-file-coding-system-for-read}. (@file{files.el}, | |
14510 @file{code-files.el}) | |
14511 | |
14512 @item | |
14513 new (brain-damaged) FSF way of calling post-read-conversion (only one | |
14514 arg, not two) is supported, along with our two-argument way, as best we | |
14515 can. (@file{code-files.el}) | |
14516 | |
14517 @item | |
14518 add inexplicably missing var @code{default-process-coding-system}. use | |
14519 it. get rid of former hacked-up way of setting these defaults using | |
14520 @code{comint-exec-hook}. also fun | |
14521 @code{set-buffer-process-coding-system}. (@file{code-process.el}, | |
14522 @file{code-cmds.el}, @file{process.c}) | |
14523 | |
14524 @item | |
14525 remove function @code{set-default-coding-systems}; replace with | |
14526 @code{set-default-output-coding-systems}, which affects only the output | |
14527 defaults (@code{buffer-file-coding-system}, output half of | |
14528 @code{default-process-coding-system}). the input defaults should not be | |
14529 set by this because they should always remain @code{undecided} in normal | |
14530 circumstances. fix @code{prefer-coding-system} to use the new function | |
14531 and correct its docs. | |
14532 | |
14533 @item | |
14534 fix bug in @code{coding-system-change-eol-conversion} | |
14535 (@file{code-cmds.el}) | |
14536 | |
14537 @item | |
14538 recognize all eol types in @code{prefer-coding-system} | |
14539 (@file{code-cmds.el}) | |
14540 | |
14541 @item | |
14542 rewrite @code{coding-system-category} to be correct (@file{coding.el}) | |
14543 @end itemize | |
14544 | |
14545 @heading Internal Changes | |
14546 | |
14547 @itemize | |
14548 @item | |
14549 major improvements to eistring code, fleshing out of missing funs. | |
14550 @end itemize | |
14551 | |
14552 @itemize | |
14553 @item | |
14554 Separate encoding and decoding lstreams have been combined into a single | |
14555 coding lstream. Functions@samp{ make_encoding_*_stream} and | |
14556 @samp{make_decoding_*_stream} have been combined into | |
14557 @samp{make_coding_*_stream}, which takes an argument specifying whether | |
14558 encode or decode is wanted. | |
14559 | |
14560 @item | |
14561 remove last vestiges of I18N3, I18N4 code. | |
14562 | |
14563 @item | |
14564 ascii optimization for strings: we keep track of the number of ascii | |
14565 chars at the beginning and use this to optimize byte<->char conversion | |
14566 on strings. | |
14567 | |
14568 @item | |
14569 @file{mule-misc.el}, @file{mule-init.el} deleted; code in there either | |
14570 deleted, rewritten, or moved to another file. | |
14571 | |
14572 @item | |
14573 @file{mule.c} deleted. | |
14574 | |
14575 @item | |
14576 move non-Mule-specific code out of @file{mule-cmds.el} into | |
14577 @file{code-cmds.el}. (@code{coding-system-change-text-conversion}; | |
14578 remove duplicate @code{coding-system-change-eol-conversion}) | |
14579 | |
14580 @item | |
14581 remove duplicate @code{set-buffer-process-coding-system} | |
14582 (@file{code-cmds.el}) | |
14583 | |
14584 @item | |
14585 add some commented-out code from FSF @file{mule-cmds.el} | |
14586 (@code{find-coding-systems-region-subset-p}, | |
14587 @code{find-coding-systems-region}, @code{find-coding-systems-string}, | |
14588 @code{find-coding-systems-for-charsets}, | |
14589 @code{find-multibyte-characters}, @code{last-coding-system-specified}, | |
14590 @code{select-safe-coding-system}, @code{select-message-coding-system}) | |
14591 (@file{code-cmds.el}) | |
14592 | |
14593 @item | |
14594 remove obsolete alias @code{pathname-coding-system}, function | |
14595 @code{set-pathname-coding-system} (@file{coding.el}) | |
14596 | |
14597 @item | |
14598 remove coding-system property @code{doc-string}; split into | |
14599 @code{description} (short, for menu items) and @code{documentation} | |
14600 (long); correct coding system defns (@file{coding.el}, | |
14601 @file{file-coding.c}, lots of language files) | |
14602 | |
14603 @item | |
14604 move coding-system-base into C and make use of internal info | |
14605 (@file{coding.el}, @file{file-coding.c}) | |
14606 | |
14607 @item | |
14608 move @code{undecided} defn into C (@file{coding.el}, | |
14609 @file{file-coding.c}) | |
14610 | |
14611 @item | |
14612 use @code{define-coding-system-alias}, not @code{copy-coding-system} | |
14613 (@file{coding.el}) | |
14614 | |
14615 @item | |
14616 new coding system @code{iso-8859-6} for arabic | |
14617 | |
14618 @item | |
14619 delete windows-1251 support from @file{cyrillic.el}; we do it | |
14620 automatically | |
14621 | |
14622 @item | |
14623 remove @samp{setup-*-environment} as per FSF 21 | |
14624 | |
14625 @item | |
14626 rewrite @file{european.el} with lang envs for each language, so we can | |
14627 specify the locale | |
14628 | |
14629 @item | |
14630 fix corruption in @file{greek.el} | |
14631 | |
14632 @item | |
14633 sync @file{japanese.el} with FSF 20.6 | |
14634 | |
14635 @item | |
14636 fix warnings in @file{mule-ccl.el} | |
14637 | |
14638 @item | |
14639 move FSF compat Mule fns from @file{obsolete.el} to | |
14640 @file{mule-charset.el} | |
14641 | |
14642 @item | |
14643 eliminate unused @samp{truncate-string@{-to-width@}} | |
14644 | |
14645 @item | |
14646 @code{make-coding-system} accepts (but ignores) the additional | |
14647 properties present in the fsf version, for compatibility. | |
14648 | |
14649 @item | |
14650 i fixed the iso2022 handling so it will correctly read in files | |
14651 containing unknown charsets, creating a ``temporary'' charset which can | |
14652 later be overwritten by the real charset when it's defined. this allows | |
14653 iso2022 elisp files with literals in strange languages to compile | |
14654 correctly under mule. i also added a hack that will correctly read in | |
14655 and write out the emacs-specific ``composition'' escape sequences, | |
14656 i.e. @samp{ESC 0} through @samp{ESC 4}. this means that my workspace | |
14657 correctly compiles the new file @file{devanagari.el} that i added. | |
14658 | |
14659 @item | |
14660 elimination of @code{string-to-char-list} (use @code{string-to-list}) | |
14661 | |
14662 @item | |
14663 elimination of junky @code{define-charset} | |
14664 @end itemize | |
14665 | |
14666 @heading Selection | |
14667 | |
14668 @itemize | |
14669 @item | |
14670 fix msw selection code for Mule. proper encoding for | |
14671 @code{RegisterClipboardFormat}. store selection as | |
14672 @code{CF_UNICODETEXT}, which will get converted to the other formats. | |
14673 don't respond to destroy messages from @code{EmptyClipboard()}. | |
14674 @end itemize | |
14675 | |
14676 @heading Menubar | |
14677 | |
14678 @itemize | |
14679 @item | |
14680 new items @samp{Open With Specified Encoding}, | |
14681 @samp{Revert Buffer with Specified Encoding} | |
14682 | |
14683 @item | |
14684 split Mule menu into @samp{Encoding} (non-Mule-specific; includes new | |
14685 item to control EOL auto-detection) and @samp{International} submenus on | |
14686 @samp{Options}, @samp{International} on @samp{Help} | |
14687 | |
14688 @end itemize | |
14689 | |
14690 @heading Unicode support: | |
14691 | |
14692 @itemize | |
14693 @item | |
14694 translation tables added in @file{etc/unicode} | |
14695 | |
14696 @item | |
14697 new files @file{unicode.c}, @file{unicode.el} containing unicode coding | |
14698 systems and support; old code ripped out of @file{file-coding.c} | |
14699 | |
14700 @item | |
14701 translation tables read in at startup (NEEDS WORK TO MAKE IT MORE | |
14702 EFFICIENT) | |
14703 | |
14704 @item | |
14705 support @code{CF_TEXT}, @code{CF_UNICODETEXT} in @file{select.el} | |
14706 | |
14707 @item | |
14708 encapsulation code added so that we can support both Windows 9x and NT | |
14709 in a single executable, determining at runtime whether to call the | |
14710 Unicode or non-Unicode API. encapsulated routines in | |
14711 @file{intl-encap-win32.c} (non-auto-generated) and | |
14712 @file{intl-auto-encap-win32.[ch]} (auto-generated). code generator in | |
14713 @file{lib-src/make-mswin-unicode.pl}. changes throughout the code to | |
14714 use the wide structures (W suffix) and call the encapsulated Win32 API | |
14715 routines (@samp{qxe} prefix). calling code needs to do proper | |
14716 conversion of text using new coding systems @code{Qmswindows_tstr}, | |
14717 @code{Qmswindows_unicode}, or @code{Qmswindows_multibyte}. (the first | |
14718 points to one of the other two.) | |
14719 @end itemize | |
14720 | |
14721 | |
14722 @node File-coding rewrite, General User-Visible Changes, Mule changes by theme, The Great Mule Merge of March 2002 | |
14723 @subsection File-coding rewrite | |
14724 | |
14725 The coding system code has been majorly rewritten. It's abstracted into | |
14726 coding systems that are defined by methods (similar to devices and | |
14727 specifiers). The types of conversions have also been generalized. | |
14728 Formerly, decoding always converted bytes to characters and encoding the | |
14729 reverse (these are now called ``text file converters''), but conversion | |
14730 can now happen either to or from bytes or characters. This allows | |
14731 coding systems such as @code{gzip} and @code{base64} to be written. | |
14732 When specifying such a coding system to an operation that expects a text | |
14733 file converter (such as reading in or writing out a file), the | |
14734 appropriate coding systems to convert between bytes and characters are | |
14735 automatically inserted into the conversion chain as necessary. To | |
14736 facilitate creating such chains, a special coding system called | |
14737 ``chain'' has been created, which chains together two or more coding | |
14738 systems. | |
14739 | |
14740 Encoding detection has also been abstracted. Detectors are logically | |
14741 separate from coding systems, and each detector defines one or more | |
14742 categories. (For example, the detector for Unicode defines categories | |
14743 such as UTF-8, UTF-16, UCS-4, and UTF-7.) When a particular detector is | |
14744 given a piece of text to detect, it determines likeliness values (seven | |
14745 of them, from 3 [most likely] to -3 [least likely]; specific criteria | |
14746 are defined for each possible value). All detectors are run in parallel | |
14747 on a particular piece of text, and the results tabulated together to | |
14748 determine the actual encoding of the text. | |
14749 | |
14750 Encoding and decoding are now completely parallel operations, and the | |
14751 former ``encoding'' and ``decoding'' lstreams have been combined into a | |
14752 single ``coding'' lstream. Coding system methods that were formerly | |
14753 split in such a fashion have also been combined. | |
14754 | |
14755 | |
14756 @node General User-Visible Changes, General Lisp-Visible Changes, File-coding rewrite, The Great Mule Merge of March 2002 | |
14757 @subsection General User-Visible Changes | |
14758 | |
14759 @heading Search | |
14760 | |
14761 @itemize | |
14762 @item | |
14763 make regex routines reentrant, since they're sometimes called | |
14764 reentrantly. (see @file{regex.c} for a description of how.) all global | |
14765 variables used by the regex routines get pushed onto a stack by the | |
14766 callers before being set, and are restored when finished. redo the | |
14767 preprocessor flags controlling @code{REL_ALLOC} in conjunction with | |
14768 this. | |
14769 @end itemize | |
14770 | |
14771 @heading Menubar | |
14772 | |
14773 @itemize | |
14774 @item | |
14775 move menu-splitting code (@code{menu-split-long-menu}, etc.) from | |
14776 @file{font-menu.el} to @file{menubar-items.el} and redo its algorithm; | |
14777 use in various items with long generated menus; rename to remove | |
14778 @samp{font-} from beginning of functions but keep old names as aliases | |
14779 | |
14780 @item | |
14781 new fn @code{menu-sort-menu} | |
14782 | |
14783 @item | |
14784 redo items @samp{Grep All Files in Current Directory @{and Below@}} | |
14785 using stuff from sample @file{init.el} | |
14786 | |
14787 @item | |
14788 @samp{Debug on Error} and friends now affect current session only; not | |
14789 saved | |
14790 | |
14791 @item | |
14792 @code{maybe-add-init-button} -> @code{init-menubar-at-startup} and call | |
14793 explicitly from @file{startup.el} | |
14794 | |
14795 @item | |
14796 don't use @code{charset-registry} in @file{msw-font-menu.el}; it's only | |
14797 for X | |
14798 @end itemize | |
14799 | |
14800 @heading Changes to key bindings | |
14801 | |
14802 These changes are primarily found in @file{keymap.c}, @file{keydefs.el}, | |
14803 and @file{help.el}, but are found in many other files. | |
14804 | |
14805 @itemize | |
14806 @item | |
14807 @kbd{M-home}, @kbd{M-end} now move forward and backward in buffers; with | |
14808 @key{Shift}, stay within current group (e.g. all C files; same grouping | |
14809 as the gutter tabs). (bindings | |
14810 @samp{switch-to-@{next/previous@}-buffer[-in-group]} in @file{files.el}) | |
14811 | |
14812 needed to move code from @file{gutter-items.el} to @file{buff-menu.el} | |
14813 that's used by these bindings, since @file{gutter-items.el} is loaded | |
14814 only when the gutter is active and these bindings (and hence the code) | |
14815 is not (any more) gutter specific. | |
14816 | |
14817 @item | |
14818 new global vars global-tty-map and global-window-system-map specify key | |
14819 bindings for use only on TTY's or window systems, respectively. this is | |
14820 used to make @kbd{ESC ESC} be keyboard-quit on window systems, but | |
14821 @kbd{ESC ESC ESC} on TTY's, where @key{Meta + arrow} keys may appear as | |
14822 @kbd{ESC ESC O A} or whatever. @kbd{C-z} on window systems is now | |
14823 @code{zap-up-to-char}, and @code{iconify-frame} is moved to @kbd{C-Z}. | |
14824 @kbd{ESC ESC} is @code{isearch-quit}. (@file{isearch-mode.el}) | |
14825 | |
14826 @item | |
14827 document @samp{global-@{tty,window-system@}-map} in various places; | |
14828 display them when you do @kbd{C-h b}. | |
14829 | |
14830 @item | |
14831 fix up function documentation in general for keyboard primitives. | |
14832 e.g. key-bindings now contains a detailed section on the steps prior to | |
14833 looking up in keymaps, i.e. @code{function-key-map}, | |
14834 @code{keyboard-translate-table}. etc. @code{define-key} and other | |
14835 obvious starting points indicate where to look for more info. | |
14836 | |
14837 @item | |
14838 eliminate use and mention of grody @code{advertised-undo} and | |
14839 @code{deprecated-help}. (@file{simple.el}, @file{startup.el}, | |
14840 @file{picture.el}, @file{menubar-items.el}) | |
14841 @end itemize | |
14842 | |
14843 | |
14844 @node General Lisp-Visible Changes, User documentation, General User-Visible Changes, The Great Mule Merge of March 2002 | |
14845 @subsection General Lisp-Visible Changes | |
14846 | |
14847 @heading gzip support | |
14848 | |
14849 The gzip protocol is now partially supported as a coding system. | |
14850 | |
14851 @itemize | |
14852 @item | |
14853 new coding system @code{gzip} (bytes -> bytes); unfortunately, not quite | |
14854 working yet because it handles only the raw zlib format and not the | |
14855 higher-level gzip format (the zlib library is brain-damaged in that it | |
14856 provides low-level, stream-oriented API's only for raw zlib, and for | |
14857 gzip you have only high-level API's, which aren't useful for xemacs). | |
14858 | |
14859 @item | |
14860 configure support (@code{--with-zlib}). | |
14861 @end itemize | |
14862 | |
14863 | |
14864 @node User documentation, General internal changes, General Lisp-Visible Changes, The Great Mule Merge of March 2002 | |
14865 @subsection User documentation | |
14866 | |
14867 @heading Tutorial | |
14868 | |
14869 @itemize | |
14870 @item | |
14871 massive rewrite; sync to FSF 21.0.106, switch focus to window systems, | |
14872 new sections on terminology and multiple frames, lots of fixes for | |
14873 current xemacs idioms. | |
14874 | |
14875 @item | |
14876 german version from Adrian mostly matching my changes. | |
14877 | |
14878 @item | |
14879 copy new tutorials from FSF (Spanish, Dutch, Slovak, Slovenian, Czech); | |
14880 not updated yet though. | |
14881 | |
14882 @item | |
14883 eliminate @file{help-nomule.el} and @file{mule-help.el}; merge into one | |
14884 single tutorial function, fix lots of problems, put back in | |
14885 @file{help.el} where it belongs. (there was some random junk in | |
14886 @file{help-nomule.el}, @code{string-width} and @code{make-char}. | |
14887 @code{string-width} is now in @file{subr.el} with a single definition, | |
14888 and @code{make-char} in @file{text.c}.) | |
14889 @end itemize | |
14890 | |
14891 @heading Sample init file | |
14892 | |
14893 @itemize | |
14894 @item | |
14895 remove forward/backward buffer code, since it's now standard. | |
14896 | |
14897 @item | |
14898 when disabling @kbd{C-x C-c}, make it display a message saying how to | |
14899 exit, not just beep and complain ``undefined''. | |
14900 @end itemize | |
14901 | |
14902 | |
14903 @node General internal changes, Ben's TODO list, User documentation, The Great Mule Merge of March 2002 | |
14904 @subsection General internal changes | |
14905 | |
14906 @heading Changes to gnuclient and gnuserv | |
14907 | |
14908 @itemize | |
14909 @item | |
14910 clean up headers a bit. | |
14911 | |
14912 @item | |
14913 use proper ms win idiom for checking for temp directory (@code{TEMP} or | |
14914 @code{TMP}, not @code{TMPDIR}). | |
14915 @end itemize | |
14916 | |
14917 @heading Process changes | |
14918 | |
14919 @itemize | |
14920 @item | |
14921 Move @code{setenv} from packages; synch @code{setenv}/@code{getenv} with | |
14922 21.0.105 | |
14923 @end itemize | |
14924 | |
14925 @heading Changes to I/O internals | |
14926 | |
14927 @itemize | |
14928 @item | |
14929 use @code{PATH_MAX} consistently instead of @code{MAXPATHLEN}, | |
14930 @code{MAX_PATH}, etc. | |
14931 | |
14932 @item | |
14933 all code that does preprocessor games with C lib I/O functions (open, | |
14934 read) has been removed. The code has been changed to call the correct | |
14935 function directly. Functions that accept @code{Intbyte *} arguments for | |
14936 filenames and such and do automatic conversion to or from external | |
14937 format will be prefixed @samp{qxe...()}. Functions that are retrying in | |
14938 case of @code{EINTR} are prefixed @samp{retry_...()}. | |
14939 @code{DONT_ENCAPSULATE} is long-gone. | |
14940 | |
14941 @item | |
14942 never call @code{getcwd()} any more. use our shadowed value always. | |
14943 @end itemize | |
14944 | |
14945 @heading Changes to string processing | |
14946 | |
14947 @itemize | |
14948 @item | |
14949 the @file{doprnt.c} external entry points have been completely rewritten | |
14950 to be more useful and have more sensible names. We now have, for | |
14951 example, versions that work exactly like @code{sprintf()} but return a | |
14952 @code{malloc()}ed string. | |
14953 | |
14954 @item | |
14955 code in @file{print.c} that handles @code{stdout}, @code{stderr} | |
14956 rewritten. | |
14957 | |
14958 @item | |
14959 places that print to @code{stderr} directly replaced with | |
14960 @code{stderr_out()}. | |
14961 | |
14962 @item | |
14963 new convenience functions @code{write_fmt_string()}, | |
14964 @code{write_fmt_string_lisp()}, @code{stderr_out_lisp()}, | |
14965 @code{write_string()}. | |
14966 @end itemize | |
14967 | |
14968 @heading Changes to Allocation, Objects, and the Lisp Interpreter | |
14969 | |
14970 @itemize | |
14971 @item | |
14972 automatically use ``managed lcrecord'' code when allocating. any | |
14973 lcrecord can be put on a free list with @code{free_lcrecord()}. | |
14974 | |
14975 @item | |
14976 @code{record_unwind_protect()} returns the old spec depth. | |
14977 | |
14978 @item | |
14979 @code{unbind_to()} now takes only one arg. use @code{unbind_to_1()} if | |
14980 you want the 2-arg version, with GC protection of second arg. | |
14981 | |
14982 @item | |
14983 new funs to easily inhibit GC. (@code{@{begin,end@}_gc_forbidden()}) | |
14984 use them in places where gc is currently being inhibited in a more ugly | |
14985 fashion. also, we disable GC in certain strategic places where string | |
14986 data is often passed in, e.g. @samp{dfc} functions, @samp{print} | |
14987 functions. | |
14988 | |
14989 @item | |
14990 @code{make_buffer()} -> @code{wrap_buffer()} for consistency with other | |
14991 objects; same for @code{make_frame()} ->@code{ wrap_frame()} and | |
14992 @code{make_console()} -> @code{wrap_console()}. | |
14993 | |
14994 @item | |
14995 better documentation in condition-case. | |
14996 | |
14997 @item | |
14998 new convenience funs @code{record_unwind_protect_freeing()} and | |
14999 @code{record_unwind_protect_freeing_dynarr()} for conveniently setting | |
15000 up an unwind-protect to @code{xfree()} or @code{Dynarr_free()} a | |
15001 pointer. | |
15002 @end itemize | |
15003 | |
15004 @heading s/m files: | |
15005 | |
15006 @itemize | |
15007 @item | |
15008 removal of unused @code{DATA_END}, @code{TEXT_END}, | |
15009 @code{SYSTEM_PURESIZE_EXTRA}, @code{HAVE_ALLOCA} (automatically | |
15010 determined) | |
15011 | |
15012 @item | |
15013 removal of @code{vfork} references (we no longer use @code{vfork}) | |
15014 @end itemize | |
15015 | |
15016 @heading @file{make-docfile}: | |
15017 | |
15018 @itemize | |
15019 @item | |
15020 clean up headers a bit. | |
15021 | |
15022 @item | |
15023 allow @file{.obj} to mean equivalent @file{.c}, just like for @file{.o}. | |
15024 | |
15025 @item | |
15026 allow specification of a ``response file'' (a command-line argument | |
15027 beginning with @@, specifying a file containing further command-line | |
15028 arguments) -- a standard mswin idiom to avoid potential command-line | |
15029 limits and to simplify makefiles. use this in @file{xemacs.mak}. | |
15030 @end itemize | |
15031 | |
15032 @heading debug support | |
15033 | |
15034 @itemize | |
15035 @item | |
15036 (@file{cmdloop.el}) new var breakpoint-on-error, which breaks into the C | |
15037 debugger when an unhandled error occurs noninteractively. useful when | |
15038 debugging errors coming out of complicated make scripts, e.g. package | |
15039 compilation, since you can set this through an env var. | |
15040 | |
15041 @item | |
15042 (@file{startup.el}) new env var @code{XEMACSDEBUG}, specifying a Lisp | |
15043 form executed early in the startup process; meant to be used for turning | |
15044 on debug flags such as @code{breakpoint-on-error} or | |
15045 @code{stack-trace-on-error}, to track down noninteractive errors. | |
15046 | |
15047 @item | |
15048 (@file{cmdloop.el}) removed non-working code in @code{command-error} to | |
15049 display a backtrace on @code{debug-on-error}. use | |
15050 @code{stack-trace-on-error} instead to get this. | |
15051 | |
15052 @item | |
15053 (@file{process.c}) new var @code{debug-process-io} displays data sent to | |
15054 and received from a process. | |
15055 | |
15056 @item | |
15057 (@file{alloc.c}) staticpros have name stored with them for easier | |
15058 debugging. | |
15059 | |
15060 @item | |
15061 (@file{emacs.c}) code that handles fatal errors consolidated and | |
15062 rewritten. much more robust and correctly handles all fatal exits on | |
15063 mswin (e.g. aborts, not previously handled right). | |
15064 @end itemize | |
15065 | |
15066 @heading @file{startup.el} | |
15067 | |
15068 @itemize | |
15069 @item | |
15070 move init routines from @code{before-init-hook} or | |
15071 @code{after-init-hook}; just call them directly | |
15072 (@code{init-menubar-at-startup}, @code{init-mule-at-startup}). | |
15073 | |
15074 @item | |
15075 help message fixed up (divided into sections), existing problem causing | |
15076 incomplete output fixed, undocumented options documented. | |
15077 @end itemize | |
15078 | |
15079 @heading @file{frame.el} | |
15080 | |
15081 @itemize | |
15082 @item | |
15083 delete old commented-out code. | |
15084 @end itemize | |
15085 | |
15086 | |
15087 @node Ben's TODO list, Ben's README, General internal changes, The Great Mule Merge of March 2002 | |
15088 @subsection Ben's TODO list (probably obsolete) | |
15089 | |
15090 These notes substantially overlap those in @ref{Ben's README}. They | |
15091 should probably be combined. | |
15092 | |
15093 @heading April 11, 2002 | |
15094 | |
15095 Priority: | |
15096 | |
15097 @enumerate | |
15098 @item | |
15099 Finish checking in current mule ws. | |
15100 | |
15101 @item | |
15102 Start working on bugs reported by others and noticed by me: | |
15103 | |
15104 @itemize | |
15105 @item | |
15106 problems cutting and pasting binary data, e.g. from byte-compiler | |
15107 instructions | |
15108 | |
15109 @item | |
15110 test suite failures | |
15111 | |
15112 @item | |
15113 process i/o problems w.r.t. eol: |uniq (e.g.) leaves ^M's at end of | |
15114 line; running "bash" as shell-file-name doesn't work because it doesn't | |
15115 like the extra ^M's. | |
15116 @end itemize | |
15117 @end enumerate | |
15118 | |
15119 @heading March 20, 2002 | |
15120 | |
15121 bugs: | |
15122 | |
15123 @itemize | |
15124 @item | |
15125 TTY-mode problem. When you start up in TTY mode, XEmacs goes through | |
15126 the loadup process and appears to be working -- you see the startup | |
15127 screen pulsing through the different screens, and it appears to be | |
15128 listening (hitting a key stops the screen motion), but it's frozen -- | |
15129 the screen won't get off the startup, key commands don't cause anything | |
15130 to happen. STATUS: In progress. | |
15131 | |
15132 @item | |
15133 Memory ballooning in some cases. Not yet understood. | |
15134 | |
15135 @item | |
15136 other test suite failures? | |
15137 | |
15138 @item | |
15139 need to review the handling of sounds. seems that not everything is | |
15140 documented, not everything is consistently used where it's supposed to, | |
15141 some sounds are ugly, etc. add sounds to `completer' as well. | |
15142 | |
15143 @item | |
15144 redo with-trapping-errors so that the backtrace is stored away and only | |
15145 outputted when an error actually occurs (i.e. in the condition-case | |
15146 handler). test. (use ding of various sorts as a helpful way of checking | |
15147 out what's going on.) | |
15148 | |
15149 @item | |
15150 problems with process input: |uniq (for example) leaves ^M's at end of | |
15151 line. | |
15152 | |
15153 @item | |
15154 carefully review looking up of fonts by charset, esp. wrt the last | |
15155 element of a font spec. | |
15156 | |
15157 @item | |
15158 add package support to ignore certain files -- *-util.el for languages. | |
15159 | |
15160 @item | |
15161 review use of escape-quoted in auto_save_1() vs. the buffer's own coding | |
15162 system. | |
15163 | |
15164 @item | |
15165 figure out how to get the total amount of data memory (i.e. everything | |
15166 but the code, or even including the code if can't distinguish) used by | |
15167 the process on each different OS, and use it in a new algorithm for | |
15168 triggering GC: trigger only when a certain % of the data size has been | |
15169 consed up; in addition, have a minimum. | |
15170 | |
15171 @item | |
15172 fixed bugs??? | |
15173 | |
15174 @itemize | |
15175 @item | |
15176 Occasional crash when freeing display structures. The problem seems to | |
15177 be this: A window has a "display line dynarr"; each display line has a | |
15178 "display block dynarr". Sometimes this display block dynarr is getting | |
15179 freed twice. It appears from looking at the code that sometimes a | |
15180 display line from somewhere in the dynarr gets added to the end -- hence | |
15181 two pointers to the same display block dynarr. need to review this | |
15182 code. | |
15183 @end itemize | |
15184 @end itemize | |
15185 | |
15186 @heading August 29, 2001 | |
15187 | |
15188 This is the most current list of priorities in `ben-mule-21-5'. | |
15189 Updated often. | |
15190 | |
15191 high-priority: | |
15192 | |
15193 @table @strong | |
15194 | |
15195 @item [input] | |
15196 | |
15197 @itemize | |
15198 @item | |
15199 support for WM_IME_CHAR. IME input can work under -nuni if we use | |
15200 WM_IME_CHAR. probably we should always be using this, instead of | |
15201 snarfing input using WM_COMPOSITION. i'll check this out. | |
15202 | |
15203 @item | |
15204 Russian C-x problem. see above. | |
15205 @end itemize | |
15206 | |
15207 @item [clean-up] | |
15208 | |
15209 @itemize | |
15210 @item | |
15211 make sure it compiles and runs under non-mule. remember that some | |
15212 code needs the unicode support, or at least a simple version of it. | |
15213 | |
15214 @item | |
15215 make sure it compiles and runs under pdump. see below. | |
15216 | |
15217 @item | |
15218 make sure it compiles and runs under cygwin. see below. | |
15219 | |
15220 @item | |
15221 clean up mswindows-multibyte, TSTR_TO_C_STRING. expand dfc | |
15222 optimizations to work across chain. | |
15223 | |
15224 @item | |
15225 eliminate last vestiges of codepage<->charset conversion and similar | |
15226 stuff. | |
15227 @end itemize | |
15228 | |
15229 @item [other] | |
15230 | |
15231 @itemize | |
15232 @item | |
15233 test the "file-coding is binary only on Unix, no-Mule" stuff. | |
15234 | |
15235 @item | |
15236 test that things work correctly in -nuni if the system environment | |
15237 is set to e.g. japanese -- i should get japanese menus, japanese | |
15238 file names, etc. same for russian, hebrew ... | |
15239 | |
15240 @item | |
15241 cut and paste. see below. | |
15242 | |
15243 @item | |
15244 misc issues with handling lang environments. see also August 25, | |
15245 "finally: working on the @kbd{C-x} in ...". | |
15246 | |
15247 @itemize | |
15248 @item | |
15249 when switching lang env, needs to set keyboard layout. | |
15250 | |
15251 @item | |
15252 user var to control whether, when moving into text of a | |
15253 particular language, we set the appropriate keyboard layout. we | |
15254 would need to have a lisp api for retrieving and setting the | |
15255 keyboard layout, set text properties to indicate the layout of | |
15256 text, and have a way of dealing with text with no property on | |
15257 it. (e.g. saved text has no text properties on it.) basically, | |
15258 we need to get a keyboard layout from a charset; getting a | |
15259 language would do. Perhaps we need a table that maps charsets | |
15260 to language environments. | |
15261 | |
15262 @item | |
15263 test that the lang env is properly set at startup. test that | |
15264 switching the lang env properly sets the C locale (call | |
15265 @code{setlocale()}, set @code{LANG}, etc.) -- a spawned subprogram | |
15266 should have the new locale in its environment. | |
15267 @end itemize | |
15268 | |
15269 @item | |
15270 look through everything below and see if anything is missed in this | |
15271 priority list, and if so add it. create a separate file for the | |
15272 priority list, so it can be updated as appropriate. | |
15273 @end itemize | |
15274 @end table | |
15275 | |
15276 mid-priority: | |
15277 | |
15278 @itemize | |
15279 @item | |
15280 clean up the chain coding system. its list should specify decode | |
15281 order, not encode; i now think this way is more logical. it should | |
15282 check the endpoints to make sure they make sense. it should also | |
15283 allow for the specification of "reverse-direction coding systems": | |
15284 use the specified coding system, but invert the sense of decode and | |
15285 encode. | |
15286 | |
15287 @item | |
15288 along with that, places that take an arbitrary coding system and | |
15289 expect the ends to be anything specific need to check this, and add | |
15290 the appropriate conversions from byte->char or char->byte. | |
15291 | |
15292 @item | |
15293 get some support for arabic, thai, vietnamese, japanese jisx 0212: | |
15294 at least get the unicode information in place and make sure we have | |
15295 things tied together so that we can display them. worry about r2l | |
15296 some other time. | |
15297 | |
15298 @item | |
15299 check the handling of @kbd{C-c}. can XEmacs itself be interrupted with | |
15300 @kbd{C-c}? is that impossible now that we are a window, not a console, | |
15301 app? at least we should work something out with @file{i} so that if it | |
15302 receives a @kbd{C-c} or @kbd{C-break}, it interrupts XEmacs, too. check | |
15303 out how process groups work and if they apply only to console apps. | |
15304 also redo the way that XEmacs sends @kbd{C-c} to other apps. the | |
15305 business of injecting code should be last resort. we should try | |
15306 @kbd{C-c} first, and if that doesn't work, then the next time we try to | |
15307 interrupt the same process, use the injection method. | |
15308 @end itemize | |
15309 | |
15310 @node Ben's README, , Ben's TODO list, The Great Mule Merge of March 2002 | |
15311 @subsection Ben's README (probably obsolete) | |
15312 | |
15313 These notes substantially overlap those in @ref{Ben's TODO list}. They | |
15314 should probably be combined. | |
15315 | |
15316 This may be of some historical interest as a record of Ben at work. | |
15317 There may also be some useful suggestions as yet unimplemented. | |
15318 | |
15319 @heading oct 27, 2001 | |
15320 | |
15321 -------- proposal for better buffer-switching commands: | |
15322 | |
15323 implement what VC++ currently has. you have a single "switch" command | |
15324 like @kbd{CTRL-TAB}, which as long as you hold the @key{CTRL} button | |
15325 down, brings successive buffers that are "next in line" into the current | |
15326 position, bumping the rest forward. once you release the @key{CTRL} | |
15327 key, the chain is broken, and further @kbd{CTRL-TAB}s will start from | |
15328 the beginning again. this way, frequently used buffers naturally move | |
15329 toward the front of the chain, and you can switch back and forth between | |
15330 two buffers using @kbd{CTRL-TAB}. the only thing about @kbd{CTRL-TAB} | |
15331 is it's a bit awkward. the way to implement is to have modifier-up | |
15332 strokes fire off a hook, like modifier-up-hook. this is driven by event | |
15333 dispatch, so there are no synchronization issues. when @kbd{C-tab} is | |
15334 pressed, the binding function does something like set a one-shot handler | |
15335 on the modifier-up-hook (perhaps separate hooks for separate | |
15336 modifiers?). | |
15337 | |
15338 to do this, we'd also want to change the buffer tabs so that they maintain | |
15339 their own order. in particular, they start out synched to the regular | |
15340 order, but as you make changes, you don't want the tabs to change | |
15341 order. (in fact, they may already do this.) selecting a particular buffer | |
15342 from the buffer tabs DOES make the buffer go to the head of the line. the | |
15343 invariant is that if the tabs are displaying X items, those X items are the | |
15344 first X items in the standard buffer list, but may be in a different | |
15345 order. (it looks like the tabs may already implement all of this.) | |
15346 | |
15347 @heading oct 26, 2001 | |
15348 | |
15349 necessary testing/changes: | |
15350 | |
15351 @itemize | |
15352 @item | |
15353 test all eol detection stuff under windows w/ and w/o mule, unix w/ and | |
15354 w/o mule. (test configure flag, command-line flag, menu option) may need | |
15355 a way of pretending to be unix under cygwin. | |
15356 | |
15357 @item | |
15358 test under windows w/ and w/o mule, cygwin w/ and w/o mule, cygwin x | |
15359 windows w/ and w/o mule. | |
15360 | |
15361 @item | |
15362 test undecided-dos/unix/mac. | |
15363 | |
15364 @item | |
15365 check @kbd{ESC ESC} works as @code{isearch-quit} under TTY's. | |
15366 | |
15367 @item | |
15368 test @code{coding-system-base} and all its uses (grep for them). | |
15369 | |
15370 @item | |
15371 menu item to revert to most recent auto save. | |
15372 | |
15373 @item | |
15374 consider renaming @code{build_string} -> @code{build_intstring} and | |
15375 @code{build_c_string} to @code{build_string}. (consistent with | |
15376 @code{build_msg_string} et al; many more @code{build_c_string} than | |
15377 @code{build_string}) | |
15378 @end itemize | |
15379 | |
15380 @heading oct 20, 2001 | |
15381 | |
15382 fixed problem causing crash due to invalid internal-format data, fixed | |
15383 an existing bug in @code{valid_char_p}, and added checks to more quickly | |
15384 catch when invalid chars are generated. still need to investigate why | |
15385 @code{mswindows-multibyte} is being detected. | |
15386 | |
15387 i now see why -- we only process 65536 bytes due to a constant | |
15388 @code{MAX_BYTES_PROCESSED_FOR_DETECTION}. instead, we should have no | |
15389 limit as long as we have a seekable stream. we also need to write | |
15390 @code{stderr_out_lisp()}, used in the debug info routines i wrote. | |
15391 | |
15392 check once more about @code{DEBUG_XEMACS}. i think debugging info | |
15393 should be ON by default. make sure it is. check that nothing untoward | |
15394 will result in a production system, e.g. presumably @code{assert()}s | |
15395 should not really @code{abort()}. (!! Actually, this should be runtime | |
15396 settable! Use a variable for this, and it can be set using the same | |
15397 @code{XEMACSDEBUG} method. In fact, now that I think of it, I'm sure | |
15398 that debugging info should be on always, with runtime ways of turning on | |
15399 or off any funny behavior.) | |
15400 | |
15401 @heading oct 19, 2001 | |
15402 | |
15403 fixed various bugs preventing packages from being able to be built. | |
15404 still another bug, with @file{psgml/etc/cdtd/docbook}, which contains | |
15405 some strange characters starting around char pos 110,000. It gets | |
15406 detected as @code{mswindows-multibyte} (wrong! why?) and then invalid | |
15407 internal-format data is generated. need to fix | |
15408 @code{mswindows-multibyte} (and possibly add something that signals an | |
15409 error as well; need to work on this error-signalling mechanism) and | |
15410 figure out why it's getting detected as such. what i should do is add a | |
15411 debug var that outputs blow-by-blow info of the detection process. | |
15412 | |
15413 @heading oct 9, 2001 | |
15414 | |
15415 the stuff with @code{global-window-system-map} doesn't appear to work. in any | |
15416 case it needs better documentation. [DONE] | |
15417 | |
15418 @kbd{M-home}, @kbd{M-end} do work, but cause cl-macs to get loaded. why? | |
15419 | |
15420 @heading oct 8, 2001 | |
15421 | |
15422 finished the coding system changes and they finally work! | |
15423 | |
15424 need to implement undecided-unix/dos/mac. they should be easy to do; it | |
15425 should be enough to specify an eol-type but not do-eol, but check this. | |
15426 | |
15427 consider making the standard naming be foo-lf/crlf/cr, with unix/dos/mac as | |
15428 aliases. | |
15429 | |
15430 print methods for coding systems should include some of the generic | |
15431 properties. (also then fix print_..._within_print_method). [DONE] | |
15432 | |
15433 in a little while, go back and delete the | |
15434 @code{text-file-wrapper-coding-system} code. (it'll be in CVS if | |
15435 necessary to get at it.) [DONE] | |
15436 | |
15437 need to verify at some point that non-text-file coding systems work | |
15438 properly when specified. when gzip is working, this would be a good test | |
15439 case. (and consider creating base64 as well!) | |
15440 | |
15441 remove extra crap from @code{coding-system-category} that checks for | |
15442 chain coding systems. [DONE] | |
15443 | |
15444 perhaps make a primitive that gets at | |
15445 @code{coding-system-canonical}. [DONE] | |
15446 | |
15447 need to test cygwin, compiling the mule packages, get unix-eol stuff | |
15448 working. frank from germany says he doesn't see a lisp backtrace when he | |
15449 gets an error during temacs? verify that this actually gets outputted. | |
15450 | |
15451 consider putting the current language on the modeline, mousable so it can | |
15452 be switched. also consider making the coding system be mousable and the | |
15453 line number (pick a line) and the percentage (pick a percentage). | |
15454 | |
15455 @heading oct 6, 2001 | |
15456 | |
15457 added code so that @code{debug_print()} will output a newline to the | |
15458 mswindows debugging output, not just the console. need to test. [DONE] | |
15459 | |
15460 working on problem where all files are being detected as binary. the | |
15461 problem may be that the undecided coding system is getting wrapped with | |
15462 an auto-eol coding system, which it shouldn't be -- but even in this | |
15463 situation, we should get the right results! check the | |
15464 canonicalize-after-coding methods. also, | |
15465 @code{determine_real_coding_system} appears to be getting called even | |
15466 when we're not detecting encoding. also, undecided needs a print method | |
15467 to show its params, and chain needs to be updated to show | |
15468 @code{canonicalize_after_coding}. check others as well. [DONE] | |
15469 | |
15470 @heading oct 5, 2001 | |
15471 | |
15472 finished up coding system changes, testing. | |
15473 | |
15474 errors byte-compiling files in @code{iso-2022-7-bit}. perhaps it's not | |
15475 correctly detecting the encoding? | |
15476 | |
15477 noticed a problem in the dfc macros: we call | |
15478 @code{get_coding_system_for_text_file} with @code{eol_wrap == 1}, to | |
15479 allow for auto-detection of the eol type; but this defeats the check and | |
15480 short-circuit for unicode. | |
15481 | |
15482 still need to implement calling @code{determine_real_coding_system()} | |
15483 for non-seekable streams. to implement correctly, we need to do our own | |
15484 buffering. [DONE, BUT WITHOUT BUFFERING] | |
15485 | |
15486 @heading oct 4, 2001 | |
15487 | |
15488 implemented most stuff below. | |
15489 | |
15490 need to finish up changes to @code{make_coding_system_1}. (i changed the | |
15491 way internal coding systems were handled; i need to create subsidiaries | |
15492 for all types of coding systems, not just text ones.) there's a nasty | |
15493 @code{xfree()} crash i was hitting; perhaps it'll go away once all stuff | |
15494 has been rewritten. | |
15495 | |
15496 check under cygwin to make sure that when an error occurs during loadup, a | |
15497 backtrace is output. | |
15498 | |
15499 as soon as andy releases his new setup, we should put it onto various | |
15500 standard windows software repositories. | |
15501 | |
15502 @heading oct 3, 2001 | |
15503 | |
15504 added @code{global-tty-map} and @code{global-window-system-map}. add | |
15505 some stuff to the maps, e.g. @kbd{C-x ESC} for repeat vs. @kbd{C-x ESC | |
15506 ESC} on TTY's, and of course @kbd{ESC ESC} on window systems | |
15507 vs. @kbd{ESC ESC ESC} on TTY's. [TEST] | |
15508 | |
15509 was working on integrating the two @code{help-for-tutorial} versions (mule, | |
15510 non-mule). [DONE, but test under non-Mule] | |
15511 | |
15512 was working on the file-coding changes. need to think more about | |
15513 @code{text-file-wrapper}. conclusion i think is that | |
15514 @code{get_coding_system_for_text_file} should wrap using a special | |
15515 coding system type called a @code{text-file-wrapper}, which inherits | |
15516 from chain, and implements @code{canonicalize-after-decoding} to just | |
15517 return the unwrapped coding system. We need to implement inheritance of | |
15518 coding systems, which will certainly come in extremely useful when | |
15519 coding systems get implemented in Lisp, which should happen at some | |
15520 point. (see existing docs about this.) essentially, we have a way of | |
15521 declaring that we inherit from some system, and the appropriate data | |
15522 structures get created, perhaps just an extra inheritance pointer. but | |
15523 when we create the coding system, the extra data needs to be a stretchy | |
15524 array of offsets, pointing to the type-specific data for the coding | |
15525 system type and all its parents. that means that in the methods | |
15526 structure for a coding system (which perhaps should be expanded beyond | |
15527 method, it's just a "class structure") is the index in these arrays of | |
15528 offsets. @code{CODING_SYSTEM_DATA()} can take any of the coding system | |
15529 classes (rename type to class!) that make up this class. similarly, a | |
15530 coding system class inherits its methods from the class above unless | |
15531 specifying its own method, and can call the superclass method at any | |
15532 point by either just invoking its name, or conceivably by some macro | |
15533 like | |
15534 | |
15535 @samp{CALL_SUPER (method, (args))} | |
15536 | |
15537 similar mods would have to be made to coding stream structures. | |
15538 | |
15539 perhaps for the immediate we can just sort of fake things like we currently | |
15540 do with undecided calling some stuff from chain. | |
15541 | |
15542 @heading oct 2, 2001 | |
15543 | |
15544 need to implement support for iso-8859-15, i.e. iso-8859-1 + euro symbol. | |
15545 figure out how to fall back to iso-8859-1 as necessary. | |
15546 | |
15547 leave the current bindings the way they are for the moment, but bump off | |
15548 @kbd{M-home} and @kbd{M-end} (hardly used), and substitute my buffer | |
15549 movement stuff there. [DONE, but test] | |
15550 | |
15551 there's something to be said for combining block of 6 and paragraph, | |
15552 esp. if we make the definition of "paragraph" be so that it skips by 6 when | |
15553 within code. hmm. | |
15554 | |
15555 eliminate @code{advertised-undo} crap, and similar hacks. [DONE] | |
15556 | |
15557 think about obsolete stuff to be eliminated. think about eliminating or | |
15558 dimming obsolete items from @code{hyper-apropos} and something similar | |
15559 in completion buffers. | |
15560 | |
15561 @heading sep 30, 2001 | |
15562 | |
15563 synched up the tutorials with FSF 21.0.105. was rewriting them to favor | |
15564 the cursor keys over the older @kbd{C-p}, etc. keys. | |
15565 | |
15566 Got thinking about key bindings again. | |
15567 | |
15568 @enumerate | |
15569 @item | |
15570 I think that @kbd{M-up/down} and @kbd{M-C-up/down} should be reversed. I use | |
15571 scroll-up/down much more often than motion by paragraph. | |
15572 | |
15573 @item | |
15574 Should we eliminate move by block (of 6) and subsitute it for paragraph? | |
15575 This would have the advantage that I could make bindings for buffer | |
15576 change (forward/back buffer, perhaps @kbd{M-C-up/down}. with shift, | |
15577 @kbd{M-C-S-up/down} only goes within the same type (C files, etc.). | |
15578 alternatively, just bump off @code{beginning-of-defun} from | |
15579 @kbd{C-M-home}, since it's on @kbd{C-M-a} already. | |
15580 @end enumerate | |
15581 | |
15582 need someone to go over the other tutorials (five new ones, from FSF | |
15583 21.0.105) and fix them up to correspond to the english one. | |
15584 | |
15585 shouldn't shift-motion work with @kbd{C-a} and such as well as arrows? | |
15586 | |
15587 @heading sep 29, 2001 | |
15588 | |
15589 @code{charcount_to_bytecount} can also be made to scream -- as can | |
15590 @code{scan_buffer}, @code{buffer_mule_signal_inserted_region}, others? | |
15591 we should start profiling though before going too far down this line. | |
15592 | |
15593 Debug code that causes no slowdown should in general remain in the | |
15594 executable even in the release version because it may be useful | |
15595 (e.g. for people to see the event output). so @code{DEBUG_XEMACS} | |
15596 should be rethought. things like use of @file{msvcrtd.dll} should be | |
15597 controlled by error_checking on. maybe @code{DEBUG_XEMACS} controls | |
15598 general debug code (e.g. use of @file{msvcrtd.dll}, asserts abort, error | |
15599 checking), and the actual debugging code should remain always, or be | |
15600 conditonalized on something else (e.g. @samp{DEBUGGING_FUNS_PRESENT}). | |
15601 | |
15602 doc strings in dumped files are displayed with an extra blank line between | |
15603 each line. presumably this is recent? i assume either the change to | |
15604 detect-coding-region or the double-wrapping mentioned below. | |
15605 | |
15606 error with @code{coding-system-property} on @code{iso-2022-jp-dos}. | |
15607 problem is that that coding system is wrapped, so its type shows up as | |
15608 @code{chain}, not @code{iso-2022}. this is a general problem, and i | |
15609 think the way to fix it is to in essence do late canonicalization -- | |
15610 similar in spirit to what was done long ago, | |
15611 @code{canonicalize_when_code}, except that the new coding system (the | |
15612 wrapper) is created only once, either when the original cs is created or | |
15613 when first needed. this way, operations on the coding system work like | |
15614 expected, and you get the same results as currently when | |
15615 decoding/encoding. the only thing tricky is handling | |
15616 @code{canonicalize-after-coding} and the ever-tricky double-wrapping | |
15617 problem mentioned below. i think the proper solution is to move the | |
15618 autodetection of eol into the main autodetect type. it can be asked to | |
15619 autodetect eol, coding, or both. for just coding, it does like it | |
15620 currently does. for just eol, it does similar to what it currently does | |
15621 but runs the detection code that @code{convert-eol} currently does, and | |
15622 selects the appropriate @code{convert-eol} system. when it does both | |
15623 eol and coding, it does something on the order of creating two more | |
15624 autodetect coding systems, one for eol only and one for coding only, and | |
15625 chains them together. when each has detected the appropriate value, the | |
15626 results are combined. this automatically eliminates the double-wrapping | |
15627 problem, removes the need for complicated | |
15628 @code{canonicalize-after-coding} stuff in chain, and fixes the problem | |
15629 of autodetect not having a seekable stream because hidden inside of a | |
15630 chain. (we presume that in the both-eol-and-coding case, the various | |
15631 autodetect coding streams can communicate with each other | |
15632 appropriately.) | |
15633 | |
15634 also, we should solve the problem of internal coding systems floating | |
15635 around and clogging up the list simply by having an "internal" property | |
15636 on cs's and an internal param to @code{coding-system-list} (optional; if | |
15637 not given, you don't get the internal ones). [DONE] | |
15638 | |
15639 we should try to reduce the size of the from-unicode tables (the dominant | |
15640 memory hog in the tables). one obvious thing is to not store a whole | |
15641 emchar as the mapped-to value, but a short that encodes the octets. [DONE] | |
15642 | |
15643 @heading sep 28, 2001 | |
15644 | |
15645 need to merge up to latest in trunk. | |
15646 | |
15647 add unicode charsets for all non-translatable unicode chars; probably | |
15648 want to extend the concept of charsets to allow for dimension 3 and | |
15649 dimension 4 charsets. for the moment we should stick with just | |
15650 dimension 3 charsets; otherwise we run past the current maximum of 4 | |
15651 bytes per emchar. (most code would work automatically since it | |
15652 uses@code{ MAX_EMCHAR_LEN}; the trickiness is in certain code that has | |
15653 intimate knowledge of the representation. | |
15654 e.g. @code{bufpos_to_bytind()} has to multiply or divide by 1, 2, 3, or | |
15655 4, and has special ways of handling each number. with 5 or 6 bytes per | |
15656 char, we'd have to change that code in various ways.) 96x96x96 = 884,000 | |
15657 or so, so with two 96x96x96 charsets, we could tackle all Unicode values | |
15658 representable by UTF-16 and then some -- and only these codepoints will | |
15659 ever have assigned chars, as far as we know. | |
15660 | |
15661 need an easy way of showing the current language environment. some menus | |
15662 need to have the current one checked or whatever. [DONE] | |
15663 | |
15664 implement unicode surrogates. | |
15665 | |
15666 implement @code{buffer-file-coding-system-when-loaded} -- make sure | |
15667 @code{find-file}, @code{revert-file}, etc. set the coding system [DONE] | |
15668 | |
15669 verify all the menu stuff [DONE] | |
15670 | |
15671 implemented the entirely-ascii check in buffers. not sure how much gain | |
15672 it'll get us as we already have a known range inside of which is | |
15673 constant time, and with pure-ascii files the known range spans the whole | |
15674 buffer. improved the comment about how @code{bufpos-to-bytind} and | |
15675 vice-versa work. [DONE] | |
15676 | |
15677 fix double-wrapping of @code{convert-eol}: when undecided converts | |
15678 itself to something with a non-autodetect eol, it needs to tell the | |
15679 adjacent @code{convert-eol} to reduce itself to nothing. | |
15680 | |
15681 need menu item for find file with specified encoding. [DONE] | |
15682 | |
15683 renamed coding systems mswindows-### to windows-### to follow the standard | |
15684 in rfc1345. [DONE] | |
15685 | |
15686 implemented @code{coding-system-subsidiary-parent} [DONE] | |
15687 @code{HAVE_MULE} -> @code{MULE} in files in @file{nt/} so that depend | |
15688 checking works [DONE] | |
15689 | |
15690 need to take the smarter @code{search-all-files-in-dir} stuff from my | |
15691 sample init file and put it on the grep menu [DONE] | |
15692 | |
15693 added item for revert w/specified encoding; mostly works, but needs | |
15694 fixes. in particular, you get the correct results, but | |
15695 @code{buffer-file-coding-system} does not reflect things right. also, | |
15696 there are too many entries. need to split into submenus. there is | |
15697 already split code out there; see if it's generalized and if not make it | |
15698 so. it should only split when there's more than a specified number, and | |
15699 when splitting, split into groups of a specified size, not into a | |
15700 specified number of groups. [DONE] | |
15701 | |
15702 too many entries in the langenv menus; need to split. [DONE] | |
15703 | |
15704 @heading sep 27, 2001 | |
15705 | |
15706 NOTE: @kbd{M-x grep} for make-string causes crash now. something | |
15707 definitely to do with string changes. check very carefully the diffs | |
15708 and put in those sledgehammer checks. [DONE] | |
15709 | |
15710 fix font-lock bug i introduced. [DONE] | |
15711 | |
15712 added optimization to strings (keeps track of # of bytes of ascii at the | |
15713 beginning of a string). perhaps should also keep an all-ascii flag to deal | |
15714 with really large (> 2 MB) strings. rewrite code to count ascii-begin to | |
15715 use the 4-or-8-at-a-time stuff in @code{bytecount_to_charcount}. | |
15716 | |
15717 Error: @kbd{M-q} is causing Invalid Regexp error on the above paragraph. | |
15718 It's not in working. I assume it's a side effect of the string stuff. | |
15719 VERIFY! Write sledgehammer checks for strings. [DONE] | |
15720 | |
15721 revamped the locale/init stuff so that it tries much harder to get things | |
15722 right. should test a bit more. in particular, test out Describe Language | |
15723 on the various created environments and make sure everything looks right. | |
15724 | |
15725 should change the menus: move the submenus on @samp{Edit->Mule} directly | |
15726 under @samp{Edit}. add a menu entry on @samp{File} to say "Reload with | |
15727 specified encoding ->". [DONE] | |
15728 | |
15729 Also @samp{Find File} with specified encoding -> Also entry to change | |
15730 the EOL settings for Unix, and implement it. | |
15731 | |
15732 @code{decode-coding-region} isn't working because it needs to insert a | |
15733 binary (char->byte) converter. [DONE] | |
15734 | |
15735 chain should be rearranged to be in decoding order; similar for | |
15736 source/sink-type, other things? | |
15737 | |
15738 the detector should check for a magic cookie even without a seekable input. | |
15739 (currently its input is not seekable, because it's hidden within a chain. | |
15740 #### See what we can do about this.) | |
15741 | |
15742 provide a way to display various settings, e.g. the current category | |
15743 mappings and priority (see mule-diag; get this working so it's in the | |
15744 path); also a way to print out the likeliness results from a detection, | |
15745 perhaps a debug flag. | |
15746 | |
15747 problem with `env', which causes path issues due to `env' in packages. | |
15748 move env code to process, sync with fsf 21.0.105, check that the autoloads | |
15749 in `env' don't cause problems. [DONE] | |
15750 | |
15751 8-bit iso2022 detection appears broken; or at least, mule-canna.c is not so | |
15752 detected. | |
15753 | |
15754 @heading sep 25, 2001 | |
15755 | |
15756 something else to do is review the font selection and fix it so that (e.g.) | |
15757 JISX-0212 can be displayed. | |
15758 | |
15759 also, text in widgets needs to be drawn by us so that the correct fonts | |
15760 will be displayed even in multi-lingual text. | |
15761 | |
15762 @heading sep 24, 2001 | |
15763 | |
15764 the detection system is now properly abstracted. the detectors have been | |
15765 rewritten to include multiple levels of abstraction. now we just need | |
15766 detectors for ascii, binary, and latin-x, as well as more sophisticated | |
15767 detectors in general and further review of the general algorithm for doing | |
15768 detection. (#### Is this written up anywhere?) after that, consider adding | |
15769 error-checking to decoding (VERY IMPORTANT) and verifying the binary | |
15770 correctness of things under unix no-mule. | |
15771 | |
15772 @heading sep 23, 2001 | |
15773 | |
15774 began to fix the detection system -- adding multiple levels of likelihood | |
15775 and properly abstracting the detectors. the system is in place except for | |
15776 the abstraction of the detector-specific data out of the struct | |
15777 detection_state. we should get things working first before tackling that | |
15778 (which should not be too hard). i'm rewriting algorithms here rather than | |
15779 just converting code, so it's harder. mostly done with everything, but i | |
15780 need to review all detectors except iso2022 and make them properly follow | |
15781 the new way. also write a no-conversion detector. also need to look into | |
15782 the `recode' package and see how (if?) they handle detection, and maybe | |
15783 copy some of the algorithms. also look at recent FSF 21.0 and see if their | |
15784 algorithms have improved. | |
15785 | |
15786 @heading sep 22, 2001 | |
15787 | |
15788 @itemize | |
15789 @item | |
15790 fixed gc bugs from yesterday. | |
15791 | |
15792 @item | |
15793 fixed truename bug. | |
15794 | |
15795 @item | |
15796 close/finalize stuff works. | |
15797 | |
15798 @item | |
15799 eliminated notyet stuff in syswindows.h. | |
15800 | |
15801 @item | |
15802 eliminated special code in tstr_to_c_string. | |
15803 | |
15804 @item | |
15805 fixed pdump problems. (many of them, mostly latent bugs, ugh) | |
15806 | |
15807 @item | |
15808 fixed cygwin @code{sscanf} problems in | |
15809 @code{parse-unicode-translation-table}. (NOT a @code{sscanf} bug, but | |
15810 subtly different behavior w.r.t. whitespace in the format string, | |
15811 combined with a debugger that sucks ROCKS!! and consistently outputs | |
15812 garbage for variable values.) | |
15813 @end itemize | |
15814 | |
15815 main stuff to test is the handling of EOF recognition vs. binary | |
15816 (i.e. check what the default settings are under Unix). then we may have | |
15817 something that WORKS on all platforms!!! (Also need to test Windows | |
15818 non-Mule) | |
15819 | |
15820 @heading sep 21, 2001 | |
15821 | |
15822 finished redoing the close/finalize stuff in the lstream code. but i | |
15823 encountered again the nasty bug mentioned on sep 15 that disappeared on | |
15824 its own then. the problem seems to be that the finalize method of some | |
15825 of the lstreams is calling @code{Lstream_delete()}, which calls | |
15826 @code{free_managed_lcrecord()}, which is a no-no when we're inside of | |
15827 garbage-collection and the object passed to | |
15828 @code{free_managed_lcrecord()} is unmarked, and about to be released by | |
15829 the gc mechanism -- the free lists will end up with @code{xfree()}d | |
15830 objects on them, which is very bad. we need to modify | |
15831 @code{free_managed_lcrecord()} to check if we're in gc and the object is | |
15832 unmarked, and ignore it rather than move it to the free list. [DONE] | |
15833 | |
15834 (#### What we really need to do is do what Java and C# do w.r.t. their | |
15835 finalize methods: For objects with finalizers, when they're about to be | |
15836 freed, leave them marked, run the finalizer, and set another bit on them | |
15837 indicating that the finalizer has run. Next GC cycle, the objects will | |
15838 again come up for freeing, and this time the sweeper notices that the | |
15839 finalize method has already been called, and frees them for good (provided | |
15840 that a finalize method didn't do something to make the object alive | |
15841 again).) | |
15842 | |
15843 @heading sep 20, 2001 | |
15844 | |
15845 redid the lstream code so there is only one coding stream. combined the | |
15846 various doubled coding stream methods into one; i'm a little bit unsure | |
15847 of this last part, though, as the results of combining the two together | |
15848 seem unclean. got it to compile, but it crashes in loadup. need to go | |
15849 through and rehash the close vs. finalize stuff, as the problem was | |
15850 stuff getting freed too quickly, before the canonicalize-after-decoding | |
15851 was run. should eliminate entirely @code{CODING_STATE_END} and use a | |
15852 different method (close coding stream). rewrite to use these two. make | |
15853 sure they're called in the right places. @code{Lstream_close} on a | |
15854 stream should *NOT* do finalizing. finalize only on delete. [DONE] | |
15855 | |
15856 in general i'd like to see the flags eliminated and converted to | |
15857 bit-fields. also, rewriting the methods to take advantage of rejecting | |
15858 should make it possible to eliminate much of the state in the various | |
15859 methods, esp. including the flags. need to test this is working, though -- | |
15860 reduce the buffer size down very low and try files with only CRLF's in | |
15861 them, with one offset by a byte from the other, and see if we correctly | |
15862 handle rejection. | |
15863 | |
15864 still have the problem with incorrectly truenaming files. | |
15865 | |
15866 | |
15867 @heading sep 19, 2001 | |
15868 | |
15869 bug reported: crash while closing lstreams. | |
15870 | |
15871 the lstream/coding system close code needs revamping. we need to document | |
15872 that order of closing lstreams is very important, and make sure we're | |
15873 consistent. furthermore, chain and undecided lstreams need to close their | |
15874 underneath lstreams when they receive the EOF signal (there may be data in | |
15875 the underneath streams waiting to come out), not when they themselves are | |
15876 closed. [DONE] | |
15877 | |
15878 (if only we had proper inheritance. i think in any case we should | |
15879 simulate it for the chain coding stream -- write things in such a way that | |
15880 undecided can use the chain coding stream and not have to duplicate | |
15881 anything itself.) | |
15882 | |
15883 in general we need to carefully think through the closing process to make | |
15884 sure everything always works correctly and in the right order. also check | |
15885 very carefully to make sure there are no dangling pointers to deleted | |
15886 objects floating around. | |
15887 | |
15888 move the docs for the lstream functions to the functions themselves, not | |
15889 the header files. document more carefully what exactly | |
15890 @code{Lstream_delete()} means and how it's used, what the connections | |
15891 are between @code{Lstream_close(}), @code{Lstream_delete()}, | |
15892 @code{Lstream_flush()}, @code{lstream_finalize}, etc. [DONE] | |
15893 | |
15894 additional error-checking: consider deadbeefing the memory in objects | |
15895 stored in lcrecord free lists; furthermore, consider whether lifo or | |
15896 fifo is correct; under error-checking, we should perhaps be doing fifo, | |
15897 and setting a minimum number of objects on the lists that's quite large | |
15898 so that it's highly likely that any erroneous accesses to freed objects | |
15899 will go into such deadbeefed memory and cause crashes. also, at the | |
15900 earliest available opportunity, go through all freed memory and check | |
15901 for any consistency failures (overwrites of the deadbeef), crashing if | |
15902 so. perhaps we could have some sort of id for each block, to easier | |
15903 trace where the offending block came from. (all of these ideas are | |
15904 present in the debug system malloc from VC++, plus more stuff.) there's | |
15905 similar code i wrote sitting somewhere (in @file{free-hook.c}? doesn't | |
15906 appear so. we need to delete the blocking stuff out of there!). also | |
15907 look into using the debug system malloc from VC++, which has lots of | |
15908 cool stuff in it. we even have the sources. that means compiling under | |
15909 pdump, which would be a good idea anyway. set it as the default. (but | |
15910 then, we need to remove the requirement that Xpm be a DLL, which is | |
15911 extremely annoying. look into this.) | |
15912 | |
15913 test the windows code page coding systems recently created. | |
15914 | |
15915 problems reading my mail files -- 1personal appears to hang, others come up | |
15916 with lots of ^M's. investigate. | |
15917 | |
15918 test the enum functions i just wrote, and finish them. | |
15919 | |
15920 still pdump problems. | |
15921 | |
15922 @heading sep 18, 2001 | |
15923 | |
15924 critical-quit broken sometime after aug 25. | |
15925 | |
15926 @itemize | |
15927 @item | |
15928 fixed critical quit. | |
15929 | |
15930 @item | |
15931 fixed process problems. | |
15932 | |
15933 @item | |
15934 print routines work. (no routine for ccl, though) | |
15935 | |
15936 @item | |
15937 can read and write unicode files, and they can still be read by some | |
15938 other program | |
15939 | |
15940 @item | |
15941 defaults should come up correctly -- mswindows-multibyte is general. | |
15942 @end itemize | |
15943 | |
15944 still need to test matej's stuff. | |
15945 seems ok with multibyte stuff but needs more testing. | |
15946 | |
15947 @heading sep 17, 2001 | |
15948 | |
15949 !!!!! something broken with processes !!!!! cannot send mail anymore. must | |
15950 investigate. | |
15951 | |
15952 @heading sep 17, 2001 | |
15953 | |
15954 on mon/wed nights, stop *BEFORE* 11pm. Otherwise i just start getting | |
15955 woozy and can't concentrate. | |
15956 | |
15957 just finished getting assorted fixups to the main branch committed, so it | |
15958 will compile under C++ (Andy committed some code that broke C++ builds). | |
15959 cup'd the code into the fixtypes workspace, updated the tags appropriately. | |
15960 i've created the appropriate log message, sitting in fixtypes.txt in | |
15961 /src/xemacs; perhaps it should go into a README. now i just have to build | |
15962 on everything (it's currently building), verify it's ok, run patcher-mail, | |
15963 commit, send. | |
15964 | |
15965 my mule ws is also very close. need to: | |
15966 | |
15967 @itemize | |
15968 @item | |
15969 test the new print routines. | |
15970 | |
15971 @item | |
15972 test it can read and write unicode files, and they can still be read by | |
15973 some other program. | |
15974 | |
15975 @item | |
15976 try to see if unicode can be auto-detected properly. | |
15977 | |
15978 @item | |
15979 test it can read and write multibyte files in a few different formats. | |
15980 currently can't recognize them, but if you set the cs right, it should | |
15981 work. | |
15982 | |
15983 @item | |
15984 examine the test files sent by matej and see if we can handle them. | |
15985 @end itemize | |
15986 | |
15987 @heading sep 15, 2001 | |
15988 | |
15989 more eol fixing. this stuff is utter crap. | |
15990 | |
15991 currently we wrap coding systems with @code{convert-eol-autodetect} when we create | |
15992 them in @code{make_coding_system_1}. i had a feeling that this would be a | |
15993 problem, and indeed it is -- when autodetecting with `undecided', for | |
15994 example, we end up with multiple layers of eol conversion. to avoid this, | |
15995 we need to do the eol wrapping *ONLY* when we actually retrieve a coding | |
15996 system in places such as @code{insert-file-contents}. these places are | |
15997 @code{insert-file-contents}, load, process input, @code{call-process-internal}, | |
15998 @samp{encode/decode/detect-coding-region}, database input, ... | |
15999 | |
16000 (later) it's fixed, and things basically work. NOTE: for some reason, | |
16001 adding code to wrap coding systems with @code{convert-eol-lf} when | |
16002 @code{eol-type == lf} results in crashing during garbage collection in | |
16003 some pretty obscure place -- an lstream is free when it shouldn't be. | |
16004 this is a bad sign. i guess something might be getting initialized too | |
16005 early? | |
16006 | |
16007 we still need to fix the canonicalization-after-decoding code to avoid | |
16008 problems with coding systems like `internal-7' showing up. basically, | |
16009 when @code{eol==lf} is detected, nil should be returned, and the callers | |
16010 should handle it appropriately, eliding when necessary. chain needs to | |
16011 recognize when it's got only one (or even 0) items in the chain, and | |
16012 elide out the chain. | |
16013 | |
16014 @heading sep 11, 2001: the day that will live in infamy | |
16015 | |
16016 rewrite of sep 9 entry about formats: | |
16017 | |
16018 when calling @samp{make-coding-system}, the name can be a cons of @samp{(format1 . | |
16019 format2)}, specifying that it decodes @samp{format1->format2} and encodes the other | |
16020 way. if only one name is given, that is assumed to be @samp{format1}, and the | |
16021 other is either `external' or `internal' depending on the end type. | |
16022 normally the user when decoding gives the decoding order in formats, but | |
16023 can leave off the last one, `internal', which is assumed. a multichain | |
16024 might look like gzip|multibyte|unicode, using the coding systems named | |
16025 `gzip', `(unicode . multibyte)' and `unicode'. the way this actually works | |
16026 is by searching for gzip->multibyte; if not found, look for gzip->external | |
16027 or gzip->internal. (In general we automatically do conversion between | |
16028 internal and external as necessary: thus gzip|crlf does the expected, and | |
16029 maps to gzip->external, external->internal, crlf->internal, which when | |
16030 fully specified would be gzip|external:external|internal:crlf|internal -- | |
16031 see below.) To forcibly fit together two converters that have explicitly | |
16032 specified and incompatible names (say you have unicode->multibyte and | |
16033 iso8859-1->ebcdic and you know that the multibyte and iso8859-1 in this | |
16034 case are compatible), you can force-cast using :, like this: | |
16035 ebcdic|iso8859-1:multibyte|unicode. (again, if you force-cast between | |
16036 internal and external formats, the conversion happens automatically.) | |
16037 | |
16038 | |
16039 @heading sep 10, 2001 | |
16040 | |
16041 moved the autodetection stuff (both codesys and eol) into particular coding | |
16042 systems -- `undecided' and `convert-eol' (type == `autodetect'). needs | |
16043 lots of work. still need to search through the rest of the code and find | |
16044 any remaining auto-detect code and move it into the undecided coding | |
16045 system. need to modify make-coding-system so that it spits out | |
16046 auto-detecting versions of all text-file coding systems unless we say not | |
16047 to. need eliminate entirely the EOF flag from both the stream info and the | |
16048 coding system; have only the original-eof flag. in | |
16049 coding_system_from_mask, need to check that the returned value is not of | |
16050 type `undecided', falling back to no-conversion if so. also need to make | |
16051 sure we wrap everything appropriate for text-files -- i removed the | |
16052 wrapping on set-coding-category-list or whatever (need to check all those | |
16053 files to make sure all wrapping is removed). need to review carefully the | |
16054 new code in `undecided' to make sure it works are preserves the same logic | |
16055 as previously. need to review the closing and rewinding behavior of chain | |
16056 and undecided (same -- should really consolidate into helper routines, so | |
16057 that any coding system can embed a chain in it) -- make sure the dynarr's | |
16058 are getting their data flushed out as necessary, rewound/closed in the | |
16059 right order, no missing steps, etc. | |
16060 | |
16061 also split out mule stuff into @file{mule-coding.c}. work done on | |
16062 @file{configure}/@file{xemacs.mak}/@file{Makefile}s not done yet. work | |
16063 on @file{emacs.c}/@file{symsinit.h} to interface with the new init | |
16064 functions not done yet. | |
16065 | |
16066 also put in a few declarations of the way i think the abstracted detection | |
16067 stuff ought to go. DON'T WORK ON THIS MORE UNTIL THE REST IS DEALT WITH | |
16068 AND WE HAVE A WORKING XEMACS AGAIN WITH ALL EOL ISSUES NAILED. | |
16069 | |
16070 really need a version of @file{cvs-mods} that reports only the current | |
16071 directory. WRITE THIS! use it to implement a better | |
16072 @file{cvs-checkin}. | |
16073 | |
16074 @heading sep 9, 2001 | |
16075 | |
16076 implemented a gzip coding system. unfortunately, doesn't quite work right | |
16077 because it doesn't handle the gzip headers -- it just reads and writes raw | |
16078 zlib data. there's no function in the library to skip past the header, but | |
16079 we do have some code out of the library that we can snarf that implements | |
16080 header parsing. we need to snarf that, store it, and output it again at | |
16081 the beginning when encoding. in the process, we should create a "get next | |
16082 byte" macro that bails out when there are no more. using this, we set up a | |
16083 nice way of doing most stuff statelessly -- if we have to bail, we reject | |
16084 everything back to the sync point. also need to fix up the autodetection | |
16085 of zlib in configure.in. | |
16086 | |
16087 BIG problems with eol. finished up everything i thought i would need to | |
16088 get eol stuff working, but no -- when you have mswindows-unicode, with its | |
16089 eol set to autodetect, the detection routines themselves do the autodetect | |
16090 (first), and fail (they report CR on CRLF because of the NULL byte between | |
16091 the CR and the LF) since they're not looking at ascii data. with a chain | |
16092 it's similarly bad. for mswindows-multibyte, for example, which is a chain | |
16093 unicode->unicode-to-multibyte, autodetection happens inside of the chain, | |
16094 both when unicode and unicode-to-multibyte are active. we could twiddle | |
16095 around with the eol flags to try to deal with this, but it's gonna be a | |
16096 big mess, which is exactly what we're trying to avoid. what we | |
16097 basically want is to entirely rip out all EOL settings from either the | |
16098 coding system or the stream (yes, there are two! one might saw | |
16099 autodetect, and then the stream contains the actual detected value). | |
16100 instead, we simply create an eol-autodetect coding system -- or rather, | |
16101 it's part of the convert-eol coding system. convert-eol, type = | |
16102 autodetect, does autodetection the first time it gets data sent to it to | |
16103 decode, and thereafter sets a stream parameter indicating the actual eol | |
16104 type for this stream. this means that all autodetect coding systems, as | |
16105 created by @code{make-coding-system}, really are chains with a | |
16106 convert-eol at the beginning. only subsidiary xxx-unix has no wrapping | |
16107 at all. this should allow eof detection of gzip, unicode, etc. for | |
16108 that matter, general autodetection should be entirely encapsulated | |
16109 inside of the `autodetect' coding system, with no eol-autodetection -- | |
16110 the chain becomes convert-eol (autodetect) -> autodetect or perhaps | |
16111 backwards. the generic autodetect similarly has a coding-system in its | |
16112 stream methods, and needs somehow or other to insert the detected | |
16113 coding-system into the chain. either it contains a chain inside of it | |
16114 (perhaps it *IS* a chain), or there's some magic involving | |
16115 canonicalization-type switcherooing in the middle of a decode. either | |
16116 way, once everything is good and done and we want to save the coding | |
16117 system so it can be used later, we need to do another sort of | |
16118 canonicalization -- converting auto-detect-type coding systems into the | |
16119 detected systems. again, a coding-system method, with some magic | |
16120 currently so that subsidiaries get properly used rather than something | |
16121 that's new but equivalent to subsidiaries. (#### perhaps we could use a | |
16122 hash table to avoid recreating coding systems when not necessary. but | |
16123 that would require that coding systems be immutable from external, and | |
16124 i'm not sure that's the case.) | |
16125 | |
16126 i really think, after all, that i should reverse the naming of everything | |
16127 in chain and source-sink-type -- they should be decoding-centric. later | |
16128 on, if/when we come up with the proper way to make it totally symmetrical, | |
16129 we'll be fine whether before then we were encoding or decoding centric. | |
16130 | |
16131 | |
16132 @heading sep 9, 2001 | |
16133 | |
16134 investigated eol parameter. | |
16135 | |
16136 implemented handling in @code{make-coding-system} of @code{eol-cr} and | |
16137 @code{eol-crlf}. fixed calls everywhere to @code{Fget_coding_system} / | |
16138 @code{Ffind_coding_system} to reject non-char->byte coding systems. | |
16139 | |
16140 still need to handle "query eol type using coding-system-property" so it | |
16141 magically returns the right type by parsing the chain. | |
16142 | |
16143 no work done on formats, as mentioned below. we should consider using : | |
16144 instead of || to indicate casting. | |
16145 | |
16146 @heading early sep 9, 2001 | |
16147 | |
16148 renamed some codesys properties: `list' in chain -> chain; `subtype' in | |
16149 unicode -> type. everything compiles again and sort of works; some CRLF | |
16150 problems that may resolve themselves when i finish the convert-eol stuff. | |
16151 the stuff to create subsidiaries has been rewritten to use chains; but i | |
16152 still need to investigate how the EOL type parameter is used. also, still | |
16153 need to implement this: when a coding system is created, and its eol type | |
16154 is not autodetect or lf, a chain needs to be created and returned. i think | |
16155 that what needs to happen is that the eol type can only be set to | |
16156 autodetect or lf; later on this should be changed to simply be either | |
16157 autodetect or not (but that would require ripping out the eol converting | |
16158 stuff in the various coding systems), and eventually we will do the work on | |
16159 the detection mechanism so it can do chain detection; then we won't need an | |
16160 eol autodetect setting at all. i think there's a way to query the eol type | |
16161 of a coding system; this should check to see if the coding system is a | |
16162 chain and there's a convert-eol at the front; if so, the eol type comes | |
16163 from the type of the convert-eol. | |
16164 | |
16165 also check out everywhere that @code{Fget_coding_system} or | |
16166 @code{Ffind_coding_system} is called, and see whether anything but a | |
16167 char->byte system can be tolerated. create a new function for all the | |
16168 places that only want char->byte, something like | |
16169 @samp{get_coding_system_char_to_byte_only}. | |
16170 | |
16171 think about specifying formats in make-coding-system. perhaps the name can | |
16172 be a cons of (format1, format2), specifying that it encodes | |
16173 format1->format2 and decodes the other way. if only one name is given, | |
16174 that is assumed to be format2, and the other is either `byte' or `char' | |
16175 depending on the end type. normally the user when decoding gives the | |
16176 decoding order in formats, but can leave off the last one, `char', which is | |
16177 assumed. perhaps we should say `internal' instead of `char' and `external' | |
16178 instead of byte. a multichain might look like gzip|multibyte|unicode, | |
16179 using the coding systems named `gzip', `(unicode . multibyte)' and | |
16180 `unicode'. we would have to allow something where one format is given only | |
16181 as generic byte/char or internal/external to fit with any of the same | |
16182 byte/char type. when forcibly fitting together two converters that have | |
16183 explicitly specified and incompatible names (say you have | |
16184 unicode->multibyte and iso8859-1->ebcdic and you know that the multibyte | |
16185 and iso8859-1 in this case are compatible), you can force-cast using ||, | |
16186 like this: ebcdic|iso8859-1||multibyte|unicode. this will also force | |
16187 external->internal translation as necessary: | |
16188 unicode|multibyte||crlf|internal does unicode->multibyte, | |
16189 external->internal, crlf->internal. perhaps you'd need to put in the | |
16190 internal translation, like this: unicode|multibyte|internal||crlf|internal, | |
16191 which means unicode->multibyte, external->internal (multibyte is compatible | |
16192 with external); force-cast to crlf format and convert crlf->internal. | |
16193 | |
16194 @heading even later: Sep 8, 2001 | |
16195 | |
16196 chain doesn't need to set character mode, that happens automatically when | |
16197 the coding systems are created. fixed chain to return correct source/sink | |
16198 type for itself and to check the compatibility of source/sink types in its | |
16199 chain. fixed decode/encode-coding-region to check the source and sink | |
16200 types of the coding system performing the conversion and insert appropriate | |
16201 byte->char/char->byte converters (aka "binary" coding system). fixed | |
16202 set-coding-category-system to only accept the traditional | |
16203 encode-char-to-byte types of coding systems. | |
16204 | |
16205 still need to extend chain to specify the parameters mentioned below, | |
16206 esp. "reverse". also need to extend the print mechanism for chain so it | |
16207 prints out the chain. probably this should be general: have a new method | |
16208 to return all properties, and output those properties. you could also | |
16209 implement a read syntax for coding systems this way. | |
16210 | |
16211 still need to implement @code{convert-eol} and finish up the rest of the | |
16212 eol stuff mentioned below. | |
16213 | |
16214 @heading later September 7, 2001 (more like Sep 8) | |
16215 | |
16216 moved many @code{Lisp_Coding_System *} params to @code{Lisp_Object}. In | |
16217 general this is the way to go, and if we ever implement a copying GC, we | |
16218 will never want to be passing direct pointers around. With no | |
16219 error-checking, we lose no cycles using @code{Lisp_Object}s in place of | |
16220 pointers -- the @code{Lisp_Object} itself is nothing but a pointer, and | |
16221 so all the casts and "dereferences" boil down to nothing. | |
16222 | |
16223 Clarified and cleaned up the "character mode" on streams, and documented | |
16224 who (caller or object itself) has the right to be setting character mode | |
16225 on a stream, depending on whether it's a read or write stream. changed | |
16226 @code{conversion_end_type} method and @code{enum source_sink_type} to | |
16227 return encoding-centric values, rather than decoding-centric. for the | |
16228 moment, we're going to be entirely encoding-centric in everything; we | |
16229 can rethink later. fixed coding systems so that the decode and encode | |
16230 methods are guaranteed to receive only full characters, if that's the | |
16231 source type of the data, as per conversion_end_type. | |
16232 | |
16233 still need to fix the chain method so that it correctly sets the | |
16234 character mode on all the lstreams in it and checks the source/sink | |
16235 types to be compatible. also fix @code{decode-coding-string} and | |
16236 friends to put the appropriate byte->character | |
16237 (i.e. @code{no-conversion}) coding systems on the ends as necessary so | |
16238 that the final ends are both character. also add to chain a parameter | |
16239 giving the ability to switch the direction of conversion of any | |
16240 particular item in the chain (i.e. swap encoding and decoding). i think | |
16241 what we really want to do is allow for arbitrary parameters to be put | |
16242 onto a particular coding system in the chain, of which the only one so | |
16243 far is swap-encode-decode. don't need too much codage here for that, | |
16244 but make the design extendable. | |
16245 | |
16246 | |
16247 | |
16248 @heading September 7, 2001 | |
16249 | |
16250 just added a return value from the decode and encode methods of a coding | |
16251 system, so that some of the data can get rejected. fixed the calling | |
16252 routines to handle this. need to investigate when and whether the coding | |
16253 lstream is set to character mode, so that the decode/encode methods only | |
16254 get whole characters. if not, we should do so, according to the source | |
16255 type of these methods. also need to implement the convert_eol coding | |
16256 system, and fix the subsidiary coding systems (and in general, any coding | |
16257 system where the eol type is specified and is not LF) to be chains | |
16258 involving convert_eol. | |
16259 | |
16260 after everything is working, need to remove eol handling from encode/decode | |
16261 methods and eventually consider rewriting (simplifying) them given the | |
16262 reject ability. | |
16263 | |
16264 @heading September 5, 2001 | |
16265 | |
16266 @itemize | |
16267 @item | |
16268 need to organize this. get everything below into the TODO list. | |
16269 CVS the TODO list frequently so i can delete old stuff. prioritize | |
16270 it!!!!!!!!! | |
16271 | |
16272 @item | |
16273 move @file{README.ben-mule...} to @file{STATUS.ben-mule...}; use | |
16274 @file{README} for intro, overview of what's new, what's broken, how to | |
16275 use the features, etc. | |
16276 | |
16277 @item | |
16278 need a global and local @samp{coding-category-precedence} list, which | |
16279 get merged. | |
16280 | |
16281 @item | |
16282 finished the BOM support. also finished something not listed below, | |
16283 expansion to the auto-generator of Unicode-encapsulation to support | |
16284 bracketing code with @samp{#if ... #endif}, for Cygwin and MINGW | |
16285 problems, e.g. This is tested; appears to work. | |
16286 | |
16287 @item | |
16288 need to add more multibyte coding systems now that we have various | |
16289 properties to specify them. need to add DEFUN's for mac-code-page | |
16290 and ebcdic-code-page for completeness. need to rethink the whole | |
16291 way that the priority list works. it will continue to be total | |
16292 junk until multiple levels of likeliness get implemented. | |
16293 | |
16294 @item | |
16295 need to finish up the stuff about the various defaults. [need to | |
16296 investigate more generally where all the different default values | |
16297 are that control encoding. (there are six places or so.) need to | |
16298 list them in @code{make-coding-system} docs and put pointers | |
16299 elsewhere. [[[[#### what interface to specify that this default | |
16300 should be unicode? a "Unicode" language environment seems too | |
16301 drastic, as the language environment controls much more.]]]] even | |
16302 skipping the Unicode stuff here, we need to survey and list the | |
16303 variables that control coding page behavior and determine how they | |
16304 need to be set for various possible scenarios: | |
16305 | |
16306 @itemize | |
16307 @item | |
16308 total binary: no detection at all. | |
16309 | |
16310 @item | |
16311 raw-text only: wants only autodetection of line endings, nothing else. | |
16312 | |
16313 @item | |
16314 "standard Windows environment": tries for Unicode, falls back on | |
16315 code page encoding. | |
16316 | |
16317 @item | |
16318 some sort of East European environment, and Russian. | |
16319 | |
16320 @item | |
16321 some sort of standard Japanese Windows environment. | |
16322 | |
16323 @item | |
16324 standard Chinese Windows environments (traditional and simplified) | |
16325 | |
16326 @item | |
16327 various Unix environments (European, Japanese, Russian, etc.) | |
16328 | |
16329 @item | |
16330 Unicode support in all of these when it's reasonable | |
16331 @end itemize | |
16332 @end itemize | |
16333 | |
16334 These really require multiple likelihood levels to be fully | |
16335 implementable. We should see what can be done ("gracefully fall | |
16336 back") with single likelihood level. need lots of testing. | |
16337 | |
16338 @itemize | |
16339 @item | |
16340 need to fix the truename problem. | |
16341 | |
16342 @item | |
16343 lots of testing: need to test all of the stuff above and below that's | |
16344 recently been implemented. | |
16345 @end itemize | |
16346 | |
16347 | |
16348 @heading September 4, 2001 | |
16349 | |
16350 mostly everything compiles. currently there is a crash in | |
16351 @code{parse-unicode-translation-table}, and Cygwin/Mule won't run. it | |
16352 may well be a bug in the @code{sscanf()} in Cygwin. | |
16353 | |
16354 working on today: | |
16355 | |
16356 @itemize | |
16357 @item | |
16358 adding BOM support for Unicode coding systems. mostly there, but | |
16359 need to finish adding BOM support to the detection routines. then test. | |
16360 | |
16361 @item | |
16362 adding properties to @code{unicode-to-multibyte} to specify the coding | |
16363 system in various flexible ways, e.g. directly specified code page or | |
16364 ansi or oem code page of specified locale, current locale, user-default | |
16365 or system-default locale. need to test. | |
16366 | |
16367 @item | |
16368 creating a `multibyte' coding system, with the same parameters as | |
16369 unicode-to-multibyte and which resolves at coding-system-creation | |
16370 time to the appropriate chain. creating the underlying mechanism | |
16371 to allow such under-the-scenes switcheroo. need to test. | |
16372 | |
16373 @item | |
16374 set default-value of @code{buffer-file-coding-system} to | |
16375 mswindows-multibyte, as Matej said it should be. need to test. | |
16376 need to investigate more generally where all the different default | |
16377 values are that control encoding. (there are six places or so.) | |
16378 need to list them in make-coding-system docs and put pointers | |
16379 elsewhere. #### what interface to specify that this default should | |
16380 be unicode? a "Unicode" language environment seems too drastic, as | |
16381 the language environment controls much more. | |
16382 | |
16383 @item | |
16384 thinking about adding multiple levels of certainty to the detection | |
16385 schemes, instead of just a mask. eventually, we need to totally | |
16386 abstract things, but that can easier be done in many steps. (we | |
16387 need multiple levels of likelihood to more reasonably support a | |
16388 Windows environment with code-page type files. currently, in order | |
16389 to get them detected, we have to put them first, because they can | |
16390 look like lots of other things; but then, other encodings don't get | |
16391 detected. with multiple levels of likelihood, we still put the | |
16392 code-page categories first, but they will return low levels of | |
16393 likelihood. Lower-down encodings may be able to return higher | |
16394 levels of likelihood, and will get taken preferentially.) | |
16395 | |
16396 @item | |
16397 making it so you cannot disable file-coding, but you get an | |
16398 equivalent default on Unix non-Mule systems where all defaults are | |
16399 `binary'. need to test!!!!!!!!! | |
16400 @end itemize | |
16401 | |
16402 Matej (mostly, + some others) notes the following problems, and here | |
16403 are possible solutions: | |
16404 | |
16405 @itemize | |
16406 @item | |
16407 he wants the defaults to work right. [figure out what those | |
16408 defaults are. i presume they are auto-detection of data in current | |
16409 code page and in unicode, and new files have current code page set | |
16410 as their output encoding.] | |
16411 | |
16412 @item | |
16413 too easy to lose data with incorrect encodings. [need to set up an | |
16414 error system for encoding/decoding. extremely important but a | |
16415 little tricky to implement so let's deal with other issues now.] | |
16416 | |
16417 @item | |
16418 EOL isn't always detected correctly. [#### ?? need examples] | |
16419 | |
16420 @item | |
16421 truename isn't working: @file{c:\t.txt} and @file{c:\tmp.txt} have the | |
16422 same truename. [should be easy to fix] | |
16423 | |
16424 @item | |
16425 unicode files lose the BOM mark. [working on this] | |
16426 | |
16427 @item | |
16428 command-line utilities use OEM. [actually it seems more | |
16429 complicated. it seems they use the codepage of the console. we | |
16430 may be able to set that, e.g. to UTF8, before we invoke a command. | |
16431 need to investigate.] | |
16432 | |
16433 @item | |
16434 no way to handle unicode characters not recognized as charsets. [we | |
16435 need to create something like 8 private 2-dimensional charsets to | |
16436 handle all BMP Unicode chars. Obviously this is a stopgap | |
16437 solution. Switching to Unicode internal will ultimately make life | |
16438 far easier and remove the BMP limitation. but for now it will | |
16439 work. we translate all characters where we have charsets into | |
16440 chars in those charsets, and the remainder in a unicode charset. | |
16441 that way we can save them out again and guarantee no data loss with | |
16442 unicode. this creates font problems, though ...] | |
16443 | |
16444 @item | |
16445 problems with xemacs font handling. [xemacs font handling is not | |
16446 sophisticated enough. it goes on a charset granularity basis and | |
16447 only looks for a font whose name contains the corresponding windows | |
16448 charset in it. with unicode this fails in various ways. for one | |
16449 the granularity needs to be single character, so that those unicode | |
16450 charsets mentioned above work; and it needs to query the font to | |
16451 see what unicode ranges it supports, rather than just looking at | |
16452 the charset ending.] | |
16453 @end itemize | |
16454 | |
16455 | |
16456 @heading August 28, 2001 | |
16457 | |
16458 working on getting everything to compile again: Cygwin, non-MULE, | |
16459 pdump. not there yet. | |
16460 | |
16461 @code{mswindows-multibyte} is now defined using chain, and works. | |
16462 removed most vestiges of the @code{mswindows-multibyte} coding system | |
16463 type. | |
16464 | |
16465 file-coding is on by default; should default to binary only on Unix. | |
16466 Need to test. (Needs to compile first :-) | |
16467 | |
16468 @heading August 26, 2001 | |
16469 | |
16470 I've fixed the issue of inputting non-ASCII text under -nuni, and done | |
16471 some of the work on the Russian @key{C-x} problem -- we now compute the | |
16472 other possibilities. We still need to fix the key-lookup code, though, | |
16473 and that code is unfortunately a bit ugly. the best way, it seems, is | |
16474 to expand the command-builder structure so you can specify different | |
16475 interpretations for keys. (if we do find an alternative binding, though, | |
16476 we need to mess with both the command builder and this-command-keys, as | |
16477 does the function-key stuff. probably need to abstract that munging | |
16478 code.) | |
16479 | |
16480 high-priority: | |
16481 | |
16482 @table @strong | |
16483 | |
16484 @item [currently doing] | |
16485 | |
16486 @itemize | |
16487 @item | |
16488 support for @code{WM_IME_CHAR}. IME input can work under @code{-nuni} | |
16489 if we use @code{WM_IME_CHAR}. probably we should always be using this, | |
16490 instead of snarfing input using @code{WM_COMPOSITION}. i'll check this | |
16491 out. | |
16492 | |
16493 @item | |
16494 Russian @key{C-x} problem. see above. | |
16495 @end itemize | |
16496 | |
16497 @item [clean-up] | |
16498 | |
16499 @itemize | |
16500 @item | |
16501 make sure it compiles and runs under non-mule. remember that some | |
16502 code needs the unicode support, or at least a simple version of it. | |
16503 | |
16504 @item | |
16505 make sure it compiles and runs under pdump. see below. | |
16506 | |
16507 @item | |
16508 clean up @code{mswindows-multibyte}, @code{TSTR_TO_C_STRING}. see | |
16509 below. [DONE] | |
16510 | |
16511 @item | |
16512 eliminate last vestiges of codepage<->charset conversion and similar stuff. | |
16513 @end itemize | |
16514 | |
16515 @item [other] | |
16516 | |
16517 @itemize | |
16518 @item | |
16519 cut and paste. see below. | |
16520 @item | |
16521 misc issues with handling lang environments. see also August 25, | |
16522 "finally: working on the C-x in ...". | |
16523 @itemize | |
16524 @item | |
16525 when switching lang env, needs to set keyboard layout. | |
16526 @item | |
16527 user var to control whether, when moving into text of a | |
16528 particular language, we set the appropriate keyboard layout. we | |
16529 would need to have a lisp api for retrieving and setting the | |
16530 keyboard layout, set text properties to indicate the layout of | |
16531 text, and have a way of dealing with text with no property on | |
16532 it. (e.g. saved text has no text properties on it.) basically, | |
16533 we need to get a keyboard layout from a charset; getting a | |
16534 language would do. Perhaps we need a table that maps charsets | |
16535 to language environments. | |
16536 @item | |
16537 test that the lang env is properly set at startup. test that | |
16538 switching the lang env properly sets the C locale (call | |
16539 setlocale(), set LANG, etc.) -- a spawned subprogram should have | |
16540 the new locale in its environment. | |
16541 @end itemize | |
16542 @item | |
16543 look through everything below and see if anything is missed in this | |
16544 priority list, and if so add it. create a separate file for the | |
16545 priority list, so it can be updated as appropriate. | |
16546 @end itemize | |
16547 @end table | |
16548 | |
16549 mid-priority: | |
16550 | |
16551 @itemize | |
16552 @item | |
16553 clean up the chain coding system. its list should specify decode | |
16554 order, not encode; i now think this way is more logical. it should | |
16555 check the endpoints to make sure they make sense. it should also | |
16556 allow for the specification of "reverse-direction coding systems": | |
16557 use the specified coding system, but invert the sense of decode and | |
16558 encode. | |
16559 | |
16560 @item | |
16561 along with that, places that take an arbitrary coding system and | |
16562 expect the ends to be anything specific need to check this, and add | |
16563 the appropriate conversions from byte->char or char->byte. | |
16564 | |
16565 @item | |
16566 get some support for arabic, thai, vietnamese, japanese jisx 0212: | |
16567 at least get the unicode information in place and make sure we have | |
16568 things tied together so that we can display them. worry about r2l | |
16569 some other time. | |
16570 @end itemize | |
16571 | |
16572 @heading August 25, 2001 | |
16573 | |
16574 There is actually more non-Unicode-ized stuff, but it's basically | |
16575 inconsequential. (See previous note.) You can check using the file | |
16576 nmkun.txt (#### RENAME), which is just a list of all the routines that | |
16577 have been split. (It was generated from the output of `nmake | |
16578 unicode-encapsulate', after removing everything from the output but | |
16579 the function names.) Use something like | |
16580 | |
16581 @example | |
16582 fgrep -f ../nmkun.txt -w [a-hj-z]*.[ch] |m | |
16583 @end example | |
16584 | |
16585 in the source directory, which does a word match and skips | |
16586 @file{intl-unicode-win32.[ch]} and @file{intl-win32.[ch]}, which have a | |
16587 whole lot of references to these, unavoidably. It effectively detects | |
16588 what needs to be changed because changed versions either begin | |
16589 @samp{qxe...} or end with A or W, and in each case there's no whole-word | |
16590 match. | |
16591 | |
16592 The nasty bug has been fixed below. The @code{-nuni} option now works | |
16593 -- all specially-written code to handle the encapsulation has been | |
16594 tested by some operation (fonts by loadup and checking the output of | |
16595 @code{(list-fonts "")}; devmode by printing; dragdrop tests other | |
16596 stuff). | |
16597 | |
16598 NOTE: for @code{-nuni} (Win 95), areas need work: | |
16599 | |
16600 @itemize | |
16601 @item | |
16602 cut and paste. we should be able to receive Unicode text if it's there, | |
16603 and we should be able to receive it even in Win 95 or @code{-nuni}. we | |
16604 should just check in all circumstances. also, under 95, when we put | |
16605 some text in the clipboard, it may or may not also be automatically | |
16606 enumerated as unicode. we need to test this out and/or just go ahead | |
16607 and manually do the unicode enumeration. | |
16608 | |
16609 @item | |
16610 receiving keyboard input. we get only a single byte, but we should | |
16611 be able to correlate the language of the keyboard layout to a | |
16612 particular code page, so we can then decode it correctly. | |
16613 | |
16614 @item | |
16615 @code{mswindows-multibyte}. still implemented as its own thing. should | |
16616 be done as a chain of (encoding) unicode | unicode-to-multibyte. need | |
16617 to turn this on, get it working, and look into optimizations in the dfc | |
16618 stuff. (#### perhaps there's a general way to do these optimizations??? | |
16619 something like having a method on a coding system that can specify | |
16620 whether a pure-ASCII string gets rendered as pure-ASCII bytes and | |
16621 vice-versa.) | |
16622 @end itemize | |
16623 | |
16624 ALSO: | |
16625 | |
16626 @itemize | |
16627 @item | |
16628 we have special macros @code{TSTR_TO_C_STRING} and such because formerly | |
16629 the @samp{DFC} macros didn't know about external stuff that was Unicode | |
16630 encoded and would call @code{strlen()} on them. this is fixed, so now | |
16631 we should undo the special macros, make em normal, removal the comments | |
16632 about this, and make sure it works. [DONE] | |
16633 | |
16634 | |
16635 @item | |
16636 finally: working on the @kbd{C-x} in Russian key layout problem. in the | |
16637 process will probably end up doing work on cleaning up the handling | |
16638 of keyboard layouts, integrating or deleting the FSF stuff, adding | |
16639 code to change the keyboard layout as we move in and out of text in | |
16640 different languages (implemented as a post-command-hook; we need | |
16641 something like internal-post-command-hook if not already there, for | |
16642 internal stuff that doesn't want to get mixed up with the regular | |
16643 post-command-hook; similar for pre-command-hook). also, when | |
16644 langenv changes, ways to set the keyboard layout appropriately. | |
16645 | |
16646 @item | |
16647 i think the stuff above is higher priority than the other stuff | |
16648 mentioned below. what i'm aiming for is to be able to input and | |
16649 work with multiple languages without weird glitches, both under 95 | |
16650 and NT. the problems above are all basic impediments to such work. | |
16651 we assume for the moment that the user can make use of the existing | |
16652 file i/o conversion stuff, and put that lower in priority, after | |
16653 the basic input is working. | |
16654 | |
16655 @item | |
16656 i should get my modem connected and write up what's going on and | |
16657 send it to the lists; also cvs commit my workspaces and get more | |
16658 testers. | |
16659 @end itemize | |
16660 | |
16661 August 24, 2001: | |
16662 | |
16663 All code has been Unicode-ized except for some stuff in console-msw.c | |
16664 that deals with console output. Much of the Unicode-encapsulation | |
16665 stuff, particularly the hand-written stuff, really needs testing. I | |
16666 added a new command-line option, @code{-nuni}, to force use of all ANSI | |
16667 calls -- @code{XE_UNICODEP} evaluates to false in this case. | |
16668 | |
16669 There is a nasty bug that appeared recently, probably when the event | |
16670 code got Unicode-ized -- bad interactions with OS sticky modifiers. | |
16671 Hold the shift key down and release it, then instead of affecting the | |
16672 next char only, it gets permanently stuck on (until you do a regular | |
16673 shift+char stroke). This needs to be debugged. | |
16674 | |
16675 Other things on agenda: | |
16676 | |
16677 @itemize | |
16678 @item | |
16679 go through and prioritize what's listed below. | |
16680 | |
16681 @item | |
16682 make sure the pdump code can compile and work. for the moment we | |
16683 just don't try to dump any Unicode tables and load them up each | |
16684 time. this is certainly fast but ... | |
16685 | |
16686 @item | |
16687 there's the problem that XEmacs can't be run in a directory with | |
16688 non-ASCII/Latin-1 chars in it, since it will be doing Unicode processing | |
16689 before we've had a chance to load the tables. In fact, even finding the | |
16690 tables in such a situation is problematic using the normal commands. my | |
16691 idea is to eventually load the stuff extremely extremely early, at the | |
16692 same time as the pdump data gets loaded. in fact, the unicode table | |
16693 data (stored in an efficient binary format) can even be stuck into the | |
16694 pdump file (which would mean as a resource to the executable, for | |
16695 windows). we'd need to extend pdump a bit: to allow for attaching extra | |
16696 data to the pdump file. (something like @code{pdump_attach_extra_data | |
16697 (addr, length)} returns a number of some sort, an index into the file, | |
16698 which you can then retrieve with @code{pdump_load_extra_data()}, which | |
16699 returns an addr (@code{mmap()}ed or loaded), and later you | |
16700 @code{pdump_unload_extra_data()} when finished. we'd probably also need | |
16701 @code{pdump_attach_extra_data_append()}, which appends data to the data | |
16702 just written out with @code{pdump_attach_extra_data()}. this way, | |
16703 multiple tables in memory can be written out into one contiguous | |
16704 table. (we'd use the tar-like trick of allowing new blocks to be written | |
16705 without going back to change the old blocks -- we just rely on the end | |
16706 of file/end of memory.) this same mechanism could be extracted out of | |
16707 pdump and used to handle the non-pdump situation (or alternatively, we | |
16708 could just dump either the memory image of the tables themselves or the | |
16709 compressed binary version). in the case of extra unicode tables not | |
16710 known about at compile time that get loaded before dumping, we either | |
16711 just dump them into the image (pdump and all) or extract them into the | |
16712 compressed binary format, free the original tables, and treat them like | |
16713 all other tables. | |
16714 | |
16715 @item | |
16716 @kbd{C-x b} when using a Russian keyboard layout. XEmacs currently | |
16717 tries to interpret @samp{C+cyrillic char}, which causes an error. We | |
16718 want @kbd{C-x b} to still work even when the keyboard normally generates | |
16719 Cyrillic. What we should do is expand the keyboard event structure so | |
16720 that it contains not only the actual char, but what the char would have | |
16721 been in various other keyboard layouts, and in contexts where only | |
16722 certain keystrokes make sense (creating control chars, and looking up in | |
16723 keymaps), we proceed in order, processing each of them until we get | |
16724 something. order should be something like: current keyboard layout; | |
16725 layout of the current language environment; layout of the user's default | |
16726 language; layout of the system default language; layout of US English. | |
16727 | |
16728 @item | |
16729 reading and writing Unicode files. multiple problems: | |
16730 | |
16731 @itemize | |
16732 @item | |
16733 EOL's aren't handled right. for the moment, just fix the | |
16734 Unicode coding systems; later on, create EOL-only coding | |
16735 systems: | |
16736 | |
16737 @enumerate | |
16738 @item | |
16739 they would be character->character and operate next to the | |
16740 internal data; this means that coding systems need to be able | |
16741 to handle ends of lines that are either CR, LF, or CRLF. | |
16742 usually this isn't a problem, as they are just characters | |
16743 like any other and get encoded appropriately. however, | |
16744 coding systems that are line-oriented need to recognize any | |
16745 of the three as line endings. | |
16746 | |
16747 @item | |
16748 we'd also have to complete the stuff that handles coding | |
16749 systems where either end can be byte or char (four | |
16750 possibilities total; use a single enum such as | |
16751 @code{ENCODES_CHAR_TO_BYTE}, @code{ENCODES_BYTE_TO_BYTE}, etc.). | |
16752 | |
16753 @item | |
16754 we'd need ways of specifying the chaining of coding systems. | |
16755 e.g. when reading a coding system, a user can specify more | |
16756 than one with a | symbol between them. when a context calls | |
16757 for a coding system and a chain is needed, the `chain' coding | |
16758 system is useful; but we should really expand the contexts | |
16759 where a list of coding systems can be given, and whenever | |
16760 possible try to inline the chain instead of using a | |
16761 surrounding @code{chain} coding system. | |
16762 | |
16763 @item | |
16764 the @code{chain} needs some work so that it passes all sorts of | |
16765 lstream commands down to the chain inside it -- it should be | |
16766 entirely transparent and the fact that there's actually a | |
16767 surrounding coding system should be invisible. more general | |
16768 coding system methods might need to be created. | |
16769 | |
16770 @item | |
16771 important: we need a way of specifying how detecting works | |
16772 when we have more than one coding system. we might need more | |
16773 than a single priority list. need to think about this. | |
16774 @end enumerate | |
16775 | |
16776 @item | |
16777 Unicode files beginning with the BOM are not recognized as such. | |
16778 we need to fix this; but to make things sensible, we really need | |
16779 to add the idea of different levels of confidence regarding | |
16780 what's detected. otherwise, Unicode says "yes this is me" but | |
16781 others higher up do too. in the process we should probably | |
16782 finish abstracting the detection system and fix up some | |
16783 stupidities in it. | |
16784 | |
16785 @item | |
16786 When writing a file, we need error detection; otherwise somebody | |
16787 will create a Unicode file without realizing the coding system | |
16788 of the buffer is Raw, and then lose all the non-ASCII/Latin-1 | |
16789 text when it's written out. We need two levels | |
16790 | |
16791 @enumerate | |
16792 @item | |
16793 first, a "safe-charset" level that checks before any actual | |
16794 encoding to see if all characters in the document can safely | |
16795 be represented using the given coding system. FSF has a | |
16796 "safe-charset" property of coding systems, but it's stupid | |
16797 because this information can be automatically derived from | |
16798 the coding system, at least the vast majority of the time. | |
16799 What we need is some sort of | |
16800 alternative-coding-system-precedence-list, langenv-specific, | |
16801 where everything on it can be checked for safe charsets and | |
16802 then the user given a list of possibilities. When the user | |
16803 does "save with specified encoding", they should see the same | |
16804 precedence list. Again like with other precedence lists, | |
16805 there's also a global one, and presumably all coding systems | |
16806 not on other list get appended to the end (and perhaps not | |
16807 checked at all when doing safe-checking?). safe-checking | |
16808 should work something like this: compile a list of all | |
16809 charsets used in the buffer, along with a count of chars | |
16810 used. that way, "slightly unsafe" charsets can perhaps be | |
16811 presented at the end, which will lose only a few characters | |
16812 and are perhaps what the users were looking for. | |
16813 | |
16814 @item | |
16815 when actually writing out, we need error checking in case an | |
16816 individual char in a charset can't be written even though the | |
16817 charsets are safe. again, the user gets the choice of other | |
16818 reasonable coding systems. | |
16819 | |
16820 @item | |
16821 same thing (error checking, list of alternatives, etc.) needs | |
16822 to happen when reading! all of this will be a lot of work! | |
16823 @end enumerate | |
16824 @end itemize | |
16825 @end itemize | |
16826 | |
16827 | |
16828 | |
16829 @heading Announcement, August 20, 2001: | |
16830 | |
16831 I'm looking for testers. There is a complete and fast implementation | |
16832 in C of Unicode conversion, translations for almost all of the | |
16833 standardly-defined charsets that load up automatically and | |
16834 instantaneously at runtime, coding systems supporting the common | |
16835 external representations of Unicode [utf-16, ucs-4, utf-8, | |
16836 little-endian versions of utf-16 and ucs-4; utf-7 is sitting there | |
16837 with abort[]s where the coding routines should go, just waiting for | |
16838 somebody to implement], and a nice set of primitives for translating | |
16839 characters<->codepoints and setting the priority lists used to control | |
16840 codepoint->char lookup. | |
16841 | |
16842 It's so far hooked into one place: the Windows IME. Currently I can | |
16843 select the Japanese IME from the thing on my tray pad in the lower | |
16844 right corner of the screen, and type Japanese into XEmacs, and you get | |
16845 Japanese in XEmacs -- regardless of whether you set either your | |
16846 current or global system locale to Japanese,and regardless of whether | |
16847 you set your XEmacs lang env as Japanese. This should work for many | |
16848 other languages, too -- Cyrillic, Chinese either Traditional or | |
16849 Simplified, and many others, but YMMV. There may be some lurking | |
16850 bugs (hardly surprising for something so raw). | |
16851 | |
16852 To get at this, checkout using `ben-mule-21-5', NOT the simpler | |
16853 *`mule-21-5'. For example | |
16854 | |
16855 cvs -d :pserver:xemacs@@cvs.xemacs.org:/usr/CVSroot checkout -r ben-mule-21-5 xemacs | |
16856 | |
16857 or you get the idea. the `-r ben-mule-21-5' is important. | |
16858 | |
16859 I keep track of my progress in a file called README.ben-mule-21-5 in | |
16860 the root directory of the source tree. | |
16861 | |
16862 WARNING: Pdump might not work. Will be fixed rsn. | |
16863 | |
16864 @heading August 20, 2001 | |
16865 | |
16866 @itemize | |
16867 @item | |
16868 still need to sort out demand loading, binary format, etc. figure | |
16869 out what the goals are and how we're going to achieve them. for | |
16870 the moment let's just say that running XEmacs in a directory with | |
16871 Japanese or other weird characters in the name is likely to cause | |
16872 problems under MS Windows, but once XEmacs is initialized (and | |
16873 before processing init files), all Unicode support is there. | |
16874 | |
16875 @item | |
16876 wrote the size computation routines, although not yet tested. | |
16877 | |
16878 @item | |
16879 lots more abstraction of coding systems; almost done. | |
16880 | |
16881 @item | |
16882 UNICODE WORKS!!!!! | |
16883 @end itemize | |
16884 | |
16885 @heading August 19, 2001 | |
16886 | |
16887 Still needed on the Unicode support: | |
16888 | |
16889 @itemize | |
16890 @item | |
16891 demand loading: load the Unicode table data the first time a | |
16892 conversion needs to be done. | |
16893 | |
16894 @item | |
16895 maybe: table size computation: figure out how big the in-memory | |
16896 tables actually are. | |
16897 | |
16898 @item | |
16899 maybe: create a space-efficient binary format for the data, and a | |
16900 way to dump out an existing charset's data into this binary format. | |
16901 it should allow for many such groups of data to be appended | |
16902 together in one file, such that you can just append the new data | |
16903 onto the end and not have to go back and modify anything | |
16904 previously. (like how tar archives work, and how the UFS? for | |
16905 CD-R's and CD-RW's works.) | |
16906 | |
16907 @item | |
16908 maybe: figure out how to be able to access the Unicode tables at | |
16909 @code{init_intl()} time, before we know how to get at data-directory; | |
16910 that way we can handle the need for unicode conversions that come up | |
16911 very early, for example if XEmacs is run from a directory containing | |
16912 Japanese in it. Presumably we'd want to generalize the stuff in | |
16913 @file{pdump.c} that deals with the dumper file, so that it can handle | |
16914 other files -- putting the file either in the directory of the | |
16915 executable or in a resource, maybe actually attached to the pdump file | |
16916 itself -- or maybe we just dump the data into the actual executable. | |
16917 With pdump we could extend pdump to allow for data that's in the pdump | |
16918 file but not actually mapped at startup, separate from the data that | |
16919 does get mapped -- and then at runtime the pointer gets restored not | |
16920 with a real pointer but an offset into the file; another pdump call and | |
16921 we get some way to access the data. (tricky because it might be in a | |
16922 resource, not a file. we might have to just tell pdump to mmap or | |
16923 whatever the data in, and then tell pdump to release it.) | |
16924 | |
16925 @item | |
16926 fix multibyte to use unicode. at first, just reverse | |
16927 @code{mswindows-multibyte-to-unicode} to be @code{unicode-to-multibyte}; | |
16928 later implement something in chain to allow for reversal, for declaring | |
16929 the ends of the coding systems, etc. | |
16930 | |
16931 @item | |
16932 actually make sure that the IME stuff is working!!! | |
16933 @end itemize | |
16934 | |
16935 Other things before announcing: | |
16936 | |
16937 @itemize | |
16938 @item | |
16939 change so that the Unicode tables are not pdumped. This means we need | |
16940 to free any table data out there. Make sure that pdump compiles and try | |
16941 to finish the pretty-much-already-done stuff already with | |
16942 @code{XD_STRUCT_ARRAY} and dynamic size computation; just need to see | |
16943 what's going on with @code{LO_LINK}. | |
16944 @end itemize | |
16945 | |
16946 @heading August 14, 2001 | |
16947 | |
16948 To do a diff between this workspace and the mainline, use the most recent sync tags, currently: | |
16949 | |
16950 @example | |
16951 cvs diff -r main-branch-ben-mule-21-5-aug-11-2001-sync -r ben-mule-21-5-post-aug-11-2001-sync | |
16952 @end example | |
16953 | |
16954 Unicode support: | |
16955 | |
16956 Unicode support is important for supporting many languages under | |
16957 Windows, such as Cyrillic, without resorting to translation tables for | |
16958 particular Windows-specific code pages. Internally, all characters in | |
16959 Windows can be represented in two encodings: code pages and Unicode. | |
16960 With Unicode support, we can seamlessly support all Windows | |
16961 characters. Currently, the test in the drive to support Unicode is if | |
16962 IME input works properly, since it is being converted from Unicode. | |
16963 | |
16964 Unicode support also requires that the various Windows API's be | |
16965 "Unicode-encapsulated", so that they automatically call the ANSI or | |
16966 Unicode version of the API call appropriately and handle the size | |
16967 differences in structures. What this means is: | |
16968 | |
16969 @itemize | |
16970 @item | |
16971 first, note that Windows already provides a sort of encapsulation | |
16972 of all API's that deal with text. All such API's are underlyingly | |
16973 provided in two versions, with an A or W suffix (ANSI or "wide" | |
16974 i.e. Unicode), and the compile-time constant UNICODE controls which | |
16975 is selected by the unsuffixed API. Same thing happens with | |
16976 structures. Unfortunately, this is compile-time only, not | |
16977 run-time, so not sufficient. (Creating the necessary run-time | |
16978 encoding is not conceptually difficult, but very time-consuming to | |
16979 write. It adds no significant overhead, and the only reason it's | |
16980 not standard in Windows is conscious marketing attempts by | |
16981 Microsoft to cripple Windows 95. FUCK MICROSOFT! They even | |
16982 describe in a KnowledgeBase article exactly how to create such an | |
16983 API [although we don't exactly follow their procedure], and point | |
16984 out its usefulness; the procedure is also described more generally | |
16985 in Nadine Kano's book on Win32 internationalization -- written SIX | |
16986 YEARS AGO! Obviously Microsoft has such an API available | |
16987 internally.) | |
16988 | |
16989 @item | |
16990 what we do is provide an encapsulation of each standard Windows API | |
16991 call that is split into A and W versions. current theory is to | |
16992 avoid all preprocessor games; so we name the function with a prefix | |
16993 -- "qxe" currently -- and require callers to use the prefixed name. | |
16994 Callers need to explicitly use the W version of all structures, and | |
16995 convert text themselves using @code{Qmswindows_tstr}. the qxe | |
16996 encapsulated version will automatically call the appropriate A or W | |
16997 version depending on whether we're running on 9x or NT, and copy | |
16998 data between W and A versions of the structures as necessary. | |
16999 | |
17000 @item | |
17001 We require the caller to handle the actual translation of text to | |
17002 avoid possible overflow when dealing with fixed-size Windows | |
17003 structures. There are no such problems when copying data between | |
17004 the A and W versions because ANSI text is never larger than its | |
17005 equivalent Unicode representation. | |
17006 | |
17007 @item | |
17008 We allow for incremental creation of the encapsulated routines by using | |
17009 the coding system @code{Qmswindows_tstr_notyet}. This is an alias for | |
17010 @code{Qmswindows_multibyte}, i.e. it always converts to ANSI; but it | |
17011 indicates that it will be changed to @code{Qmswindows_tstr} when we have | |
17012 a qxe version of the API call that the data is being passed to and | |
17013 change the code to use the new function. | |
17014 @end itemize | |
17015 | |
17016 Besides creating the encapsulation, the following needs to be done for | |
17017 Unicode support: | |
17018 | |
17019 @itemize | |
17020 @item | |
17021 No actual translation tables are fed into XEmacs. We need to | |
17022 provide glue code to read the tables in @file{etc/unicode}. See | |
17023 @file{etc/unicode/README} for the interface to implement. | |
17024 | |
17025 @item | |
17026 Fix pdump. The translation tables for Unicode characters function as | |
17027 unions of structures with different numbers of indirection levels, in | |
17028 order to be efficient. pdump doesn't yet support such unions. | |
17029 @file{charset.h} has a general description of how the translation tables | |
17030 work, and the pdump code has constants added for the new required data | |
17031 types, and descriptions of how these should work. | |
17032 | |
17033 @item | |
17034 ultimately, there's no end to additional work (composition, bidi | |
17035 reordering, glyph shaping/ordering, etc.), but the above is enough | |
17036 to get basic translation working. | |
17037 @end itemize | |
17038 | |
17039 Merging this workspace into the trunk requires some work. ChangeLogs | |
17040 have not yet been created. Also, there is a lot of additional code in | |
17041 this workspace other than just Windows and Unicode stuff. Some of the | |
17042 changes have been somewhat disruptive to the code base, in particular: | |
17043 | |
17044 @itemize | |
17045 @item | |
17046 the code that handles the details of processing multilingual text has | |
17047 been consolidated to make it easier to extend it. it has been yanked | |
17048 out of various files (@file{buffer.h}, @file{mule-charset.h}, | |
17049 @file{lisp.h}, @file{insdel.c}, @file{fns.c}, @file{file-coding.c}, | |
17050 etc.) and put into @file{text.c} and @file{text.h}. | |
17051 @file{mule-charset.h} has also been renamed @file{charset.h}. all long | |
17052 comments concerning the representations and their processing have been | |
17053 consolidated into @file{text.c}. | |
17054 | |
17055 @item | |
17056 @file{nt/config.h} has been eliminated and everything in it merged into | |
17057 @file{config.h.in} and @file{s/windowsnt.h}. see @file{config.h.in} for | |
17058 more info. | |
17059 | |
17060 @item | |
17061 @file{s/windowsnt.h} has been completely rewritten, and | |
17062 @file{s/cygwin32.h} and @file{s/mingw32.h} have been largely rewritten. | |
17063 tons of dead weight has been removed, and stuff common to more than one | |
17064 file has been isolated into @file{s/win32-common.h} and | |
17065 @file{s/win32-native.h}, similar to what's already done for usg | |
17066 variants. | |
17067 | |
17068 @item | |
17069 large amounts of code throughout the code base have been Mule-ized, | |
17070 not just Windows code. | |
17071 | |
17072 @item | |
17073 @file{file-coding.c/.h} have been largely rewritten (although still | |
17074 mostly syncable); see below. | |
17075 @end itemize | |
17076 | |
17077 | |
17078 @heading June 26, 2001 | |
17079 | |
17080 ben-mule-21-5 | |
17081 | |
17082 this contains all the mule work i've been doing. this includes mostly | |
17083 work done to get mule working under ms windows, but in the process | |
17084 i've [of course] fixed a whole lot of other things as well, mostly | |
17085 mule issues. the specifics: | |
17086 | |
17087 @itemize | |
17088 @item | |
17089 it compiles and runs under windows and should basically work. the | |
17090 stuff remaining to do is (a) improved unicode support (see below) | |
17091 and (b) smarter handling of keyboard layouts. in particular, it | |
17092 should (1) set the right keyboard layout when you change your | |
17093 language environment; (2) optionally (a user var) set the | |
17094 appropriate keyboard layout as you move the cursor into text in a | |
17095 particular language. | |
17096 | |
17097 @item | |
17098 i added a bunch of code to better support OS locales. it tries to | |
17099 notice your locale at startup and set the language environment | |
17100 accordingly (this more or less works), and call setlocale() and set | |
17101 LANG when you change the language environment (may or may not work). | |
17102 | |
17103 @item | |
17104 major rewriting of file-coding. it's mostly abstracted into coding | |
17105 systems that are defined by methods (similar to devices and | |
17106 specifiers), with the ultimate aim being to allow non-i18n coding | |
17107 systems such as gzip. there is a "chain" coding system that allows | |
17108 multiple coding systems to be chained together. (it doesn't yet | |
17109 have the concept that either end of a coding system can be bytes or | |
17110 chars; this needs to be added.) | |
17111 | |
17112 @item | |
17113 unicode support. very raw. a few days ago i wrote a complete and | |
17114 efficient implementation of unicode translation. it should be very | |
17115 fast, and fairly memory-efficient in its tables. it allows for | |
17116 charset priority lists, which should be language-environment | |
17117 specific (but i haven't yet written the glue code). it works in | |
17118 preliminary testing, but obviously needs more testing and work. | |
17119 as of yet there is no translation data added for the standard charsets. | |
17120 the tables are in etc/unicode, and all we need is a bit of glue code | |
17121 to process them. see etc/unicode/README for the interface to | |
17122 implement. | |
17123 | |
17124 @item | |
17125 support for unicode in windows is partly there. this will work even | |
17126 on windows 95. the basic model is implemented but it needs finishing | |
17127 up. | |
17128 | |
17129 @item | |
17130 there is a preliminary implementation of windows ime support courtesy | |
17131 of ikeyama. | |
17132 | |
17133 @item | |
17134 if you want to get cyrillic working under windows (it appears to "work" | |
17135 but the wrong chars currently appear), the best way is to add unicode | |
17136 support for iso-8859-5 and use it in redisplay-msw.c. we are already | |
17137 passing unicode codepoints to the text-draw routine (ExtTextOutW). | |
17138 (ExtTextOutW and GetTextExtentPoint32W are implemented on both 95 and NT.) | |
17139 | |
17140 @item | |
17141 i fixed the iso2022 handling so it will correctly read in files | |
17142 containing unknown charsets, creating a "temporary" charset which can | |
17143 later be overwritten by the real charset when it's defined. this allows | |
17144 iso2022 elisp files with literals in strange languages to compile | |
17145 correctly under mule. i also added a hack that will correctly read in | |
17146 and write out the emacs-specific "composition" escape sequences, | |
17147 i.e. @samp{ESC 0} through @samp{ESC 4}. this means that my workspace correctly | |
17148 compiles the new file @file{devanagari.el} that i added (see below). | |
17149 | |
17150 @item | |
17151 i copied the remaining language-specific files from fsf. i made | |
17152 some minor changes in certain cases but for the most part the stuff | |
17153 was just copied and may not work. | |
17154 | |
17155 @item | |
17156 i fixed @code{post-read-conversion} in coding systems to follow fsf | |
17157 conventions. (i also support our convention, for the moment. a | |
17158 kludge, of course.) | |
17159 | |
17160 @item | |
17161 @code{make-coding-system} accepts (but ignores) the additional properties | |
17162 present in the fsf version, for compatibility. | |
17163 @end itemize | |
17164 | |
14162 | 17165 |
14163 | 17166 |
14164 @node Consoles; Devices; Frames; Windows, The Redisplay Mechanism, Multilingual Support, Top | 17167 @node Consoles; Devices; Frames; Windows, The Redisplay Mechanism, Multilingual Support, Top |
14165 @chapter Consoles; Devices; Frames; Windows | 17168 @chapter Consoles; Devices; Frames; Windows |
14166 @cindex consoles; devices; frames; windows | 17169 @cindex consoles; devices; frames; windows |
17398 @item tty_name | 20401 @item tty_name |
17399 The name of the terminal that the subprocess is using, | 20402 The name of the terminal that the subprocess is using, |
17400 or @code{nil} if it is using pipes. | 20403 or @code{nil} if it is using pipes. |
17401 @end table | 20404 @end table |
17402 | 20405 |
20406 @menu | |
20407 * Ben's separate stderr notes:: Probably obsolete. | |
20408 @end menu | |
20409 | |
20410 | |
20411 @node Ben's separate stderr notes, , , Subprocesses | |
20412 @subsection Ben's separate stderr notes (probably obsolete) | |
20413 | |
20414 This node contains some notes that Ben kept on his separate subprocess | |
20415 workspace. These notes probably describe changes and features that have | |
20416 already been included in XEmacs 21.5; somebody should check and/or ask | |
20417 Ben. | |
20418 | |
20419 @heading ben-separate-stderr-improved-error-trapping | |
20420 | |
20421 this is an old workspace, very close to being done, containing | |
20422 | |
20423 @itemize | |
20424 @item | |
20425 subprocess stderr output can be read separately; needed to fully | |
20426 implement call-process with asynch. subprocesses. | |
20427 | |
20428 @item | |
20429 huge improvements to the internal error-trapping routines (i.e. the | |
20430 routines that call Lisp code and trap errors); Lisp code can now be | |
20431 called from within redisplay. | |
20432 | |
20433 @item | |
20434 cleanup and simplification of C-g handling; some things work now | |
20435 that never used to. | |
20436 | |
20437 @item | |
20438 see the ChangeLogs in the workspace. | |
20439 @end itemize | |
20440 | |
20441 | |
17403 @node Interface to MS Windows, Interface to the X Window System, Subprocesses, Top | 20442 @node Interface to MS Windows, Interface to the X Window System, Subprocesses, Top |
17404 @chapter Interface to MS Windows | 20443 @chapter Interface to MS Windows |
17405 @cindex MS Windows, interface to | 20444 @cindex MS Windows, interface to |
17406 @cindex Windows, interface to | 20445 @cindex Windows, interface to |
17407 | 20446 |
17408 @menu | 20447 @menu |
17409 * Different kinds of Windows environments:: | 20448 * Different kinds of Windows environments:: |
17410 * Windows Build Flags:: | 20449 * Windows Build Flags:: |
17411 * Windows I18N Introduction:: | 20450 * Windows I18N Introduction:: |
17412 * Modules for Interfacing with MS Windows:: | 20451 * Modules for Interfacing with MS Windows:: |
20452 * CHANGES from 21.4-windows branch:: Probably obsolete. | |
17413 @end menu | 20453 @end menu |
17414 | 20454 |
17415 @node Different kinds of Windows environments, Windows Build Flags, Interface to MS Windows, Interface to MS Windows | 20455 @node Different kinds of Windows environments, Windows Build Flags, Interface to MS Windows, Interface to MS Windows |
17416 @section Different kinds of Windows environments | 20456 @section Different kinds of Windows environments |
17417 @cindex different kinds of Windows environments | 20457 @cindex different kinds of Windows environments |
17873 definition with a call to the macro XETEXT. This appropriately makes a | 20913 definition with a call to the macro XETEXT. This appropriately makes a |
17874 string of either regular or wide chars, which is to say this string may be | 20914 string of either regular or wide chars, which is to say this string may be |
17875 prepended with an L (causing it to be a wide string) depending on | 20915 prepended with an L (causing it to be a wide string) depending on |
17876 XEUNICODE_P. | 20916 XEUNICODE_P. |
17877 | 20917 |
17878 @node Modules for Interfacing with MS Windows, , Windows I18N Introduction, Interface to MS Windows | 20918 @node Modules for Interfacing with MS Windows, CHANGES from 21.4-windows branch, Windows I18N Introduction, Interface to MS Windows |
17879 @section Modules for Interfacing with MS Windows | 20919 @section Modules for Interfacing with MS Windows |
17880 @cindex modules for interfacing with MS Windows | 20920 @cindex modules for interfacing with MS Windows |
17881 @cindex interfacing with MS Windows, modules for | 20921 @cindex interfacing with MS Windows, modules for |
17882 @cindex MS Windows, modules for interfacing with | 20922 @cindex MS Windows, modules for interfacing with |
17883 @cindex Windows, modules for interfacing with | 20923 @cindex Windows, modules for interfacing with |
17934 @item intl-auto-encap-win32.c | 20974 @item intl-auto-encap-win32.c |
17935 Auto-generated Unicode encapsulation functions | 20975 Auto-generated Unicode encapsulation functions |
17936 @item intl-auto-encap-win32.h | 20976 @item intl-auto-encap-win32.h |
17937 Auto-generated Unicode encapsulation headers | 20977 Auto-generated Unicode encapsulation headers |
17938 @end table | 20978 @end table |
20979 | |
20980 | |
20981 @node CHANGES from 21.4-windows branch, , Modules for Interfacing with MS Windows, Interface to MS Windows | |
20982 @section CHANGES from 21.4-windows branch (probably obsolete) | |
20983 | |
20984 This node contains the @file{CHANGES-msw} log that Andy Piper kept while | |
20985 he was maintaining the Windows branch of 21.4. These changes have | |
20986 (presumably) long since been merged to both 21.4 and 21.5, but let's not | |
20987 throw the list away yet. | |
20988 | |
20989 @heading CHANGES-msw | |
20990 | |
20991 This file briefly describes all mswindows-specific changes to XEmacs | |
20992 in the OXYMORON series of releases. The mswindows release branch | |
20993 contains additional changes on top of the mainline XEmacs | |
20994 release. These changes are deemed necessary for XEmacs to be fully | |
20995 functional under mswindows. It is not intended that these changes | |
20996 cause problems on UNIX systems, but they have not been tested on UNIX | |
20997 platforms. Caveat Emptor. | |
20998 | |
20999 See the file @file{CHANGES-release} for a full list of mainline changes. | |
21000 | |
21001 @heading to XEmacs 21.4.9 "Informed Management (Windows)" | |
21002 | |
21003 @itemize | |
21004 @item | |
21005 Fix layout of widgets so that the search dialog works. | |
21006 | |
21007 @item | |
21008 Fix focus capture of widgets under X. | |
21009 @end itemize | |
21010 | |
21011 @heading to XEmacs 21.4.8 "Honest Recruiter (Windows)" | |
21012 | |
21013 @itemize | |
21014 @item | |
21015 All changes from 21.4.6 and 21.4.7. | |
21016 | |
21017 @item | |
21018 Make sure revert temporaries are not visiting files. Suggested by | |
21019 Mike Alexander. | |
21020 | |
21021 @item | |
21022 File renaming fix from Mathias Grimmberger. | |
21023 | |
21024 @item | |
21025 Fix printer metrics on windows 95 from Jonathan Harris. | |
21026 | |
21027 @item | |
21028 Fix layout of widgets so that the search dialog works. | |
21029 | |
21030 @item | |
21031 Fix focus capture of widgets under X. | |
21032 | |
21033 @item | |
21034 Buffers tab doc fixes from John Palmieri. | |
21035 | |
21036 @item | |
21037 Sync with FSF custom @code{:set-after} behavior. | |
21038 | |
21039 @item | |
21040 Virtual window manager freeze fix from Rick Rankin. | |
21041 | |
21042 @item | |
21043 Fix various printing problems. | |
21044 | |
21045 @item | |
21046 Enable windows printing on cygwin. | |
21047 @end itemize | |
21048 | |
21049 @heading to XEmacs 21.4.7 "Economic Science (Windows)" | |
21050 | |
21051 @itemize | |
21052 @item | |
21053 All changes from 21.4.6. | |
21054 | |
21055 @item | |
21056 Fix problems with auto-revert with noconfirm. | |
21057 | |
21058 @item | |
21059 Undo autoconf 2.5x changes. | |
21060 | |
21061 @item | |
21062 Undo 21.4.7 process change. | |
21063 @end itemize | |
21064 | |
21065 to XEmacs 21.4.6 "Common Lisp (Windows)" | |
21066 | |
21067 @itemize | |
21068 @item | |
21069 Made native registry entries match the installer. | |
21070 | |
21071 @item | |
21072 Fixed mousewheel lockups. | |
21073 | |
21074 @item | |
21075 Frame iconifcation fix from Adrian Aichner. | |
21076 | |
21077 @item | |
21078 Fixed some printing problems. | |
21079 | |
21080 @item | |
21081 Netinstaller updated to support kit revisions. | |
21082 | |
21083 @item | |
21084 Fixed customize popup menus. | |
21085 | |
21086 @item | |
21087 Fixed problems with too many dialog popups. | |
21088 | |
21089 @item | |
21090 Netinstaller fixed to correctly upgrade shortcuts when upgrading | |
21091 core XEmacs. | |
21092 | |
21093 @item | |
21094 Fix for virtual window managers from Adrian Aichner. | |
21095 | |
21096 @item | |
21097 Installer registers all C++ file types. | |
21098 | |
21099 @item | |
21100 Short-filename fix from Peter Arius. | |
21101 | |
21102 @item | |
21103 Fix for GC assertions from Adrian Aichner. | |
21104 | |
21105 @item | |
21106 Winclient DDE client from Alastair Houghton. | |
21107 | |
21108 @item | |
21109 Fix event assert from Mike Alexander. | |
21110 | |
21111 @item | |
21112 Warning removal noticed by Ben Wing. | |
21113 | |
21114 @item | |
21115 Redisplay glyph height fix from Ben Wing. | |
21116 | |
21117 @item | |
21118 Printer margin fix from Jonathan Harris. | |
21119 | |
21120 @item | |
21121 Error dialog fix suggested by Thomas Vogler. | |
21122 | |
21123 @item | |
21124 Fixed revert-buffer to not revert in the case that there is | |
21125 nothing to be done. | |
21126 | |
21127 @item | |
21128 Glyph-baseline fix from Nix. | |
21129 | |
21130 @item | |
21131 Fixed clipping of wide glyphs in non-zero-length extents. | |
21132 | |
21133 @item | |
21134 Windows build fixes. | |
21135 | |
21136 @item | |
21137 Fixed @code{:initial-focus} so that it works. | |
21138 @end itemize | |
21139 | |
21140 @heading to XEmacs 21.4.5 "Civil Service (Windows)" | |
21141 | |
21142 @itemize | |
21143 @item | |
21144 Fixed a scrollbar problem when selecting the frame with focus. | |
21145 | |
21146 @item | |
21147 Fixed @code{mswindows-shell-execute} under cygwin. | |
21148 | |
21149 @item | |
21150 Added a new function @code{mswindows-cygwin-to-win32-path} for JDE. | |
21151 | |
21152 @item | |
21153 Added support for dialog-based directory selection. | |
21154 | |
21155 @item | |
21156 The installer version has been updated to the 21.5 netinstaller. The 21.5 | |
21157 installer now does proper dde file association and adds uninstall | |
21158 capability. | |
21159 | |
21160 @item | |
21161 Handle leak fix from Mike Alexander. | |
21162 | |
21163 @item | |
21164 New release build script. | |
21165 @end itemize | |
21166 | |
21167 | |
17939 | 21168 |
17940 @node Interface to the X Window System, Dumping, Interface to MS Windows, Top | 21169 @node Interface to the X Window System, Dumping, Interface to MS Windows, Top |
17941 @chapter Interface to the X Window System | 21170 @chapter Interface to the X Window System |
17942 @cindex X Window System, interface to the | 21171 @cindex X Window System, interface to the |
17943 | 21172 |